Skip to main content

199
UPLOADS


More right-solid

Show sorted alphabetically

Show sorted alphabetically

More right-solid
SHOW DETAILS
eye
Title
Date Archived
Creator
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01797. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01797/7901240/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01747. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01747/7898108/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01670. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01670/7897691/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01256. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01256/7885169/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02095. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02095/7927067/1
The Dataset Collection
by Jonathan Pevsner
data

eye 2

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19359. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19359/7928666/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19795. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19795/7932278/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA21144. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA21144/7951910/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01171. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01171/7884881/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03730. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03730/7923617/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02536. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02536/7931063/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19737. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19737/7931006/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA20884. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20884/7949462/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02006. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02006/7914449/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03279. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03279/7897946/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01347. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01347/7890095/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03446. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03446/7901288/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01702. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01702/7898024/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA20282. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20282/7934999/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03464. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03464/7901504/1
The Dataset Collection
by Jonathan Pevsner
data

eye 2

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19463. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19463/7929641/1
The Dataset Collection
by Jonathan Pevsner
data

eye 2

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03926. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03926/7929389/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02142. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02142/7927550/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03977. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03977/7929782/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01682. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01682/7897922/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA18951. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18951/7892684/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03095. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03095/7890959/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19429. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19429/7929002/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA20344. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20344/7936628/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA11931. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA11931/7936667/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01113. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01113/7883471/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01204. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01204/7885034/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03792. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03792/7927244/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19467. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19467/7929692/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02793. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02793/7940702/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19663. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19663/7930124/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02661. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02661/7934975/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG00154.
Source: https://figshare.com/articles/dataset/gVCF_HG00154/7872560/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01362. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01362/7890326/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA18907. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18907/7891043/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG00530. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG00530/7879991/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03190. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03190/7895180/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02860. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02860/7884980/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA20289. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20289/7935014/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA21086. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA21086/7950086/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01948. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01948/7910672/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03897. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03897/7928993/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02799. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02799/7940798/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01746. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01746/7898099/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03598. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03598/7908563/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19472. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19472/7929845/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02070. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02070/7925396/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03812. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03812/7928066/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01624. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01624/7895774/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA18864. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18864/7890737/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02286. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02286/7928966/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19475. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19475/7929941/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02554. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02554/7931255/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG00097.
Source: https://figshare.com/articles/dataset/gVCF_HG00097/7841411/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA18504. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18504/7944293/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03452. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03452/7901339/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03874. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03874/7928795/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA19185. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19185/7927103/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG01950. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01950/7911317/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG03558. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03558/7907651/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for HG02561. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02561/7931462/1
The Dataset Collection
by Jonathan Pevsner
data

eye 1

favorite 0

comment 0

1000 Genomes gVCF mapped to hs37d5 for NA20786. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20786/7944296/1
OMICS-DOI-LANDING-CRAWL-2019-04
by Internet Archive Web Group
data

eye 0

favorite 0

comment 0

DOI-LANDING-CRAWL-2018-06
by Internet Archive Web Group
data

eye 6

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2020-11
by Internet Archive Web Group
data

eye 0

favorite 0

comment 0

OA-JOURNAL-CRAWL-2020-07
by Internet Archive Web Group
data

eye 1

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 21

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 23

favorite 0

comment 0

This item contains some bulk research affiliation datasets from Internet Archive cataloging efforts. These are mostly strings included in research papers that indicate the institutional affiliations of specific authors (eg, with a home department, university, or company) at the time of publication. These might be useful datasets for efforts to build complete indices of research organizations, or to test normalization code that maps raw strings to organization identifiers. Attribution and links...
The Dataset Collection
by Internet Archive Web Group
data

eye 67

favorite 0

comment 0

This item contains both metadata and fulltext PDF content (from the public web) related to research on COVID-19 and past influenza pandemics. This content backs the https://covid19.fatcat.wiki search interface. Rough numbers: - over 51,000 metadata records from 2020-04-10 release of CORD19 corpus - over 79,000 metadata records total (union of the above plus fatcat.wiki keyword matches) - over 45,000 fulltext PDF files and derived PNG thumbnails and pdftotext text files The upstream...
Topic: COVID-19, Coronavirus, SARS-CoV-2
UNPAYWALL-PDF-CRAWL-2020-11
UNPAYWALL-PDF-CRAWL-2020-11
collection
199
ITEMS
1.6M
VIEWS
by Internet Archive Web Group
collection

eye 1.6M

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 18

favorite 0

comment 0

This item contains SPARQL query exports of journal metadata from wikidata; in particular ISSN/QID mappings. The SPARQL query run is included as wikidata.sparql
Internet Archive Research Publication Crawls
Internet Archive Research Publication Crawls
collection
21,054
ITEMS
100.6M
VIEWS
by Internet Archive Web Group
collection

eye 100.6M

A series of open web crawls targeting journal articles, technical memos, essays, datasets, and other research publications. This collection contains WARC and CDX files that end up in Wayback ( https://web.archive.org ). See also bibliographic metadata corpuses at  https://archive.org/details/ia_biblio_metadata
by Internet Archive Web Group
collection

eye 6,488

This collection contains web crawl data for a random selection of 500k (0.5 million) Crossref DOI redirects, including the doi.org redirect requests. The intent of this crawl is to gather loose statistics on the number of failing redirects, number of host websites that block automated crawling, and a corpus of HTML landing pages for metadata extraction (eg, "signposting" HTTP headers, linked data HTML metadata, semantic markup). Total size of (uncompressed) WARC data is 50 GB,...
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 14

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 5

favorite 0

comment 0

DIRECT-OA-CRAWL-2019
by Internet Archive Web Group
data

eye 5

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2020-11
by Internet Archive Web Group
data

eye 0

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2018-07
by Internet Archive Web Group
data

eye 12

favorite 0

comment 0

MAG-PDF-CRAWL-2020-07
by Internet Archive Web Group
data

eye 0

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 10

favorite 1

comment 0

URL lists to PDFs on the web (and preserved in the wayback machine) which are likely to contain research materials.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 12

favorite 0

comment 0

Test runs of large-scale matching algorithms (sha1 to DOI). Will likely be obsolete soon, and not useful for others.
Community Software
by Internet Archive Web Group
software

eye 111

favorite 0

comment 0

This item contains re-compiled .jar files for JVM (Java, Scala, etc) software packages used by the archive's "sandcrawler" journal ingest pipeline.
Community Texts
by Internet Archive Web Group
texts

eye 4

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2018-07
UNPAYWALL-PDF-CRAWL-2018-07
collection
1,241
ITEMS
14.5M
VIEWS
by Internet Archive Web Group
collection

eye 14.5M

Web archive data from a crawl of open access PDF URLs provided by Unpaywall.
OA-JOURNAL-CRAWL-2019-08
OA-JOURNAL-CRAWL-2019-08
collection
201
ITEMS
2.7M
VIEWS
by Internet Archive Web Group
collection

eye 2.7M

PubMed Central Crawl (2019-10)
PubMed Central Crawl (2019-10)
collection
216
ITEMS
407,225
VIEWS
by Internet Archive Web Group
collection

eye 407,225

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 33

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 17

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 16

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 93

favorite 0

comment 0

Data-munged title-level metadata combined from: DOAJ, ROAD, Norwegian Register, and Internet Archive crawled metadata. See SOURCES.md for URLs of upstream metadata, and ISSN_matching.html for Jupyter notebook used to derive this dataset.
DATACITE-DOI-CRAWL-2020-01
DATACITE-DOI-CRAWL-2020-01
collection
1,417
ITEMS
3.7M
VIEWS
by Internet Archive Web Group
collection

eye 3.7M

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 16

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 7

favorite 0

comment 0

This item contains hash lists of PDF files crawled from the public web specifically to preserve the scholarly record. It does not contain hashes of *all* PDFs the archive has ever seen, only a subset. Not all of these hashes are necessarily journal articles or other research outputs, but we have reason to believe the large majority are.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 10

favorite 0

comment 0

This item contains datasets of homepage URLs found by hand using search engines and bibliographic metadata (eg, ISSN and journal title). The "long-tail" batch contains about 4,600 journal lookup results, with about 3,900 successful homepage URLs found. The list of journals was created in May 2020, and the lookup work completed in June 2020. IA staff member Richard Greydanus ran this batch of lookups. All of this metadata can be considered public domain, or CC-0 (Creative Commons Zero)...
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 9

favorite 0

comment 0