1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01797. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01797/7901240/1
1
1.0
Dec 17, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01747. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01747/7898108/1
1
1.0
Dec 15, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01670. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01670/7897691/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01256. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01256/7885169/1
1
1.0
Dec 17, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02095. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02095/7927067/1
2
2.0
Dec 17, 2021
12/21
by
Jonathan Pevsner
data
eye 2
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19359. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19359/7928666/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19795. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19795/7932278/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA21144. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA21144/7951910/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01171. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01171/7884881/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03730. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03730/7923617/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02536. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02536/7931063/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19737. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19737/7931006/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA20884. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20884/7949462/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02006. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02006/7914449/1
1
1.0
Dec 20, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03279. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03279/7897946/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01347. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01347/7890095/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03446. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03446/7901288/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01702. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01702/7898024/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA20282. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20282/7934999/1
1
1.0
Dec 17, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03464. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03464/7901504/1
2
2.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 2
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19463. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19463/7929641/1
2
2.0
Dec 12, 2021
12/21
by
Jonathan Pevsner
data
eye 2
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03926. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03926/7929389/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02142. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02142/7927550/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03977. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03977/7929782/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01682. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01682/7897922/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA18951. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18951/7892684/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03095. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03095/7890959/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19429. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19429/7929002/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA20344. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20344/7936628/1
1
1.0
Dec 17, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA11931. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA11931/7936667/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01113. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01113/7883471/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01204. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01204/7885034/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03792. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03792/7927244/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19467. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19467/7929692/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02793. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02793/7940702/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19663. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19663/7930124/1
1
1.0
Dec 20, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02661. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02661/7934975/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG00154.
Source: https://figshare.com/articles/dataset/gVCF_HG00154/7872560/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01362. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01362/7890326/1
1
1.0
Dec 12, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA18907. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18907/7891043/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG00530. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG00530/7879991/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03190. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03190/7895180/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02860. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02860/7884980/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA20289. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20289/7935014/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA21086. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA21086/7950086/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01948. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01948/7910672/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03897. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03897/7928993/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02799. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02799/7940798/1
1
1.0
Dec 17, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01746. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01746/7898099/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03598. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03598/7908563/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19472. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19472/7929845/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02070. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02070/7925396/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03812. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03812/7928066/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01624. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01624/7895774/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA18864. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18864/7890737/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02286. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02286/7928966/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19475. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19475/7929941/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02554. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02554/7931255/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG00097.
Source: https://figshare.com/articles/dataset/gVCF_HG00097/7841411/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA18504. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA18504/7944293/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03452. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03452/7901339/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03874. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03874/7928795/1
1
1.0
Dec 16, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA19185. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA19185/7927103/1
1
1.0
Dec 20, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG01950. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG01950/7911317/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG03558. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG03558/7907651/1
1
1.0
Dec 19, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for HG02561. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_HG02561/7931462/1
1
1.0
Dec 18, 2021
12/21
by
Jonathan Pevsner
data
eye 1
favorite 0
comment 0
1000 Genomes gVCF mapped to hs37d5 for NA20786. Complete collection: https://doi.org/10.6084/m9.figshare.c.4414307
Source: https://figshare.com/articles/dataset/gVCF_NA20786/7944296/1
1
1.0
Sep 29, 2020
09/20
by
Internet Archive Web Group
data
eye 1
favorite 0
comment 0
This item contains some bulk research affiliation datasets from Internet Archive cataloging efforts. These are mostly strings included in research papers that indicate the institutional affiliations of specific authors (eg, with a home department, university, or company) at the time of publication. These might be useful datasets for efforts to build complete indices of research organizations, or to test normalization code that maps raw strings to organization identifiers. Attribution and links...
67
67
Apr 10, 2020
04/20
by
Internet Archive Web Group
data
eye 67
favorite 0
comment 0
This item contains both metadata and fulltext PDF content (from the public web) related to research on COVID-19 and past influenza pandemics. This content backs the https://covid19.fatcat.wiki search interface. Rough numbers: - over 51,000 metadata records from 2020-04-10 release of CORD19 corpus - over 79,000 metadata records total (union of the above plus fatcat.wiki keyword matches) - over 45,000 fulltext PDF files and derived PNG thumbnails and pdftotext text files The upstream...
Topic: COVID-19, Coronavirus, SARS-CoV-2
1.6M
1.6M
Nov 2, 2020
11/20
by
Internet Archive Web Group
This item contains SPARQL query exports of journal metadata from wikidata; in particular ISSN/QID mappings. The SPARQL query run is included as wikidata.sparql
100.6M
101M
Dec 19, 2017
12/17
by
Internet Archive Web Group
A series of open web crawls targeting journal articles, technical memos, essays, datasets, and other research publications. This collection contains WARC and CDX files that end up in Wayback ( https://web.archive.org ). See also bibliographic metadata corpuses at https://archive.org/details/ia_biblio_metadata
6,488
6.5K
May 7, 2018
05/18
by
Internet Archive Web Group
This collection contains web crawl data for a random selection of 500k (0.5 million) Crossref DOI redirects, including the doi.org redirect requests. The intent of this crawl is to gather loose statistics on the number of failing redirects, number of host websites that block automated crawling, and a corpus of HTML landing pages for metadata extraction (eg, "signposting" HTTP headers, linked data HTML metadata, semantic markup). Total size of (uncompressed) WARC data is 50 GB,...
5
5.0
Apr 11, 2019
04/19
by
Internet Archive Web Group
data
eye 5
favorite 0
comment 0
0
0.0
Jul 20, 2020
07/20
by
Internet Archive Web Group
data
eye 0
favorite 0
comment 0
URL lists to PDFs on the web (and preserved in the wayback machine) which are likely to contain research materials.
Test runs of large-scale matching algorithms (sha1 to DOI). Will likely be obsolete soon, and not useful for others.
111
111
Jun 4, 2018
06/18
by
Internet Archive Web Group
software
eye 111
favorite 0
comment 0
This item contains re-compiled .jar files for JVM (Java, Scala, etc) software packages used by the archive's "sandcrawler" journal ingest pipeline.
4
4.0
Sep 10, 2019
09/19
by
Internet Archive Web Group
texts
eye 4
favorite 0
comment 0
14.5M
15M
Jul 17, 2018
07/18
by
Internet Archive Web Group
Web archive data from a crawl of open access PDF URLs provided by Unpaywall.
2.7M
2.7M
Aug 1, 2019
08/19
by
Internet Archive Web Group
407,225
407K
Oct 12, 2019
10/19
by
Internet Archive Web Group
A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Data-munged title-level metadata combined from: DOAJ, ROAD, Norwegian Register, and Internet Archive crawled metadata. See SOURCES.md for URLs of upstream metadata, and ISSN_matching.html for Jupyter notebook used to derive this dataset.
3.7M
3.7M
Jan 24, 2020
01/20
by
Internet Archive Web Group
This item contains hash lists of PDF files crawled from the public web specifically to preserve the scholarly record. It does not contain hashes of *all* PDFs the archive has ever seen, only a subset. Not all of these hashes are necessarily journal articles or other research outputs, but we have reason to believe the large majority are.
This item contains datasets of homepage URLs found by hand using search engines and bibliographic metadata (eg, ISSN and journal title). The "long-tail" batch contains about 4,600 journal lookup results, with about 3,900 successful homepage URLs found. The list of journals was created in May 2020, and the lookup work completed in June 2020. IA staff member Richard Greydanus ran this batch of lookups. All of this metadata can be considered public domain, or CC-0 (Creative Commons Zero)...