Skip to main content
SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 16

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 55

favorite 1

comment 0

UNPAYWALL-PDF-CRAWL-2019-04
by Internet Archive Web Group
data

eye 2

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 46

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 200

favorite 3

comment 0

See: https://guide.fatcat.wiki/reference_graph.html License: CC-0
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 19

favorite 0

comment 0

This item contains bulk metadata exported from https://fatcat.wiki. With the exception of the 'abstracts' file (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to upstream sources (including Crossref, ORCID, DOAJ, the ISSN...
UNPAYWALL-PDF-CRAWL-2019-04
UNPAYWALL-PDF-CRAWL-2019-04
collection
641
ITEMS
6.3M
VIEWS
by Internet Archive Web Group
collection

eye 6.3M

arXiv Content Crawl (2019-10)
arXiv Content Crawl (2019-10)
collection
37
ITEMS
95,778
VIEWS
by Internet Archive Web Group
collection

eye 95,778

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 21

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 23

favorite 0

comment 0

This item contains some bulk research affiliation datasets from Internet Archive cataloging efforts. These are mostly strings included in research papers that indicate the institutional affiliations of specific authors (eg, with a home department, university, or company) at the time of publication. These might be useful datasets for efforts to build complete indices of research organizations, or to test normalization code that maps raw strings to organization identifiers. Attribution and links...
UNPAYWALL-PDF-CRAWL-2021-05
UNPAYWALL-PDF-CRAWL-2021-05
collection
123
ITEMS
1.1M
VIEWS
by Internet Archive Web Group
collection

eye 1.1M

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 34

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 4

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 11

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 50

favorite 0

comment 0

OMICS-DOI-LANDING-CRAWL-2019-04
OMICS-DOI-LANDING-CRAWL-2019-04
collection
4
ITEMS
14,313
VIEWS
by Internet Archive Web Group
collection

eye 14,313

This crawl started in April 2019, as an informal collaboration with Crossref. Crawling a smallish number (100k) DOI redirects and landing pages (plus PDF outlinks, and maybe a couple other hops) for a single large publisher (OMICS, which has multiple subsidiaries). Intent is to get reasonably good capture that can be used as canonical preservation copies of the landing pages. Secondary goal is to get decent fulltext capture coverage.
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 16

favorite 0

comment 0

Community Texts
by Internet Archive Web Group
texts

eye 5

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 10

favorite 1

comment 0

URL lists to PDFs on the web (and preserved in the wayback machine) which are likely to contain research materials.
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 12

favorite 0

comment 0

PubMed Central Crawl (2019-10)
PubMed Central Crawl (2019-10)
collection
216
ITEMS
513,745
VIEWS
by Internet Archive Web Group
collection

eye 513,745

Open Access Journal Test Crawl (2018)
by Internet Archive Web Group
data

eye 8

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 28

favorite 0

comment 0

This dump includes all tables (including oauth authentication tables which could be a privacy, but not security, concern). At this time only IA staff have accounts, so the snapshot, which is intended mostly for disaster recovery, is still public.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 39

favorite 0

comment 0

This item contains a complete PostgreSQL SQL database snapshot from https://fatcat.wiki, in binary 'pg_dump tar mode' format. With the exception of the 'abstracts' table (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to...
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 27

favorite 1

comment 0

This item contains a complete PostgreSQL SQL database snapshot from https://fatcat.wiki, in binary 'pg_dump tar mode' format. With the exception of the 'abstracts' table (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to...
Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 28

favorite 0

comment 0

DIRECT-OA-CRAWL-2019
DIRECT-OA-CRAWL-2019
collection
2,566
ITEMS
6.1M
VIEWS
by Internet Archive Web Group
collection

eye 6.1M

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 78

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2019-04
by Internet Archive Web Group
data

eye 0

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 10

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 22

favorite 1

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
by Internet Archive Web Group
data

eye 19

favorite 0

comment 0

See README.md
Web PDF GROBID Corpus (June 2019)
Web PDF GROBID Corpus (June 2019)
collection
10
ITEMS
52
VIEWS
by Internet Archive Web Group
collection

eye 52