Skip to main content
SHOW DETAILS
eye
Title
Date Archived
Creator
Bulk Bibliographic Metadata
data

eye 20

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
by ORCID, Inc.
data

eye 15

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.figshare.com/articles/dataset/ORCID_Public_Data_File_2021/16750535 See also: https://info.orcid.org/orcids-2021-public-data-file-is-now-available More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0).
Bulk Bibliographic Metadata
by OpenAPC
data

eye 18

favorite 0

comment 0

Downloaded from:  https://github.com/OpenAPC/openapc-de/blob/master/data/apc_de.csv See also:  https://openapc.github.io/about/
Bulk Bibliographic Metadata
data

eye 6

favorite 0

comment 0

Mirrored from:  https://isaw.nyu.edu/publications/awol-index/ Note creator request: The content of the  The AWOL Index  is derived from: Charles E. Jones,  AWOL - The Ancient World Online  (ISSN 2156-2253), 2009-. That content is re-used and re-mixed here under the terms of  AWOL's  Creative Commons Attribution Share-Alike 3.0 Unported license. The production and publication of  The AWOL Index  contributes significant additional value both to the content itself and to its presentation...
Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 33

favorite 0

comment 0

This is a backup of the "Open Academic Search" corpus, published by Semantic Scholar / Allen Institute for AI. For more info see http://labs.semanticscholar.org/corpus/. In particular, note the terms and conditions, and the request: We request that any published research that makes use of this data cites the following paper: Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL. ...
Bulk Bibliographic Metadata
by Library Genesis
data

eye 81

favorite 0

comment 0

Snapshot as of 2019-04-15, contains SQL dumps for multiple databases: Complete Library Genesis Comic book database Fiction database 'Compact' Library Genesis database Scientific magazines SQL dumps generated by MySQL/MariaDB database. *** THIS ITEM DOES NOT CONTAIN ANY BOOKS *** Upstream does not provide checksums and all checksums should be taken with some doubt. Databases were archived by the upstream with RAR archiver, file names has been changed to include creation date.
Bulk Bibliographic Metadata
by Impactstory
data

eye 37

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
by Wikimedia Research
data

eye 12

favorite 0

comment 0

Contains (at least) a list of DOIs cited by various language Wikipedias as of March 2018. Transformed by Charles using lists linked from https://blog.wikimedia.org/2018/04/05/ten-most-cited-sources-wikipedia/
Bulk Bibliographic Metadata
by Crossref
data

eye 76

favorite 0

comment 0

'crossref-works.json.xz' is the original file. 'works_crossref.elasticsearch.json.gz' contains a subset of metadata for most (but not all) works, restructured to be loaded directly into an Elasticsearch index. DOI: 10.6084/m9.figshare.4816720.v1 Via: https://figshare.com/articles/Metadata_for_all_DOIs_in_Crossref_JSON_MongoDB_exports_of_all_works_from_the_Crossref_API/4816720
Mirrored from:  https://www.arc.gov.au/excellence-research-australia/era-2018-journal-list
Bulk Bibliographic Metadata
by NCBI
data

eye 9

favorite 0

comment 0

Downloaded from: ftp://ftp.ncbi.nlm.nih.gov/pubmed/J_Entrez.txt
This item contains a transformed copy (single gzip'd JSON-per-line file, instead of tarball of xz-zipped JSON per-source files) of the metadata in item https://archive.org/details/core_oa_metadata_20180301. All the same licenses and caveats apply.
Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 16

favorite 0

comment 0

This is a mirror of the Semantic Scholar Graph of References in Context (GORC) dataset. Use of this dataset is under terms of the Semantic Scholar Dataset License: http://web.archive.org/web/20200118202545/http://api.semanticscholar.org/corpus/legal/ See also: https://github.com/allenai/s2-gorc https://arxiv.org/abs/1911.02782
Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 56

favorite 0

comment 0

This is a backup of the "Open Academic Search" corpus, published by Semantic Scholar / Allen Institute for AI. For more info see http://labs.semanticscholar.org/corpus/. In particular, note the terms and conditions: Semantic Scholar Open Research Corpus is licensed under  ODC-BY . When using the Semantic Scholar Open Research Corpus (“S2 ORC”) in a product or service, or including data in a redistribution, please cite the following paper: Waleed Ammar et al. 2018. Construction...
Bulk Bibliographic Metadata
by Cariniana
data

eye 21

favorite 0

comment 0

Downloaded from, eg:  https://cariniana.ibict.br/index.php/preservacao-de-publicacoes-digitais/periodicos-eletronicos
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 61

favorite 0

comment 0

This item contains sqlite3 database snapshots, URL crawl status, and other metadata useful for doing analytics on journal OA coverage, homepage status, etc. Particularly in the context of https://fatcat.wiki. Source code: https://github.com/bnewbold/chocula
Bulk Bibliographic Metadata
by dblp
data

eye 5

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 5

favorite 0

comment 0

Bulk Bibliographic Metadata
by Japan Link Center
data

eye 46

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
data

eye 167

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
by Japan Link Center
data

eye 29

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
data

eye 9

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 9

favorite 0

comment 0

About 1 million unique PDFs from Global Wayback before year 2000.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 9

favorite 0

comment 0

Snapshot of Internet Archive (petabox) file-level metadata (eg, PDF hashes) for files under the 'journals' collection as of December 2018. Note: includes a small number of items not actually under the 'journals' collection hierarchy due to how the input item list was generated, and a small fraction (estimate 500?) of items didn't dump successfully. A bit sloppy!
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 8

favorite 0

comment 0

Mirrored from:  https://github.com/njahn82/vanished_journals/tree/master/data
Bulk Bibliographic Metadata
data

eye 15

favorite 0

comment 0

This is the 2020 "baseline" PubMed/MEDLINE bibliographic metadata corpus, originally published in December 2019. Downloaded from https://www.nlm.nih.gov/databases/download/pubmed_medline.html
Bulk Bibliographic Metadata
by EZB
data

eye 17

favorite 0

comment 0

See README for details. Scraped from: http://ezb.uni-regensburg.de/ezeit/services/collections.phtml?bibid=AAAAA&colors=1〈=en http://ezb.uni-regensburg.de/ezeit/services/xmloutput.phtml?bibid=AAAAA&colors=1〈=de#6.2
Bulk Bibliographic Metadata
by Microsoft Academic
data

eye 64

favorite 0

comment 0

This is an updated snapshot of the Microsoft Academic Graph corpus. Microsoft generously makes this corpus available at no cost under the ODC-BY "open data license" ( https://opendatacommons.org/licenses/by/1.0/ ). See the link for details; at a minimum this license requires downstream users to acknowledge the creator. You can read more about the corpus, including how to obtain updated copies on Microsoft Azure, a schema reference, etc, at the following URLs and in the following...
Bulk Bibliographic Metadata
data

eye 3

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 30

favorite 0

comment 0

This dump includes all tables (including oauth authentication tables which could be a privacy, but not security, concern). At this time only IA staff have accounts, so the snapshot, which is intended mostly for disaster recovery, is still public.
Bulk Bibliographic Metadata
by Impactstory
data

eye 36

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
data

eye 17

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 30

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 12

favorite 0

comment 0

Downloaded from: https://grid.ac/downloads
Bulk Bibliographic Metadata
by Bruns A, Lenke C, Schmidt C, Taubert NC
data

eye 20

favorite 0

comment 0

ISSN-GOLD-OA provides a matching list of ISSN for Gold Open Access (OA) journals. The intention was to compile a matching table that is as complete as possible by using different publicly available sources. The data set offers a basis for various journal-related issues in bibliometric studies on Gold OA. The list is an updated version of ISSN-GOLD-OA . For a detailed description of the method, data sources used and the definition of the table fields, please refer to the original...
Bulk Bibliographic Metadata
data

eye 10

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 22

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 6

favorite 0

comment 0

Bulk Bibliographic Metadata
by Microsoft Academic
data

eye 141

favorite 0

comment 0

This is a mirror of the RDF dump posted at:  http://ma-graph.org/rdf-dumps/ The license provided with this metadata is: Open Data Commons Attribution License (ODC-By) v1.0
Bulk Bibliographic Metadata
data

eye 24

favorite 0

comment 0

OAI-PMH metadata collected from the arxiv.org endpoint, using the arXivRaw schema. Collected in two batches: up through ~2017, then up through May 22nd, 2019.
Bulk Bibliographic Metadata
by Japan Link Center
data

eye 14

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
data

eye 10

favorite 0

comment 0

This item contains snapshots of the PubMed Central OA subset file manifests, linked from https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist
Bulk Bibliographic Metadata
data

eye 8

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 34

favorite 0

comment 0

Downloaded from: https://zenodo.org/record/1438356
Bulk Bibliographic Metadata
data

eye 13

favorite 0

comment 0

Bulk Bibliographic Metadata
by Impactstory
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 15

favorite 0

comment 0

This is a backup of the "Open Academic Search" corpus, published by Semantic Scholar / Allen Institute for AI. For more info see http://labs.semanticscholar.org/corpus/. In particular, note the terms and conditions, and the request: We request that any published research that makes use of this data cites the following paper: Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL. ...
Bulk Bibliographic Metadata
by aiminer.org
data

eye 246

favorite 0

comment 0

A copy of the "Open Academic Graph" corpus published by aminer.org and Microsoft Academic Graph in Summer 2017. Contains almost 120 GB (compressed) of bibliographic metadata for hundreds of millions of publications. Related publications include: Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining...
Bulk Bibliographic Metadata
data

eye 20

favorite 0

comment 0

This is the 2019 "baseline" PubMed/MEDLINE bibliographic metadata corpus, originally published in December 2018. Downloaded from https://www.nlm.nih.gov/databases/download/pubmed_medline.html
Bulk Bibliographic Metadata
by Wikipedia Editors
data

eye 29

favorite 0

comment 0

This is a corpus of millions of citations from Wikipedia articles, for a subset of language wikis, created using the wikiciteparser Python library. 
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 24

favorite 0

comment 0

This item contains some bulk research affiliation datasets from Internet Archive cataloging efforts. These are mostly strings included in research papers that indicate the institutional affiliations of specific authors (eg, with a home department, university, or company) at the time of publication. These might be useful datasets for efforts to build complete indices of research organizations, or to test normalization code that maps raw strings to organization identifiers. Attribution and links...
Bulk Bibliographic Metadata
data

eye 27

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 20

favorite 0

comment 0

This item contains SPARQL query exports of journal metadata from wikidata; in particular ISSN/QID mappings. The SPARQL query run is included as wikidata.sparql
Bulk Bibliographic Metadata
by DIrectory of Open Access Journals
data

eye 50

favorite 0

comment 0

From: https://doaj.org/public-data-dump
Bulk Bibliographic Metadata
by ORCID, Inc.
data

eye 168

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from:  https://orcid.figshare.com/articles/dataset/ORCID_Public_Data_File_2020/13066970 More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0).
Bulk Bibliographic Metadata
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 33

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
by SciELO
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
by Impactstory
data

eye 20

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
by OurResearch
data

eye 147

favorite 0

comment 1

This is an archive of the "beta" pre-release of the OpenAlex bibliographic metadata corpus. It was downloaded from AWS S3 "requester pays" bucket, then the individual files were compressed with gzip (pigz command), which reduced on-disk size significantly. Downloads of some files needed to be restarted, which seems to have worked ok, but potentially could have introduced corruption. This initial snapshot is dated in file names as "2021-10-11", and that date is used...
( 1 reviews )
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 8

favorite 0

comment 0

This item contains hash lists of PDF files crawled from the public web specifically to preserve the scholarly record. It does not contain hashes of *all* PDFs the archive has ever seen, only a subset. Not all of these hashes are necessarily journal articles or other research outputs, but we have reason to believe the large majority are.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 11

favorite 0

comment 0

This item contains datasets of homepage URLs found by hand using search engines and bibliographic metadata (eg, ISSN and journal title). The "long-tail" batch contains about 4,600 journal lookup results, with about 3,900 successful homepage URLs found. The list of journals was created in May 2020, and the lookup work completed in June 2020. IA staff member Richard Greydanus ran this batch of lookups. All of this metadata can be considered public domain, or CC-0 (Creative Commons Zero)...
Bulk Bibliographic Metadata
data

eye 14

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 8

favorite 0

comment 0

Bulk Bibliographic Metadata
by Crossref
data

eye 91

favorite 0

comment 0

Mirrored via torrent from academic torrents: https://academictorrents.com/details/0c6c3fbfdc13f0169b561d29354ea8b188eb9d63 https://www.crossref.org/blog/free-public-data-file-of-112-million-crossref-records/
Bulk Bibliographic Metadata
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 8

favorite 0

comment 0

Bulk Bibliographic Metadata
by aiminer.org
data

eye 701

favorite 0

comment 0

A copy of the "Open Academic Graph v2" (OAGv2) corpus published by aminer.org and Microsoft Academic Graph in early 2019. Contains roughly 90 GB (compressed) of bibliographic metadata for hundreds of millions of publications. Related publications include: Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data...
Bulk Bibliographic Metadata
by Internet Archive
data

eye 8

favorite 0

comment 0

This item contains KBART files of Internet Archive "serials" (aka, journals, magazines, conference proceedings, other periodicals) preservation holdings. They include both digitized content in archive.org, and web archived content ("fatcat").
Bulk Bibliographic Metadata
data

eye 8

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 21

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 55

favorite 0

comment 0

This item contains a set of "Keeper's Reports" summarizing journal content preservation coverage from major archival services and networks (Portico, LOCKSS, CLOCKSS).
Bulk Bibliographic Metadata
data

eye 19

favorite 0

comment 0

Bulk Bibliographic Metadata
by dblp
data

eye 22

favorite 0

comment 0

Bulk Bibliographic Metadata
by Impactstory
data

eye 117

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
data

eye 10

favorite 0

comment 0

Bulk Bibliographic Metadata
by Jan Szczepanski
data

eye 21

favorite 0

comment 0

Downloaded from: https://www.ebsco.com/sites/g/files/nabnos191/files/acquiadam-assets/Jan-Szczepanski-Open-Access-Journals-2018_0.docx
Bulk Bibliographic Metadata
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
by dblp
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
by dblp
data

eye 27

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 40

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 2

favorite 0

comment 0

Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 39

favorite 0

comment 0

This item contains a complete PostgreSQL SQL database snapshot from https://fatcat.wiki, in binary 'pg_dump tar mode' format. With the exception of the 'abstracts' table (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to...
Bulk Bibliographic Metadata
by moreo.info
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 20

favorite 0

comment 0

This item contains work-level metadata about papers on academia.edu, obtained through their OAI-PMH interface.
Bulk Bibliographic Metadata
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 185

favorite 0

comment 0

Semantic Scholar Open Research Corpus is licensed under  ODC-BY . When using the Semantic Scholar Open Research Corpus (“S2 ORC”) in a product or service, or including data in a redistribution, please cite the following paper: Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL https://www.semanticscholar.org/paper/09e3cf5704bcb16e6657f6ceed70e93373a54618 This site is provided by The Allen Institute for Artificial Intelligence (“AI2”) as a service...
Bulk Bibliographic Metadata
by creator
data

eye 33

favorite 0

comment 0

Bulk Bibliographic Metadata
by JURN
data

eye 13

favorite 0

comment 0

JURN is a scholarly web search engine implemented as a custom Google search index. A subset of resources are included in a directory at:  http://www.jurn.org/directory/ This item contains snapshots of the directory in the form of TSV files. At least to start these are only title + URL, but we hope to reconcile or lookup to ISSN number.
Bulk Bibliographic Metadata
data

eye 9

favorite 0

comment 0

This item contains a set of "Keeper's Reports" summarizing journal content preservation coverage from major archival services and networks (Portico, LOCKSS, CLOCKSS). See README for links to where these files were downloaded from.
Bulk Bibliographic Metadata
by DIrectory of Open Access Journals
data

eye 50

favorite 0

comment 0

Downloaded from https://doaj.org/csv and the OAI-PMH interface. File names encode the date when data was downloaded.
Bulk Bibliographic Metadata
by Japan Link Center
data

eye 28

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 24

favorite 0

comment 0

This is a snapshot of the AI@ (Semantic Scholar') "Open Research Corpus". These files originally downloaded from: http://labs.semanticscholar.org/corpus/ Note restrictions in the 'license.txt' file. 'index.html' is a backup of the landing page, that includes field content. 'papers-*-sample.zip' is a subset of the data useful for exploration. Semantic Scholar is a project of the Allen Institute for Artificial Intelligence.
Bulk Bibliographic Metadata
by DOAJ
data

eye 9

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 21

favorite 1

comment 0