Skip to main content
SHOW DETAILS
up-solid down-solid
eye
Title
Date Added
Creator
Bulk Bibliographic Metadata
Jun 1, 2017 CiteSeerX Group at PSU
data

eye 193

favorite 0

comment 0

This is a mirror of a CiteSeerX database dump, downloaded from S3. It's hosted here for easy Internet Archive analytics access, and so we don't need to re-pay S3 download fees. See also: http://csxstatic.ist.psu.edu/about/data
Bulk Bibliographic Metadata
Jun 26, 2017 Allen Institute for Artificial Intelligence
data

eye 51

favorite 0

comment 0

This is a snapshot of the AI@ (Semantic Scholar') "Open Research Corpus", as downloaded June 26th, 2017. These files originally downloaded from: http://labs.semanticscholar.org/corpus/ Note restrictions in the 'license.txt' file. 'index.html' is a backup of the landing page, that includes field content. 'papers-2017-02-21-sample.zip' is a subset of the data useful for exploration. Semantic Scholar is a project of the Allen Institute for Artificial Intelligence.
Bulk Bibliographic Metadata
Jun 26, 2017 Microsoft Academic Search
data

eye 323

favorite 0

comment 0

This is a copy of the Microsoft Academic Graph corpus of scholarly publications and citations, based on crawls from the open web. Metadata (authors, DOI numbers, journals, citations, keywords, affiliations, etc) is included for more than 125 million publications. The corpus is a single 27GB zipfile that extracts into about 96GB of flat tab-separated text files, cross-referenced using identifier columns. Schema information can be found in the `readme.txt` file, and usage restrictions can be...
Bulk Bibliographic Metadata
Jul 3, 2017 ROAD: Directory of Open Access Scholarly Resources
data

eye 90

favorite 0

comment 0

This is a backup of ROAD/ISSN metadata, downloaded July 3rd, 2017 from http://road.issn.org/en/contenu/download-road-records Dumps in both MARC XML and RDF format are included. These files are under the Creative Commons Attribution-NonCommercial 4.0 International Public License (aka, CC-BY-NC).
Topic: metadata
Bulk Bibliographic Metadata
Aug 29, 2017 Crossref
data

eye 76

favorite 0

comment 0

'crossref-works.json.xz' is the original file. 'works_crossref.elasticsearch.json.gz' contains a subset of metadata for most (but not all) works, restructured to be loaded directly into an Elasticsearch index. DOI: 10.6084/m9.figshare.4816720.v1 Via: https://figshare.com/articles/Metadata_for_all_DOIs_in_Crossref_JSON_MongoDB_exports_of_all_works_from_the_Crossref_API/4816720
Bulk Bibliographic Metadata
Sep 12, 2017
data

eye 20

favorite 0

comment 0

Copy of the MEDLINE 2017 Baseline of PubMed metadata, provided by the US National Libraries of Medicine (NLM)
Bulk Bibliographic Metadata
Sep 19, 2017
data

eye 182

favorite 0

comment 0

Manifest of Internet Archive's identified scholarly works in digital form (eg, journal articles). See README.html for details.
Bulk Bibliographic Metadata
Sep 21, 2017
data

eye 342

favorite 0

comment 0

A snapshot of the oaDOI DOI/URL database, including open access status for each paper. oaDOI is the API backing unpaywall; see oadoi.org for more details. This dataset is intended for NON-COMMERCIAL USE ONLY; contact oaDOI for details or commercial support.
Bulk Bibliographic Metadata
Sep 27, 2017 aiminer.org
data

eye 246

favorite 0

comment 0

A copy of the "Open Academic Graph" corpus published by aminer.org and Microsoft Academic Graph in Summer 2017. Contains almost 120 GB (compressed) of bibliographic metadata for hundreds of millions of publications. Related publications include: Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining...
Bulk Bibliographic Metadata
Oct 3, 2017
data

eye 20

favorite 0

comment 0

This item contains work-level metadata about papers on academia.edu, obtained through their OAI-PMH interface.
Bulk Bibliographic Metadata
data

eye 116

favorite 0

comment 0

Manifest of Internet Archive's identified scholarly works in digital form (eg, journal articles). See README.html for details.
Bulk Bibliographic Metadata
Dec 10, 2017 CORE
data

eye 18

favorite 0

comment 0

This item contains mappings between CORE (https://core.ac.uk/) internal identifiers (simple integer numbers) and DOIs. This listing (a simple two-column TSV file) is derived from their publicly available metadata corpus.
Bulk Bibliographic Metadata
Dec 14, 2017 Allen Institute for Artificial Intelligence
data

eye 24

favorite 0

comment 0

This is a snapshot of the AI@ (Semantic Scholar') "Open Research Corpus". These files originally downloaded from: http://labs.semanticscholar.org/corpus/ Note restrictions in the 'license.txt' file. 'index.html' is a backup of the landing page, that includes field content. 'papers-*-sample.zip' is a subset of the data useful for exploration. Semantic Scholar is a project of the Allen Institute for Artificial Intelligence.
Bulk Bibliographic Metadata
Dec 14, 2017
data

eye 93

favorite 0

comment 0

Downloaded from https://core.ac.uk/services "The data aggregated from repositories by the CORE system can be accessed in two ways, through the CORE API or by downloading the data to your computer. The former option is practical if you want to build a service on top of CORE while the latter is something we recommend to those who would like to analyse the CORE dataset and/or apply some computationally intensive batch processes. If you use CORE in your work, we kindly request you to cite one...
Bulk Bibliographic Metadata
Dec 14, 2017
data

eye 64

favorite 0

comment 0

Downloaded from https://core.ac.uk/services "The data aggregated from repositories by the CORE system can be accessed in two ways, through the CORE API or by downloading the data to your computer. The former option is practical if you want to build a service on top of CORE while the latter is something we recommend to those who would like to analyse the CORE dataset and/or apply some computationally intensive batch processes. If you use CORE in your work, we kindly request you to cite one...
Bulk Bibliographic Metadata
Dec 14, 2017 DIrectory of Open Access Journals
data

eye 49

favorite 0

comment 0

Downloaded from https://doaj.org/csv and the OAI-PMH interface.
Bulk Bibliographic Metadata
Dec 14, 2017 ISSN
data

eye 429

favorite 1

comment 0

Unlike most ISSN metadata, this mapping file is publicly available.
Bulk Bibliographic Metadata
Dec 15, 2017 Sci-Hub
data

eye 264

favorite 0

comment 0

On 2017-03-19, The Twitter user @Sci_Hub posted a list of 62,835,101 DOIs contained in Sci-Hub: https://twitter.com/Sci_Hub/status/843546352219017218 This item contains a copy of the list. This item contains no PDFs, papers, fulltext, or other copyrighted content. Important note: not all DOIs in this list are valid (aka, do not resolve via doi.org).
Bulk Bibliographic Metadata
Dec 15, 2017 Datacite
data

eye 67

favorite 0

comment 0

This item contains snapshots of the Datacite OAI-PHM metadata feed, as captured with the tool 'metha'.
Bulk Bibliographic Metadata
data

eye 13

favorite 0

comment 0

Standard paper bibliographic metadata corpuses (eg, Crossref, Pubmed, Arxiv) transformed into simple tab-separated and JSON formats.
Bulk Bibliographic Metadata
Dec 18, 2017
data

eye 141

favorite 0

comment 0

Manifest of Internet Archive's identified scholarly works in digital form (eg, journal articles). See README.html for details.
Bulk Bibliographic Metadata
Jan 20, 2018
data

eye 21

favorite 0

comment 0

Bulk Bibliographic Metadata
Jan 24, 2018 Crossref
data

eye 655

favorite 2

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 94 million DOIs. Compared to the previous 2017-03 version (see archive.org item "crossref_doi_dump_201703"), this snapshot has a few million more works, but the corpus size is much larger (29 GB compressed vs. 7 GB compressed) as it now contains significantly more citation data, due to the efforts of the Initiative for Open Citations (I4OC) project. This was generated by running the scripts...
Bulk Bibliographic Metadata
Jan 24, 2018 DIrectory of Open Access Journals
data

eye 93

favorite 0

comment 0

Downloaded from https://doaj.org/csv and the OAI-PMH interface. File names encode the date when data was downloaded.
Bulk Bibliographic Metadata
Jan 25, 2018 ROAD: Directory of Open Access Scholarly Resources
data

eye 145

favorite 0

comment 0

This is a backup of ROAD/ISSN metadata from http://road.issn.org/en/contenu/download-road-records Dumps in both MARC XML and RDF format are included; see sub-directory for date of download. See also earlier July 2017 dump at: https://archive.org/download/road-issn-2017 These files are under the Creative Commons Attribution-NonCommercial 4.0 International Public License (aka, CC-BY-NC).
Topic: metadata
Bulk Bibliographic Metadata
data

eye 24

favorite 0

comment 0

Standard paper bibliographic metadata corpuses (eg, Crossref, Pubmed, Arxiv) transformed into simple tab-separated and JSON formats.
Bulk Bibliographic Metadata
Jan 30, 2018
data

eye 249

favorite 0

comment 0

Manifest of Internet Archive's identified scholarly works in digital form (eg, journal articles). See README.html for details.
Bulk Bibliographic Metadata
Mar 1, 2018 Norwegian Centre for Research Data
data

eye 19

favorite 0

comment 0

This item contains a snapshot of the "Norwegian Register for Scientific Journals, Series and Publishers", as downloaded from https://dbh.nsd.uib.no/publiseringskanaler/AlltidFerskListe. As the name indicates, this is a registry of international Journals (aka "titles", or "serials"); the scope is not limited to Norwegian or Nordic publications.
Bulk Bibliographic Metadata
Mar 9, 2018
data

eye 17

favorite 0

comment 0

This item contains a set of "Keeper's Reports" summarizing journal content preservation coverage from major archival services and networks (Portico, LOCKSS, CLOCKSS). See README for links to where these files were downloaded from.
Topics: Keeper's Reports, Metadata, Preservation
Bulk Bibliographic Metadata
Mar 23, 2018 OCLC
data

eye 17

favorite 0

comment 0

This is a copy of the VIAF ("Virtual International Authority File") as downloaded from OCLC on 2018-03-07. Download urls are in the original_urls.txt text file. See also: https://viaf.org/viaf/data/
Bulk Bibliographic Metadata
Mar 23, 2018 Sci-Hub
data

eye 40

favorite 0

comment 0

This item contains a dump of download statistics as downloaded from Sci-Hub (see original_urls.txt) in March, 2018.
Bulk Bibliographic Metadata
Apr 4, 2018 Crossref
data

eye 44

favorite 0

comment 1

Metadata from the Crossref DOI registrar about "titles" (aka, individual Journals), in CSV format. Originally fetched from: https://wwwold.crossref.org/titlelist/titleFile.csv
( 1 reviews )
Bulk Bibliographic Metadata
Apr 5, 2018 Internet Archive Web Group
data

eye 97

favorite 0

comment 0

Data-munged title-level metadata combined from: DOAJ, ROAD, Norwegian Register, and Internet Archive crawled metadata. See SOURCES.md for URLs of upstream metadata, and ISSN_matching.html for Jupyter notebook used to derive this dataset.
Bulk Bibliographic Metadata
Apr 19, 2018 ORCID, Inc
data

eye 26

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.org/content/download-file More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0). The DOI of this dataset is: https://doi.org/10.14454/07243.2013.001
Bulk Bibliographic Metadata
Apr 19, 2018 ORCID, Inc
data

eye 55

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.org/content/download-file More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0). The DOI of this dataset is: https://doi.org/10.6084/m9.figshare.4134027
Bulk Bibliographic Metadata
Apr 19, 2018 ORCID, Inc
data

eye 170

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.org/content/download-file More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0). The DOI of this dataset is: https://doi.org/10.6084/m9.figshare.5479792
Bulk Bibliographic Metadata
Apr 19, 2018 ORCID, Inc
data

eye 30

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.org/content/download-file More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0). The DOI of this dataset is: https://doi.org/10.6084/m9.figshare.1582705
Bulk Bibliographic Metadata
Apr 19, 2018 ORCID, Inc
data

eye 48

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from: https://orcid.org/content/download-file More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0). The DOI of this dataset is: https://doi.org/10.14454/07243.2014.001
Bulk Bibliographic Metadata
May 9, 2018 Allen Institute for Artificial Intelligence
data

eye 42

favorite 0

comment 0

This is a snapshot of the AI2 (Semantic Scholar') "Open Research Corpus", as release May 3rd, 2018. These files originally downloaded from AWS S3, via: http://labs.semanticscholar.org/corpus/ Note restrictions in the 'license.txt' file. 'index.html' is a backup of the landing page, that includes field content. 'sample-S2-records.gz' is a subset of the data useful for exploration. Semantic Scholar is a project of the Allen Institute for Artificial Intelligence.
Bulk Bibliographic Metadata
Jul 17, 2018 Impactstory
data

eye 47

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
Aug 13, 2018
data

eye 51

favorite 0

comment 0

Downloaded from https://core.ac.uk/services "The data aggregated from repositories by the CORE system can be accessed in two ways, through the CORE API or by downloading the data to your computer. The former option is practical if you want to build a service on top of CORE while the latter is something we recommend to those who would like to analyse the CORE dataset and/or apply some computationally intensive batch processes. If you use CORE in your work, we kindly request you to cite one...
Bulk Bibliographic Metadata
Sep 6, 2018 Wikidata Project
data

eye 83

favorite 0

comment 0

This item contains a copy of the 2018-09-03 snapshot of bibliographic metadata extracted from Wikidata. These datasets downloaded from: http://uri.gbv.de/wikicite/20180903/ More information at: https://github.com/wikicite/wikicite-data#readme and http://wikicite.org/
Bulk Bibliographic Metadata
Sep 6, 2018 EuropePMC
data

eye 50

favorite 0

comment 0

Data mirrored from https://europepmc.org/downloads Contains a mapping between PubMed IDs (PMID), PubMedCentral IDs (PMCID), and DOI numbers, for over 29 million works.
Bulk Bibliographic Metadata
Sep 9, 2018 Internet Archive Web Group
data

eye 89

favorite 0

comment 0

This is a mapping between: - DOIs (Crossref) - PubMed PMID and PMCID (NIH) - CORE record identifier (core.ac.uk) - Wikidata QIDs See README and scripts for details.
Bulk Bibliographic Metadata
Sep 15, 2018 Internet Archive Web Group
data

eye 15

favorite 0

comment 0

This is a derivative of https://archive.org/download/ia_papers_manifest_2018-01-25, which contains JSON objects that can be inserted into a fatcat catalog.
Bulk Bibliographic Metadata
Sep 15, 2018 Internet Archive Web Group
data

eye 13

favorite 0

comment 0

Test runs of large-scale matching algorithms (sha1 to DOI). Will likely be obsolete soon, and not useful for others.
Bulk Bibliographic Metadata
Sep 22, 2018 Crossref
data

eye 485

favorite 1

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 99 million DOIs. This was generated by running the scripts at: https://github.com/greenelab/crossref (git commit: 768a49ba1d8ba1971f00471950514716a9f699c8) The script completed on 2018-09-20. Format is xz-compressed JSON (one JSON object per line).
Bulk Bibliographic Metadata
Sep 27, 2018 Internet Archive Web Group
data

eye 28

favorite 0

comment 0

Contains a TSV file with SHA1, file size, wayback URLs, and metadata extracted from PDF by GROBID. Not intended for external use, but might be interested. DOES NOT CONTAIN FULLTEXT CONTENT.
Fatcat Database Snapshots and Bulk Metadata Exports
Sep 27, 2018 Internet Archive Web Group
data

eye 86

favorite 0

comment 0

Bulk Bibliographic Metadata
data

eye 30

favorite 0

comment 0

This item contains a transformed copy (single gzip'd JSON-per-line file, instead of tarball of xz-zipped JSON per-source files) of the metadata in item https://archive.org/details/core_oa_metadata_20180301. All the same licenses and caveats apply.
Bulk Bibliographic Metadata
Nov 29, 2018 Impactstory
data

eye 20

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
Dec 3, 2018 Japan Link Center
data

eye 46

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
Dec 20, 2018 Internet Archive Web Group
data

eye 9

favorite 0

comment 0

Snapshot of Internet Archive (petabox) file-level metadata (eg, PDF hashes) for files under the 'journals' collection as of December 2018. Note: includes a small number of items not actually under the 'journals' collection hierarchy due to how the input item list was generated, and a small fraction (estimate 500?) of items didn't dump successfully. A bit sloppy!
Bulk Bibliographic Metadata
Dec 20, 2018 Internet Archive Web Group
data

eye 8

favorite 0

comment 0

This item contains hash lists of PDF files crawled from the public web specifically to preserve the scholarly record. It does not contain hashes of *all* PDFs the archive has ever seen, only a subset. Not all of these hashes are necessarily journal articles or other research outputs, but we have reason to believe the large majority are.
Bulk Bibliographic Metadata
Jan 20, 2019
data

eye 34

favorite 0

comment 0

Downloaded from: https://zenodo.org/record/1438356
Bulk Bibliographic Metadata
data

eye 12

favorite 0

comment 0

Downloaded from: https://grid.ac/downloads
Bulk Bibliographic Metadata
Jan 24, 2019 moreo.info
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
Jan 24, 2019 NCBI
data

eye 9

favorite 0

comment 0

Downloaded from: ftp://ftp.ncbi.nlm.nih.gov/pubmed/J_Entrez.txt
Bulk Bibliographic Metadata
Jan 24, 2019 DIrectory of Open Access Journals
data

eye 50

favorite 0

comment 0

Downloaded from https://doaj.org/csv and the OAI-PMH interface. File names encode the date when data was downloaded.
Bulk Bibliographic Metadata
Jan 24, 2019
data

eye 9

favorite 0

comment 0

This item contains a set of "Keeper's Reports" summarizing journal content preservation coverage from major archival services and networks (Portico, LOCKSS, CLOCKSS). See README for links to where these files were downloaded from.
Bulk Bibliographic Metadata
Jan 24, 2019 Jan Szczepanski
data

eye 21

favorite 0

comment 0

Downloaded from: https://www.ebsco.com/sites/g/files/nabnos191/files/acquiadam-assets/Jan-Szczepanski-Open-Access-Journals-2018_0.docx
Bulk Bibliographic Metadata
Jan 28, 2019 JSTOR
data

eye 220

favorite 1

comment 0

As downloaded from: https://www.jstor.org/dfr/about/sample-datasets "The Early Journal Content (EJC) on JSTOR includes public domain journal articles published in the United States before 1923 and articles published in other countries before 1870, and includes discourse and scholarship in the arts and humanities, economics and politics, and in mathematics and other sciences. The EJC dataset includes full-text OCR and article-level metadata."
Bulk Bibliographic Metadata
Jan 29, 2019 Internet Archive Web Group
data

eye 7

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Feb 5, 2019 Internet Archive Web Group
data

eye 19

favorite 0

comment 0

This item contains bulk metadata exported from https://fatcat.wiki. With the exception of the 'abstracts' file (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to upstream sources (including Crossref, ORCID, DOAJ, the ISSN...
Fatcat Database Snapshots and Bulk Metadata Exports
Feb 5, 2019 Internet Archive Web Group
data

eye 28

favorite 1

comment 0

This item contains a complete PostgreSQL SQL database snapshot from https://fatcat.wiki, in binary 'pg_dump tar mode' format. With the exception of the 'abstracts' table (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to...
Bulk Bibliographic Metadata
Feb 15, 2019 NCBI
data

eye 10

favorite 0

comment 0

This item contains snapshots of the PubMed Central OA subset file manifests, linked from https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist
Bulk Bibliographic Metadata
Feb 15, 2019 aiminer.org
data

eye 701

favorite 0

comment 0

A copy of the "Open Academic Graph v2" (OAGv2) corpus published by aminer.org and Microsoft Academic Graph in early 2019. Contains roughly 90 GB (compressed) of bibliographic metadata for hundreds of millions of publications. Related publications include: Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of Academic Social Networks. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data...
Bulk Bibliographic Metadata
Feb 15, 2019 Allen Institute for Artificial Intelligence
data

eye 15

favorite 0

comment 0

This is a backup of the "Open Academic Search" corpus, published by Semantic Scholar / Allen Institute for AI. For more info see http://labs.semanticscholar.org/corpus/. In particular, note the terms and conditions, and the request: We request that any published research that makes use of this data cites the following paper: Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL. ...
Bulk Bibliographic Metadata
Feb 15, 2019 NCBI
data

eye 20

favorite 0

comment 0

This is the 2019 "baseline" PubMed/MEDLINE bibliographic metadata corpus, originally published in December 2018. Downloaded from https://www.nlm.nih.gov/databases/download/pubmed_medline.html
Bulk Bibliographic Metadata
Feb 21, 2019 Wikimedia Research
data

eye 12

favorite 0

comment 0

Contains (at least) a list of DOIs cited by various language Wikipedias as of March 2018. Transformed by Charles using lists linked from https://blog.wikimedia.org/2018/04/05/ten-most-cited-sources-wikipedia/
Bulk Bibliographic Metadata
Apr 5, 2019 DIrectory of Open Access Journals
data

eye 50

favorite 0

comment 0

From: https://doaj.org/public-data-dump
Bulk Bibliographic Metadata
Apr 17, 2019 Library Genesis
data

eye 81

favorite 0

comment 0

Snapshot as of 2019-04-15, contains SQL dumps for multiple databases: Complete Library Genesis Comic book database Fiction database 'Compact' Library Genesis database Scientific magazines SQL dumps generated by MySQL/MariaDB database. *** THIS ITEM DOES NOT CONTAIN ANY BOOKS *** Upstream does not provide checksums and all checksums should be taken with some doubt. Databases were archived by the upstream with RAR archiver, file names has been changed to include creation date.
Bulk Bibliographic Metadata
Apr 23, 2019 Impactstory
data

eye 36

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
Apr 24, 2019 Internet Archive Web Group
data

eye 30

favorite 0

comment 0

This dump includes all tables (including oauth authentication tables which could be a privacy, but not security, concern). At this time only IA staff have accounts, so the snapshot, which is intended mostly for disaster recovery, is still public.
Fatcat Database Snapshots and Bulk Metadata Exports
Apr 30, 2019 Internet Archive Web Group
data

eye 17

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
May 1, 2019 Internet Archive Web Group
data

eye 12

favorite 0

comment 0

Bulk Bibliographic Metadata
May 13, 2019 Microsoft Academic
data

eye 141

favorite 0

comment 0

This is a mirror of the RDF dump posted at:  http://ma-graph.org/rdf-dumps/ The license provided with this metadata is: Open Data Commons Attribution License (ODC-By) v1.0
Bulk Bibliographic Metadata
May 22, 2019 arxiv.org
data

eye 24

favorite 0

comment 0

OAI-PMH metadata collected from the arxiv.org endpoint, using the arXivRaw schema. Collected in two batches: up through ~2017, then up through May 22nd, 2019.
Bulk Bibliographic Metadata
May 23, 2019 Internet Archive Web Group
data

eye 39

favorite 0

comment 0

This item contains a complete PostgreSQL SQL database snapshot from https://fatcat.wiki, in binary 'pg_dump tar mode' format. With the exception of the 'abstracts' table (for which no aggregate license or copyright claims can be made; downstream users are responsible for their use), all metadata here is licensed CC-0 (public domain release) and may be used for any purpose. Downstream users are strongly encouraged to provide attribution and link here to the snapshot, as well as give credit to...
Fatcat Database Snapshots and Bulk Metadata Exports
Jun 4, 2019 Internet Archive Web Group
data

eye 20

favorite 0

comment 0

See README.md
Fatcat Database Snapshots and Bulk Metadata Exports
Jun 4, 2019 Internet Archive Web Group
data

eye 24

favorite 0

comment 0

Bulk Bibliographic Metadata
Jun 27, 2019 Internet Archive Web Group
data

eye 5

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Jul 8, 2019 Internet Archive Web Group
data

eye 29

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Jul 8, 2019 Internet Archive Web Group
data

eye 35

favorite 0

comment 0

Bulk Bibliographic Metadata
Jul 9, 2019 Bruns A, Lenke C, Schmidt C, Taubert NC
data

eye 20

favorite 0

comment 0

ISSN-GOLD-OA provides a matching list of ISSN for Gold Open Access (OA) journals. The intention was to compile a matching table that is as complete as possible by using different publicly available sources. The data set offers a basis for various journal-related issues in bibliometric studies on Gold OA. The list is an updated version of ISSN-GOLD-OA . For a detailed description of the method, data sources used and the definition of the table fields, please refer to the original...
Bulk Bibliographic Metadata
Jul 12, 2019 EZB
data

eye 17

favorite 0

comment 0

See README for details. Scraped from: http://ezb.uni-regensburg.de/ezeit/services/collections.phtml?bibid=AAAAA&colors=1〈=en http://ezb.uni-regensburg.de/ezeit/services/xmloutput.phtml?bibid=AAAAA&colors=1〈=de#6.2
Bulk Bibliographic Metadata
Jul 31, 2019 OpenAPC
data

eye 18

favorite 0

comment 0

Downloaded from:  https://github.com/OpenAPC/openapc-de/blob/master/data/apc_de.csv See also:  https://openapc.github.io/about/
Bulk Bibliographic Metadata
Jul 31, 2019 Internet Archive Web Group
data

eye 20

favorite 0

comment 0

This item contains SPARQL query exports of journal metadata from wikidata; in particular ISSN/QID mappings. The SPARQL query run is included as wikidata.sparql
Bulk Bibliographic Metadata
Aug 1, 2019 Internet Archive Web Group
data

eye 61

favorite 0

comment 0

This item contains sqlite3 database snapshots, URL crawl status, and other metadata useful for doing analytics on journal OA coverage, homepage status, etc. Particularly in the context of https://fatcat.wiki. Source code: https://github.com/bnewbold/chocula
Sep 9, 2019 Internet Archive Web Group
collection

eye 1,547

This collection holds database snapshots (SQL) and bulk metadata exports (JSON and TSV) from https:///fatcat.wiki (an Internet Archive service)
Fatcat Database Snapshots and Bulk Metadata Exports
Sep 10, 2019 Internet Archive Web Group
data

eye 12

favorite 0

comment 0

See README.md
Fatcat Database Snapshots and Bulk Metadata Exports
Sep 10, 2019 Internet Archive Web Group
data

eye 20

favorite 0

comment 0

See README.md
Bulk Bibliographic Metadata
Oct 3, 2019 Internet Archive Web Group
data

eye 24

favorite 0

comment 0

This item contains some bulk research affiliation datasets from Internet Archive cataloging efforts. These are mostly strings included in research papers that indicate the institutional affiliations of specific authors (eg, with a home department, university, or company) at the time of publication. These might be useful datasets for efforts to build complete indices of research organizations, or to test normalization code that maps raw strings to organization identifiers. Attribution and links...
Bulk Bibliographic Metadata
Oct 6, 2019 Crossref
data

eye 344

favorite 1

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 107 million DOIs. This was generated by running the scripts at: https://github.com/greenelab/crossref (git commit: 768a49ba1d8ba1971f00471950514716a9f699c8) The script started on 2019-09-09 and completed on 2019-10-06. Format is xz-compressed JSON (one JSON object per line).
Fatcat Database Snapshots and Bulk Metadata Exports
Oct 11, 2019 Internet Archive Web Group
data

eye 11

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Oct 11, 2019 Internet Archive Web Group
data

eye 17

favorite 0

comment 0

Bulk Bibliographic Metadata
Nov 5, 2019 Allen Institute for Artificial Intelligence
data

eye 33

favorite 0

comment 0

This is a backup of the "Open Academic Search" corpus, published by Semantic Scholar / Allen Institute for AI. For more info see http://labs.semanticscholar.org/corpus/. In particular, note the terms and conditions, and the request: We request that any published research that makes use of this data cites the following paper: Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL. ...
Bulk Bibliographic Metadata
Nov 16, 2019 ORCID, Inc.
data

eye 100

favorite 1

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from:  https://orcid.figshare.com/articles/ORCID_Public_Data_File_2019/9988322 This dump contains over 7.31M summary entities (ORCIDs). More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0).
Fatcat Database Snapshots and Bulk Metadata Exports
Dec 14, 2019 Internet Archive Web Group
data

eye 10

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Dec 14, 2019 Internet Archive Web Group
data

eye 22

favorite 1

comment 0