Skip to main content
SHOW DETAILS
up-solid down-solid
eye
Title
Date Added
Creator
Bulk Bibliographic Metadata
Jan 24, 2018 Crossref
data

eye 655

favorite 2

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 94 million DOIs. Compared to the previous 2017-03 version (see archive.org item "crossref_doi_dump_201703"), this snapshot has a few million more works, but the corpus size is much larger (29 GB compressed vs. 7 GB compressed) as it now contains significantly more citation data, due to the efforts of the Initiative for Open Citations (I4OC) project. This was generated by running the scripts...
Bulk Bibliographic Metadata
Jan 24, 2018 DIrectory of Open Access Journals
data

eye 93

favorite 0

comment 0

Downloaded from https://doaj.org/csv and the OAI-PMH interface. File names encode the date when data was downloaded.
Bulk Bibliographic Metadata
Jan 25, 2018 ROAD: Directory of Open Access Scholarly Resources
data

eye 145

favorite 0

comment 0

This is a backup of ROAD/ISSN metadata from http://road.issn.org/en/contenu/download-road-records Dumps in both MARC XML and RDF format are included; see sub-directory for date of download. See also earlier July 2017 dump at: https://archive.org/download/road-issn-2017 These files are under the Creative Commons Attribution-NonCommercial 4.0 International Public License (aka, CC-BY-NC).
Topic: metadata
Bulk Bibliographic Metadata
Mar 1, 2018 Norwegian Centre for Research Data
data

eye 19

favorite 0

comment 0

This item contains a snapshot of the "Norwegian Register for Scientific Journals, Series and Publishers", as downloaded from https://dbh.nsd.uib.no/publiseringskanaler/AlltidFerskListe. As the name indicates, this is a registry of international Journals (aka "titles", or "serials"); the scope is not limited to Norwegian or Nordic publications.
Bulk Bibliographic Metadata
Mar 9, 2018
data

eye 17

favorite 0

comment 0

This item contains a set of "Keeper's Reports" summarizing journal content preservation coverage from major archival services and networks (Portico, LOCKSS, CLOCKSS). See README for links to where these files were downloaded from.
Topics: Keeper's Reports, Metadata, Preservation
Bulk Bibliographic Metadata
Apr 5, 2018 Internet Archive Web Group
data

eye 97

favorite 0

comment 0

Data-munged title-level metadata combined from: DOAJ, ROAD, Norwegian Register, and Internet Archive crawled metadata. See SOURCES.md for URLs of upstream metadata, and ISSN_matching.html for Jupyter notebook used to derive this dataset.
Bulk Bibliographic Metadata
May 9, 2018 Allen Institute for Artificial Intelligence
data

eye 42

favorite 0

comment 0

This is a snapshot of the AI2 (Semantic Scholar') "Open Research Corpus", as release May 3rd, 2018. These files originally downloaded from AWS S3, via: http://labs.semanticscholar.org/corpus/ Note restrictions in the 'license.txt' file. 'index.html' is a backup of the landing page, that includes field content. 'sample-S2-records.gz' is a subset of the data useful for exploration. Semantic Scholar is a project of the Allen Institute for Artificial Intelligence.
Bulk Bibliographic Metadata
Aug 13, 2018
data

eye 51

favorite 0

comment 0

Downloaded from https://core.ac.uk/services "The data aggregated from repositories by the CORE system can be accessed in two ways, through the CORE API or by downloading the data to your computer. The former option is practical if you want to build a service on top of CORE while the latter is something we recommend to those who would like to analyse the CORE dataset and/or apply some computationally intensive batch processes. If you use CORE in your work, we kindly request you to cite one...
Bulk Bibliographic Metadata
Sep 6, 2018 Wikidata Project
data

eye 83

favorite 0

comment 0

This item contains a copy of the 2018-09-03 snapshot of bibliographic metadata extracted from Wikidata. These datasets downloaded from: http://uri.gbv.de/wikicite/20180903/ More information at: https://github.com/wikicite/wikicite-data#readme and http://wikicite.org/
Bulk Bibliographic Metadata
Sep 6, 2018 EuropePMC
data

eye 50

favorite 0

comment 0

Data mirrored from https://europepmc.org/downloads Contains a mapping between PubMed IDs (PMID), PubMedCentral IDs (PMCID), and DOI numbers, for over 29 million works.
Bulk Bibliographic Metadata
Sep 9, 2018 Internet Archive Web Group
data

eye 89

favorite 0

comment 0

This is a mapping between: - DOIs (Crossref) - PubMed PMID and PMCID (NIH) - CORE record identifier (core.ac.uk) - Wikidata QIDs See README and scripts for details.
Bulk Bibliographic Metadata
Sep 22, 2018 Crossref
data

eye 485

favorite 1

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 99 million DOIs. This was generated by running the scripts at: https://github.com/greenelab/crossref (git commit: 768a49ba1d8ba1971f00471950514716a9f699c8) The script completed on 2018-09-20. Format is xz-compressed JSON (one JSON object per line).
Bulk Bibliographic Metadata
data

eye 30

favorite 0

comment 0

This item contains a transformed copy (single gzip'd JSON-per-line file, instead of tarball of xz-zipped JSON per-source files) of the metadata in item https://archive.org/details/core_oa_metadata_20180301. All the same licenses and caveats apply.
Bulk Bibliographic Metadata
Nov 29, 2018 Impactstory
data

eye 20

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
Dec 3, 2018 Japan Link Center
data

eye 46

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
Jan 20, 2019
data

eye 34

favorite 0

comment 0

Downloaded from: https://zenodo.org/record/1438356
Bulk Bibliographic Metadata
data

eye 12

favorite 0

comment 0

Downloaded from: https://grid.ac/downloads
Bulk Bibliographic Metadata
Jan 24, 2019 moreo.info
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
Jan 24, 2019 Jan Szczepanski
data

eye 21

favorite 0

comment 0

Downloaded from: https://www.ebsco.com/sites/g/files/nabnos191/files/acquiadam-assets/Jan-Szczepanski-Open-Access-Journals-2018_0.docx
Bulk Bibliographic Metadata
Feb 21, 2019 Wikimedia Research
data

eye 12

favorite 0

comment 0

Contains (at least) a list of DOIs cited by various language Wikipedias as of March 2018. Transformed by Charles using lists linked from https://blog.wikimedia.org/2018/04/05/ten-most-cited-sources-wikipedia/
Bulk Bibliographic Metadata
May 13, 2019 Microsoft Academic
data

eye 141

favorite 0

comment 0

This is a mirror of the RDF dump posted at:  http://ma-graph.org/rdf-dumps/ The license provided with this metadata is: Open Data Commons Attribution License (ODC-By) v1.0
Fatcat Database Snapshots and Bulk Metadata Exports
Jan 19, 2020 Internet Archive Web Group
data

eye 17

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Jan 19, 2020 Internet Archive Web Group
data

eye 6

favorite 0

comment 0

Bulk Bibliographic Metadata
Feb 3, 2020 Microsoft Academic
data

eye 145

favorite 1

comment 0

This is an updated snapshot of the Microsoft Academic Graph corpus. Microsoft generously makes this corpus available at no cost under the ODC-BY "open data license" ( https://opendatacommons.org/licenses/by/1.0/ ). See the link for details; at a minimum this license requires downstream users to acknowledge the creator. You can read more about the corpus, including how to obtain updated copies on Microsoft Azure, a schema reference, etc, at the following URLs and in the following...
Bulk Bibliographic Metadata
Feb 6, 2020 Allen Institute for Artificial Intelligence
data

eye 56

favorite 0

comment 0

This is a backup of the "Open Academic Search" corpus, published by Semantic Scholar / Allen Institute for AI. For more info see http://labs.semanticscholar.org/corpus/. In particular, note the terms and conditions: Semantic Scholar Open Research Corpus is licensed under  ODC-BY . When using the Semantic Scholar Open Research Corpus (“S2 ORC”) in a product or service, or including data in a redistribution, please cite the following paper: Waleed Ammar et al. 2018. Construction...
Bulk Bibliographic Metadata
Feb 14, 2020 Library Genesis
data

eye 385

favorite 2

comment 0

Snapshot as of 2019-04-15, contains SQL dumps for multiple databases: Complete Library Genesis Comic book database Fiction database 'Compact' Library Genesis database Scientific magazines SQL dumps generated by MySQL/MariaDB database. *** THIS ITEM DOES NOT CONTAIN ANY BOOKS *** Upstream does not provide checksums and all checksums should be taken with some doubt. Databases were archived by the upstream with RAR archiver, file names has been changed to include creation date.
Fatcat Database Snapshots and Bulk Metadata Exports
Mar 4, 2020 Internet Archive Web Group
data

eye 5

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Mar 4, 2020 Internet Archive Web Group
data

eye 10

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Apr 6, 2020 Internet Archive Web Group
data

eye 18

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Apr 6, 2020 Internet Archive Web Group
data

eye 33

favorite 0

comment 0

Bulk Bibliographic Metadata
Apr 7, 2020 Internet Archive Web Group
data

eye 33

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
Apr 9, 2020 Crossref
data

eye 91

favorite 0

comment 0

Mirrored via torrent from academic torrents: https://academictorrents.com/details/0c6c3fbfdc13f0169b561d29354ea8b188eb9d63 https://www.crossref.org/blog/free-public-data-file-of-112-million-crossref-records/
Bulk Bibliographic Metadata
May 4, 2020 Impactstory
data

eye 117

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
May 5, 2020 SciELO
data

eye 11

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
May 27, 2020 Internet Archive Web Group
data

eye 8

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
May 27, 2020 Internet Archive Web Group
data

eye 19

favorite 0

comment 0

Bulk Bibliographic Metadata
Jun 23, 2020 SciELO
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
Jun 23, 2020 Internet Archive Web Group
data

eye 11

favorite 0

comment 0

This item contains datasets of homepage URLs found by hand using search engines and bibliographic metadata (eg, ISSN and journal title). The "long-tail" batch contains about 4,600 journal lookup results, with about 3,900 successful homepage URLs found. The list of journals was created in May 2020, and the lookup work completed in June 2020. IA staff member Richard Greydanus ran this batch of lookups. All of this metadata can be considered public domain, or CC-0 (Creative Commons Zero)...
Bulk Bibliographic Metadata
Jun 23, 2020
data

eye 8

favorite 0

comment 0

Mirrored from:  https://github.com/njahn82/vanished_journals/tree/master/data
Bulk Bibliographic Metadata
data

eye 6

favorite 0

comment 0

Mirrored from:  https://www.arc.gov.au/excellence-research-australia/era-2018-journal-list
Bulk Bibliographic Metadata
Jul 6, 2020 Microsoft Academic
data

eye 64

favorite 0

comment 0

This is an updated snapshot of the Microsoft Academic Graph corpus. Microsoft generously makes this corpus available at no cost under the ODC-BY "open data license" ( https://opendatacommons.org/licenses/by/1.0/ ). See the link for details; at a minimum this license requires downstream users to acknowledge the creator. You can read more about the corpus, including how to obtain updated copies on Microsoft Azure, a schema reference, etc, at the following URLs and in the following...
Fatcat Database Snapshots and Bulk Metadata Exports
Aug 7, 2020 Internet Archive Web Group
data

eye 26

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Aug 7, 2020 Internet Archive Web Group
data

eye 34

favorite 0

comment 0

Bulk Bibliographic Metadata
Aug 8, 2020 DOAJ
data

eye 11

favorite 0

comment 0

Bulk Bibliographic Metadata
Aug 8, 2020 dblp
data

eye 27

favorite 0

comment 0

Bulk Bibliographic Metadata
Sep 4, 2020 Allen Institute for Artificial Intelligence
data

eye 185

favorite 0

comment 0

Semantic Scholar Open Research Corpus is licensed under  ODC-BY . When using the Semantic Scholar Open Research Corpus (“S2 ORC”) in a product or service, or including data in a redistribution, please cite the following paper: Waleed Ammar et al. 2018. Construction of the Literature Graph in Semantic Scholar. NAACL https://www.semanticscholar.org/paper/09e3cf5704bcb16e6657f6ceed70e93373a54618 This site is provided by The Allen Institute for Artificial Intelligence (“AI2”) as a service...
Fatcat Database Snapshots and Bulk Metadata Exports
Sep 29, 2020 Internet Archive Web Group
data

eye 58

favorite 0

comment 0

This item contains an example corpus of citations between scholarly documents, as extracted from the fatcat (https://fatcat.wiki) corpus as of the 2020-08-05 bulk release export. This corpus itself was generated from a fatcat-scholar "intermediate" fulltext dump which is not public, using software in the fatcat-scholar repository in mid-September 2020. See also the README for some more notes, and the "sample" file.
Bulk Bibliographic Metadata
Oct 9, 2020 Impactstory
data

eye 7

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Oct 10, 2020 Internet Archive Web Group
data

eye 28

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Oct 11, 2020 Internet Archive Web Group
data

eye 12

favorite 0

comment 0

Bulk Bibliographic Metadata
Nov 17, 2020 DOAJ
data

eye 41

favorite 1

comment 0

Bulk Bibliographic Metadata
Dec 1, 2020 ORCID, Inc.
data

eye 168

favorite 0

comment 0

This item contains an annual copy of the ORCID public data file, as originally downloaded from:  https://orcid.figshare.com/articles/dataset/ORCID_Public_Data_File_2020/13066970 More details about this content and it's use available at: https://orcid.org/content/orcid-public-data-file This dataset is available under the public domain (CC-0).
Fatcat Database Snapshots and Bulk Metadata Exports
Dec 8, 2020 Internet Archive Web Group
data

eye 17

favorite 0

comment 0

Bulk Bibliographic Metadata
Dec 18, 2020 dblp
data

eye 22

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Dec 30, 2020 Internet Archive Web Group
data

eye 17

favorite 0

comment 0

Fatcat Database Snapshots and Bulk Metadata Exports
Dec 30, 2020 Internet Archive Web Group
data

eye 29

favorite 0

comment 0

Bulk Bibliographic Metadata
Mar 2, 2021 Harshdeep Singh, Robert West, & Giovanni Colavizza
data

eye 12

favorite 0

comment 0

Mirrored from: https://zenodo.org/record/3940692 Harshdeep Singh, Robert West, & Giovanni Colavizza. (2020). Wikipedia Citations: A comprehensive dataset of citations with identifiers extracted from English Wikipedia (Version 0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3940692
Bulk Bibliographic Metadata
May 24, 2021 CORE.ac.uk
data

eye 20

favorite 0

comment 0

Mirrored from: https://core.ac.uk/documentation/dataset Dataset created for Deduplication of Scholarly Documents using Locality Sensitive Hashing and Word Embeddings (LREC 2020) (62 MB compressed, 204 MB in total) License: Open Data Commons Attribution (ODC-By) license.