Skip to main content
SHOW DETAILS
eye
Title
Date Archived
Creator
Bulk Bibliographic Metadata
by Wikimedia Research
data

eye 12

favorite 0

comment 0

Contains (at least) a list of DOIs cited by various language Wikipedias as of March 2018. Transformed by Charles using lists linked from https://blog.wikimedia.org/2018/04/05/ten-most-cited-sources-wikipedia/
Mirrored from:  https://www.arc.gov.au/excellence-research-australia/era-2018-journal-list
This item contains a transformed copy (single gzip'd JSON-per-line file, instead of tarball of xz-zipped JSON per-source files) of the metadata in item https://archive.org/details/core_oa_metadata_20180301. All the same licenses and caveats apply.
Bulk Bibliographic Metadata
by Japan Link Center
data

eye 46

favorite 0

comment 0

Downloaded from http://japanlinkcenter.org/top/material/material_metadata.html
Bulk Bibliographic Metadata
data

eye 12

favorite 0

comment 0

Downloaded from: https://grid.ac/downloads
Bulk Bibliographic Metadata
by Microsoft Academic
data

eye 141

favorite 0

comment 0

This is a mirror of the RDF dump posted at:  http://ma-graph.org/rdf-dumps/ The license provided with this metadata is: Open Data Commons Attribution License (ODC-By) v1.0
Bulk Bibliographic Metadata
data

eye 34

favorite 0

comment 0

Downloaded from: https://zenodo.org/record/1438356
Bulk Bibliographic Metadata
by Jan Szczepanski
data

eye 21

favorite 0

comment 0

Downloaded from: https://www.ebsco.com/sites/g/files/nabnos191/files/acquiadam-assets/Jan-Szczepanski-Open-Access-Journals-2018_0.docx
Bulk Bibliographic Metadata
by moreo.info
data

eye 7

favorite 0

comment 0

Bulk Bibliographic Metadata
by Impactstory
data

eye 20

favorite 0

comment 0

A mirror of the Unpaywall (aka oaDOI.org) metadata corpus, primarily consisting of public open access flags for a large number of Crossref-registered DOIs (identifiers representing published journal articles and other works). For more information see: http://unpaywall.org/products/snapshot
Bulk Bibliographic Metadata
data

eye 17

favorite 0

comment 0

This item contains a set of "Keeper's Reports" summarizing journal content preservation coverage from major archival services and networks (Portico, LOCKSS, CLOCKSS). See README for links to where these files were downloaded from.
Topics: Keeper's Reports, Metadata, Preservation
Bulk Bibliographic Metadata
by Allen Institute for Artificial Intelligence
data

eye 42

favorite 0

comment 0

This is a snapshot of the AI2 (Semantic Scholar') "Open Research Corpus", as release May 3rd, 2018. These files originally downloaded from AWS S3, via: http://labs.semanticscholar.org/corpus/ Note restrictions in the 'license.txt' file. 'index.html' is a backup of the landing page, that includes field content. 'sample-S2-records.gz' is a subset of the data useful for exploration. Semantic Scholar is a project of the Allen Institute for Artificial Intelligence.
Bulk Bibliographic Metadata
by ROAD: Directory of Open Access Scholarly Resources
data

eye 145

favorite 0

comment 0

This is a backup of ROAD/ISSN metadata from http://road.issn.org/en/contenu/download-road-records Dumps in both MARC XML and RDF format are included; see sub-directory for date of download. See also earlier July 2017 dump at: https://archive.org/download/road-issn-2017 These files are under the Creative Commons Attribution-NonCommercial 4.0 International Public License (aka, CC-BY-NC).
Topic: metadata
Bulk Bibliographic Metadata
by Norwegian Centre for Research Data
data

eye 19

favorite 0

comment 0

This item contains a snapshot of the "Norwegian Register for Scientific Journals, Series and Publishers", as downloaded from https://dbh.nsd.uib.no/publiseringskanaler/AlltidFerskListe. As the name indicates, this is a registry of international Journals (aka "titles", or "serials"); the scope is not limited to Norwegian or Nordic publications.
Bulk Bibliographic Metadata
by EuropePMC
data

eye 50

favorite 0

comment 0

Data mirrored from https://europepmc.org/downloads Contains a mapping between PubMed IDs (PMID), PubMedCentral IDs (PMCID), and DOI numbers, for over 29 million works.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 89

favorite 0

comment 0

This is a mapping between: - DOIs (Crossref) - PubMed PMID and PMCID (NIH) - CORE record identifier (core.ac.uk) - Wikidata QIDs See README and scripts for details.
Bulk Bibliographic Metadata
by Internet Archive Web Group
data

eye 97

favorite 0

comment 0

Data-munged title-level metadata combined from: DOAJ, ROAD, Norwegian Register, and Internet Archive crawled metadata. See SOURCES.md for URLs of upstream metadata, and ISSN_matching.html for Jupyter notebook used to derive this dataset.
Bulk Bibliographic Metadata
by Wikidata Project
data

eye 83

favorite 0

comment 0

This item contains a copy of the 2018-09-03 snapshot of bibliographic metadata extracted from Wikidata. These datasets downloaded from: http://uri.gbv.de/wikicite/20180903/ More information at: https://github.com/wikicite/wikicite-data#readme and http://wikicite.org/
Bulk Bibliographic Metadata
by DIrectory of Open Access Journals
data

eye 93

favorite 0

comment 0

Downloaded from https://doaj.org/csv and the OAI-PMH interface. File names encode the date when data was downloaded.
Bulk Bibliographic Metadata
data

eye 51

favorite 0

comment 0

Downloaded from https://core.ac.uk/services "The data aggregated from repositories by the CORE system can be accessed in two ways, through the CORE API or by downloading the data to your computer. The former option is practical if you want to build a service on top of CORE while the latter is something we recommend to those who would like to analyse the CORE dataset and/or apply some computationally intensive batch processes. If you use CORE in your work, we kindly request you to cite one...
Bulk Bibliographic Metadata
by Crossref
data

eye 485

favorite 1

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 99 million DOIs. This was generated by running the scripts at: https://github.com/greenelab/crossref (git commit: 768a49ba1d8ba1971f00471950514716a9f699c8) The script completed on 2018-09-20. Format is xz-compressed JSON (one JSON object per line).
Bulk Bibliographic Metadata
by Crossref
data

eye 655

favorite 2

comment 0

This file is a snapshot dump of the Crossref DOI metadata API, containing entries for over 94 million DOIs. Compared to the previous 2017-03 version (see archive.org item "crossref_doi_dump_201703"), this snapshot has a few million more works, but the corpus size is much larger (29 GB compressed vs. 7 GB compressed) as it now contains significantly more citation data, due to the efforts of the Initiative for Open Citations (I4OC) project. This was generated by running the scripts...
Open Access Journal Test Crawl (2018)
Open Access Journal Test Crawl (2018)
collection
794
ITEMS
13.4M
VIEWS
by Internet Archive Web Group
collection

eye 13.4M

UNPAYWALL-PDF-CRAWL-2018-07
by Internet Archive Web Group
data

eye 3

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2018-07
by Internet Archive Web Group
data

eye 3

favorite 0

comment 0

See also the crawl logs item for this crawl.
DOI-LANDING-CRAWL-2018-06
by Internet Archive Web Group
data

eye 10

favorite 0

comment 0

This item contains output files related to the DOI-LANDING-CRAWL-2018-06 crawl of Crossref DOI redirect landing pages: - list of Crossref DOI numbers attempted - an index of DOI, URL, and final HTTP status codes
DOI-LANDING-CRAWL-2018-06
by Internet Archive Web Group
data

eye 8

favorite 0

comment 0

by Internet Archive Web Group
collection

eye 7,112

This collection contains web crawl data for a random selection of 500k (0.5 million) Crossref DOI redirects, including the doi.org redirect requests. The intent of this crawl is to gather loose statistics on the number of failing redirects, number of host websites that block automated crawling, and a corpus of HTML landing pages for metadata extraction (eg, "signposting" HTTP headers, linked data HTML metadata, semantic markup). Total size of (uncompressed) WARC data is 50 GB,...
DOI-LANDING-CRAWL-2018-06
by Internet Archive Web Group
data

eye 6

favorite 0

comment 0

DOI-LANDING-CRAWL-2018-06
DOI-LANDING-CRAWL-2018-06
collection
279
ITEMS
3.8M
VIEWS
by Internet Archive Web Group
collection

eye 3.8M

CORE-UPSTREAM-CRAWL-2018-11
CORE-UPSTREAM-CRAWL-2018-11
collection
741
ITEMS
2.3M
VIEWS
by Internet Archive Web Group
collection

eye 2.3M

Crawl of "upstream" URLs from CORE (core.ac.uk) metadata dump. Only a partial seedlist of files crawled.
UNPAYWALL-PDF-CRAWL-2018-07
UNPAYWALL-PDF-CRAWL-2018-07
collection
1,241
ITEMS
18.3M
VIEWS
by Internet Archive Web Group
collection

eye 18.3M

Web archive data from a crawl of open access PDF URLs provided by Unpaywall.