Skip to main content

Crossref DOI Resolution Test Crawl (May 2018)

Internet Archive Web Group

This collection contains web crawl data for a random selection of 500k (0.5 million) Crossref DOI redirects, including the doi.org redirect requests. The intent of this crawl is to gather loose statistics on the number of failing redirects, number of host websites that block automated crawling, and a corpus of HTML landing pages for metadata extraction (eg, "signposting" HTTP headers, linked data HTML metadata, semantic markup).



rss RSS

5
RESULTS


Show sorted alphabetically

Show sorted alphabetically

SHOW DETAILS
up-solid down-solid
eye
Title
Date Added
Creator
Crossref DOI Resolution Test Crawl (May 2018)
May 8, 2018
data

eye 3

favorite 0

comment 0

Configuration, Reports, and Logs for DOI-LANDING-TESTCRAWL-2018-05 crawl.
Internet Archive crawldata of web PDF content captured by wbgrp-svc285.us.archive.org:DOI-LANDING-TESTCRAWL-2018-05 from Sat May 5 16:31:23 PDT 2018 to Mon May 7 14:29:26 PDT 2018.
Topic: crawldata
Internet Archive crawldata of web PDF content captured by wbgrp-svc285.us.archive.org:DOI-LANDING-TESTCRAWL-2018-05 from Fri May 4 14:20:49 PDT 2018 to Sat May 5 09:31:21 PDT 2018.
Topic: crawldata
Crossref DOI Resolution Test Crawl (May 2018)
web

eye 2,127

favorite 0

comment 0

Internet Archive crawldata of web PDF content captured by wbgrp-svc285.us.archive.org:DOI-LANDING-TESTCRAWL-2018-05 from Fri May 4 03:47:28 PDT 2018 to Fri May 4 11:47:17 PDT 2018.
Topic: crawldata
Crossref DOI Resolution Test Crawl (May 2018)
web

eye 4,030

favorite 0

comment 0

Internet Archive crawldata of web PDF content captured by wbgrp-svc285.us.archive.org:DOI-LANDING-TESTCRAWL-2018-05 from Fri May 4 02:32:06 PDT 2018 to Thu May 3 22:33:51 PDT 2018.
Topic: crawldata