Skip to main content
SHOW DETAILS
up-solid down-solid
eye
Title
Date Reviewed
Creator
UNPAYWALL-PDF-CRAWL-2019-04
- Internet Archive Web Group
data

eye 2

favorite 0

comment 0

Custom Crawl Services
- Internet Archive Web Group
data

eye 0

favorite 0

comment 0

This item contains a copy of log files found on the Internet Archive (Web Group) machine `wbgrp-svc263.us.archive.org` on 2018-05-29, under the `/3` directory. These are logs of file transfer status between various crawler machines; they are not known to contain any sensitive metadata (eg, personal information, IPs, or other security-sensitive information), but are being keep `access-restricted` anyways. This data is almost certainly unimportant and could be deleted; it is being preserved out...
OAI-PMH-CRAWL-2020-06
- Internet Archive Web Group
data

eye 2

favorite 0

comment 0

arXiv Content Crawl (2019-10)
arXiv Content Crawl (2019-10)
collection
37
ITEMS
95,778
VIEWS
- Internet Archive Web Group
collection

eye 95,778

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 11:19:38 PDT 2017 to Thu Jul 6 04:33:05 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 06:56:02 PDT 2017 to Thu Jul 6 00:08:40 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 13:45:57 PDT 2017 to Thu Jul 6 07:00:09 PDT 2017.
Topic: crawldata
CiteSeerX URL Crawl 2017
web

eye 10,243

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 08:35:43 PDT 2017 to Thu Jul 6 01:46:47 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 02:31:51 PDT 2017 to Wed Jul 5 19:45:00 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Fri Jul 7 06:46:26 PDT 2017 to Fri Jul 14 15:21:22 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 07:54:24 PDT 2017 to Wed Jul 5 01:08:02 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 06:58:20 PDT 2017 to Wed Jul 5 00:11:16 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 09:03:33 PDT 2017 to Wed Jul 5 02:16:39 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 10:48:22 PDT 2017 to Wed Jul 5 04:00:37 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 17:47:56 PDT 2017 to Wed Jul 5 11:02:06 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 18:19:45 PDT 2017 to Wed Jul 5 11:33:54 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 18:52:51 PDT 2017 to Wed Jul 5 12:06:48 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 18:41:03 PDT 2017 to Wed Jul 5 11:56:32 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 16:49:18 PDT 2017 to Wed Jul 5 10:04:13 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 13:27:22 PDT 2017 to Wed Jul 5 06:40:32 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 19:41:55 PDT 2017 to Wed Jul 5 12:59:15 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 22:05:23 PDT 2017 to Wed Jul 5 15:42:16 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 12:53:52 PDT 2017 to Thu Jul 6 06:07:14 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 15:07:36 PDT 2017 to Thu Jul 6 08:28:04 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 02:09:00 PDT 2017 to Wed Jul 5 19:24:01 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 11:31:07 PDT 2017 to Thu Jul 6 08:54:41 PDT 2017.
Topic: crawldata
DOI-LANDING-CRAWL-2018-06
DOI-LANDING-CRAWL-2018-06
collection
279
ITEMS
3.6M
VIEWS
- Internet Archive Web Group
collection

eye 3.6M

DOAJ-CRAWL-2020-11
DOAJ-CRAWL-2020-11
collection
102
ITEMS
1M
VIEWS
- Internet Archive Web Group
collection

eye 1M

UNPAYWALL-PDF-CRAWL-2020-03
UNPAYWALL-PDF-CRAWL-2020-03
collection
344
ITEMS
2.2M
VIEWS
- Internet Archive Web Group
collection

eye 2.2M

CORE-UPSTREAM-CRAWL-2018-11
CORE-UPSTREAM-CRAWL-2018-11
collection
741
ITEMS
2.2M
VIEWS
- Internet Archive Web Group
collection

eye 2.2M

Crawl of "upstream" URLs from CORE (core.ac.uk) metadata dump. Only a partial seedlist of files crawled.
UNPAYWALL-PDF-CRAWL-2019-04
UNPAYWALL-PDF-CRAWL-2019-04
collection
641
ITEMS
6.3M
VIEWS
- Internet Archive Web Group
collection

eye 6.3M

SEMSCHOLAR-DIRECT-PDF-CRAWL-2020-02
SEMSCHOLAR-DIRECT-PDF-CRAWL-2020-02
collection
1,011
ITEMS
1.8M
VIEWS
- Internet Archive Web Group
collection

eye 1.8M

UNPAYWALL-PDF-CRAWL-2018-07
- Internet Archive Web Group
data

eye 1

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 05:01:45 PDT 2017 to Tue Jul 4 22:50:03 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 07:35:29 PDT 2017 to Wed Jul 5 00:48:05 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 05:45:37 PDT 2017 to Tue Jul 4 23:00:36 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 07:07:49 PDT 2017 to Wed Jul 5 00:18:34 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 07:44:33 PDT 2017 to Wed Jul 5 00:57:33 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 06:37:37 PDT 2017 to Tue Jul 4 23:54:39 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 06:27:07 PDT 2017 to Tue Jul 4 23:40:46 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 08:03:45 PDT 2017 to Wed Jul 5 01:16:31 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 09:55:35 PDT 2017 to Wed Jul 5 03:08:11 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 08:33:05 PDT 2017 to Wed Jul 5 01:46:32 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 10:26:12 PDT 2017 to Wed Jul 5 03:41:41 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 12:17:48 PDT 2017 to Wed Jul 5 05:29:23 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 18:09:15 PDT 2017 to Wed Jul 5 11:23:47 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 11:48:17 PDT 2017 to Wed Jul 5 05:01:28 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 11:59:15 PDT 2017 to Wed Jul 5 05:11:17 PDT 2017.
Topic: crawldata
CiteSeerX URL Crawl 2017
web

eye 10,878

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 12:08:27 PDT 2017 to Wed Jul 5 05:22:22 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 02:20:32 PDT 2017 to Wed Jul 5 19:33:31 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 11:58:36 PDT 2017 to Thu Jul 6 05:13:01 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 06:57:33 PDT 2017 to Thu Jul 6 02:08:23 PDT 2017.
Topic: crawldata
CiteSeerX URL Crawl 2017
web

eye 11,059

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 08:53:13 PDT 2017 to Thu Jul 6 02:05:47 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 14:52:50 PDT 2017 to Thu Jul 6 08:15:06 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 09:38:29 PDT 2017 to Thu Jul 6 02:51:41 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 05:56:39 PDT 2017 to Wed Jul 5 23:08:54 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 07:42:06 PDT 2017 to Thu Jul 6 00:54:28 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 22:56:35 PDT 2017 to Thu Jul 6 17:34:26 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Fri Jul 7 03:24:06 PDT 2017 to Thu Jul 6 21:44:17 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 09:48:37 PDT 2017 to Thu Jul 6 03:05:10 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 00:01:18 PDT 2017 to Wed Jul 5 17:59:10 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 07:32:04 PDT 2017 to Thu Jul 6 00:46:06 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 01:36:31 PDT 2017 to Wed Jul 5 18:48:11 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 11:08:03 PDT 2017 to Thu Jul 6 04:22:39 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 03:01:58 PDT 2017 to Wed Jul 5 20:14:53 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 12:19:36 PDT 2017 to Thu Jul 6 05:34:14 PDT 2017.
Topic: crawldata
MSAG-PDF-CRAWL-2017
collection
1,855
ITEMS
14M
VIEWS
- Internet Archive Web Group
collection

eye 14M

Microsoft Academic Graph public corpus (Feb 2016) PDF URLs, filtered to remove large sites (pubmed, citeseerx, arxiv) and already-crawled URLs.
Topics: papers, journals
MAG-PDF-CRAWL-2020-07
MAG-PDF-CRAWL-2020-07
collection
196
ITEMS
2M
VIEWS
- Internet Archive Web Group
collection

eye 2M

OMICS-DOI-LANDING-CRAWL-2019-04
OMICS-DOI-LANDING-CRAWL-2019-04
collection
4
ITEMS
14,313
VIEWS
- Internet Archive Web Group
collection

eye 14,313

This crawl started in April 2019, as an informal collaboration with Crossref. Crawling a smallish number (100k) DOI redirects and landing pages (plus PDF outlinks, and maybe a couple other hops) for a single large publisher (OMICS, which has multiple subsidiaries). Intent is to get reasonably good capture that can be used as canonical preservation copies of the landing pages. Secondary goal is to get decent fulltext capture coverage.
SCIELO-CRAWL-2020-07
SCIELO-CRAWL-2020-07
collection
41
ITEMS
218,859
VIEWS
- Internet Archive Web Group
collection

eye 218,859

Open Access Journal Test Crawl (2018)
Open Access Journal Test Crawl (2018)
collection
794
ITEMS
12.3M
VIEWS
- Internet Archive Web Group
collection

eye 12.3M

DIRECT-OA-CRAWL-2019
- Internet Archive Web Group
data

eye 5

favorite 0

comment 0

MAG-PDF-CRAWL-2020-07
- Internet Archive Web Group
data

eye 0

favorite 0

comment 0

UNPAYWALL-PDF-CRAWL-2020-11
- Internet Archive Web Group
data

eye 0

favorite 0

comment 0

DATACITE-DOI-CRAWL-2020-01
DATACITE-DOI-CRAWL-2020-01
collection
1,417
ITEMS
4.3M
VIEWS
- Internet Archive Web Group
collection

eye 4.3M

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 14:27:18 PDT 2017 to Wed Jul 5 07:42:31 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 05:26:51 PDT 2017 to Wed Jul 5 22:39:49 PDT 2017.
Topic: crawldata
CiteSeerX URL Crawl 2017
web

eye 10,162

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 09:01:08 PDT 2017 to Thu Jul 6 02:12:56 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 13:15:48 PDT 2017 to Thu Jul 6 06:27:35 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 10:39:46 PDT 2017 to Thu Jul 6 03:52:14 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 07:05:25 PDT 2017 to Thu Jul 6 00:16:46 PDT 2017.
Topic: crawldata
CiteSeerX URL Crawl 2017
web

eye 11,501

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 01:25:08 PDT 2017 to Wed Jul 5 18:40:27 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 04:45:52 PDT 2017 to Wed Jul 5 21:59:01 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 06:35:45 PDT 2017 to Wed Jul 5 23:47:46 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 21:42:48 PDT 2017 to Wed Jul 5 15:07:34 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 16:02:02 PDT 2017 to Thu Jul 6 09:30:48 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 15:40:27 PDT 2017 to Thu Jul 6 09:06:51 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 16:23:41 PDT 2017 to Thu Jul 6 09:49:34 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 07:26:10 PDT 2017 to Wed Jul 5 00:38:51 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 09:13:11 PDT 2017 to Wed Jul 5 02:27:43 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 11:06:40 PDT 2017 to Wed Jul 5 04:21:44 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 13:16:31 PDT 2017 to Wed Jul 5 06:29:05 PDT 2017.
Topic: crawldata
CiteSeerX URL Crawl 2017
web

eye 12,826

favorite 0

comment 0

Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 12:57:06 PDT 2017 to Wed Jul 5 06:10:16 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc284.us.archive.org:CITESEERX-CRAWL-2017 from Wed Jul 5 14:37:19 PDT 2017 to Wed Jul 5 07:53:37 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 09:57:54 PDT 2017 to Thu Jul 6 03:09:56 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 03:47:41 PDT 2017 to Wed Jul 5 20:59:49 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 06:25:00 PDT 2017 to Wed Jul 5 23:39:28 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 15:25:17 PDT 2017 to Thu Jul 6 08:45:39 PDT 2017.
Topic: crawldata
Internet Archive crawldata of uncrawled CiteseerX PDF URLs captured by wbgrp-svc285.us.archive.org:CITESEERX-CRAWL-2017 from Thu Jul 6 08:08:25 PDT 2017 to Thu Jul 6 01:20:22 PDT 2017.
Topic: crawldata
UNPAYWALL-PDF-CRAWL-2018-07
- Internet Archive Web Group
data

eye 20

favorite 0

comment 0