Skip to main content

8,554
UPLOADS


More right-solid

More right-solid

Show sorted alphabetically

More right-solid

Show sorted alphabetically

More right-solid

More right-solid
SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
The Archive Team Just In Time Grabs
May 3, 2011
web

eye 4,309

favorite 3

comment 0

Founded in 2004, Encyclopedia Dramatica (ED) was a free-form Wiki dedicated to collecting all manner of internet subculture, including illustrations, descriptions and histories, especially as related to trolling, troll activities, and internet drama. Unlike many mainstream wikis such as Wikipedia, ED was intentionally indecent, obtuse, and inaccurate - the actual information related to a situation would come with further research, not by simply reading the related ED article. Constantly on the...
Topics: encyclopedia dramatica, anonymous, trolling, wiki, wikidump, lulz
Web Crawls
May 3, 2011 City of Vancouver
web

eye 392

favorite 0

comment 0

As part of an open data initiative, the City of Vancouver government opened up a large swath of its public data on a website called data.vancouver.ca. This continually growing site is meant to house all relevant city data available in a public and transparent manner. This is an April 2011 snapshot of the city data in one large (31 gigabyte) package, including all datasets, web pages, and collected materials related to data.vancouver.ca. Data includes XML, KML, CSV and other formats. Information...
Topics: government, vancouver, canada, opendata, kml, xml, csv, datasets
The Archive Team Geocities Valhalla
web

eye 5,992

favorite 5

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 970

favorite 1

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 1,470

favorite 1

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 2,015

favorite 2

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 1,511

favorite 1

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 1,667

favorite 1

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 1,173

favorite 1

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Geocities Valhalla
web

eye 1,216

favorite 1

comment 0

This is a collection of Geocities data downloaded by a bunch of people who call themselves ARCHIVE TEAM, who began scraping the Yahoo! Geocities site during a six month period in 2009, before Yahoo! shut down geocities.com on October 26th, 2009. This collection is compressed in a UNIX filesystem with both 7zip archives and tape archives (gtar). This collection was put together by nearly 100 folks assembling at the news of the death of Geocities, a website that allowed free hosting of web pages...
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 826

favorite 3

comment 0

From the README: This is a collection of mirrors maintained by gopher.quux.org. These mirrors were taken offline in 2006 due to bandwidth constraints. This collection prepared April 2010 by John Goerzen -------------------------------------------------- mirrors.tar.bz2 -------------------------------------------------- Compressed size: 1.6GB Uncompressed size: 3.8GB File count: 102736 The content includes: boombox.micro.umn.edu /pub/gopher from the FTP site, including various historic Gopher...
Topics: gopher, quux, archiveteam, usenet
The Archive Team Just In Time Grabs
May 11, 2011 boingboing.net / individual authors
web

eye 284

favorite 0

comment 0

Two collections of Boing Boing postings provided by the cultural website boingboing.net on its 5th and 11th anniversaries. Includes the HTML/text aspects of the postings, along with various author and creation information. From the 2011 BoingBoing.net posting: "Having very recently celebrated Boing Boing's eleventh bloggaversary, we're releasing an update of our previous archival release of Boing Boing posts. "This time, we're releasing a 120.3MB XML file (38.3MB zip) of 63,999 posts...
Topics: BoingBoing, posts archive, blogging
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 913

favorite 0

comment 0

Explanatory file included with this archive, with slight edits: On Monday 24th January 2011 the BBC announced [1] that it would be restructuring its online department - with 360 job losses and the deletion of 200 of its top level directories (including the websites that live under them - eg http://www.bbc.co.uk/blast). 172 of of those top level directories [2] were due to be deleted within the coming 12 months. Most of these sites are already 'mothballed' [3], which means that the BBC has...
The Archive Team Just In Time Grabs
web

eye 10,399

favorite 0

comment 0

This dataset is a collection of scraped public twitter updates used in coordination with an academic project to study the geolocation data related to twittering. From the explanatory PDF in the dataset collection: We provide both training set and test set (collected from September 2009 to January 2010) in the paper You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users in CIKM 2010. The training set contains 115,886 Twitter users and 3,844,612 updates from the users....
Topics: academic paper, twitter, tweets, location, geolocation, archiveteam
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 296

favorite 0

comment 0

EtherPad was a web-based collaborative real-time editor, allowing authors to simultaneously edit a text document, and see all of the participants' edits in real-time, with the ability to display each author's text in their own color. Very popular and in use by educators, businesses, and developers, Etherpad gained a strong following, but was later purchased by Google. With the introduction of the competing Wave application, Google announced a shutdown of Etherpad in favor of Wave. To outcry,...
Topics: etherpad, archiveteam, archive
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 732

favorite 0

comment 0

(BudhaM0nk) i want hard drives so small i can snort them up like powder and increase my brain capacity Comprising the wit, wisdom, brilliance and buffoonery of thousands of individuals over decades, Quote Databases have provided easy access to amusing snatches of conversation from IRC and other online gathering places. While many of these sites are still up, this package of compressed archives allow easy access to the full collections of quotes from various sources. This collection was built in...
Topics: quotes, qdb, archiveteam
The Archive Team Just In Time Grabs
web

eye 12,751

favorite 0

comment 0

Facebook data scrape related to paper "The Social Structure of Facebook Networks", by Amanda L. Traud, Peter J. Mucha, Mason A. Porter. "We study the social structure of Facebook "friendship" networks at one hundred American colleges and universities at a single point in time, and we examine the roles of user attributes - gender, class year, major, high school, and residence - at these institutions. We investigate the influence of common attributes at the dyad level in...
Topics: arcademic, facebook, facebook networks, networks, archiveteam
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 231

favorite 0

comment 0

American Powerblogs was a blog hosting service that provided ease-of-use access to blogging software. Allowing its users the ability to create their own subdomains and presentation style, Powerblogs was used by a relatively small but energetic community of bloggers. This is a 108-blog snapshot of the final month of Powerblogs, before their shutdown.
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 526

favorite 0

comment 0

Voluntary dataset on affinities of 60,000+ Reddit users, recorded in 2010. From the enclosed readme file: "I filtered the list of votes for the list of users that gave us permission to use their data. For the curious, that's 67,059 users: 62,763 with "public votes" and 6,726 with "allow my data to be used for research"...I'm trying to use it to build a recommender, and I've got some preliminary source code. I'm looking for feedback on all of these steps, since I'm not...
Topics: reddit, database, archiveteam, affinities, mysql
The Archive Team Just In Time Grabs
May 11, 2011
web

eye 286

favorite 0

comment 0

This is a panic download of the starwars.yahoo.com forums and profiles, done before the closure of same by Yahoo on December 15, 2009. This includes as many messages, profiles, and pages related to the site as could be easily brought in.
Topics: archiveteam, archive, starwars.yahoo.com, yahoo, use the force
Disk Drives: Collections of Files from the Era of the Drive
May 21, 2011 Jason Scott
web

eye 7,280

favorite 11

comment 0

In 1998, Jason Scott, a child of the quickly-fading Bulletin Board System (BBS) era in computer communication, checked the then-seemingly-infinite World Wide Web to see what it had to say and show about these single-line online services he had used since the early 1980s. Shocked to find that very little had made it online, he assembled his personal collection of textfiles, archives and memorabilia and began a site called TEXTFILES.COM, intending it to be a living museum of the early days of the...
Topics: textfiles.com, bbs, bulletin board system, g-files, philes, hacking, phreaking, anarchy, messages,...
The Archive Team Just In Time Grabs
Jun 8, 2011 Jeff Atwood, Stackoverflow.com
web

eye 899

favorite 1

comment 1

Stack Overflow / Stack Exchange Creative Commons data dump, to start of April 2011. Includes - http://stackoverflow.com - http://serverfault.com - http://superuser.com - http://meta.stackoverflow.com - http://meta.serverfault.com - http://meta.superuser.com - http://stackapps.com And any other public (non-beta) website and its corresponding meta site at http://stackexchange.com/sites The original torrent of this material was provided hosting by ClearBits.
favoritefavorite ( 1 reviews )
Topics: Stackoverflow, serverfault.com, stackoverflow.com, superuser.com
The Archive Team Friendster Snapshot Collection
Jul 5, 2011 Archiveteam
web

eye 478

favorite 0

comment 0

Before its relaunch as a gaming website, Friendster was a social networking website that allowed users to connect with their friends. One of the elements of the site were the groups that members could join. This dataset contains the group memberships of all Friendster groups. It is the result of an extensive crawl of Friendster.com at the end of June 2011. It was performed as part of the ArchiveTeam project to archive part of the Friendster data before the service relaunched. The data files...
Topics: Friendster, Groups, Group Lists, Membership Lists, Archive Team
The Archive Team Friendster Snapshot Collection
Jul 5, 2011 Archiveteam
web

eye 6,809

favorite 1

comment 0

Before its relaunch as a gaming website, Friendster was a social networking website that allowed users to connect with their friends. The central element of the site was the 'friends list', showing the contacts of the user. This dataset contains the connections between all Friendster users. It is the result of an extensive crawl of Friendster.com at the end of June 2011. It was performed as part of the ArchiveTeam project to archive part of the Friendster data before the service relaunched. The...
Topics: Friendster, Friends, Friend Lists, Membership Lists, Archive Team
The Archive Team Just In Time Grabs
Jul 5, 2011 Archive Team
web

eye 494

favorite 0

comment 0

In May of 2011, Salon announced the deletion of the Table Talk message base, with 30 days notice, after 16 years of operation. With little attempt to find a new home for the site, and with little reason given, the site was ultimately deleted in June of 2011 and replaced with an article reminiscing on the history of Table Talk. Archive Team has downloaded the full public threads of Table Talk, excluding group threads that had a semi-private setting, predating most search engines.
Topics: Table Talk, Discussions, Salon, Archive Team
The Archive Team Just In Time Grabs
Jul 25, 2011
web

eye 63

favorite 0

comment 0

WGET grab of WELL.COM user websites conducted in November of 2008. Includes every externally-findable userpage and subdirectory of Google, for purposes of historical research and archiving.
The Archive Team Just In Time Grabs
Jul 28, 2011
web

eye 470

favorite 0

comment 0

A web archive of News of the World before it closed its doors. This is a copy of *most of* the www.newsoftheworld.co.uk website. I haven't checked everything but from doing some quick exploring around the data myself, I only found a few missing pages. This mirror was started about 2-3 days before the site went down so I can't be sure if everything made it or not, any pages you find that 404 were probably scraped after the site went down. I was also away on holidays at the time this all happened...
Topics: News of The World, NOTW
Away From Keyboard
Jul 29, 2011
web

eye 285

favorite 1

comment 0

An archive of Len Sassaman, still filling up with items.
The Archive Team Just In Time Grabs
Aug 1, 2011
web

eye 301

favorite 0

comment 0

Billed as "Twitpic for Audio", the Twaud.io service (Twitter Audio) allowed a short-form URL to post audio snippets on Twitter. Launched in May of 2009 by Massive Robot, Twaud.io was one of a number of third-party services bringing rich content access to Twitter streams. With a limit of 10 megabytes, no limit on content or approach, and an easy to use API, Twaud.io seemed poised for some level of success. In 2011, Twaud.io announced it was shutting down, and gracefully cut off...
Topics: mp3, twaudio, twaud.io, Massive Robot, audio
The Archive Team Just In Time Grabs
Aug 10, 2011
web

eye 45,531

favorite 0

comment 0

This is a download of http://forum.nos.nl/, the online discussion forums of Dutch public broadcaster NOS. It contains messages posted in 2005, 2006 and 2007 by visitors of the NOS website. Discussion topics include news, politics and NOS programmes. Downloaded June 2011. -- The archive is available in several formats: * a copy of the HTML page of each topic (including every message posted on the forum) * an XML file for each topic, providing the messages extracted from the HTML * a wget...
Topics: forum.nos.org, NOS, Forum, Archive
The Archive Team Just In Time Grabs
web

eye 27,119

favorite 0

comment 0

This is a Heritrix crawl of http://llink.nl/, the website of Dutch public broadcasting association LLiNK, made on 23 and 24 June 2011. It includes the main website as well as the programme-specific websites of LLiNK radio and television programmes. The crawl logs and order file are available in llink-20110624-crawl-logs.tar.bz2 -- The MD5 checksums of the files are: 050b714c6df98a29bdb6c1ff077c6953 llink-20110623100606-00000.warc.gz 49de8110b71ba8da7607eafbfe80fd50...
Topics: LLiNK, Dutch, Archive, Webgrab, NPO
The Archive Team Just In Time Grabs
Aug 17, 2011
web

eye 252

favorite 1

comment 0

Billed as "Twitpic for Audio", the Twaud.io service (Twitter Audio) allowed a short-form URL to post audio snippets on Twitter. Launched in May of 2009 by Massive Robot, Twaud.io was one of a number of third-party services bringing rich content access to Twitter streams. With a limit of 10 megabytes, no limit on content or approach, and an easy to use API, Twaud.io seemed poised for some level of success. In 2011, Twaud.io announced it was shutting down, and gracefully cut off...
Topics: mp3, twaudio, twaud.io, Massive Robot, audio
The Archive Team Just In Time Grabs
web

eye 552

favorite 1

comment 0

This is a Heritrix web archive of www.jana-news.ly, the site of the official state news agency in Libya. The archive was made on August 22, 2011, when the site was still online. The last article was published on August 21, 2011. On August 25 this was still the most recent information on the site. The site has sections in Arabic, English and French. From Wikipedia's description of JANA: -- The Jamahiriya News Agency, also known as JANA, was the official state news agency in Libya. It was founded...
The Archive Team Just In Time Grabs
Sep 2, 2011
web

eye 729

favorite 2

comment 0

Thingiverse is a website dedicated to the sharing of user-created digital design files. Providing primarily open source hardware designs licensed under the GNU General Public License or Creative Commons licenses, users choose the type of user license they wish to attach to the designs they share. 3D printers, laser cutters, milling machines and many other technologies can be used to physically create the files shared by the users on Thingiverse. Thingiverse is widely used in the DIY technology...
The Archive Team Just In Time Grabs
web

eye 258

favorite 0

comment 0

This is a Heritrix web crawl of VKBlog.nl, or Volkskrantblog, the weblogging service of Dutch newspaper De Volkskrant. Subscribers of the newspaper could use the service to run their own blogs, which some 18.000 of them actually did. Started since 2005, the service was hoped to increase the use of citizen journalism at De Volkskrant. In January 2011, De Volkskrant announced the closure of the service later in the year. The actual shut down was postponed several times while De Volkskrant...
Wikileaks.org Archive
Sep 4, 2011
web

eye 2,053

favorite 1

comment 0

Topic: WikiLeaks, Julian Assange
The Archive Team Friendster Snapshot Collection
web

eye 31

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 40

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 333

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 52

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 40

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 25

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 22

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 33

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 45

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 27

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 30

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 22

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 35

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 27

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 20

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 40

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 127

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 69

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 28

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 24

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 54

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 35

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 26

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 33

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 22

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 47

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 49

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 43

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 21

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 27

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 29

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 22

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 25

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 54

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 46

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 24

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 27

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 56

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 45

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 48

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 26

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 29

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 20

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 59

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 40

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 33

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 28

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 43

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 30

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 116

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 24

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 59

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 26

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 76

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 26

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 76

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 25

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 45

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 27

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 68

favorite 0

comment 0

The Archive Team Friendster Snapshot Collection
web

eye 60

favorite 0

comment 0