[CITASA] Call for Participation in WebSci'16 Hackathon [Exploring the Past of the Web: Alexandria & Archive-It Hackathon]

Ujwal Gadiraju gadiraju at l3s.de
Tue Mar 8 05:11:45 EST 2016



Exploring the Past of the Web: Alexandria & Archive-It Hackathon
Web Science 2016 Hackathon: http://www.websci16.org/hackathon

Hackathon Chairs

Avishek Anand, L3S Research Center, Germany
Jefferson Bailey, Internet Archive, USA

The Web has pervaded all walks of life and has become an important 
corpus for studying the humanities, social sciences, and for use by 
computer scientists and other disciplines. Web archives collect, 
preserve, and provide ongoing access to ephemeral Web pages and hence 
encode traces of human thought, activity, and history. This makes them a 
valuable resource for analysis and study. However, there have been only 
few concerted efforts to bring together tools, platforms, storage, 
processing frameworks, and existing collections for mining and analysing 
Web archives.

We present the Alexandria & Archive-It Hackathon @ WebSci’16 as a forum 
for scientists, engineers, practitioners, and enthusiasts to work with 
Web archive collections at scale and use and help build tools that can 
help realize the largely untapped potential of using Web archives in 
their research and work. The goal of the Hackathon is to bring together 
a small and focused group of participants to collaboratively work with 
Web archive collections using open-source tools and platforms and to 
discuss new ideas in exploring and analyzing these collections.

We will provide access to focused, subject-specific Web archive 
collections from a diverse set of institutions and topics. The data 
consists of collections from Archive-It, Internet Archive’s web 
archiving service, and is housed on a commercial data cluster (provided 
generously by www.altiscale.com) for processing and analysis, but can be 
browsed on the Web as well through their collection pages 
athttps://archive-it.org/. The topics range from web pages collected 
around events (like the U.S. Occupy Movement), interest groups 
(politics, art, et cetera), home pages (museums, universities) and more. 
All collections were archived over a notable period of time and can 
support multiple analytical approaches and tools.

A range of collections will be available for use in the hackathon. Some 
examples of the types of collections to be included:

[1]. Human Rights web archive collected by Columbia University: 
[2]. Occupy Movement 2011/2012, collected by Internet Archive: 
[3]. Auction Houses web archive, collected by New York Art Resources 
consortium: https://www.archive-it.org/collections/2135
[4]. Contemporary Women Artists on the Web, collected by National Museum 
of Women in the Arts:https://archive-it.org/collections/2973

To lower the entry barrier in accessing and analysing this data we will 
provide a small hands-on session on Day 1, using existing open source 
tools, and will be able to provide some coaching during the Hackathon to 
groups not yet fully fluent with working with large data clusters. We 
want to ensure that participation will be truly cross-disciplinary with 
the hope of fostering cross-fertilization of ideas from users and 
researchers from multiple disciplines, including social and political 
sciences, the humanities, and computer science. We will end the 
Hackathon on Day 4 with presentations of team accomplishments as well as 
discussions and exchange of ideas for future projects and 

The Hackathon will run in parallel to the WebSci’16 conference, to allow 
participants to register and attend the conference, and will finish one 
day after the conference. Participants will receive promotional 
materials from the event hosts and Internet Archive and Archive-It. The 
research team with the most accomplished plan, project, or future work 
will receive a complimentary Archive-It account that can be used to 
build their own web archive collection for use in their own future 
research. Alexandria and Archive-It also plan on convening additional 
hackathons and web archive data mining challenges in conjunction with 
future conferences and events.


The registration for the Hackathon is free for WebSci'16 participants, 
however we waive off the charges for participating only in the 

If you want to register for "Hackathon Only": People who only want to 
attend the hackathon, can register on http://websci16.org/registration 
by selecting "Dinner only" first and on the next page below their 
personal details select "Hackathon only".

Feel free to contact us if you have any questions at: 
websci-hackathon at l3s.de.

Best Regards,
Ujwal Gadiraju

Ujwal Gadiraju
L3S Research Center
Leibniz Universität Hannover
30167 Hannover, Germany

Phone: +49. 511. 762-5772
Fax: +49. 511. 762-19712
E-Mail: gadiraju at l3s.de
Web: www.l3s.de/~gadiraju/

More information about the CITAMS mailing list