The Documentalist

Conference: Crisis Mapping 2009

Posted in Conferences & Meetings, technology by Sarah on September 29, 2009

There will be a conference hosted in Cleveland, Ohio October 13-16 2009 focusing on the emerging field of Crisis Mapping.  Conference registration is now closed, but it will be worthwhile to follow the results of the working groups that will be hosted at the event.  See Crisis Mapping 2009: The First International Conference on Crisis Mapping (ICCM) for more information.  The conference is hosted by the Harvard Humanitarian Institute and the Department of Political Science at John Carroll University.  The Conference has received sponsorship from the Open Society Institute (OSI), Humanity United (HU), and the US Institute of Peace (USIP).

Crisis Mapping

Crisis Mapping is an emerging field in humanitarian work that takes advantage of a variety of technologies and techniques to dynamically map events so that monitoring groups, governments, international bodies, or other interested parties can follow emerging crises and respond to them.   Crises can be anything from natural disasters to large scale violence.  The idea behind crisis mapping is to allow individuals witnessing an event to post information to dynamic maps (often through Google Maps) by up-loading images or text reports from a mobile device.  These images or text files are linked to a particular geographic location on an interactive map where users can click on marked locations and view the the posted information.  As individuals submit information to a map, crisis patterns emerge, allowing for better intervention strategies.  Mapping also records key data for tracing the emergence and movement of human rights events anywhere in the world.


For detailed information on Crisis Mapping visit iRevolution, a blog maintained by Patrick Meier, a Ph.D. candidate in Harvard’s Fletcher School of Law and Diplomacy.  Mr. Meier maintains the blog as part of his dissertation research, which:

… analyzes the impact of the information revolution on repressive rule and social resistance. I am particularly interested in how repressive regimes and resistance groups use information communication technologies to further their own strategic and tactical goals. To this end, I provide research guidance to DigiActive, a non-profit initiative dedicated to digital activism, and serve on the Board of Advisers for Digital Democracy.

The blog focuses on how humanitarian efforts and social resistance groups take advantage of a variety of technologies and methods for sharing information to counter social injustice, human rights abuses, and repression.  Mr. Meier profiles various mapping and information programs throughout his blog posts.  He has also produced a 38 minute video Introduction to Crisis Mapping that provides a good overview of different interactive mapping techniques and technologies.


Masking on-line activity to protect human rights workers

Posted in Reviews, technology by Sarah on September 24, 2009

One of our goals with this blog is to provide information about various services and technologies we encounter that might interest human rights activists, scholars, or archivists.   If you have suggestions for products or services you think  we should review, please let me know!


Tor Anonymity Online

Interlocking layers of information.

Interlocking layers of information. Image courtesy of

As the world increasingly shifts to the Web for communicating, there seems to be an equal increase in the need of repressive governments, corporations, and agencies to read over people’s shoulders (so to speak), by monitoring the flow of communication and information across the Web.  This is worrisome in general, but is particularly problematic for human rights work.  Activists, whistle-blowers, witnesses, and victims want to share what they know, but they have to be careful of using the Web because repressive regimes have gotten quite good at intercepting emails, blogs, and other electronic forms of communication and using them to identify the physical locations and identities of  senders and the recipients.  They then use the information to detain people, or even as an excuse to torture or execute them.  Recent events in Iran and China are perfect examples of how governments monitor and censor internet communications with the intent to quash popular movements by committing violence against those who try to stand up to repressive policies and practices.

One popular resource for circumventing Web censorship is a free program called Tor–an acronym for “The onion router.”   Onion routing differs from traditional communication on the Web in that it sends encrypted messages along a roundabout path rather than zipping them straight from point A to point B.  Typically, when a user uses their internet browser or email program, messages move directly from sender to recipient across a single, proprietary server (e.g. Yahoo!, Gmail, Hotmail, etc.).   This may be an incredibly efficient means of moving information, but a lot of identifying information is visible to third parties that lurk on these servers—your IP address provides precise information about where you are physically located when you send a given message, and the delivery header that is attached to a message to make sure that it gets to the recipient contains information about the content of the message itself that is easily read by outside parties.  (This, by the way, is what allows Gmail to customize advertisements within their email program—they can scan the content of  headers for messages you send and receive, looing for key words that trigger classes of ads).

An onion router is a program that masks your message from outside view by hiding it in a layered data bundle called an “onion.”  The onion shuttles the hidden message through a randomly selected series of proxy servers, making it much more difficult for anyone monitoring net activity to identify sender, receiver, or the content of the message.   There are several programs that provide this service for a fee, but in Tor’s case, the service is free because the proxy network consists of members who voluntarily make their PCs available as network nodes, or “onion routers.”  Information about the Tor Project (as the program is formally called) is available at, where you can download free software that allows you to send and receive messages anonymously.  There are no limitations on who can use the service, and users aren’t required to volunteer their own PCs to the network, but they are strongly encourage to.

What is Onion Routing?

Imagine standing in a large, crowded room and you are handed a brown paper cylinder with your name on it.  The person who hands it to you tells you to peel the paper with your name on it off of the cylinder to expose a new layer with a new name on it–your task is to deliver the cylinder to the person named, tell her to peel that layer of paper off and pass it on to the next person named and tell him to do the same.  This goes on until the very center of the cylinder is handed off to the person to whom it is addressed.  The idea is that the center of the cylinder contains a message sent to the final recipient by the very first person to hand the cylinder off.  But, because the cylinder traveled through so many hands, and along a random path through the crowd, anyone observing the receipt of the final message (or any of the hand-offs at any point along the way, for that matter) has no idea where it came from originally; he or she only saw the final hand-off in a relay of hand-offs.

Onion routing diagram

Onion routing diagram courtesy of

The situation described above is a good analogy for onion routing.  When a user logs into Tor using the modified version of the FireFox Web browser provided at the Tor Web page, the program automatically scans the network of member PCs to identify which ones are available for data receipt and transfer.  The program then writes a series of layered encryption codes that will route the sender’s message through a randomly selected subset of those PCs.  When each PC in the selected series receives the bundle, it reads the layer of code addressed to it, which will instruct the PC to re-encrypt the message bundle and send it to the next PC in the series, which, in turn will read and execute its layer of coded instructions,  and so on until the message reaches its destination.  Third parties observing the net as the message moves through this roundabout path can see that information is moving, but they only see the activity between one PC and the next—they don’t see the complete path and so cannot easily identify the origin and destination for the message.  Furthermore, because the message is wrapped in layers of encryption, it cannot be read while in-transit.  And, because the message bundle is re-encrypted every step of the way, an bserver will see one unreadable message  arriving to a PC and what looks like a different unreadable message going out to the next PC in the chain.  Thus, Tor offers an effective alternative to such visibility by hiding this information, but the trade off is that delivery is slow and cumbersome.(For more detailed information about how this process functions, see the onion routing article at Wikipedia or the overview page at Tor.)

Web Ecology Announces Free Translation Tool

Posted in technology, Twitter by Sarah on September 20, 2009

The Rosseta Stone.

The Rosseta Stone. Image courtesy of

The Web Ecology project (covered in the post from September 4, 2009) announced the release of their first open source resource tool on Friday September 18, 2009.  The tool works with Google’s language tools to detect, translate, and transliterate print language on the Web.  In the words of Jon Beilin, the author of the announcement:

One of the tenets of Web Ecology is accessibility to the field through open tools and open data. At the Web Ecology Project, we’re working to get more of our code in a clean, commented, and releasable state. The first tool that we have queued up for release is a Python module allowing easy use of Google Language Tools, involving language detection and translation, with transliteration in an experimental state (Google has not yet released the API spec for the transliteration portion so that was reverse-engineered).

Please visit the full post describing the Google Language Python Module to see an example of how the code will work for translating print material on the Web, and to download the the program, which is an MIT/X11-licensed release.  Web Ecology plans to continue developing and making Web research tools available, so keep an eye on the site to learn more as developments emerge.

Book Review: Video For Change

Posted in Reviews, technology by Sarah on September 18, 2009
Book image courtesy of

Book image courtesy of

Title: Video for Change: A Guide for Advocacy and Activism

Editors: Sam Gregory, Gillian Caldwell, Ronit Avni, and Thomas Harding (with a forward by Peter Gabriel)

Publication information: Ann Arbor, Michigan: Pluto Press. © 2005.

To Purchase: See The WITNESS Store


Video for Change is produced by the human rights organization WITNESS (their motto is “See it. Film it. Change it.”), and consists of 7 practical papers for how everyday people can easily and professionally incorporate video cameras into activism.  The basic message?  You don’t have to go be a professional film maker to make effective films with professional polish.  The book’s goal is to get cameras into the hands of advocates and get them recording as quickly as possible so that important human rights material gets recorded, disseminated, and saved.

Topics covered include:

  • The power of video in advocacy and how to plan for its most effective use
  • Safety issues ranging from protecting your own safety as an advocate to protecting the safety of those you record, as well as issues related to informed consent and the legal use of videos once they are created.
  • Strategies for using video as a storytelling medium, including advice on planning the structure of a final video to make the most impact within a well-planned narrative.
  • Straightforward technical advice on equipment and video techniques that will allow everyday people with no formal training in film-making  to quickly begin using video in their field work.
  • Editing advice aimed at helping advocates to keep their target audiences in mind as they compile their final documentary product.
  • video as legal evidence–advice on how to ensure that the videos that advocates produce can be admitted to a court of law, including guidelines for collecting appropriate metadata and provenance.

Each chapter draws on case studies to illustrate the efficacy of video in human rights work and provides diagrams, photographs, charts, and other visual aids to present straightforward steps for moving through the entire process of video advocacy: from recording the raw footage to producing and disseminating the final edited product.  At the end of the book, there are a number of appendices providing model recording and production plans, templates for consent and release forms, and production checklists.

The Human Rights Electronic Evidence Study at CRL

Posted in Reports by Sarah on September 16, 2009

How do we create these things?  Image courtesy of

What do we do with these things?

CRL is currently engaged in a study of how NGOs  think about and engage with digital documentation as a form of evidence for human rights activities (whether activism, scholarship, or legal action).  We’re interested in learning about the challenges human rights professionals encounter when creating and preserving electronic and digital documents (videos, emails, etc.) that could serve as evidence in a variety of human rights contexts; institutions’ goals for maintaining and using digital documents to support sustained human rights work; and strategies that work well for gathering and preserving them to meet long-term goals.  The goal in all of this is to devise strategies and practices for mobilizing human rights documentation and evidence as a vital resource for sustained activism, scholarship, and policy-making.

If–after reading the overview below–you’d like to talk to me about anything related to our study, please let me know by leaving a comment on this post and I will get right back to you–I’ll be able to see your email address, but the rest of the world will not.



Overview of the Human Rights Electronic Evidence Study


Social and digital media (e.g., photos, blogs, videos, and Twitter Tweets) continue to gain mainstream recognition as powerful tools for creating awareness of human rights abuses; as a result, digital materials circulated via the World Wide Web constitute a potential treasure-trove of primary source materials for activists, scholars, and policy-makers seeking to affect change. Unfortunately, maintaining these resources for long-term human rights work vexes human rights field professionals on several levels, creating frustration and disorganization in collection efforts.

In order to help human rights field workers, scholars, archivists, and legal practitioners meet the many challenges related to preserving digital documentation of human rights work, the Center for Research Libraries-Global Resources Network (CRL-GRN) is engaged in an 18 month “Human Rights Electronic Evidence Study” (funded by The John D. and Catherine T. MacArthur Foundation). The goals are to understand the processes and related challenges of collecting, utilizing, and maintaining documentation as a human rights resource and to develop strategies for addressing these challenges.  Doing so will support continued work in human rights activism, policy-making, scholarship, and legal action.

Challenges related to digital documentation

Volume and rapid distribution of electronic documents:

Thanks to increased access to handheld digital devices and the internet, digital documentation is cheap, easy to produce, and quickly disseminated.  This creates a challenge of volume, as images, texts, and videos of human rights events flood the World Wide Web.  Relatedly, organizations are increasingly creating internal, operational documents via electronic means, resulting in a larger volume of production than paper documentation previously allowed.  In both cases, the challenge is to keep pace with the rapid generation of new digital items as we preserve relevant materials for supporting continued work in human rights advocacy.

Ephemeral nature of electronic documents & rapidly changing technology:

Digital documents are ephemeral—they lack the tangibility of hand written notes, printed photos, or typed reports that encourages people to save them—and because computer storage is a limited resource for non-profit organizations, such documents (e.g., email or internal memoranda)  are frequently deleted with no hard-copy back-up.  To further complicate this issue, all forms of digital documents are created in a context of constantly changing technological media, thus making it difficult to maintain access to documents that do get saved.  In each case, the result is loss of valuable material for sustained activism, policy-making, and scholarship. 

Recording provenance, contextual information & metadata:

Because there is no uniform and easy means for doing so, field workers often do not collect provenance (chain-of-custody), contextual information, and metadata associated with the digital documents they create—information that is necessary if documents are to continue to serve scholarship, policy making, or legal work.  Relatedly, because there are no guidelines for sorting and storing documents, many organizations save or delete internal documentation in a piecemeal fashion.  Both scenarios result in incomplete and disorganized records.

Strategies for addressing challenges

Goals for the study:

Given the challenges described above, there is a clear need to establish simple mechanisms for documenting the provenance, context, and metadata for digital documents and for organizing these materials so that they continue to serve the needs of human rights activists, scholars, and policy-makers well into the future.  The current study begins to address this need by focusing on four tasks:

1)       Assess the practices and technologies used by local and regional monitoring groups and activists to create and collect documentation in electronic media of human rights abuses and violations;

2)      Determine how adequately these practices and technologies support advocacy, investigations, reporting, and legal proceedings on a local and international basis, and serve “downstream” uses by researchers and archivists.

3)      Identify practical measures, tools and standards that can improve practice and ensure greater integrity and durability of the electronic evidence.

4)      Make available on-line training and guidance to help local groups and organizations collect and manage such evidence more effectively

Products of the study:

In order to accomplish these goals, CRL-GRN is currently interviewing and surveying human rights field workers, organizational leaders, and archivists concerning the extent of their interaction with, concerns about, and desires for digital documentation.  The information from these interviews and surveys will be used to create the following products:

1)      A report of shared challenges related to producing, collecting, and preserving electronic or digital documentation for sustained activism for correcting human rights abuses and violations;

2)      A report of existing “best practices” for handling such documentation in order to provide guidelines for groups shifting their documentation practices to a digital format;

3)      An on-line “Human Rights Resources Network” supported by CRL-GRN, which will allow human rights field workers, administrators, scholars, and archivists to share information, collect and access resources, and collaborate on protocols for handling digital documentation.

Ultimately, this work will establish key infrastructure for accumulating and preserving information that will serve sustained scholarly, political, and legal work in human rights, thus increasing our understanding of—and engagement with—human rights abuses and activism around the world.

New Twitter Terms Potentially Impact Archiving

Posted in Archiving Solutions, Twitter by Sarah on September 15, 2009
The confusing world of copyright.  Image courtesy of vtualerts.

The confusing world of copyright. Image courtesy of

Twitter Announces New Copy Right Terms

On September 10, 2009,  Twitter announced that they have updated their copyright terms for user posts (see their blog at for an overview).  Previously, Twitter simply assured that users’ posts were their own, but encouraged people to consider their material as part of the public domain.  The new terms specify that,  whereas tweets are the property of users, they automatically enter into the public domain. Furthermore, users–by virtue of agreeing to Twitter’s terms and conditions–grant Twitter the right to world-wide distribution of tweets, as well as the right to distribute Tweets to outside organizations for purposes of media coverage or research.

I’ve included Twitter’s original terms of use and their new terms below so that you can compare and contrast.  It appears that the new terms may allow for straightforward harvesting and archiving of Twitter tweets.

Twitter’s original copyright statement:

Copyright (What’s Yours is Yours)

1. We claim no intellectual property rights over the material you provide to the Twitter service. Your profile and materials uploaded remain yours.  You can remove your profile at any time by deleting your account.  This will also remove any text and images you have stored in the system.

2. We encourage users to contribute their creations to the public domain or consider progressive licensing terms

3. Twitter undertakes to obey all relevant copyright laws.  We will review all claims of copyright infringement received and remove content deemed to have been posted or distributed in violation of any such laws.

(Source: accessed 8/28/2009 at 1:00 pm.  N.B.–clicking on the link to the left will take you to the new terms and conditions of use)

Twitter’s new copyright statement:

Your rights

You retain your rights to any Content you submit, post or display on or through the Services. By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed).

TIP: This license is you authorizing us to make your Tweets available to the rest of the world and to let others do the same. But what’s yours is yours – you own your content.

You agree that this license includes the right for Twitter to make such Content available to other companies, organizations or individuals who partner with Twitter for the syndication, broadcast, distribution or publication of such Content on other media and services, subject to our terms and conditions for such Content use.

TIP: Twitter has an evolving set of rules for how API developers can interact with your content. These rules exist to enable an open ecosystem with your rights in mind.

Such additional uses by Twitter, or other companies, organizations or individuals who partner with Twitter, may be made with no compensation paid to you with respect to the Content that you submit, post, transmit or otherwise make available through the Services.

We may modify or adapt your Content in order to transmit, display or distribute it over computer networks and in various media and/or make changes to your Content as are necessary to conform and adapt that Content to any requirements or limitations of any networks, devices, services or media.

You are responsible for your use of the Services, for any Content you provide, and for any consequences thereof, including the use of your Content by other users and our third party partners. You understand that your Content may be rebroadcasted by our partners and if you do not have the right to submit Content for such use, it may subject you to liability. Twitter will not be responsible or liable for any use of your Content by Twitter in accordance with these Terms. You represent and warrant that you have all the rights, power and authority necessary to grant the rights granted herein to any Content that you submit.

Twitter gives you a personal, worldwide, royalty-free, non-assignable and non-exclusive license to use the software that is provided to you by Twitter as part of the Services. This license is for the sole purpose of enabling you to use and enjoy the benefit of the Services as provided by Twitter, in the manner permitted by these Terms.

Twitter Rights

All right, title, and interest in and to the Services (excluding Content provided by users) are and will remain the exclusive property of Twitter and its licensors. The Services are protected by copyright, trademark, and other laws of both the United States and foreign countries. Nothing in the Terms gives you a right to use the Twitter name or any of the Twitter trademarks, logos, domain names, and other distinctive brand features. Any feedback, comments, or suggestions you may provide regarding Twitter, or the Services is entirely voluntary and we will be free to use such feedback, comments or suggestions as we see fit and without any obligation to you.

(Source: accessed 9/15/2009 at 10:45 am)

Breaking Tweets: Twitter Tweets Informing Human Rights News

Posted in Human Rights news, Twitter by Sarah on September 10, 2009
"A little bird told me" A weekly op-ed column at Breaking Tweets.  Image courtesy of

"A little bird told me." Image courtesy of

Here’s a fun site that illustrates how twitter tweets can make informative news pieces related to human rights.  Breaking Tweets, in its own words produces “world news, Twitter-style” by creating journalistic news articles based on first-person information posted on Twitter.  Tweets are pulled together into coherent stories; these stories, along with links to the tweets that inform them, are archived at the website.  Many of the news items produced by Breaking Tweets focus on issues directly related to human rights, social justice, or environmental justice.  The site also allows you to view stories by region, grouping them by major geographic areas: Africa, Americas, Asia, Europe, Mideast, and Oceania.

Harvesting and Preserving Twitter Tweets: A Model from the Web Ecology Project

Posted in Archiving Solutions, Twitter by Sarah on September 4, 2009
How do you capture <i>that<i>? <br /><i>Image courtesy of tweetwheel<i>

How do you capture that? Image courtesy of tweetwheel

Every day, users of the social media platform Twitter send out streams of “tweets” (short text messages of 140 characters or less) to communicate about events, share photos, and link readers, or “followers,” to other on-line sources of information.  Thus, when users tweet about human rights events or issues, Twitter becomes a powerful tool for human rights work, both for mobilizing action and documenting events.  In the case of human rights, a portion of tweets become first-person records of key events and therefore constitute a valuable potential resource for human rights scholarship, activism, and legal action.  However, collecting and archiving those tweets for such work can be challenging due to the volume of tweets produced and their fleeting nature.  Fortunately, the Web Ecology Project (WEP—an overview of the organization can be found at the end of this report) has devised a workable solution for harvesting Twitter tweets.[1] By using readily available server technologies, working with Twitter’s established access and data sharing policies, and drawing on the skills of trained programmers, the research team at the WEP collects, stores, and archives massive numbers of Twitter tweets.[2] Their tweet-harvesting set-up is straight forward and can potentially be implemented by any organization wishing to gather similar materials from Twitter, as long as they have access to a programmer who can help manage the process.

The first step to collecting and archiving Twitter tweets is gaining access to Twitter’s Application Programming Interface (API), which WEP accomplished by following a standard application process established by Twitter for permitting access to their data.[3] An API serves as a common access point that allows various programs and platforms to “talk” to each other through shared variables, even if they do not share the same programming language.  Basically, the API allows programmers to build applications that share information between platforms (for example, the ability to post Twitter tweets via Facebook or Facebook updates via Twitter).

With API access secured, the next step is to capture and download data from Twitter’s database.  The WEP’s programmers accomplish this by writing code that requests data from Twitter’s servers via the API.  The code instructs Twitter’s server to harvest data that meet specific search criteria contained in the code request—typically key words or phrases that appear in tweets about the event or topic of interest.  For example, if a researcher wished to collect tweets related to the 2009 Iranian presidential election, she would submit search terms such as: #iranelection, Neda, Ahmadinejad , et cetera.  When Twitter’s data server receives the code command, it pulls all tweets containing any of the requested terms, bundles them as a data packet, and sends the packet back to the WEP’s server.

Once the data arrive in the WEP’s server, the tweets dump into a massive database program as individual text files accompanied by relevant metadata (time and date tweet was created, Twitter user name, and location (if available)). The database is essentially a meta-form of an excel spreadsheet organized in rows and columns; it is the sort of thing that any trained server programmer can create when establishing a server’s architecture.  Once the tweets are grouped and stored in this database, they are searchable and sortable, so that both qualitative and quantitative analyses can be run on them.  And, most importantly, the database is easily archived and shared because a database of this sort is a fundamental type of programming that does not change much over time, meaning that the content will be readable down the line.

Though the request and delivery process that the Web Ecology Project has established is rapid and efficient, a couple of important limitations impact this process. First, once a code request is sent, harvest and delivery of data is automatic, however, the request process itself is not.  Code must be hand written and manually sent, which can complicate archiving tweets for the duration of an important event.  Typically, Twitter users responding to events send out tweets for a few days, which means that data need to download for the duration of the event in order to capture as much relevant material as possible.  Since the WEP programmers have not yet devised a means of sending automated requests to Twitter, they have to manually resend requests for a particular set of terms at regular intervals over the course of several days as they follow a trending topic on Twitter. Second, although Twitter shares their data freely, they have one stipulated limitation on harvesting: only data up to five days old may be collected in response to a code request (though Twitter does maintain a database of all of the tweets ever posted since it came on line in 2006).  However, these limitations should not hinder harvesting if a researcher or archivist is diligent and begins requesting data shortly after an event begins to trend on Twitter and then regularly re-sends the request until the event dies down.

These exceptions aside, the process described above provides a model for one means of establishing and maintaining archives of fleeting, first-person, digital documentation of key events produced through social media platforms.  Though the Web Ecology Project team established the process explained above for collecting and archiving Twitter data, other social media platforms, such as Facebook, MySpace, or LinkedIn, also use APIs to integrate their functions with other social networking platforms in order to meet users’ desires to work seamlessly between their various social presences on the web.  Therefore, the process described for the Web Ecology Project’s Twitter research would also apply to collecting and archiving digital documentation from a variety of social media sources.

Notes on the Web Ecology Project

The Web Ecology Project is an unaffiliated research group made up of independent researchers in the Boston area interested in the social processes of the internet and social media.  Members pool their expertise and resources to conduct relevant research into social media trends.  To date, the research group receives no outside funds (public or private), therefore, members of the WEP have pooled their resources to purchase infrastructure such as servers, and all of the work they do is voluntary.  That said, their business model is shifting and evolving as interest in their work grows and they are contemplating a means of doing contractual research.  With regards to the data they collect and archive, the WEP researchers make  them available to interested parties when and where appropriate, within the limitations of legal restrictions with Twitter and Twitter users.  If you are interested in learning more about data availability, contact the group at Dataset availability is dependent upon WEP research; they can only make data available that they originally collected for their own research interests.  At the time that this report was written, researchers at the WEP state that they plan to store all of the databases and archives they create indefinitely as a resource to future investigators.  For more information on the goals and objectives of the Web Ecology Project, see their mission statement at:

[1] Special thanks goes to Dharmishta Rood of the Web Ecology Project for explaining the data harvesting and archiving process described in this report.

[2] Copyright on all tweets belongs to Twitter users; however, Twitter encourages users to contribute their tweets to the public domain (see for details on terms of service and copyright).  Tweets submitted as such fall under fair use rules for copyright.

[3] See for Twitter’s API application process.