Introducing the Internet Connectivity Statistics Dataverse: A High-Precision Dataset of Internet Connections for Scientific Research

Date and Time

April 18, 2019
05:00PM - 06:30PM EDT

Location

RCC Conference Room, 26 Trowbridge St., Cambridge MA

The mass adoption of the Internet has boosted the demand for scientific explanations about the effects of digitalization. What is the impact of social media on elections and polarization? What is the effect of digital technologies in economic growth, inequality or unemployment? How is public health affected by increased access to medical websites?

Official statistics typically provide a country-year resolution, but researchers need more precision in order to take into account variation inside countries such as urban versus rural areas, and also shorter-term effects of seasonal dynamics and shocking events. In addition, researchers working with highly precise Internet data need to address the challenges introduced by privacy legislation as well.

baleato

The Internet Connectivity Statistics Dataverse is the most precise dataset of Internet connections for scientific research available and it contributes to overcome both the precision and the privacy related challenges. First, we analyze the global traffic of the Internet using remote sensing to estimate connectivity by months, and down to city resolution. As we rely on direct observation of the Internet, we can get estimates also in areas where official statistics are not available and data cannot be retrieved, such as in the case of authoritarian regimes or territories experiencing political violence. Second, we estimate connectivity statistics using differential privacy algorithms, and we test the accuracy of our estimates. Finally, we make the statistics available for the entire research community thanks to the Harvard University Dataverse, the most prominent research data sharing software, maintained by the Institute for Quantitative Social Science.

In this session we will explain the method, and introduce the dataset towards exploring possible collaborations with researchers interested in using our data, contributing with new validations, and help in expanding our spatiotemporal coverage. The utility of Internet remote sensing can be illustrated with this paper on digital discrimination published in Science: http://science.sciencemag.org/content/353/6304/1151

Speakers: Mercè Crosas, Harvard University's Research Data Officer, with the Office of Vice Provost for Research and Chief Data Science and Technology Officer, Harvard University Institute for Quantitative Social Science; Suso Baleato, Postdoctoral Fellow, Harvard University Institute for Quantitative Social Science.

Sponsor: RCC.