Applying Data Warehousing and Big Data Techniques to Analyze Internet Performance. Barbosa, T. M. S., Souza, R., Cruz, S. M. S., Campos, M. L. M., & Cottrell, L. In Proceedings of the 4th International Conference on Internet Applications, Protocols and Services.
Applying Data Warehousing and Big Data Techniques to Analyze Internet Performance [pdf]Paper  abstract   bibtex   
Measuring the quality of Internet is essential to evaluate the performance of data links around the world and to keep track of how countries have improved their connections throughout the years. Moreover, Internet performance measurements provide understanding for network bottlenecks, trouble-shooting and even insights about the impact of major events such as tsunamis, fiber cuts or social upheavals. For this reason, since 1998, the PingER (Ping End-to-end Reporting) initiative at SLAC National Accelerator Laboratory monitors end-to-end performance of Internet links spread over 160 countries, providing a worldwide history of Internet performance. Data containing network measurements are daily collected from PingER Measurement Agents (MAs) and stored into flat files. As a result, PingER maintains a valuable fine- grained big dataset consisting of Internet performance data around the world. However, due to the large amounts of data, performing sophisticated joint analyses on those files may be so difficult that it becomes unfeasible in some scenarios. In this paper, we apply data warehousing techniques to transform the data on those flat files into structured data using a data model that facilitates complex analyses. We load the transformed data into a big distributed data warehouse that is able to perform complex analytical queries on large volumes of data in seconds. Finally, we show some data analyses correlating Internet performance data to hypothetical real-world scenarios.

Downloads: 0