The DARPA SEARCHLIGHT Dataset of Application Network Traffic. Ardi, C., Aubry, C., Kocoloski, B., DeAngelis, D., Hussain, A., Troglia, M., & Schwab, S. In Proceedings of the 15th Workshop on Cyber Security Experimentation and Test, of CSET '22, pages 59–64, New York, NY, USA, 2022. Association for Computing Machinery.
The DARPA SEARCHLIGHT Dataset of Application Network Traffic [link]Paper  doi  abstract   bibtex   
Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ∼ 750GB dataset that includes ∼ 2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.
@inproceedings{10.1145/3546096.3546103,
author = {Ardi, Calvin and Aubry, Connor and Kocoloski, Brian and DeAngelis, Dave and Hussain, Alefiya and Troglia, Matt and Schwab, Stephen},
title = {The DARPA SEARCHLIGHT Dataset of Application Network Traffic},
year = {2022},
isbn = {9781450396844},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3546096.3546103},
doi = {10.1145/3546096.3546103},
abstract = {Researchers are in constant need of reliable data to develop and evaluate AI/ML methods for networks and cybersecurity. While Internet measurements can provide realistic data, such datasets lack ground truth about application flows. We present a ∼ 750GB dataset that includes ∼ 2000 systematically conducted experiments and the resulting packet captures with video streaming, video teleconferencing, and cloud-based document editing applications. This curated and labeled dataset has bidirectional and encrypted traffic with complete ground truth that can be widely used for assessments and evaluation of AI/ML algorithms.},
booktitle = {Proceedings of the 15th Workshop on Cyber Security Experimentation and Test},
pages = {59–64},
numpages = {6},
keywords = {network traffic, network experimentation, datasets},
location = {Virtual, CA, USA},
series = {CSET '22}
}

Downloads: 0