A Baseline for NSFW Video Detection in e-Learning Environments. de Freitas, P. V. A.; Santos, G. N. P. d.; Busson, A. J. G.; Guedes, Á. L. V.; and Colcher, S. In Proceedings of the 25th Brazillian Symposium on Multimedia and the Web, of WebMedia '19, pages 357–360. ACM.
A Baseline for NSFW Video Detection in e-Learning Environments [link]Paper  doi  abstract   bibtex   
The broad use of video capture and services for its storage and transmission has enabled the production of a massive volume of video data. This usage presents a challenge in controlling the type of content that is loaded for these video storage services. The Internet slang NSFW (Not Safe For Work) is often used as a warning for media that contain inappropriate content, such as nudity, intense sexuality, violence, gore or other potentially disturbing subject matter. Convolutional Neural Network (CNNs) architectures, or ConvNets, have become the primary method used for audio-visual pattern recognition. In this work, we intend to: (1) create a CNN based model for video features extraction; And (2), validate these video features with baselines models for NSFW video classification using a multi-modal approach. In initial experimentation, our best model achieves a recall of 96.6% for NSFW class.
@inproceedings{de_freitas_baseline_2019,
	location = {New York, {NY}, {USA}},
	title = {A Baseline for {NSFW} Video Detection in e-Learning Environments},
	rights = {All rights reserved},
	isbn = {978-1-4503-6763-9},
	url = {http://doi.acm.org/10.1145/3323503.3360625},
	doi = {10.1145/3323503.3360625},
	series = {{WebMedia} '19},
	abstract = {The broad use of video capture and services for its storage and transmission has enabled the production of a massive volume of video data. This usage presents a challenge in controlling the type of content that is loaded for these video storage services. The Internet slang {NSFW} (Not Safe For Work) is often used as a warning for media that contain inappropriate content, such as nudity, intense sexuality, violence, gore or other potentially disturbing subject matter. Convolutional Neural Network ({CNNs}) architectures, or {ConvNets}, have become the primary method used for audio-visual pattern recognition. In this work, we intend to: (1) create a {CNN} based model for video features extraction; And (2), validate these video features with baselines models for {NSFW} video classification using a multi-modal approach. In initial experimentation, our best model achieves a recall of 96.6\% for {NSFW} class.},
	pages = {357--360},
	booktitle = {Proceedings of the 25th Brazillian Symposium on Multimedia and the Web},
	publisher = {{ACM}},
	author = {de Freitas, Pedro V. A. and Santos, Gabriel N. P. dos and Busson, Antonio J. G. and Guedes, Álan L. V. and Colcher, Sérgio},
	urldate = {2019-11-15},
	date = {2019}
}
Downloads: 0