Natural Language Watermarking for German Texts. Halvani, O., Steinebach, M., Wolf, P., & Zimmermann, R. In IH\&MMSec'13, June 17--19, 2013, Montpellier, France., 2013.
abstract   bibtex   
In this paper we present four informed natural language watermark embedding methods, which operate on the lexi- cal and syntactic layer of German texts. Our scheme pro- vides several bene⬚ts in comparison to state-of-the-art ap- proaches, as for instance that it is not relying on complex NLP operations like full sentence parsing, word sense disam- biguation, named entity recognition or semantic role pars- ing. Even rich lexical resources (e.g. WordNet or the Collins thesaurus), which play an essential role in many previous ap- proches, are unnecessary for our system. Instead, our meth- ods require only a Part-Of-Speech Tagger, simple wordlists that act as black- and whitelists and a trained classi⬚er, which automatically predicts the ability of potential lexi- cal or syntactic patterns to carry portions of the watermark message. Besides this, a part of the proposed methods can be easily adapted into other Indo-European languages, since the grammar rules the methods rely on are not restricted only to the German language. Because the methods per- form only lexical and minor syntactic transformations, the watermarked text is not a⬚ected by grammatical distortion and simultaneously the meaning of the text is preserved in 82:14% of the cases.
@inproceedings{ halvani_natural_2013,
  title = {Natural Language Watermarking for German Texts},
  abstract = {In this paper we present four informed natural language watermark embedding methods, which operate on the lexi- cal and syntactic layer of German texts. Our scheme pro- vides several bene⬚ts in comparison to state-of-the-art ap- proaches, as for instance that it is not relying on complex {NLP} operations like full sentence parsing, word sense disam- biguation, named entity recognition or semantic role pars- ing. Even rich lexical resources (e.g. {WordNet} or the Collins thesaurus), which play an essential role in many previous ap- proches, are unnecessary for our system. Instead, our meth- ods require only a Part-Of-Speech Tagger, simple wordlists that act as black- and whitelists and a trained classi⬚er, which automatically predicts the ability of potential lexi- cal or syntactic patterns to carry portions of the watermark message. Besides this, a part of the proposed methods can be easily adapted into other Indo-European languages, since the grammar rules the methods rely on are not restricted only to the German language. Because the methods per- form only lexical and minor syntactic transformations, the watermarked text is not a⬚ected by grammatical distortion and simultaneously the meaning of the text is preserved in 82:14% of the cases.},
  booktitle = {{IH}\&{MMSec}'13, June 17--19, 2013, Montpellier, France.},
  author = {Halvani, Oren and Steinebach, Martin and Wolf, Patrick and Zimmermann, Ralf},
  year = {2013},
  keywords = {parsing}
}

Downloads: 0