Generation of Efficient Parsers Through Direct Compilation of XML Schema Grammars. Perkins, E., Matsa, M., Gaitatzes Kostoulas, M., Heifets, A., & Mendelsohn, N. IBM Systems Journal, 45(2):225-244, 2006.
doi  abstract   bibtex   
With the widespread adoption of SOAP and Web services, XML-based processing, and parsing of XML documents in particular, is becoming a performance-critical aspect of business computing. In such scenarios, XML is often constrained by an XML Schema grammar, which can be used during parsing to improve performance. Although traditional grammar-based parser generation techniques could be applied to the XML Schema grammar, the expressiveness of XML Schema does not lend itself well to the generic intermediate representations associated with these approaches. In this paper we present a method for generating efficient parsers by using the schema component model itself as the representation of the grammar. We show that the model supports the full expressive power of the XML Schema, and we present results demonstrating significant performance improvements over existing parsers.
@article{ per06,
  author = {Eric Perkins and Morris Matsa and Margaret {Gaitatzes Kostoulas} and Abraham Heifets and Noah Mendelsohn},
  title = {Generation of Efficient Parsers Through Direct Compilation of XML Schema Grammars},
  journal = {IBM Systems Journal},
  year = {2006},
  volume = {45},
  number = {2},
  pages = {225-244},
  doi = {10.1147/sj.452.0225},
  uri = {http://www.research.ibm.com/journal/sj/452/perkins.html},
  uri = {http://www.research.ibm.com/journal/sj/452/perkins.pdf},
  abstract = {With the widespread adoption of SOAP and Web services, XML-based processing, and parsing of XML documents in particular, is becoming a performance-critical aspect of business computing. In such scenarios, XML is often constrained by an XML Schema grammar, which can be used during parsing to improve performance. Although traditional grammar-based parser generation techniques could be applied to the XML Schema grammar, the expressiveness of XML Schema does not lend itself well to the generic intermediate representations associated with these approaches. In this paper we present a method for generating efficient parsers by using the schema component model itself as the representation of the grammar. We show that the model supports the full expressive power of the XML Schema, and we present results demonstrating significant performance improvements over existing parsers.}
}

Downloads: 0