Accurate keyphrase extraction by discriminating overlapping phrases. Haddoud, M. & Abdeddaim, S. Journal of Information Science, April, 2014.
Accurate keyphrase extraction by discriminating overlapping phrases [link]Paper  doi  abstract   bibtex   
In this paper we define the document phrase maximality index (DPM-index), a new measure to discriminate overlapping keyphrase candidates in a text document. As an application we developed a supervised learning system that uses 18 statistical features, among them the DPM-index and five other new features. We experimentally compared our results with those of 21 keyphrase extraction methods on SemEval-2010/Task-5 scientific articles corpus. When all the systems extract 10 keyphrases per document, our method enhances by 13% the F-score of the best system. In particular, the DPM-index feature increases the F-score of our keyphrase extraction system by a rate of 9%. This makes the DPM-index contribution comparable to that of the well-known TFIDF measure on such a system.
@article{ haddoud_accurate_2014,
  title = {Accurate keyphrase extraction by discriminating overlapping phrases},
  url = {http://dx.doi.org/10.1177/0165551514530210},
  doi = {10.1177/0165551514530210},
  abstract = {In this paper we define the document phrase maximality index ({DPM}-index), a new measure to discriminate overlapping keyphrase candidates in a text document. As an application we developed a supervised learning system that uses 18 statistical features, among them the {DPM}-index and five other new features. We experimentally compared our results with those of 21 keyphrase extraction methods on {SemEval}-2010/Task-5 scientific articles corpus. When all the systems extract 10 keyphrases per document, our method enhances by 13% the F-score of the best system. In particular, the {DPM}-index feature increases the F-score of our keyphrase extraction system by a rate of 9%. This makes the {DPM}-index contribution comparable to that of the well-known {TFIDF} measure on such a system.},
  journal = {Journal of Information Science},
  author = {Haddoud, Mounia and Abdeddaim, Said},
  month = {April},
  year = {2014},
  keywords = {terminology_extraction}
}

Downloads: 0