Efficient computation of approximate gene clusters based on reference occurrences. Jahn, K. J Comput Biol, 18(9):1255–1274, 2011.
doi  abstract   bibtex   
Whole genome comparison based on the analysis of gene cluster conservation has become a popular approach in comparative genomics. While gene order and gene content as a whole randomize over time, it is observed that certain groups of genes which are often functionally related remain co-located across species. However, the conservation is usually not perfect which turns the identification of these structures, often referred to as approximate gene clusters, into a challenging task. In this article, we present an efficient set distance based approach that computes approximate gene clusters by means of reference occurrences. We show that it yields highly comparable results to the corresponding non-reference based approach, while its polynomial runtime allows for approximate gene cluster detection in parameter ranges that used to be feasible only with simpler, e.g., max-gap based, gene cluster models. To illustrate further the performance and predictive power of our algorithm, we compare it to a state-of-the art approach for max-gap gene cluster computation.
@Article{jahn11efficient,
  author    = {Katharina Jahn},
  title     = {Efficient computation of approximate gene clusters based on reference occurrences.},
  journal   = {J Comput Biol},
  year      = {2011},
  volume    = {18},
  number    = {9},
  pages     = {1255--1274},
  abstract  = {Whole genome comparison based on the analysis of gene cluster conservation has become a popular approach in comparative genomics. While gene order and gene content as a whole randomize over time, it is observed that certain groups of genes which are often functionally related remain co-located across species. However, the conservation is usually not perfect which turns the identification of these structures, often referred to as approximate gene clusters, into a challenging task. In this article, we present an efficient set distance based approach that computes approximate gene clusters by means of reference occurrences. We show that it yields highly comparable results to the corresponding non-reference based approach, while its polynomial runtime allows for approximate gene cluster detection in parameter ranges that used to be feasible only with simpler, e.g., max-gap based, gene cluster models. To illustrate further the performance and predictive power of our algorithm, we compare it to a state-of-the art approach for max-gap gene cluster computation.},
  comment   = {reference gene clusters are "introduced"},
  doi       = {10.1089/cmb.2011.0132},
  keywords  = {3AGC; gene clusters; reference gene clusters; gecko},
  owner     = {Sebastian},
  pmid      = {21899430},
  timestamp = {2012.03.20},
}

Downloads: 0