A Fast and Unbiased Procedure to Randomize Ecological Binary Matrices with Fixed Row and Column Totals. Strona, G., Nappo, D., Boccacci, F., Fattorini, S., & San-Miguel-Ayanz, J. Nature Communications, June, 2014.
doi  abstract   bibtex   
A well-known problem in numerical ecology is how to recombine presence-absence matrices without altering row and column totals. A few solutions have been proposed, but all of them present some issues in terms of statistical robustness (that is, their capability to generate different matrix configurations with the same probability) and their performance (that is, the computational effort that they require to generate a null matrix). Here we introduce the 'Curveball algorithm', a new procedure that differs from existing methods in that it focuses rather on matrix information content than on matrix structure. We demonstrate that the algorithm can sample uniformly the set of all possible matrix configurations requiring a computational effort orders of magnitude lower than that required by available methods, making it possible to easily randomize matrices larger than 108 cells.
@article{stronaFastUnbiasedProcedure2014,
  title = {A Fast and Unbiased Procedure to Randomize Ecological Binary Matrices with Fixed Row and Column Totals},
  author = {Strona, Giovanni and Nappo, Domenico and Boccacci, Francesco and Fattorini, Simone and {San-Miguel-Ayanz}, Jesus},
  year = {2014},
  month = jun,
  volume = {5},
  issn = {2041-1723},
  doi = {10.1038/ncomms5114},
  abstract = {A well-known problem in numerical ecology is how to recombine presence-absence matrices without altering row and column totals. A few solutions have been proposed, but all of them present some issues in terms of statistical robustness (that is, their capability to generate different matrix configurations with the same probability) and their performance (that is, the computational effort that they require to generate a null matrix). Here we introduce the 'Curveball algorithm', a new procedure that differs from existing methods in that it focuses rather on matrix information content than on matrix structure. We demonstrate that the algorithm can sample uniformly the set of all possible matrix configurations requiring a computational effort orders of magnitude lower than that required by available methods, making it possible to easily randomize matrices larger than 108 cells.},
  journal = {Nature Communications},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-13223536,data-transformation-modelling,ecology,presence-absence,pseudo-random,statistics},
  lccn = {INRMM-MiD:c-13223536}
}

Downloads: 0