Evidence for Soft Bounds in Ubuntu Package Sizes and Mammalian Body Masses. Gherardi, M., Mandrà, S., Bassetti, B., & Cosentino Lagomarsino, M. Proceedings of the National Academy of Sciences, 110(52):21054–21058, December, 2013.
doi  abstract   bibtex   
[Significance] Not unlike a big city, a large software project grows in a complex way, involving many developers and even more users, but a predictive framework to understand these temporal patterns is lacking. We focus on software size and analyze the changes of the Ubuntu open source operating system, finding two quantitative laws. First, growth is driven by changes in scale rather than by addition-subtraction; second, evolution toward larger sizes between two consecutive releases is limited by bounds that depend on the starting size of a package. Strikingly, a stochastic model that implements these two laws is predictive. Finally, we provide evidence that similar principles could be in place for the evolution of body mass in mammals. [Abstract] The development of a complex system depends on the self-coordinated action of a large number of agents, often determining unexpected global behavior. The case of software evolution has great practical importance: knowledge of what is to be considered atypical can guide developers in recognizing and reacting to abnormal behavior. Although the initial framework of a theory of software exists, the current theoretical achievements do not fully capture existing quantitative data or predict future trends. Here we show that two elementary laws describe the evolution of package sizes in a Linux-based operating system: first, relative changes in size follow a random walk with non-Gaussian jumps; second, each size change is bounded by a limit that is dependent on the starting size, an intriguing behavior that we call '' soft bound.'' Our approach is based on data analysis and on a simple theoretical model, which is able to reproduce empirical details without relying on any adjustable parameter and generates definite predictions. The same analysis allows us to formulate and support the hypothesis that a similar mechanism is shaping the distribution of mammalian body sizes, via size-dependent constraints during cladogenesis. Whereas generally accepted approaches struggle to reproduce the large-mass shoulder displayed by the distribution of extant mammalian species, this is a natural consequence of the softly bounded nature of the process. Additionally, the hypothesis that this model is valid has the relevant implication that, contrary to a common assumption, mammalian masses are still evolving, albeit very slowly.
@article{gherardiEvidenceSoftBounds2013,
  title = {Evidence for Soft Bounds in {{Ubuntu}} Package Sizes and Mammalian Body Masses},
  author = {Gherardi, Marco and Mandr{\`a}, Salvatore and Bassetti, Bruno and Cosentino Lagomarsino, Marco},
  year = {2013},
  month = dec,
  volume = {110},
  pages = {21054--21058},
  issn = {1091-6490},
  doi = {10.1073/pnas.1311124110},
  abstract = {[Significance] Not unlike a big city, a large software project grows in a complex way, involving many developers and even more users, but a predictive framework to understand these temporal patterns is lacking. We focus on software size and analyze the changes of the Ubuntu open source operating system, finding two quantitative laws. First, growth is driven by changes in scale rather than by addition-subtraction; second, evolution toward larger sizes between two consecutive releases is limited by bounds that depend on the starting size of a package. Strikingly, a stochastic model that implements these two laws is predictive. Finally, we provide evidence that similar principles could be in place for the evolution of body mass in mammals.

 [Abstract] The development of a complex system depends on the self-coordinated action of a large number of agents, often determining unexpected global behavior. The case of software evolution has great practical importance: knowledge of what is to be considered atypical can guide developers in recognizing and reacting to abnormal behavior. Although the initial framework of a theory of software exists, the current theoretical achievements do not fully capture existing quantitative data or predict future trends. Here we show that two elementary laws describe the evolution of package sizes in a Linux-based operating system: first, relative changes in size follow a random walk with non-Gaussian jumps; second, each size change is bounded by a limit that is dependent on the starting size, an intriguing behavior that we call '' soft bound.'' Our approach is based on data analysis and on a simple theoretical model, which is able to reproduce empirical details without relying on any adjustable parameter and generates definite predictions. The same analysis allows us to formulate and support the hypothesis that a similar mechanism is shaping the distribution of mammalian body sizes, via size-dependent constraints during cladogenesis. Whereas generally accepted approaches struggle to reproduce the large-mass shoulder displayed by the distribution of extant mammalian species, this is a natural consequence of the softly bounded nature of the process. Additionally, the hypothesis that this model is valid has the relevant implication that, contrary to a common assumption, mammalian masses are still evolving, albeit very slowly.},
  journal = {Proceedings of the National Academy of Sciences},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-12825905,ecology,evolution,free-software,modelling,multi-scale,multiplicative-structure,relative-distance-similarity,similarity,soft-constraint,software-evolvability,statistics},
  lccn = {INRMM-MiD:c-12825905},
  number = {52}
}

Downloads: 0