Data intensive computing for bioinformatics. Qiu, J., Ekanayake, S., Ekanayake, J., Wu, S., Gunarathne, T., Beason, S., Choi, Y., Fox, G., C., Bae, S., Rho, M., Ruan, Y., & Tang, H. Volume 1 , 2013.
doi  abstract   bibtex   
© 2013, IGI Global. Data intensive computing, cloud computing, and multicore computing are converging as frontiers to address massive data problems with hybrid programming models and/or runtimes including MapReduce, MPI, and parallel threading on multicore platforms. A major challenge is to utilize these technologies and large-scale computing resources effectively to advance fundamental science discoveries such as those in Life Sciences. The recently developed next-generation sequencers have enabled large-scale genome sequencing in areas such as environmental sample sequencing leading to metagenomic studies of collections of genes. Metagenomic research is just one of the areas that present a significant computational challenge because of the amount and complexity of data to be processed. This chapter discusses the use of innovative data-mining algorithms and new programming models for several Life Sciences applications. The authors particularly focus on methods that are applicable to large data sets coming from high throughput devices of steadily increasing power. They show results for both clustering and dimension reduction algorithms, and the use of MapReduce on modest size problems. They identify two key areas where further research is essential, and propose to develop new O(NlogN) complexity algorithms suitable for the analysis of millions of sequences. They suggest Iterative MapReduce as a promising programming model combining the best features of MapReduce with those of high performance environments such as MPI.
@book{
 title = {Data intensive computing for bioinformatics},
 type = {book},
 year = {2013},
 source = {Bioinformatics: Concepts, Methodologies, Tools, and Applications},
 volume = {1},
 id = {ebee85c8-f21a-3f76-87de-ce0c0d0620ce},
 created = {2017-11-28T17:32:48.600Z},
 file_attached = {false},
 profile_id = {42d295c0-0737-38d6-8b43-508cab6ea85d},
 last_modified = {2020-05-11T14:43:29.236Z},
 read = {false},
 starred = {false},
 authored = {true},
 confirmed = {false},
 hidden = {false},
 citation_key = {Qiu2013},
 folder_uuids = {36d8ccf4-7085-47fa-8ab9-897283d082c5},
 private_publication = {false},
 abstract = {© 2013, IGI Global. Data intensive computing, cloud computing, and multicore computing are converging as frontiers to address massive data problems with hybrid programming models and/or runtimes including MapReduce, MPI, and parallel threading on multicore platforms. A major challenge is to utilize these technologies and large-scale computing resources effectively to advance fundamental science discoveries such as those in Life Sciences. The recently developed next-generation sequencers have enabled large-scale genome sequencing in areas such as environmental sample sequencing leading to metagenomic studies of collections of genes. Metagenomic research is just one of the areas that present a significant computational challenge because of the amount and complexity of data to be processed. This chapter discusses the use of innovative data-mining algorithms and new programming models for several Life Sciences applications. The authors particularly focus on methods that are applicable to large data sets coming from high throughput devices of steadily increasing power. They show results for both clustering and dimension reduction algorithms, and the use of MapReduce on modest size problems. They identify two key areas where further research is essential, and propose to develop new O(NlogN) complexity algorithms suitable for the analysis of millions of sequences. They suggest Iterative MapReduce as a promising programming model combining the best features of MapReduce with those of high performance environments such as MPI.},
 bibtype = {book},
 author = {Qiu, J. and Ekanayake, S. and Ekanayake, J. and Wu, S. and Gunarathne, T. and Beason, S. and Choi, Y.J. and Fox, Geoffrey Charles and Bae, S.-H. and Rho, M. and Ruan, Y. and Tang, H.},
 doi = {10.4018/978-1-4666-3604-0.ch016}
}

Downloads: 0