Efficient Big Data Processing in Hadoop MapReduce

Efficient Big Data Processing in Hadoop MapReduce. Dittrich, J. & Quian, J. In Proceedings of the VLDB Endowment, volume 5, 2012. Issue: 12 ISSN: 21508097 (ISSN)
doi abstract bibtex

This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes effi- ciently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing en- gine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming history. There are many techniques that can be used with Hadoop MapReduce jobs to boost performance by orders of magnitude. In this tutorial we teach such techniques. First, we will briefly familiarize the audience with Hadoop MapReduce and motivate its use for big data processing. Then, we will focus on dif- ferent data management techniques, going from job optimization to physical data organization like data layouts and indexes. Through- out this tutorial, we will highlight the similarities and differences between Hadoop MapReduce and Parallel DBMS. Furthermore, we will point out unresolved research problems and open issues

@inproceedings{dittrich_efficient_2012,
	title = {Efficient {Big} {Data} {Processing} in {Hadoop} {MapReduce}},
	volume = {5},
	isbn = {2-9929639-8-4},
	doi = {10.14778/2367502.2367562},
	abstract = {This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes effi- ciently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing en- gine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming history. There are many techniques that can be used with Hadoop MapReduce jobs to boost performance by orders of magnitude. In this tutorial we teach such techniques. First, we will briefly familiarize the audience with Hadoop MapReduce and motivate its use for big data processing. Then, we will focus on dif- ferent data management techniques, going from job optimization to physical data organization like data layouts and indexes. Through- out this tutorial, we will highlight the similarities and differences between Hadoop MapReduce and Parallel DBMS. Furthermore, we will point out unresolved research problems and open issues},
	booktitle = {Proceedings of the {VLDB} {Endowment}},
	author = {Dittrich, Jens and Quian, Jorge-arnulfo},
	year = {2012},
	note = {Issue: 12
ISSN: 21508097 (ISSN)}
}

Downloads: 0

{"_id":"WvokTQFMPqxbbru64","bibbaseid":"dittrich-quian-efficientbigdataprocessinginhadoopmapreduce-2012","authorIDs":[],"author_short":["Dittrich, J.","Quian, J."],"bibdata":{"bibtype":"inproceedings","type":"inproceedings","title":"Efficient Big Data Processing in Hadoop MapReduce","volume":"5","isbn":"2-9929639-8-4","doi":"10.14778/2367502.2367562","abstract":"This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes effi- ciently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing en- gine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming history. There are many techniques that can be used with Hadoop MapReduce jobs to boost performance by orders of magnitude. In this tutorial we teach such techniques. First, we will briefly familiarize the audience with Hadoop MapReduce and motivate its use for big data processing. Then, we will focus on dif- ferent data management techniques, going from job optimization to physical data organization like data layouts and indexes. Through- out this tutorial, we will highlight the similarities and differences between Hadoop MapReduce and Parallel DBMS. Furthermore, we will point out unresolved research problems and open issues","booktitle":"Proceedings of the VLDB Endowment","author":[{"propositions":[],"lastnames":["Dittrich"],"firstnames":["Jens"],"suffixes":[]},{"propositions":[],"lastnames":["Quian"],"firstnames":["Jorge-arnulfo"],"suffixes":[]}],"year":"2012","note":"Issue: 12 ISSN: 21508097 (ISSN)","bibtex":"@inproceedings{dittrich_efficient_2012,\n\ttitle = {Efficient {Big} {Data} {Processing} in {Hadoop} {MapReduce}},\n\tvolume = {5},\n\tisbn = {2-9929639-8-4},\n\tdoi = {10.14778/2367502.2367562},\n\tabstract = {This tutorial is motivated by the clear need of many organizations, companies, and researchers to deal with big data volumes effi- ciently. Examples include web analytics applications, scientific applications, and social networks. A popular data processing en- gine for big data is Hadoop MapReduce. Early versions of Hadoop MapReduce suffered from severe performance problems. Today, this is becoming history. There are many techniques that can be used with Hadoop MapReduce jobs to boost performance by orders of magnitude. In this tutorial we teach such techniques. First, we will briefly familiarize the audience with Hadoop MapReduce and motivate its use for big data processing. Then, we will focus on dif- ferent data management techniques, going from job optimization to physical data organization like data layouts and indexes. Through- out this tutorial, we will highlight the similarities and differences between Hadoop MapReduce and Parallel DBMS. Furthermore, we will point out unresolved research problems and open issues},\n\tbooktitle = {Proceedings of the {VLDB} {Endowment}},\n\tauthor = {Dittrich, Jens and Quian, Jorge-arnulfo},\n\tyear = {2012},\n\tnote = {Issue: 12\nISSN: 21508097 (ISSN)}\n}\n\n","author_short":["Dittrich, J.","Quian, J."],"key":"dittrich_efficient_2012","id":"dittrich_efficient_2012","bibbaseid":"dittrich-quian-efficientbigdataprocessinginhadoopmapreduce-2012","role":"author","urls":{},"downloads":0},"bibtype":"inproceedings","biburl":"https://bibbase.org/zotero/tillhofmann","creationDate":"2019-12-09T14:42:18.174Z","downloads":0,"keywords":[],"search_terms":["efficient","big","data","processing","hadoop","mapreduce","dittrich","quian"],"title":"Efficient Big Data Processing in Hadoop MapReduce","year":2012,"dataSources":["9pYjFWPBodPyDyb7N"]}