Replica Management in Data Intensive Distributed Science Applications

Replica Management in Data Intensive Distributed Science Applications. Chervenak, A. & Schuler, R. In Data Intensive Distributed Computing: Challenges and Solutions for Large-scale Information Management, 9, pages 188–205. IGI Global, 2012.
abstract bibtex

Management of the large data sets produced by data-intensive scientific applications is complicated by the fact that participating institutions are often geographically distributed and separated by distinct administrative domains. A key data management problem in these distributed collaborations has been the creation and maintenance of replicated data sets. This chapter provides an overview of replica management schemes used in large, data-intensive, distributed scientific collaborations. Early replica management strategies focused on the development of robust, highly scalable catalogs for maintaining replica locations. In recent years, more sophisticated, application-specific replica management systems have been developed to support the requirements of scientific Virtual Organizations. These systems have motivated interest in application-independent, policy-driven schemes for replica management that can be tailored to meet the performance and reliability requirements of a range of scientific collaborations. The authors discuss the data replication solutions to meet the challenges associated with increasingly large data sets and the requirement to run data analysis at geographically distributed sites.

@incollection{Chervenak2012a,
	abstract = {Management of the large data sets produced by data-intensive scientific applications is complicated by the fact that participating institutions are often geographically distributed and separated by distinct administrative domains. A key data management problem in these distributed collaborations has been the creation and maintenance of replicated data sets. This chapter provides an overview of replica management schemes used in large, data-intensive, distributed scientific collaborations. Early replica management strategies focused on the development of robust, highly scalable catalogs for maintaining replica locations. In recent years, more sophisticated, application-specific replica management systems have been developed to support the requirements of scientific Virtual Organizations. These systems have motivated interest in application-independent, policy-driven schemes for replica management that can be tailored to meet the performance and reliability requirements of a range of scientific collaborations. The authors discuss the data replication solutions to meet the challenges associated with increasingly large data sets and the requirement to run data analysis at geographically distributed sites.},
	author = {Chervenak, Ann and Schuler, Robert},
	bdsk-url-2 = {https://doi.org/10.4018/978-1-61520-971-2.ch009},
	booktitle = {Data {{Intensive Distributed Computing}}: {{Challenges}} and {{Solutions}} for {{Large-scale Information Management}}},
	chapter = {9},
	date-added = {2018-09-12 15:47:54 -0700},
	date-modified = {2020-01-21 15:51:59 -0800},
	editor = {Kosar, Tevfik},
	isbn = {978-1-61520-971-2},
	pages = {188--205},
	publisher = {{IGI Global}},
	title = {Replica {{Management}} in {{Data Intensive Distributed Science Applications}}},
	year = {2012}}

Downloads: 0

{"_id":"TNnnqsuW7vz5YudrE","bibbaseid":"chervenak-schuler-replicamanagementindataintensivedistributedscienceapplications-2012","author_short":["Chervenak, A.","Schuler, R."],"bibdata":{"bibtype":"incollection","type":"incollection","abstract":"Management of the large data sets produced by data-intensive scientific applications is complicated by the fact that participating institutions are often geographically distributed and separated by distinct administrative domains. A key data management problem in these distributed collaborations has been the creation and maintenance of replicated data sets. This chapter provides an overview of replica management schemes used in large, data-intensive, distributed scientific collaborations. Early replica management strategies focused on the development of robust, highly scalable catalogs for maintaining replica locations. In recent years, more sophisticated, application-specific replica management systems have been developed to support the requirements of scientific Virtual Organizations. These systems have motivated interest in application-independent, policy-driven schemes for replica management that can be tailored to meet the performance and reliability requirements of a range of scientific collaborations. The authors discuss the data replication solutions to meet the challenges associated with increasingly large data sets and the requirement to run data analysis at geographically distributed sites.","author":[{"propositions":[],"lastnames":["Chervenak"],"firstnames":["Ann"],"suffixes":[]},{"propositions":[],"lastnames":["Schuler"],"firstnames":["Robert"],"suffixes":[]}],"bdsk-url-2":"https://doi.org/10.4018/978-1-61520-971-2.ch009","booktitle":"Data Intensive Distributed Computing: Challenges and Solutions for Large-scale Information Management","chapter":"9","date-added":"2018-09-12 15:47:54 -0700","date-modified":"2020-01-21 15:51:59 -0800","editor":[{"propositions":[],"lastnames":["Kosar"],"firstnames":["Tevfik"],"suffixes":[]}],"isbn":"978-1-61520-971-2","pages":"188–205","publisher":"IGI Global","title":"Replica Management in Data Intensive Distributed Science Applications","year":"2012","bibtex":"@incollection{Chervenak2012a,\n\tabstract = {Management of the large data sets produced by data-intensive scientific applications is complicated by the fact that participating institutions are often geographically distributed and separated by distinct administrative domains. A key data management problem in these distributed collaborations has been the creation and maintenance of replicated data sets. This chapter provides an overview of replica management schemes used in large, data-intensive, distributed scientific collaborations. Early replica management strategies focused on the development of robust, highly scalable catalogs for maintaining replica locations. In recent years, more sophisticated, application-specific replica management systems have been developed to support the requirements of scientific Virtual Organizations. These systems have motivated interest in application-independent, policy-driven schemes for replica management that can be tailored to meet the performance and reliability requirements of a range of scientific collaborations. The authors discuss the data replication solutions to meet the challenges associated with increasingly large data sets and the requirement to run data analysis at geographically distributed sites.},\n\tauthor = {Chervenak, Ann and Schuler, Robert},\n\tbdsk-url-2 = {https://doi.org/10.4018/978-1-61520-971-2.ch009},\n\tbooktitle = {Data {{Intensive Distributed Computing}}: {{Challenges}} and {{Solutions}} for {{Large-scale Information Management}}},\n\tchapter = {9},\n\tdate-added = {2018-09-12 15:47:54 -0700},\n\tdate-modified = {2020-01-21 15:51:59 -0800},\n\teditor = {Kosar, Tevfik},\n\tisbn = {978-1-61520-971-2},\n\tpages = {188--205},\n\tpublisher = {{IGI Global}},\n\ttitle = {Replica {{Management}} in {{Data Intensive Distributed Science Applications}}},\n\tyear = {2012}}\n\n","author_short":["Chervenak, A.","Schuler, R."],"editor_short":["Kosar, T."],"bibbaseid":"chervenak-schuler-replicamanagementindataintensivedistributedscienceapplications-2012","role":"author","urls":{},"metadata":{"authorlinks":{}}},"bibtype":"incollection","biburl":"https://bibbase.org/f/YxxSWE3BBc6kJ6Wog/schuler-2023.bib","dataSources":["x5vhyK5kTeSDShQwG","Kn5GhfPu5QWi3iwAJ","pEcDMRzQHHpguguDx","t9PsrY7b2G5FHxCkP"],"keywords":[],"search_terms":["replica","management","data","intensive","distributed","science","applications","chervenak","schuler"],"title":"Replica Management in Data Intensive Distributed Science Applications","year":2012}