FT-Grid: A Fault-Tolerance System for e-Science. Townend, P., Groth, P., Looker, N., & Xu, J. In Proceedings of the UK OST e-Science Fourth All Hands Meeting (AHM05), September, 2005. abstract bibtex The size and complexity of many e-Science applications suggests that they may be very prone to errors and failures; the cost of recovering from failures may also be high. The FT-Grid system, developed as part of the e-Demand project at the University of Leeds [1], introduces a replication-based fault tolerance scheme that allows faults occurring in service-based systems to be tolerated, thus increasing the dependability of such systems. This paper details the progress that has been made in the development of FT-Grid, including both a GUI client and also an FT-Grid web service interface. We show empirical evidence of the dependability benefits offered by FT-Grid, by performing a dependability analysis on the results of fault injection testing performed with the WS-FIT tool at the University of Durham. We then illustrate a potential problem with voting based fault tolerance approaches in the service-oriented paradigm ? namely, that individual channels within fault-tolerant systems may invoke common services as part of their workflow, thus increasing the potential for commonmode failure. We propose a solution to this issue by using the technique of provenance to provide FT-Grid with topological awareness. We implement a large test system, and - with the use of the PreServ provenance system developed as part of the PASOA e-Science project at the University of Southampton - perform a large number of experiments which show that a provenance-aware FTGrid results in a much more dependable system than any of the other configurations tested, whilst imposing a negligible timing overhead.
@inproceedings{ Townend2005a,
author = {Paul Townend and Paul Groth and Nik Looker and Jie Xu},
title = {FT-Grid: A Fault-Tolerance System for e-Science},
abstract = {The size and complexity of many e-Science applications suggests that they may be very prone to errors and failures; the cost of recovering from failures may also be high. The FT-Grid system, developed as part of the e-Demand project at the University of Leeds [1], introduces a replication-based fault tolerance scheme that allows faults occurring in service-based systems to be tolerated, thus increasing the dependability of such systems. This paper details the progress that has been made in the development of FT-Grid, including both a GUI client and also an FT-Grid web service interface. We show empirical evidence of the dependability benefits offered by FT-Grid, by performing a dependability analysis on the results of fault injection testing performed with the WS-FIT tool at the University of Durham. We then illustrate a potential problem with voting based fault tolerance approaches in the service-oriented paradigm ? namely, that individual channels within fault-tolerant systems may invoke common services as part of their workflow, thus increasing the potential for commonmode failure. We propose a solution to this issue by using the technique of provenance to provide FT-Grid with topological awareness. We implement a large test system, and - with the use of the PreServ provenance system developed as part of the PASOA e-Science project at the University of Southampton - perform a large number of experiments which show that a provenance-aware FTGrid results in a much more dependable system than any of the other configurations tested, whilst imposing a negligible timing overhead.},
booktitle = {Proceedings of the UK OST e-Science Fourth All Hands Meeting (AHM05)},
month = {September} ,
year = {2005}
}
Downloads: 0
{"_id":{"_str":"51f77b8459ced8df44001795"},"__v":16,"authorIDs":["54571f9c2abc8e9f370000a5","5459cae3b43425b772000a5c"],"author_short":["Townend, P.","Groth, P.","Looker, N.","Xu, J."],"bibbaseid":"townend-groth-looker-xu-ftgridafaulttolerancesystemforescience-2005","bibdata":{"bibtype":"inproceedings","type":"inproceedings","author":[{"firstnames":["Paul"],"propositions":[],"lastnames":["Townend"],"suffixes":[]},{"firstnames":["Paul"],"propositions":[],"lastnames":["Groth"],"suffixes":[]},{"firstnames":["Nik"],"propositions":[],"lastnames":["Looker"],"suffixes":[]},{"firstnames":["Jie"],"propositions":[],"lastnames":["Xu"],"suffixes":[]}],"title":"FT-Grid: A Fault-Tolerance System for e-Science","abstract":"The size and complexity of many e-Science applications suggests that they may be very prone to errors and failures; the cost of recovering from failures may also be high. The FT-Grid system, developed as part of the e-Demand project at the University of Leeds [1], introduces a replication-based fault tolerance scheme that allows faults occurring in service-based systems to be tolerated, thus increasing the dependability of such systems. This paper details the progress that has been made in the development of FT-Grid, including both a GUI client and also an FT-Grid web service interface. We show empirical evidence of the dependability benefits offered by FT-Grid, by performing a dependability analysis on the results of fault injection testing performed with the WS-FIT tool at the University of Durham. We then illustrate a potential problem with voting based fault tolerance approaches in the service-oriented paradigm ? namely, that individual channels within fault-tolerant systems may invoke common services as part of their workflow, thus increasing the potential for commonmode failure. We propose a solution to this issue by using the technique of provenance to provide FT-Grid with topological awareness. We implement a large test system, and - with the use of the PreServ provenance system developed as part of the PASOA e-Science project at the University of Southampton - perform a large number of experiments which show that a provenance-aware FTGrid results in a much more dependable system than any of the other configurations tested, whilst imposing a negligible timing overhead.","booktitle":"Proceedings of the UK OST e-Science Fourth All Hands Meeting (AHM05)","month":"September","year":"2005","bibtex":"@inproceedings{ Townend2005a,\n author = {Paul Townend and Paul Groth and Nik Looker and Jie Xu},\n title = {FT-Grid: A Fault-Tolerance System for e-Science}, \n abstract = {The size and complexity of many e-Science applications suggests that they may be very prone to errors and failures; the cost of recovering from failures may also be high. The FT-Grid system, developed as part of the e-Demand project at the University of Leeds [1], introduces a replication-based fault tolerance scheme that allows faults occurring in service-based systems to be tolerated, thus increasing the dependability of such systems. This paper details the progress that has been made in the development of FT-Grid, including both a GUI client and also an FT-Grid web service interface. We show empirical evidence of the dependability benefits offered by FT-Grid, by performing a dependability analysis on the results of fault injection testing performed with the WS-FIT tool at the University of Durham. We then illustrate a potential problem with voting based fault tolerance approaches in the service-oriented paradigm ? namely, that individual channels within fault-tolerant systems may invoke common services as part of their workflow, thus increasing the potential for commonmode failure. We propose a solution to this issue by using the technique of provenance to provide FT-Grid with topological awareness. We implement a large test system, and - with the use of the PreServ provenance system developed as part of the PASOA e-Science project at the University of Southampton - perform a large number of experiments which show that a provenance-aware FTGrid results in a much more dependable system than any of the other configurations tested, whilst imposing a negligible timing overhead.},\n booktitle = {Proceedings of the UK OST e-Science Fourth All Hands Meeting (AHM05)},\n month = {September} ,\n year = {2005}\n}\n\n\n","author_short":["Townend, P.","Groth, P.","Looker, N.","Xu, J."],"key":"Townend2005a","id":"Townend2005a","bibbaseid":"townend-groth-looker-xu-ftgridafaulttolerancesystemforescience-2005","role":"author","urls":{},"downloads":0},"bibtype":"inproceedings","biburl":"http://data.bibbase.org/author/paul-groth/?format=bibtex","downloads":0,"keywords":[],"search_terms":["grid","fault","tolerance","system","science","townend","groth","looker","xu"],"title":"FT-Grid: A Fault-Tolerance System for e-Science","title_words":["grid","fault","tolerance","system","science"],"year":2005,"dataSources":["3zCLcRkoK9QojHArD"]}