Title: Performance tuning policies for application level fault tolerance in distributed object systems
Abstract: In distributed object systems, application level fault tolerance is often attained by appropriate object replication policies. These policies aim at increasing the exhibited service availability by masking potential faults that do not recur after recovery. Existing middleware support infrastructures allow customizing object replication properties. However, since fault tolerance has a significant impact in the perceived service performance, there is a need for a suitable quantitative design technique, which allows comparing different replication policies by trading off the caused overhead cost against the achieved fault-tolerance effectiveness. We are also interested in taking into account different concerns in a combined manner (e.g. fault tolerance combined with load balancing and multithreading). This paper presents experimental evidence for the most important performance tradeoffs revealed in a simulation-based study. We considered different cases of object request loss behavior for the faulty objects, as well as, a number of request-retry strategies. The experiments took place in two different application workload levels for varied fault detection settings. We provide results for the combined effects of the studied replication policies with two specific load-balancing strategies. The presented results constitute a valuable experience report for performance tuning object replication policies for application level fault tolerance.
Publication Year: 2007
Publication Date: 2007-04-19
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot