28-01-2010, 11:49 PM
please read
http://ieeexplore.ieeexpl/preabsprintf.j...er=4479488
http://ieeexplore.ieeexpl/preabsprintf.j...er=4479488
Algorithm Description:
Logging can be classified as pessimistic, optimistic, or causal. It is based on the fact that the execution of a process can be modeled as a sequence of state intervals. The execution during a state interval is deterministic. However, each state interval is initiated by a nondeterministic event now, assume that the system can capture and log sufficient information about the nondeterministic events that initiated the state interval. This is called the piecewise deterministic (PWD) assumption. Then, a crashed process can be recovered by 1) restoring it to the initial state and 2) replaying the logged events to it in the same order they appeared in the execution before the crash. To avoid a rollback to the initial state of a process and to limit the amount of nondeterministic events that need to be replayed, each process periodically saves its local state. Log based mechanisms in which the only nondeterministic events in a system are the reception of messages is usually referred to as message logging.
Existing System:
Communication Induced Check-pointing protocols usually make the assumption that any process can be check-pointed at any time. An alternative approach which releases the constraint of always check-pointable processes, without delaying any do not message reception nor did altering message ordering enforce by the communication layer or by the application. This protocol has been implemented within Pro-Active, an open source Java middleware for asynchronous and distributed objects implementing the ASP (Asynchronous Sequential Processes) model.
Proposed System:
This paper presents two fault-tolerance mechanisms called Theft-Induced Check pointing and Systematic Event Logging. These are transparent protocols capable of overcoming problems associated with both benign faults, i.e., crash faults, and node or subnet volatility. Specifically, the protocols base the state of the execution on a dataflow graph, allowing for efficient recovery in dynamic heterogeneous systems as well as multithreaded applications.