BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20131120T001500Z DTEND:20131120T020000Z LOCATION:Mile High Pre-Function DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: As larger HPC systems are built, fault recovery becomes a fundamental capability. Traditional fault recovery approaches, such as checkpointing, may not be sufficient for future exascale systems. Retry-based recovery techniques have been proposed as an alternative. These techniques simply re-execute a code region when a fault occurs and require code annotations. However, no previous work has investigated the optimal placement of these annotations in a program. Via fault injection, we evaluate how to place optimally retry annotations in a hydrodynamics mini application. We found that, contrary to our expectations, a simple scheme of protecting the main function works well for low fault rates: slowdown is up to 1.25 for a 3 faults/hour rate. We also found that the optimal recovery method is rolling a few iterations back in the application's main loop. SUMMARY:Optimal Placement of Retry-Based Fault Recovery Annotations in HPC Applications PRIORITY:3 END:VEVENT END:VCALENDAR