•  13
    Checkpoint/restart-enabled parallel debugging
    with J. Hursey, C. January, P. H. Hargrove, D. Lecomber, J. M. Squyres, and A. Lumsdaine
    Debugging is often the most time consuming part of software development. HPC applications prolong the debugging process by adding more processes interacting in dynamic ways for longer periods of time. Checkpoint/restart- enabled parallel debugging returns the developer to an intermediate state closer to the bug. This focuses the debugging process, saving developers considerable amounts of time, but requires parallel debuggers cooperating with MPI implementations and checkpointers. This paper pre…Read more