BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20131120T001500Z DTEND:20131120T020000Z LOCATION:Mile High Pre-Function DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: We describe our recent work in optimizing the performance and scaling of Synergia and ART for multi-socket multi-core architectures including BlueGene/Q and GPUs. We show multiple hybridization and optimization options, including communication avoidance, interchangeable multi-threading kernel using OpenMP or CUDA for different hardware architectures, customized FFT, etc., each demonstrating much better scaling behavior than the pre-optimization code. By implementing the optimization techniques, we have extend strong scaling and peak performance by at least a factor of 2. We expect different optimization schemes to be optimal on different architectures. We have further tailored the code for BG/Q with optimized communication divider, redundant field solver, and FFT methods. The final code of Synergia scales up to 128K cores with over 90% efficiency running on Mira (BG/Q at Argonne) SUMMARY:Multi-Core Optimizations for Synergia and ART PRIORITY:3 END:VEVENT END:VCALENDAR