BEGIN:VCALENDAR PRODID:-//Microsoft Corporation//Outlook MIMEDIR//EN VERSION:1.0 BEGIN:VEVENT DTSTART:20131121T183000Z DTEND:20131121T190000Z LOCATION:401/402/403 DESCRIPTION;ENCODING=QUOTED-PRINTABLE:ABSTRACT: The enormous gap between the high-performance capabilities of today's CPUs and off-chip communication has made the development of numerical software=0Athat is scalable and performant extremely challenging. =0A=0AIn this paper, we describe a successful methodology to address these challenges, starting from our algorithm design, kernel optimization and tuning, to our programming model in the development of a scalable high-performance singular-value-decomposition (SVD) solver. We developed a set of leading edge kernels =0Acombined with advanced optimization techniques featuring fine-grained, memory-aware kernels, a task-based approach and hybrid execution and scheduling that significantly increase the performance of the SVD solver.=0A=0AOur results demonstrate an enormous performance boost compared to current available software. In particular, our software is two-fold faster than the optimized Intel Math Kernel Library when all the singular vectors are required, achieves 4-times speedup when 20% of the vectors are computed and is significantly superior (12X) if only the singular-value is required. SUMMARY:An Improved Parallel Singular Value Algorithm and Its Implementation for Multicore Hardware PRIORITY:3 END:VEVENT END:VCALENDAR