BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160731Z
LOCATION:D171/173
DTSTART;TZID=America/Chicago:20181116T085500
DTEND;TZID=America/Chicago:20181116T091000
UID:submissions.supercomputing.org_SC18_sess145_ws_p3hpc107@linklings.com
SUMMARY:An Empirical Roofline Methodology for Quantitatively Assessing Per
 formance Portability
DESCRIPTION:Workshop\nHeterogeneous Systems, Performance, Workshop Reg Pas
 s\n\nAn Empirical Roofline Methodology for Quantitatively Assessing Perfor
 mance Portability\n\nYang, Gayatri, Kurth, Basu, Ronaghi...\n\nSystem and 
 node architectures continue to diversify to better balance on-node computa
 tion, memory capacity, memory bandwidth, interconnect bandwidth, power, an
 d cost for specific computational workloads. For many applications develop
 ers, however, achieving performance portability (effectively exploiting th
 e capabilities of multiple architectures) is a desired goal. Unfortunately
 , dramatically different per-node performance coupled with differences in 
 machine balance can lead to developers being unable to determine whether t
 hey have attained performance portability or simply written portable code.
  The Roofline model provides a means of quantitatively assessing how well 
 a given application makes use of a target platform's computational capabil
 ities. In this paper, we extend the Roofline model so that it 1) empirical
 ly captures a more realistic set of performance bounds for CPUs and GPUs, 
  2) factors in the true cost of different floating-point instructions when
  counting FLOPs, 3) incorporates the effects of different memory access pa
 tterns, and 4) with appropriate pairing of code performance and Roofline c
 eiling, facilitates the performance portability analysis.
URL:https://sc18.supercomputing.org/presentation/?id=ws_p3hpc107&sess=sess
 145
END:VEVENT
END:VCALENDAR

