BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160727Z
LOCATION:D165
DTSTART;TZID=America/Chicago:20181112T113000
DTEND;TZID=America/Chicago:20181112T115000
UID:submissions.supercomputing.org_SC18_sess161_ws_pmbss103@linklings.com
SUMMARY:Algorithm Selection of MPI Collectives Using Machine Learning Tech
 niques
DESCRIPTION:Workshop\nBenchmarks, Parallel Programming Languages, Librarie
 s, and Models, Performance, Simulation, Workshop Reg Pass\n\nAlgorithm Sel
 ection of MPI Collectives Using Machine Learning Techniques\n\nHunold, Car
 pen-Amarie\n\nAutotuning is a well established method to improve software 
 performance for a given system, and it is especially important in High Per
 formance Computing. The goal of autotuning is to find the best possible al
 gorithm and its best parameter settings for a given instance. Autotuning c
 an also be applied to MPI libraries, such as OpenMPI or IntelMPI. These MP
 I libraries provide numerous parameters that allow users to adapt them to 
 a given system. Some of these tunable parameters enable users to select a 
 specific algorithm that should be used internally by an MPI collective ope
 ration. For the purpose of automatically tuning MPI collectives on a given
  system, the Intel MPI library is shipped with mpitune. The drawback of to
 ols like mpitune is that results can only be applied to cases (e.g., numbe
 r of processes, message size) for which the tool has performed the optimiz
 ation.\n\nTo overcome this limitation, we present a first step towards tun
 ing MPI libraries also for unseen instances by applying machine learning t
 echniques. Our goal is to create a classifier that takes the collective op
 eration, the message size and communicator characteristics (number of comp
 ute nodes, number of processes per node) as an input and gives the predict
 ed best algorithm for this problem as an output. We show how such a model 
 can be constructed and what pitfalls should be avoided. We demonstrate by 
 thorough experimentation that our proposed prediction model is able to out
 perform the default configuration
URL:https://sc18.supercomputing.org/presentation/?id=ws_pmbss103&sess=sess
 161
END:VEVENT
END:VCALENDAR

