BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20181221T160727Z
LOCATION:D167/174
DTSTART;TZID=America/Chicago:20181111T163000
DTEND;TZID=America/Chicago:20181111T170000
UID:submissions.supercomputing.org_SC18_sess221_ws_mlhpce109@linklings.com
SUMMARY:Automated Parallel Data Processing Engine with Application to Larg
 e-Scale Feature Extraction
DESCRIPTION:Workshop\nApplications, Deep Learning, Machine Learning, Works
 hop Reg Pass\n\nAutomated Parallel Data Processing Engine with Application
  to Large-Scale Feature Extraction\n\nXing, Dong, Ajo-Franklin, Wu\n\nAs n
 ew scientific instruments generate ever more data, we need to parallelize 
 advanced data analysis algorithms such as machine learning to harness the 
 available computing power. The success of commercial Big Data systems demo
 nstrated that it is possible to automatically parallelize these algorithms
 . However, these Big Data tools have trouble handling the complex analysis
  operations from scientific applications. To overcome this difficulty, we 
 have started to build an automated parallel data processing engine for sci
 ence, known as SystemA1. This paper provides an overview of this data proc
 essing engine, and a use case involving a complex feature extraction task 
 from a large-scale seismic recording technology, called distributed acoust
 ic sensing (DAS). The key challenge associated with DAS is that it produce
 s a vast amount of noisy data. The existing methods used by the DAS team f
 or extracting useful signals like traveling seismic waves from this data a
 re very time-consuming. Our parallel data processing engine reduces the jo
 b execution time from 100s of hours to 10s of seconds, and achieves 95% pa
 rallelization efficiency. We are implementing more advanced techniques inc
 luding machine learning using SystemA, and plan to work with more scientif
 ic applications.
URL:https://sc18.supercomputing.org/presentation/?id=ws_mlhpce109&sess=ses
 s221
END:VEVENT
END:VCALENDAR

