ME 344: Introduction to High Performance Computing ~ Summer Session 2017

Offered to students enrolled in the Stanford Summer Session 2017

ME 344 is an introductory course on High Performance Computing (HPC), providing a solid foundation in parallel computer architectures, programming models, and essential optimization strategies. This course will discuss fundamentals of what an HPC cluster consists of, and how we can take advantage of such systems to solve large scale problems in wide ranging applications like computational fluid dynamics, image processing, machine learning and analytics. The course will consist of lectures, and practical hands-on homework assignments conducted on an Intel® Xeon Phi Processor based HPC Cluster using various software tools that are part of Parallel Studio XE. In addition to classroom instruction, experience with the latest cutting-edge hardware and interaction with industry experts, the course features hands on projects that emphasize on the application of High Performance Computing and enable students to build upon their knowledge. These include fundamental exercises wherein the students build an HPC cluster from the ground up and applied projects where the students utilize HPC paradigms to build a Deep Learning application. This course is open to both computer scientists and computational scientists who are interested in learning about data parallelism, scaling to large number of nodes, and performance tuning methodologies and tools on standards driven languages and parallel models (C/C++/Fortran/MPI/OpenMP/ Threading Building Blocks/Python). As it’s desirable to have such a mix of students, the course will not assume much background, though good programming skills will be needed to get the most of the course.

http://summer.stanford.edu/course/me-344/

Stanford HPC Advisory Council Conference and Workshop 2018

The Stanford HPC Conference and Workshop is February 20th and 21st, 2018, at the Munger Conference Center on the Stanford University Campus. Great presentations, tutorials, workshops, and of course free of charge, thanks to our wonderful sponsors.

Registration is required: (coming soon)

Stanford HPC Advisory Council Conference Feb 7th & 8th 2017

The Stanford HPC Conference is February 7th and 8th, 2017, at the Munger Conference Center on the Stanford University Campus. Great presentations, tutorials, and of course free of charge, thanks to our wonderful sponsors.

Registration is required: http://hpcadvisorycouncil.com/events/2017/stanford-workshop/

Stanford HPC Summer Speaker Series: Building Faster Machine Learning Applications with Intel Performance Libraries – July 26@noon

Date for Tutorial: Tuesday, 26 July, 2016
Time: Noon
Duration: 1.5 hours
Location: d.school (Peterson Engineering Laboratory, 550 Panama Mall, Room 200)
 
Title: Building Faster Machine Learning Applications with Intel Performance Libraries

Presentation available here

Abstract:
The future of many industries, as well as many aspects of our lives, is being shaped by machine learning and related technologies. Intel software technologies are being used to enable solutions in these areas. This talk focuses on two Intel performance libraries, MKL and DAAL, which offer optimized building blocks for data analytics and machine learning algorithms.MKL is a collection of routines for linear algebra, FFT, vector math and statistics. It’s being used to speed up math processing in almost every kind of technical computing applications. DAAL is more focused on data applications and provides higher level, canned solutions for supervised and unsupervised learning. This session is an overview of the capability and performance advantages of these libraries in the context of machine learning and deep learning.

Speaker Bios:
Zhang Zhang is a Technical Consulting Engineer with the Software and Services Group at Intel. He provides technical support for Intel performance libraries, including MKL, DAAL, and IPP. He helps customers to adopt Intel software tools and enjoys troubleshooting performance and usage problems in user’s code. Zhang came from a background of high performance and parallel programming, cluster and distributed computing, and performance modeling and analysis. Zhang holds a Ph.D. in Computer Science from Michigan Technological University.

Shaojuan Zhu is a Technical Consulting Engineer at Intel supporting Intel performance libraries: DAAL, IPP and MKL. She has ten years of experience developing and supporting media products. Her expertise and interests include biologically inspired intelligent signal processing, machine learning and media. She holds a Ph.D. in Electrical and Computer Engineering from Oregon Health and Science University.

Stanford HPC Summer Speaker Series: Guided Code Vectorization with Intel® Advisor XE – July 19@noon

Date for Tutorial: Tuesday, 19 July, 2016
Time: Noon
Duration: 1.5 hours
Location: d.school (Peterson Engineering Laboratory, 550 Panama Mall, Room 200)
 
Title: Guided Code Vectorization with Intel® Advisor XE
 
Abstract:
In this topic we discuss the usage of an optimization tool called Intel® Advisor. The discussion is illustrated with an example workload that computes the electric potential in a set of points in 3-D space produced by a group of charged particles. The example workload runs on a many-core Intel Xeon Phi processor (formerly Knights Landing) with Intel AVX512 instructions.

The application was originally parallelized across cores, but otherwise neither optimized nor vectorized. In the publication, we discuss three performance issues that the Intel Advisor detected: vector dependence, type conversion and inefficient memory access pattern. For each issue, we discuss how to interpret the data presented by the Intel Advisor, and also how to optimize the application to resolve these issues. After the optimization, we observed a 27x performance boost compared to the original, non-optimized implementation.

Speaker Bio:
Ryo Asai
 is a Researcher at Colfax International. He develops optimization methods for scientific applications targeting emerging parallel computing platforms, computing accelerators and interconnect technologies. Ryo holds a B.A. degree in Physics from University of California, Berkeley.

Stanford HPC Summer Speaker Series: HPC Workload Profiling Using VTune Amplifier XE – July 12th@noon

Date for Tutorial: Tuesday, 12 July, 2016
Time: Noon
Duration: 1.5 hours
Location: d.school (Peterson Engineering Laboratory, 550 Panama Mall, Room 200)
 
Title: HPC Workload Profiling Using VTune Amplifier XE
 
Abstract:
Hybrid programming models, that utilize both OMP and MPI for efficient parallel scalability, are getting more complex.  Adding to the complexity of SW development, the advancement in HW designs like Intel® Xeon Phi™ processors with many cores and multiple vector processing units (VPU) per core and fast MCDRAM option offers excellent vector performance to HPC workloads.   On the serial performance, workload developers need to make use of all core design features including complex FPs and Integer instruction SSE, SIMD, AVX2, AVX512, … to obtain highest FPs and thus least execution time.  Learning the best compiler options for a particular workload as well as the memory layout of the systems like NUMA are also important.  For parallel performance tuning, scalability of OMP and MPI requires detailed OMP performance analysis and MPI communication profile.   OMP analysis may include overall data load imbalance distributed over number of OMP threads, lock and wait, thread synchronization, … An MPI communication profile can help to reduce the cost of doing communication.   Intel Parallel Studio suit includes a comprehensive set of performance tools which can be effectively use to do these tasks.  In particular, the powerful Intel VTune Performance Analyzer tool is well suit to capture deep dive performance characterization of HPC workloads.  In this presentation, we will cover Intel VTune Performance Analyzer and hands-on demo of it usage to study HPC workload performance.
Speaker Bio:
Thanh Phung is a senior HPC engineer at Intel leading the HPC workload performance characterization and performance tuning.  Thanh joined Intel in 1992 working for the Supercomputer System Division (SSD) as an on-site HPC scientist at NASA/Ames and Caltech.  From 1998 to 2000 Thanh worked for Intel developing HPC tools for optical proximity correction (OPC) lithography.  From 2000 to present, Thanh worked for Intel SSG/DPD/TCAR specializing in employing performance tools like Intel VTune performance analyzer and ITAC for message profiling to do HPC workload deep dive performance analysis, vectorization tuning using  SIMD/AVX2/AVX512, OMP/MPI/Hybrid programming and scalability. Thanh holds a Ph. D. in Chemical Engineering with emphasis in CFD at Caltech in 1992.

Stanford HPC Summer Speaker Series: Intel® Distribution for Python: A Scalability Story in Production Environments – June 28@noon

Date for Tutorial: Tuesday, 28 June, 2016
Time: Noon
Duration: 1.5 hours
Location: d.school (Peterson Engineering Laboratory, 550 Panama Mall, Room 200)
 
Title – Intel® Distribution for Python: A Scalability Story in Production Environments
 
Abstract:
In this talk we will describe the tools, techniques and optimizations that Intel brings to the Python developers community. We are developing high performance libraries and profilers as well as extending support for multi-core and SIMD parallelism across Python toolchain so that developers can achieve near native performance in Python, avoiding the need to rewrite in C/C++ 

Python use continues to grow in many domains that require interactive prototyping. Quants develop trading algorithms, data scientists build analytics models, and researchers prototype numerical simulations. All too often, scaling the prototype code to production means a developer recoding the algorithm in a language such as C++ or Java. Rewriting takes time, reduces flexibility, and can lead to errors.

In this talk we will describe the tools, techniques and optimizations that Intel brings to the Python developers community to address this major challenge. We are developing high performance libraries and profilers as well as extending support for multi-core and SIMD parallelism across Python toolchain so that developers can achieve near native performance in Python, avoiding the need to rewrite.

Our case studies will show speedups up to 100x and more from highly optimized libraries such as NumPy/SciPy, Intel® DAAL and Scikit-learn* and how those scale across multiple cores and multiple nodes. We will also see how Intel® VTune™ Amplifier allows low intrusive profiling of Python and native codes to identify performance hotspots. We will also demonstrate how tools like Cython* and Numba* allow obtaining near native code performance in numerically intensive applications. 

Goals of the Tutorial:
The participants will learn how to develop and optimize technical computing programs in Python utilizing Intel® Distribution for Python*, Intel’s performance libraries: Intel® Math Kernel Library and Intel® Data Analytics Acceleration Library, and Intel’s low overhead line level profiling tool
Intel® VTune Amplifier supporting mixed mode C/C++ & Python. Hands on labs will be used to demonstrate key concepts for optimizing performance of Python applications.
 
Prerequisites:
Beginner level knowledge of Python will be helpful
Bring your own laptop
Prior to the session, download the Intel® Distribution for Python* Tech Preview (soon to be beta), to follow along with the live examples.
 
Speaker Bio:
Sergey Maidanov leads the team of software engineers working on the optimized Intel® Distribution for Python*. He has over 15 years of experience in numerical analysis with a range of contributions to Intel software products such as Intel MKL, Intel IPP, Intel compilers, and others. Among his recently completed projects was the Intel® Data Analytics Acceleration Library. Sergey received a master’s degree in Mathematics from the State University of Nizhny Novgorod with specializations in number theory, random number generation, and its application in financial math. He was a staff member of the International Center of Studies in Financial Institutions at the State University of Nizhny Novgorod.

Stanford HPC Conference Feb 24th-25th, 2016 ~ Be sure to attend!

Stanford HPC Conference – February 24th-25th 2016 Excellent presentations, network with peers, enjoy the food! Be sure to register to attend today!

http://www.hpcadvisorycouncil.com/events/2016/stanford-workshop/

Stanford HPC Seminar Series: Simulation for IoT ~ A multi-physics approach for a wearable IoT device

Inline image 3

Join us on Thursday October 15th at Noon for a Free Lunch and Learn focused Simulation and the IoT Revolution

Virtual prototyping and simulation is widely used to develop better products faster. ANSYS offers a complete range of simulation tools for structural mechanics, fluid dynamics, low- and high-frequency electromagnetics and systems.

During this Lunch and Learn, we will present the simulation offering from ANSYS and focus on simulation for the IoT revolution by demonstrating how simulation is key to develop wearable devices.

In the race to develop the next blockbuster wearable device, ANSYS Electronics, Thermal, Mechanical, and Systems Tools can help companies ensure that their feature-rich products have exceptional performance and user experience. Furthermore the manufacturing process and durability of the product can also be improved using ANSYS simulations. In this presentation, example of simulations that can lead to improved smart watch products will be discussed.

Thursday October 15th - Noon to 2PM

Mechanical Engineering Research Lab (MERL)

418 Panama Mall

Conference Room 203 (2nd floor)

Inline image 4
(If the embedded link above does not work, please register here: https://www.surveymonkey.com/r/ZP3JSKP)

 

Stanford HPC Seminar Series: Introduction to Intel Trace Analyzer and Collector

Interested in better understanding the behavior of your MPI application? Want to quickly find bottlenecks, and achieve high performance for your parallel applications? Intel is coming to Stanford to host this seminar on Intel Trace Analyzer and Collector.
What:
Introduction to Intel Trace Analyzer and Collector,
Introduction to Intel MPI/OpenMP
Presented by Gergana Slavova, Intel
When:
August 20th, 1:30pm – 3:30pm
(pizza served starting at 1:00pm)
Where:
ME Design Building
416 Escondido Mall
Room 200 Teaching Studio
Agenda:
1. Intel MPI Library
 a. overview
 b. OpenMP/MPI hybrid support
2. Intel Trace Analyzer and Collector
 a. Tracing your MPI application with Intel Trace Collector
 b. Performance Analysis with Intel Trace Analyzer
 c. Demo (Poisson + application from Stanford)
 d. Ideal Interconnect Simulator & Imbalance Diagrams
 e. MPI Performance Snapshot
 f. MPI Correctness checking
Additional detail on location:
Building 550 (see map), also known as Peterson Laboratory. The street address is 416 Escondido Mall, but the building entrance closest to the Design Group offices is on Panama Mall, across from the Mechanical Engineering Research Lab (MERL).

Stanford HPC Seminar Series: Intel VTune™ Training

What: Stanford HPC Seminar Series kicks off with Intel VTune Training!
Where:  Building 300-300
When: July 1st 10:30am

VTune™ Amplifier XE training
1. Overview of the VTune™ Amplifier
a. Data collection methodologies
2. Collecting CPU Hotspot Data
a. Drill down to source code
b. Demo
3. Collecting broader CPU-based performance data
a. Cache misses, branch mispredictions, …
b. Demo
4. Collecting Thread Contention Data
a. Demo
5. Miscellaneous topics
a. VTune™ Amplifier with MPI, OpenMP applications
b. Recent features/improvements

Video Gallery: 2015 HPCAC Stanford HPC Conference

Stanford HPC Conference 2105

Thanks to our media sponsor inside HPC, the video gallery for the 2015 HPCAC Stanford HPC Conference is available here:

http://insidehpc.com/2015-stanford-hpc-conference-video-gallery/

HPC Advisory Council Stanford HPC Conference February 2-3 2015!

What: HPC Advisory Council Stanford HPC Conference
Where: Munger Conference Center/Paul Brest Hall
When: February 2-3, 2015
Register: http://www.hpcadvisorycouncil.com/events/2015/stanford-workshop/register.php
Agenda: http://www.hpcadvisorycouncil.com/events/2015/stanford-workshop/agenda.php
Conference highlights~
HPC at Oak Ridge (Pavel Shamis, Oak Ridge National Lab)
– High Performance Computing Trends (Addison Snell, Intersect 360
HPC in an Hour: Tips and tricks to build your own HPC Cluster (Steve Jones, Stanford)
– The Evolution of HPC Usage at PayPal (Arno Kolster, eBay)
– A merit based priority scheme to optimize the use of computing infrastructure (Gowtham S., Michigan Technological University)
– Utilizing Cloud HPC Resources for CAE Simulations (Joris Poort, Rescale)

 

HPC Advisory Council Announces Stanford High-Performance Computing Conference 2015

Business Wire

SUNNYVALE, Calif.–(BUSINESS WIRE)–The HPC Advisory Council, a leading organization for high-performance computing research, outreach and education, today announced the HPC Advisory Council Stanford High-Performance Computing Conference 2015 on February 2-3, 2015 at Stanford, California. The conference will focus on High-Performance Computing (HPC) usage models and benefits, the future of supercomputing, latest technology developments, best practices and advanced HPC topics. In addition, there will be a strong focus on socially responsible computing, with advancements in solutions for the small to medium enterprise to have better use of power, cooling, hardware, and software. More details available here:

http://www.businesswire.com/news/home/20141029005160/en/HPC-Advisory-Council-Announces-Stanford-High-Performance-Computing#.VMbyhERUKcN

Video Gallery: 2014 HPCAC Stanford HPC & Exascale Conference

hero-banner

 

 

 

Thanks to our media sponsor inside HPC, the video gallery for the 2014 HPCAC Stanford HPC & Exascale Conference is available here:

http://insidehpc.com/video-gallery-2014-hpcac-stanford-hpc-exascale-conference/

2014 Press & Events


What: HPC Advisory Council Stanford Conference and Exascale Workshop
Where: Munger Conference Center/Paul Brest Hall
When: February 3-5, 2014
Agenda: http://www.hpcadvisorycouncil.com/events/2014/stanford-workshop//agenda.php

2013 Press & Events

hpc-workshop-2013
What: HPC Advisory Council Stanford Conference 2013
Where: Munger Conference Center/Paul Brest Hall
When: February 7-8, 2013
Agenda: http://hpcadvisorycouncil.com/events/2013/Stanford-Workshop/agenda.php

2012 Press & Events

Stanford Researchers Break Million-core Supercomputer Barrier
Presented by: Joseph Nichols, Stanford University
Thursday February 7th – 8th

Location: Paul Brest Hall Bldg. 4555 Salvatierrra Walk

Researchers at the Center for Turbulence Research set a new record in supercomputing, harnessing a million computing cores to model supersonic jet noise. Work was performed on the newly installed Sequoia IBM Bluegene/Q system at Lawrence Livermore National Laboratories.

2011 Press & Events

http://hpcadvisorycouncil.com/img/Event_Logo/2011_stanford.jpg

What: HPC Advisory Council Stanford Workshop 2011
Where: Munger Conference Center/Paul Brest Hall
When: December 6-7, 2011

2010 Press & Events

What: Stanford High Performance Computing Conference
Where: Munger Conference Center/Paul Brest Hall
When: Dec 8-9, 2010
Agenda: http://hpcc.stanford.edu/conference/hpccV/agenda.html