|Introduction to the course: What is high-performance computing (HPC)? Why do we want to build high-performance systems? Why high-performance is important to big data analytics and AI applications? How to design high-performance systems for big data analytics and AI applications?
|Overview of high-performance computing and basics of parallel computing: Why do we need parallel computing? What are the paradigms for parallel computing? How to pursue high-performance with parallel computing in practice? Where are the performance bottlenecks and how to identify them?
How to practice performance analysis?
|(Recorded Video) Big-data systems: concept and implementation issues. How to store petabyte-scale big data in high-performance cluster filesystems such as HDFS? How to process big data in datacenter with Hadoop MapReduce? Other than parallel computing, the key is data locality and the trick is colocation. How to accelerate data processing with in-memory computing? Lots of open source middleware projects are available for you to explore.
|AI systems: Basics and implementation issues. Many AI applications contain lots of parallelism, and parallel computing can effectively accelerate these applications. Parallel algorithms have been developed for search and expert systems before the last AI Winter. Datacenter and GPU clusters are keys to open the deep learning era. How to train deep learning models with thousands of GPUs in the datacenter?
|Edge-cloud computing and system software: Cloud computing, mobile computing, Internet of Things (IoT), autonomous driving, robots... Everything is connected and needs better mechanisms (system software?) to work together via networks. How do things connect? How do they collaborate efficiently?
|Information security and data privacy: How to protect data? There are security protocols and cryptographic methods for this purpose. The real new challenges today are to perform big data analytics and develop AI models under data protection. How to do it with techniques such as trusted computing hardware (SGX), federated learning, secure multiparty computation, and homomorphic encryption?
|Domain specific accelerators and heterogeneous computing: How to estimate the performance for neural networks with or without deep learning accelerators? How to find good neural networks for your application with platform-aware neural architecture search (NAS)? How to compress a neural network to reduce its resource consumption?
|Large language models: How to train large language models such as GPT3? What are performance issues and the frameworks to address those issues? Can we compress a LLM to run on PC?
Post-Moore - Neuromorphic computing and quantum computing: The increase of computing performance has depended on the Moore's Law for the past 60 years, but the Moore's Law is slowing down and will eventually ends. How to continue improving the capability of big data analytics and AI in the post-Moore era?
|Final Project Proposal
|Advanced Topics (HPC) - TBD
|Advanced Topics (Big Data) - TBD
|Advanced Topics (AI) - TBD
|Final Project Presentation