週次 |
日期 |
單元主題 |
第1週 |
2/20 |
Introduction to the course: What is high-performance computing (HPC)? Why do we want to build high-performance systems? Why high-performance is important to big data analytics and AI applications? How to design high-performance systems for big data analytics and AI applications? |
第2週 |
2/27 |
Holiday
|
第3週 |
3/6 |
Overview of high-performance computing and basics of parallel computing: Why do we need parallel computing? What are the paradigms for parallel computing? How to pursue high-performance with parallel computing in practice? Where are the performance bottlenecks and how to identify them?
How to practice performance analysis? |
第4週 |
3/13 |
(Recorded Video) Big-data systems: concept and implementation issues. How to store petabyte-scale big data in high-performance cluster filesystems such as HDFS? How to process big data in datacenter with Hadoop MapReduce? Other than parallel computing, the key is data locality and the trick is colocation. How to accelerate data processing with in-memory computing? Lots of open source middleware projects are available for you to explore. |
第5週 |
3/20 |
AI systems: Basics and implementation issues. Many AI applications contain lots of parallelism, and parallel computing can effectively accelerate these applications. Parallel algorithms have been developed for search and expert systems before the last AI Winter. Datacenter and GPU clusters are keys to open the deep learning era. How to train deep learning models with thousands of GPUs in the datacenter? |
第6週 |
3/27 |
Edge-cloud computing and system software: Cloud computing, mobile computing, Internet of Things (IoT), autonomous driving, robots... Everything is connected and needs better mechanisms (system software?) to work together via networks. How do things connect? How do they collaborate efficiently? |
第7週 |
4/3 |
Holiday |
第8週 |
4/10 |
Information security and data privacy: How to protect data? There are security protocols and cryptographic methods for this purpose. The real new challenges today are to perform big data analytics and develop AI models under data protection. How to do it with techniques such as trusted computing hardware (SGX), federated learning, secure multiparty computation, and homomorphic encryption? |
第9週 |
4/17 |
Domain specific accelerators and heterogeneous computing: How to estimate the performance for neural networks with or without deep learning accelerators? How to find good neural networks for your application with platform-aware neural architecture search (NAS)? How to compress a neural network to reduce its resource consumption? |
第10週 |
4/24 |
Large language models: How to train large language models such as GPT3? What are performance issues and the frameworks to address those issues? Can we compress a LLM to run on PC? |
第11週 |
5/1 |
Midterm Exam
Post-Moore - Neuromorphic computing and quantum computing: The increase of computing performance has depended on the Moore's Law for the past 60 years, but the Moore's Law is slowing down and will eventually ends. How to continue improving the capability of big data analytics and AI in the post-Moore era? |
第12週 |
5/8 |
Final Project Proposal |
第13週 |
5/15 |
Advanced Topics (HPC) - TBD |
第14週 |
5/22 |
Advanced Topics (Big Data) - TBD |
第15週 |
5/29 |
Advanced Topics (AI) - TBD |
第16週 |
6/5 |
Final Project Presentation |