課程大綱

課程資訊

課程名稱

深度學習於音樂分析及生成
Deep Learning for Music Analysis and Generation

開課學期

112-1

授課對象

電機資訊學院電機工程學研究所

授課教師

楊奕軒

課號

CommE5070

課程識別碼

942 U0840

班次

學分

3.0

全/半年

半年

必/選修

選修

上課時間

星期四7,8,9(14:20~17:20)

上課地點

電二229

備註

總人數上限：80人

課程簡介影片

核心能力關聯

核心能力與課程規劃關聯圖

課程大綱

為確保您我的權利,請尊重智慧財產權及不得非法影印

課程概述

“Music Information Research” (MIR) is an interdisciplinary research field that concerns with the analysis, retrieval, processing, and generation of musical content or information. Researchers involved in MIR may have a background in signal processing, machine learning, information retrieval, human-computer interaction, musicology, psychoacoustics, psychology, or some combination of these.

In this course, we are mainly interested in the application of machine learning, in particular deep learning, to address music related problems. Specifically, the course is divided to two parts: analysis and generation.

The first part is about the analysis of musical audio signals, covering topics such as feature extraction and representation learning for musical audio, music audio classification, melody extraction, automatic music transcription, and musical source separation.

The second part is about the generation of musical material, including symbolic-domain MIDI or tablatures, and audio-domain music signals such as singing voices and instrumental music. This would involve deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAE), Transformers, and diffusion models.

Here is a tentative schedule of the course:

W1. Introduction to the course
W2. Fundamentals & Music representation
W3. Analysis I (timbre): Automatic music classification and representation learning
(HW1: Singer classifier)
W4. Generation I: Source separation
W5. Generation II: GAN & Vocoders
W6. Generation III: Synthesis of notes and loops
(HW2: GAN-based Mel-Vocoder)
W7. Analysis II (pitch): Music transcription, Melody extraction, and Chord Recognition
W8. Generation IV: Symbolic MIDI generation
W9. Generation V: Symbolic MIDI generation: Advanced Topics
(HW3: Transformer-based pop piano MIDI generation)
W10. Generation VI: Singing voice generation
W11. Generation VII: Text-to-music generation
W12. Proposal of ideas of final projects
W13. Generation VIII: Differentiable DSP models and automatic mixing
W14. Miscellaneous Topics
W15. Break
W16. Oral presentation of final projects

課程目標

1. Understanding of different aspects of music: timbre, rhythm, pitch, harmony, and structure, and the use of domain knowledge for corresponding music signal analysis tasks.
2. Understanding of and hands-on experiences with deep learning techniques to music audio signal analysis
3. Understanding of and hands-on experiences with deep generative models for both musical audio and text-like music data such as MIDI
4. A taste of the fun of research

課程要求

I would assume that students taking this course to
* have good background in machine learning and mathematics (e.g., have taken courses such as Machine Learning, Deep Learning, Signals and Systems, Digital Signal Processing, Linear Algebra, Probability and Statistics)
* have good coding experience in python and a deep learning framework such as PyTorch
* have great interest in music

預期每週課後學習時數

Office Hours

指定閱讀

Meinard Müller, Fundamentals of Music Processing Using Python and Jupyter Notebooks, 2nd edition, ISBN: 978-3-030-69807-2. Springer, 2021.

參考書目

Jakub M. Tomcza, Deep Generative Modeling. 978-3-030-93158-2. Springer, 2022.

評量方式
(僅供參考)

No.	項目	百分比	說明
1.	Coding assignment	60%	Three homeworks (completed individually and at home; need to submit code, model and report)
2.	Final project	40%	Team of two or three; oral presentation + technical report.

針對學生困難提供學生調整方式

上課形式	提供學生彈性出席課程方式
作業繳交方式
考試形式
其他

課程進度

週次	日期	單元主題
無資料