mirror of
https://github.com/PKUFlyingPig/cs-self-learning.git
synced 2026-06-22 17:37:17 +08:00
add CMU11868
This commit is contained in:
parent
2b4ba63b09
commit
26c058b1e5
3 changed files with 66 additions and 0 deletions
29
docs/深度生成模型/大语言模型/CMU11-868.en.md
Normal file
29
docs/深度生成模型/大语言模型/CMU11-868.en.md
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
# CMU 11868: Large Language Model System
|
||||
|
||||
## Course Overview
|
||||
|
||||
- University: Carnegie Mellon University (CMU)
|
||||
- Prerequisites: Basic knowledge of deep learning
|
||||
- Programming Language: Python, CUDA
|
||||
- Difficulty: 🌟🌟🌟🌟
|
||||
- Class Hour: 60 hours
|
||||
|
||||
In recent years, the progress of artificial intelligence has benefited greatly from the rapid development of large language models (LLMs) and other generative methods. These models are usually huge in scale (e.g., GPT-3 has 175 billion parameters), so it is crucial to develop scalable LLM Systems.
|
||||
In this course, students will learn the core skills of designing LLMs at the system level.
|
||||
One of the major differences between this course and other similar courses is that there are quite a few GPU acceleration technologies involved in this course. The course will introduce the famous [FlashAttention](https://llmsystem.github.io/llmsystem2024spring/assets/files/Group-FlashAttention-0b70d553037a7729dd2a9af5e23d8b3e.pdf), and the experiments also require you to implement some operators to accelerate training.
|
||||
Overall, the course is very suitable for students who are interested in the system design of large language models.
|
||||
|
||||
This course requires you to have a certain amount of preparation for deep learning and is not suitable for complete beginners. You can see the prerequisites in the [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ).
|
||||
The experiments are generally challenging, and the main contents are as follows:
|
||||
|
||||
1. Assignment1: Automatic differentiation framework + handwritten CUDA operator + basic neural network construction
|
||||
2. Assignmant2: GPT2 model construction
|
||||
3. Assignment3: Optimize model training speed by optimizing Softmax and LayerNorm operators written in CUDA
|
||||
4. Assignment4: Distributed model training, which may not be easy to configure the environment for self-study
|
||||
|
||||
Like many other high-quality courses, the slides and assignments of this course are open-source, with quite detailed local test code, suitable for self-study.
|
||||
|
||||
## Course Resources
|
||||
|
||||
- Course Website: [https://llmsystem.github.io](https://llmsystem.github.io)
|
||||
- Assignments:<https://github.com/llmsystem>
|
||||
34
docs/深度生成模型/大语言模型/CMU11-868.md
Normal file
34
docs/深度生成模型/大语言模型/CMU11-868.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
# CMU 11868: Large Language Model System
|
||||
|
||||
## 课程简介
|
||||
|
||||
- 所属大学:CMU
|
||||
- 先修要求:深度学习基础知识
|
||||
- 编程语言:Python, CUDA
|
||||
- 课程难度:🌟🌟🌟🌟
|
||||
- 预计学时:60 小时
|
||||
|
||||
|
||||
|
||||
近年来,人工智能的进步在很大程度上得益于大型语言模型(LLMs)及其他生成式方法的快速发展。这些模型通常规模巨大(例如 GPT-3 有 1750 亿参数),因此开发可扩展的 LLM System 变得至关重要。
|
||||
在这门课程中,学生将在系统层面学习设计 LLM 的核心技能。
|
||||
和其他类似课程一个较大的区别是本课程中涉及到了相当多的 GPU 加速技术,课程会介绍著名的 [FlashAttention](https://llmsystem.github.io/llmsystem2024spring/assets/files/Group-FlashAttention-0b70d553037a7729dd2a9af5e23d8b3e.pdf), 实验也要求你实现一些算子来加速训练。
|
||||
此外, 课程还涉及一些系统上的加速技术,如 [PagedAttention](https://llmsystem.github.io/llmsystem2024spring/assets/files/Group-vLLM-presentation-8fab23dec42abb93f4075b63f1cc9e83.pptx) 和分布式训练。总体来说非常适合对于大模型在系统设计层面技术感兴趣的同学。
|
||||
|
||||
|
||||
该课程要求你对深度学习有一定的预备知识,不适合纯小白入手,可见 [FAQ](https://llmsystem.github.io/llmsystem2024spring/docs/FAQ) 的先修要求。
|
||||
实验总体来说是有难度的,主要内容如下:
|
||||
|
||||
1. Assignment1: 自动微分框架 + CUDA 手写算子 + 基础神经网络构建
|
||||
2. Assignmant2: GPT2 模型构建
|
||||
3. Assignment3: 通过手写 CUDA 的 Softmax 和 LayerNorm 算子优化模型训练速度
|
||||
4. Assignment4: 分布式模型训练,自学的话可能不太好配置环境
|
||||
|
||||
|
||||
和众多优质课程一样,该课程幻灯片和作业都是开源的,有相当详尽的本地测试代码,适合自学。
|
||||
|
||||
|
||||
## 课程资源
|
||||
|
||||
- 课程网站:<https://llmsystem.github.io>
|
||||
- 课程作业:<https://github.com/llmsystem>
|
||||
|
|
@ -114,6 +114,7 @@ plugins:
|
|||
"国立台湾大学: 李宏毅机器学习": NTU Machine Learning
|
||||
深度生成模型: Deep Generative Models
|
||||
学习路线图: Roadmap
|
||||
"大语言模型": Large Language Models
|
||||
机器学习进阶: Advanced Machine Learning
|
||||
学习路线图: Roadmap
|
||||
后记: Postscript
|
||||
|
|
@ -282,6 +283,8 @@ nav:
|
|||
- "UCB CS285: Deep Reinforcement Learning": "深度学习/CS285.md"
|
||||
- 深度生成模型:
|
||||
- "学习路线图": "深度生成模型/roadmap.md"
|
||||
- "大语言模型":
|
||||
- "CMU 11868: Large Language Model System": "深度生成模型/大语言模型/CMU11-868.md"
|
||||
- 机器学习进阶:
|
||||
- "学习路线图": "机器学习进阶/roadmap.md"
|
||||
- "CMU 10-708: Probabilistic Graphical Models": "机器学习进阶/CMU10-708.md"
|
||||
|
|
|
|||
Loading…
Reference in a new issue