update

2026-06-21 00:47:16 +08:00 · 2026-02-01 20:03:11 -08:00 · 2026-02-01 20:03:11 -08:00 · cefe6c8b13
commit cefe6c8b13
parent c0d6bac60f
2 changed files with 33 additions and 26 deletions
--- a/docs/机器学习系统/CSE234.en.md
+++ b/docs/机器学习系统/CSE234.en.md
@ -20,23 +20,27 @@ This course focuses on the design of end-to-end large language model (LLM) syste
 The course can be more accurately divided into three parts (with several additional guest lectures):

 Part 1. Foundations: modern deep learning and computational representations  
-   - Modern deep learning and computation graphs (framework and system fundamentals)  
-   - Automatic differentiation and an overview of ML system architectures  
-   - Tensor formats, in-depth matrix multiplication, and hardware accelerators  
+- Modern deep learning and computation graphs (framework and system fundamentals)  
+- Automatic differentiation and an overview of ML system architectures  
+- Tensor formats, in-depth matrix multiplication, and hardware accelerators  
+
+

 Part 2. Systems and performance optimization: from GPU kernels to compilation and memory  
-   - GPUs and CUDA (including basic performance models)  
-   - GPU matrix multiplication and operator-level compilation  
-   - Triton programming, graph optimization, and compilation  
-   - Memory management (including practical issues and techniques in training and inference)  
-   - Quantization methods and system-level deployment  
+- GPUs and CUDA (including basic performance models)  
+- GPU matrix multiplication and operator-level compilation  
+- Triton programming, graph optimization, and compilation  
+- Memory management (including practical issues and techniques in training and inference)  
+- Quantization methods and system-level deployment  
+

 Part 3. LLM systems: training and inference  
-   - Parallelization strategies: model parallelism, collective communication, intra-/inter-op parallelism, and auto-parallelization  
-   - LLM fundamentals: Transformers, Attention, and MoE  
-   - LLM training optimizations (e.g., FlashAttention-style techniques)  
-   - LLM inference: continuous batching, paged attention, disaggregated prefill/decoding  
-   - Scaling laws
+- Parallelization strategies: model parallelism, collective communication, intra-/inter-op parallelism, and auto-parallelization  
+- LLM fundamentals: Transformers, Attention, and MoE  
+- LLM training optimizations (e.g., FlashAttention-style techniques)  
+- LLM inference: continuous batching, paged attention, disaggregated prefill/decoding  
+- Scaling laws
+

 (Guest lectures cover topics such as ML compilers, LLM pretraining and open science, fast inference, and tool use and agents, serving as complementary extensions.)

--- a/docs/机器学习系统/CSE234.md
+++ b/docs/机器学习系统/CSE234.md
@ -22,23 +22,26 @@
 课程可以更准确地分为三个部分（外加若干 guest lecture）：

 Part 1. 基础：现代深度学习与计算表示
-   - Modern DL 与计算图（computational graph / framework 基础）
-   - Autodiff 与 ML system 架构概览
-   - Tensor format、MatMul 深入与硬件加速器（accelerators）
+- Modern DL 与计算图（computational graph / framework 基础）
+- Autodiff 与 ML system 架构概览
+- Tensor format、MatMul 深入与硬件加速器（accelerators）
+

 Part 2. 系统与性能优化：从 GPU Kernel 到编译与内存
-   - GPUs & CUDA（含基本性能模型）
-   - GPU MatMul 与算子编译（operator compilation）
-   - Triton 编程、图优化与编译（graph optimization & compilation）
-   - Memory（含训练/推理中的内存问题与技巧）
-   - Quantization（量化方法与系统落地）
+- GPUs & CUDA（含基本性能模型）
+- GPU MatMul 与算子编译（operator compilation）
+- Triton 编程、图优化与编译（graph optimization & compilation）
+- Memory（含训练/推理中的内存问题与技巧）
+- Quantization（量化方法与系统落地）
+

 Part 3. LLM系统：训练与推理
-   - 并行策略：模型并行、collective communication、intra-/inter-op、自动并行化
-   - LLM 基础：Transformer、Attention、MoE
-   - LLM 训练优化：FlashAttention 等
-   - LLM 推理：continuous batching、paged attention、disaggregated prefill/decoding
-   - Scaling law
+- 并行策略：模型并行、collective communication、intra-/inter-op、自动并行化
+- LLM 基础：Transformer、Attention、MoE
+- LLM 训练优化：FlashAttention 等
+- LLM 推理：continuous batching、paged attention、disaggregated prefill/decoding
+- Scaling law
+

 （Guest lectures：ML compiler、LLM pretraining/open science、fast inference、tool use & agents 等，作为补充与扩展。）