diff --git a/docs/_static/css/custom.css b/docs/_static/css/custom.css index 8ac344f..7c0a031 100644 --- a/docs/_static/css/custom.css +++ b/docs/_static/css/custom.css @@ -3,3 +3,4 @@ .rst-content a:visited{color: var(--ddpurple);} .wy-menu-vertical .caption-text{color: white;} .wy-menu-vertical a:active {background-color: var(--ddpurple);} +figure img{border: 1px solid #ccc;} diff --git a/docs/index.rst b/docs/index.rst index 1b5a430..776aae4 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -14,8 +14,8 @@ Support this project by giving it a |:star:| on GitHub |ico1| .. toctree:: :hidden: - profiling/index mental-model-for-go/index + profiling/index .. toctree:: :maxdepth: 1 diff --git a/docs/mental-model-for-go/goroutine-scheduler.rst b/docs/mental-model-for-go/goroutine-scheduler.rst index 4fcb86c..097badc 100644 --- a/docs/mental-model-for-go/goroutine-scheduler.rst +++ b/docs/mental-model-for-go/goroutine-scheduler.rst @@ -1,9 +1,15 @@ Goroutine Scheduler =================== -Let’s talk about the scheduler first using the example below: +.. note:: + The mental model presented here is intended to be as simple as possible, while still being useful to novice performance practitioners. In reality scheduling is a lot more complicated, so please consider studying `More Information`_ after this introduction. -.. code:: go +HTTP Request Example +-------------------- + +The Go runtime includes a scheduler that manages how your code is being executed on the CPUs of a system. Let’s learn about it using the example below: + +.. code-block:: go func main() { res, err := http.Get("https://example.org/") @@ -13,40 +19,58 @@ Let’s talk about the scheduler first using the example below: fmt.Printf("%d\n", res.StatusCode) } -Here we have a single goroutine, let’s call it ``G1``, that runs the -``main`` function. The picture below shows a simplified timeline of how -this goroutine might execute on a single CPU. Initially ``G1`` is -running on the CPU to prepare the http request. Then the CPU becomes -idle as the goroutine has to wait for the network. And finally it gets -scheduled onto the CPU again to print out the status code. +Here we have a single goroutine, let’s call it ``G1``, that runs the ``main`` function. :numref:`fig-timeline`. below shows a simplified timeline of how this goroutine might execute on a single CPU. Initially ``G1`` is running on the CPU to prepare the http request. Then the CPU becomes idle as the goroutine has to wait for the network. And finally it gets scheduled onto the CPU again to print out the status code. -From the scheduler’s perspective, the program above executes like shown -below. At first ``G1`` is ``Executing`` on ``CPU 1``. Then the goroutine -is taken off the CPU while ``Waiting`` for the network. Once the -scheduler notices that the network has replied (using non-blocking I/O, -similar to Node.js), it marks the goroutine as ``Runnable``. And as soon -as a CPU core becomes available, the goroutine starts ``Executing`` -again. In our case all cores are available, so ``G1`` can go back to -``Executing`` the ``fmt.Printf()`` function on one of the CPUs -immediately without spending any time in the ``Runnable`` state. +.. figure:: /images/timeline.png + :name: fig-timeline + :width: 600 + :align: center -Most of the time, Go programs are running multiple goroutines, so you -will have a few goroutines ``Executing`` on some of the CPU cores, a -large number of goroutines ``Waiting`` for various reasons, and ideally -no goroutines ``Runnable`` unless your program exhibits very high CPU -load. An example of this can be seen below. + Simplified timeline showing the execution of an http request. -Of course the model above glosses over many details. In reality it’s -turtles all the way down, and the Go scheduler works on top of threads -managed by the operating system, and even CPUs themselves are capable of -hyper-threading which can be seen as a form of scheduling. So if you’re -interested, feel free to continue down this rabbit hole via Ardan labs -series on `Scheduling in -Go `__ -or similar material. +From the scheduler’s perspective, the program above executes like shown in :numref:`fig-scheduler`. At first ``G1`` is ``Executing`` on ``CPU 1``. Then the goroutine is taken off the CPU while ``Waiting`` for the network. Once the scheduler notices that the network has replied (using non-blocking I/O, similar to Node.js), it marks the goroutine as ``Runnable``. And as soon as a CPU core becomes available, the goroutine starts ``Executing`` again. In our case all cores are available, so ``G1`` can go back to ``Executing`` the ``fmt.Printf()`` function on one of the CPUs immediately without spending any time in the ``Runnable`` state. -However, the model above should be sufficient to understand the -remainder of this guide. In particular it should become clear that the -time measured by the various Go profilers is essentially the time your -goroutines are spending in the ``Executing`` and ``Waiting`` states as -illustrated by the diagram below. + +.. figure:: /images/scheduler.gif + :name: fig-scheduler + :width: 400 + :align: center + + Goroutine execution state changes for the timeline in :numref:`fig-timeline`. + +.. note:: + In reality Go is scheduling goroutines on virtual processors that have OS threads assigned to them. From there on it's turtles all the way down, and it's actually the OS that schedules the threads on hardware threads that are scheduled by the CPUs themselves. But the truth is out there, so you should seek `More Information`_. + +Full Example +------------ + +Most of the time, Go programs are running multiple goroutines, so you will have a few goroutines ``Executing`` on some of the CPU cores, a large number of goroutines ``Waiting`` for various reasons, and ideally no goroutines ``Runnable`` unless your program exhibits very high CPU load. An example of this can be seen in :numref:`fig-scheduler-complete` below. + +.. figure:: /images/scheduler-complete.png + :name: fig-scheduler-complete + :width: 600 + :align: center + + Several goroutines in various scheduling states and the transitions events between them. + +Profiling Time +-------------- + +Using the model above, we can now understand the output of time based :doc:`/profiling/index` in Go. As illustrated by :numref:`fig-profiler-venn`, CPU Time is the time goroutines spent in the ``Executing`` state, while mutex and block time is happening in channel or mutex ``Waiting`` states. Additionally there are ``Waiting`` states that are not covered by any profilers (e.g. I/O). And if there is more than one goroutine, the total amount of goroutine time will exceed the real time experienced by a user. + +.. figure:: /images/profiler-venn.png + :name: fig-profiler-venn + :align: center + + Venn diagram showing the overlap between goroutine time and time based :doc:`/profiling/index`. + + +More Information +---------------- + +For more detailed information, check out the resources below. + +`Video: Go scheduler: Implementing language with lightweight concurrency `__ (2019) + Fantastic presentation by Dmitry Vyukov at Hydra 2019 – highly recommended. +`Scheduling In Go `__ (2018) + Three part series from William Kennedy with in-depth information on OS and Go scheduling. diff --git a/docs/profiling/cpu-profiler.rst b/docs/profiling/cpu-profiler.rst index 0f462c8..0194aba 100644 --- a/docs/profiling/cpu-profiler.rst +++ b/docs/profiling/cpu-profiler.rst @@ -109,10 +109,10 @@ main.work;main.directWork user:bob 4 40000000 main.work;main.directWork user:alice 5 30000000 ========================= ========== ============= =============== -Viewing the same profile with pprof’s Graph view will also include the labels as shown in :numref:`cpu-profiler-labels`. +Viewing the same profile with pprof’s Graph view will also include the labels as shown in :numref:`fig-cpu-profiler-labels`. .. figure:: /images/cpu-profiler-labels.png - :name: cpu-profiler-labels + :name: fig-cpu-profiler-labels :width: 400 :align: center @@ -171,7 +171,7 @@ More Information For even more detailed information, check out the resources below. -`Inside the Go CPU profiler `__ (2022-09-26) +`Inside the Go CPU profiler `__ (2022) A more in-depth look at the implementation details of the CPU profiler. -`Profiling Improvements in Go 1.18 `__ (2022-02-11) +`Profiling Improvements in Go 1.18 `__ (2022) Discusses the recent switch from ``setitimer(2)`` to ``timer_create(2)`` as well as improvements to pprof label recording.