finish scheduler doc

This commit is contained in:
Felix Geisendörfer 2022-12-30 08:25:20 +01:00
parent 6342fc3259
commit 54508f023d
4 changed files with 65 additions and 40 deletions

View file

@ -3,3 +3,4 @@
.rst-content a:visited{color: var(--ddpurple);}
.wy-menu-vertical .caption-text{color: white;}
.wy-menu-vertical a:active {background-color: var(--ddpurple);}
figure img{border: 1px solid #ccc;}

View file

@ -14,8 +14,8 @@ Support this project by giving it a |:star:| on GitHub |ico1|
.. toctree::
:hidden:
profiling/index
mental-model-for-go/index
profiling/index
.. toctree::
:maxdepth: 1

View file

@ -1,9 +1,15 @@
Goroutine Scheduler
===================
Lets talk about the scheduler first using the example below:
.. note::
The mental model presented here is intended to be as simple as possible, while still being useful to novice performance practitioners. In reality scheduling is a lot more complicated, so please consider studying `More Information`_ after this introduction.
.. code:: go
HTTP Request Example
--------------------
The Go runtime includes a scheduler that manages how your code is being executed on the CPUs of a system. Lets learn about it using the example below:
.. code-block:: go
func main() {
res, err := http.Get("https://example.org/")
@ -13,40 +19,58 @@ Lets talk about the scheduler first using the example below:
fmt.Printf("%d\n", res.StatusCode)
}
Here we have a single goroutine, lets call it ``G1``, that runs the
``main`` function. The picture below shows a simplified timeline of how
this goroutine might execute on a single CPU. Initially ``G1`` is
running on the CPU to prepare the http request. Then the CPU becomes
idle as the goroutine has to wait for the network. And finally it gets
scheduled onto the CPU again to print out the status code.
Here we have a single goroutine, lets call it ``G1``, that runs the ``main`` function. :numref:`fig-timeline`. below shows a simplified timeline of how this goroutine might execute on a single CPU. Initially ``G1`` is running on the CPU to prepare the http request. Then the CPU becomes idle as the goroutine has to wait for the network. And finally it gets scheduled onto the CPU again to print out the status code.
From the schedulers perspective, the program above executes like shown
below. At first ``G1`` is ``Executing`` on ``CPU 1``. Then the goroutine
is taken off the CPU while ``Waiting`` for the network. Once the
scheduler notices that the network has replied (using non-blocking I/O,
similar to Node.js), it marks the goroutine as ``Runnable``. And as soon
as a CPU core becomes available, the goroutine starts ``Executing``
again. In our case all cores are available, so ``G1`` can go back to
``Executing`` the ``fmt.Printf()`` function on one of the CPUs
immediately without spending any time in the ``Runnable`` state.
.. figure:: /images/timeline.png
:name: fig-timeline
:width: 600
:align: center
Most of the time, Go programs are running multiple goroutines, so you
will have a few goroutines ``Executing`` on some of the CPU cores, a
large number of goroutines ``Waiting`` for various reasons, and ideally
no goroutines ``Runnable`` unless your program exhibits very high CPU
load. An example of this can be seen below.
Simplified timeline showing the execution of an http request.
Of course the model above glosses over many details. In reality its
turtles all the way down, and the Go scheduler works on top of threads
managed by the operating system, and even CPUs themselves are capable of
hyper-threading which can be seen as a form of scheduling. So if youre
interested, feel free to continue down this rabbit hole via Ardan labs
series on `Scheduling in
Go <https://www.ardanlabs.com/blog/2018/08/scheduling-in-go-part1.html>`__
or similar material.
From the schedulers perspective, the program above executes like shown in :numref:`fig-scheduler`. At first ``G1`` is ``Executing`` on ``CPU 1``. Then the goroutine is taken off the CPU while ``Waiting`` for the network. Once the scheduler notices that the network has replied (using non-blocking I/O, similar to Node.js), it marks the goroutine as ``Runnable``. And as soon as a CPU core becomes available, the goroutine starts ``Executing`` again. In our case all cores are available, so ``G1`` can go back to ``Executing`` the ``fmt.Printf()`` function on one of the CPUs immediately without spending any time in the ``Runnable`` state.
However, the model above should be sufficient to understand the
remainder of this guide. In particular it should become clear that the
time measured by the various Go profilers is essentially the time your
goroutines are spending in the ``Executing`` and ``Waiting`` states as
illustrated by the diagram below.
.. figure:: /images/scheduler.gif
:name: fig-scheduler
:width: 400
:align: center
Goroutine execution state changes for the timeline in :numref:`fig-timeline`.
.. note::
In reality Go is scheduling goroutines on virtual processors that have OS threads assigned to them. From there on it's turtles all the way down, and it's actually the OS that schedules the threads on hardware threads that are scheduled by the CPUs themselves. But the truth is out there, so you should seek `More Information`_.
Full Example
------------
Most of the time, Go programs are running multiple goroutines, so you will have a few goroutines ``Executing`` on some of the CPU cores, a large number of goroutines ``Waiting`` for various reasons, and ideally no goroutines ``Runnable`` unless your program exhibits very high CPU load. An example of this can be seen in :numref:`fig-scheduler-complete` below.
.. figure:: /images/scheduler-complete.png
:name: fig-scheduler-complete
:width: 600
:align: center
Several goroutines in various scheduling states and the transitions events between them.
Profiling Time
--------------
Using the model above, we can now understand the output of time based :doc:`/profiling/index` in Go. As illustrated by :numref:`fig-profiler-venn`, CPU Time is the time goroutines spent in the ``Executing`` state, while mutex and block time is happening in channel or mutex ``Waiting`` states. Additionally there are ``Waiting`` states that are not covered by any profilers (e.g. I/O). And if there is more than one goroutine, the total amount of goroutine time will exceed the real time experienced by a user.
.. figure:: /images/profiler-venn.png
:name: fig-profiler-venn
:align: center
Venn diagram showing the overlap between goroutine time and time based :doc:`/profiling/index`.
More Information
----------------
For more detailed information, check out the resources below.
`Video: Go scheduler: Implementing language with lightweight concurrency <https://youtu.be/-K11rY57K7k>`__ (2019)
Fantastic presentation by Dmitry Vyukov at Hydra 2019 highly recommended.
`Scheduling In Go <https://www.ardanlabs.com/blog/2018/08/scheduling-in-go-part1.html>`__ (2018)
Three part series from William Kennedy with in-depth information on OS and Go scheduling.

View file

@ -109,10 +109,10 @@ main.work;main.directWork user:bob 4 40000000
main.work;main.directWork user:alice 5 30000000
========================= ========== ============= ===============
Viewing the same profile with pprofs Graph view will also include the labels as shown in :numref:`cpu-profiler-labels`.
Viewing the same profile with pprofs Graph view will also include the labels as shown in :numref:`fig-cpu-profiler-labels`.
.. figure:: /images/cpu-profiler-labels.png
:name: cpu-profiler-labels
:name: fig-cpu-profiler-labels
:width: 400
:align: center
@ -171,7 +171,7 @@ More Information
For even more detailed information, check out the resources below.
`Inside the Go CPU profiler <https://sumercip.com/posts/inside-the-go-cpu-profiler/>`__ (2022-09-26)
`Inside the Go CPU profiler <https://sumercip.com/posts/inside-the-go-cpu-profiler/>`__ (2022)
A more in-depth look at the implementation details of the CPU profiler.
`Profiling Improvements in Go 1.18 <https://felixge.de/2022/02/11/profiling-improvements-in-go-1.18/>`__ (2022-02-11)
`Profiling Improvements in Go 1.18 <https://felixge.de/2022/02/11/profiling-improvements-in-go-1.18/>`__ (2022)
Discusses the recent switch from ``setitimer(2)`` to ``timer_create(2)`` as well as improvements to pprof label recording.