Development

Ceph peering state machine

Posted by jianshenliu

Ceph provides a script to generate a graph for the peering state machine, which is super helpful to study the state transitions of placement groups.

To generate the graph, here are the commands:

$ git clone https://github.com/ceph/ceph.git
$ cd ceph
$ cat src/osd/PeeringState.h src/osd/PeeringState.cc | doc/scripts/gen_state_diagram.py > doc/dev/peering_graph.generated.dot
$ sed -i 's/7,7/1080,1080/' doc/dev/peering_graph.generated.dot
$ dot -Tsvg doc/dev/peering_graph.generated.dot > doc/dev/peering_graph.generated.svg

This is the latest SVG file I generated from the source as of this writing:

ceph peering state machine
Ceph Peering State Machine

There would be a lot to learn from this diagram, and this would be my task for the next few weeks.


Update: Nov 24, 2020

Ceph uses the Boost Statechart Library to create the peering state machine. In a statechart diagram, there are three main components: state, event, and transition. A state can be a single node or a node with inner states. In the Ceph peering state machine diagram, the squares, ovals, and diamonds are states. Squares are states with inner states. Ovals are states without inner states. States in diamonds are the initial states of their parents. A state with inner states must define an initial state. For example, the state Start is the initial state of state Started. Similarly, the Activating is the initial state of state Active. A state can transit to another state by defining a trigger event and a transition. In the statechart diagram, arrows are transitions, and their labels are the events that trigger the transitions.

The generated Ceph peering state machine diagram is hard to read if you look closely. The arrows and labels usually tangle with each other, making it difficult to identify the label of an arrow. For example, for the event of MakeStray in the top right corner of the diagram, how can you tell which arrow is associated with it? Things worsen if a state defines complex transitions between different inner states (e.g., state Active).

To fix this problem, I updated the diagram generation script to encode each arrow-label pair with the same color choosing from a rotating color palette. The change has been merged to the upstream of Ceph: https://github.com/ceph/ceph/pull/38146

Here is the result:

In addition, I also colored all initial states in light gray. The diamond shape can only identify those without inner states. The exception in this diagram is that the Peering state is the initial state of the state Primary. Without additional coloring, there is no way to know about it.

Hopefully, this change can help better understand the Ceph peering state model.

Related Post

Leave a Reply

%d bloggers like this: