ouroboros-network-0.16.0.0: A networking layer for the Ouroboros blockchain protocol
Safe HaskellSafe-Inferred
LanguageHaskell2010

Ouroboros.Network.PeerSelection.Governor.Monitor

Description

This module contains governor decisions for monitoring tasks:

  • monitoring local root peer config changes
  • monitoring changes to the peer target numbers
  • monitoring the completion of asynchronous governor job
  • monitoring connections
Synopsis

Documentation

targetPeers ∷ (MonadSTM m, Ord peeraddr) ⇒ PeerSelectionActions peeraddr peerconn m → PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

Monitor PeerSelectionTargets, if they change, we just need to update PeerSelectionState, since we return it in a Decision action it will be picked by the governor's peerSelectionGovernorLoop.

It should be noted if the node is in bootstrap mode (i.e. in a sensitive state) then, until the node reaches a clean state, this monitoring action will be disabled and thus churning will be disabled as well.

jobsMonadSTM m ⇒ JobPool () m (Completion m peeraddr peerconn) → PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

Await for the first result from JobPool and return its Decision.

connections ∷ ∀ m peeraddr peerconn. (MonadSTM m, Ord peeraddr) ⇒ PeerSelectionActions peeraddr peerconn m → PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

Monitor connections.

localRoots ∷ ∀ peeraddr peerconn m. (MonadSTM m, Ord peeraddr) ⇒ PeerSelectionActions peeraddr peerconn m → PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

Monitor local roots using readLocalRootPeers STM action.

If the current ledger state is TooOld we can only trust our trustable local root peers, this means that if we remove any local root peer we might no longer abide by the invariant that we are only connected to trusted peers. E.g. Local peers = A, B*, C* (* means trusted peer), if the node is in bootstrap mode and decided to reconfigure the local root peers to the following set: A*, B, C*, D*, E. Notice that B is no longer trusted, however we will keep a connection to it until the outbound governor notices it and disconnects from it.

monitorLedgerStateJudgement ∷ (MonadSTM m, Ord peeraddr) ⇒ PeerSelectionActions peeraddr peerconn m → PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

Monitor LedgerStateJudgement, if it changes, depending on the value we just need to update PeerSelectionTargets. If the ledger state changed to TooOld we set all other targets to 0 and the governor waits for all active connections to drop and then set the targets to sensible values for getting caught up again. However if the state changes to YoungEnough we reset the targets back to their original values.

It should be noted if the node has bootstrap peers disabled then this monitoring action will be disabled.

It should also be noted that churning is ignored until the node converges to a clean state. I.e., it will disconnect from the targets source of truth.

monitorBootstrapPeersFlag ∷ (MonadSTM m, Ord peeraddr) ⇒ PeerSelectionActions peeraddr peerconn m → PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

Monitor UseBootstrapPeers flag.

The user might reconfigure the node at any point to change the value of UseBootstrapPeers. Essentially the user can enable or disable bootstrap peers at any time. Since monitorLedgerStateJudgement will act on the ledger state judgement value changing, this monitoring action should only be responsible for either disabling bootstrap peers (in case the user disables this flag) or enabling the monitorLedgerStateJudgement action to work correctly in the case the node finds itself in bootstrap mode. In order to achieve this behavior, when bootstrap peers are disabled we should update the ledger state judgement value to YoungEnough; and hasOnlyBootstrapPeers to False.

Here's a brief explanation why this works. There's 4 scenarios to consider:

  1. The node is in YoungEnough state and the user
  2. 1. Enables bootstrap peers: In this case since the node is caught up nothing should happen, so setting the LSJ to YoungEnough state is idempotent.
  3. 2. Disables bootstrap peers: In this case, since the node is caught up, its functioning can't really be distinguished from that of a node that has bootstrap peers disabled. So changing the LSJ and hasOnlyBootstrapPeers flag is idempotent.
  4. The node is in TooOld state and the user
  5. 1. Enables bootstrap peers: If the node is behind, enabling bootstrap peers will enable monitorLedgerStateJudgement. So if we set the LSJ to be in YoungEnough state it is going to make sure monitorLedgerStateJudgement observes the TooOld state, triggering the right measures to be taken.
  6. 2. Disables bootstrap peers: If this is the case, we want to let the peer connect to non-trusted peers, so just updating the boostrap peers flag will enable the previously disabled monitoring actions.

waitForSystemToQuiesce ∷ (MonadSTM m, Ord peeraddr) ⇒ PeerSelectionState peeraddr peerconn → Guarded (STM m) (TimedDecision m peeraddr peerconn) Source #

If the node just got in the TooOld state, the node just had its targets adjusted to get rid of all peers. This jobs monitors the node state and when it has arrived to a clean (quiesced) state it sets the hasOnlyBootstrapPeers flag on which will unblock the localRoots and targetPeers monitoring actions, allowing the node to make progress by only connecting to trusted peers.

It should be noted if the node is _not_ in bootstrap mode (i.e. _not_ in a sensitive state) then this monitoring action will be disabled.

If the node takes more than 15 minutes to converge to a clean state the node will crash itself so it can be brought back on again in a clean state. If the node takes more than 15 minutes to converge to a clean state it means something really bad must be going on, such a global network outage, DNS issues, or there could be an actual bug in the code. In any case we'll detect that and have a way to observe such cases.