Add some of my notes to README.md

This commit is contained in:
James Andariese 2024-07-24 02:26:30 +00:00
parent 44677c8a90
commit 684cc85dc6

View File

@ -112,4 +112,67 @@ will lose its lock. Since it _must_ lose its lock and _must_ recognize that
this has happened, the health check will always timeout after T and the lock
will be given up immediately.
#### Not init
This solution is intended to signal services to start or stop, not act as an
init system in its own right. Thus, you should be doing something like
`systemctl start database.service`, not `redis-server`. Fencing should be
handled via the systemd unit DAG as well.
#### Pump
Runs in an explicit, stateless trampoline executor (in addition to the tokio
async executor), with an action of "pump" in the code. Each run builds its
state at the start and may loop a finite number of times but _must_ return to
the outer loop occasionally to allow it to do housekeeping as well as to clear
the stack depth for when tail call optimization fails (reportedly unreliable
in rust currently). Stated differently, the pump action is also a DAG with a
single entry and single exit.
A single pumping stroke happens from Start to End in the diagram.
Each stroke lasts _at least_ R. This is accomplished by awaiting a timer
running in parallel to pump at the end of the pump stroke (i.e., it is not
part of the pump function but is an important implementation detail of pump).
### State Machine
```mermaid
flowchart TD
Start(((Start))) --> GetLock
GetLock[lk = Lock Value\nrev = Revision] --> Empty{"lk empty or mine?\n(or failed to retrieve\ntimeout = 1R)"}
Empty -- yes --> HCSunnyLoop
Empty -- no --> WaitKill
WaitKill[Kill Process] --> HCWait
HCWrite[Run Healthcheck] --> HCWriteSuccess{Success?}
HCWriteSuccess -- yes --> Write
HCWriteSuccess -- no --> WaitR
Write[Write Lock] --> WriteSuccess{Success?}
WriteSuccess -- yes --> WriteWaitCR
WriteSuccess -- no ---> WaitR
WriteWaitCR[["Wait R\n(C-1 times):\n->Write Lock\n->Wait R"]] --> WaitR
style WriteWaitCR text-align:left
HCWait[Run Healthcheck] --> HCWaitSuccess{Success?}
HCWaitSuccess -- yes --> WaitFR
HCWaitSuccess -- no --> WaitR
WaitR[Wait remainder of R, if any]
WaitFR[Wait F*R for takeover] --> HCWrite
HCSunnyLoop[Run Healthcheck] --> HCSunnyLoopSuccess{Success?}
HCSunnyLoopSuccess -- yes --> SunnyLoopWrite
SunnyLoopWrite[Renew lock\nMay attempt for\nup to F*R seconds] --> SunnyLoopWriteSuccess{Success?}
SunnyLoopWriteSuccess -- yes --> SunnyStartProcess
SunnyLoopWriteSuccess -- no --> SunnyLoopAbort
SunnyStartProcess[Unfence Self\nFence Others\nEnable Process] --> WaitR
HCSunnyLoopSuccess -- no --> SunnyLoopAbort
SunnyLoopAbort[Attempt to write blank to lock\nOne attempt only.] --> SunnyKill
SunnyKill[Kill Process] --> WaitR
WaitR --> End((End)) --> ToStart[Back to start]
```