Add some of my notes to README.md
This commit is contained in:
parent
44677c8a90
commit
684cc85dc6
63
README.md
63
README.md
|
@ -112,4 +112,67 @@ will lose its lock. Since it _must_ lose its lock and _must_ recognize that
|
||||||
this has happened, the health check will always timeout after T and the lock
|
this has happened, the health check will always timeout after T and the lock
|
||||||
will be given up immediately.
|
will be given up immediately.
|
||||||
|
|
||||||
|
#### Not init
|
||||||
|
|
||||||
|
This solution is intended to signal services to start or stop, not act as an
|
||||||
|
init system in its own right. Thus, you should be doing something like
|
||||||
|
`systemctl start database.service`, not `redis-server`. Fencing should be
|
||||||
|
handled via the systemd unit DAG as well.
|
||||||
|
|
||||||
|
#### Pump
|
||||||
|
|
||||||
|
Runs in an explicit, stateless trampoline executor (in addition to the tokio
|
||||||
|
async executor), with an action of "pump" in the code. Each run builds its
|
||||||
|
state at the start and may loop a finite number of times but _must_ return to
|
||||||
|
the outer loop occasionally to allow it to do housekeeping as well as to clear
|
||||||
|
the stack depth for when tail call optimization fails (reportedly unreliable
|
||||||
|
in rust currently). Stated differently, the pump action is also a DAG with a
|
||||||
|
single entry and single exit.
|
||||||
|
|
||||||
|
A single pumping stroke happens from Start to End in the diagram.
|
||||||
|
|
||||||
|
Each stroke lasts _at least_ R. This is accomplished by awaiting a timer
|
||||||
|
running in parallel to pump at the end of the pump stroke (i.e., it is not
|
||||||
|
part of the pump function but is an important implementation detail of pump).
|
||||||
|
|
||||||
|
|
||||||
|
### State Machine
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart TD
|
||||||
|
Start(((Start))) --> GetLock
|
||||||
|
|
||||||
|
GetLock[lk = Lock Value\nrev = Revision] --> Empty{"lk empty or mine?\n(or failed to retrieve\ntimeout = 1R)"}
|
||||||
|
Empty -- yes --> HCSunnyLoop
|
||||||
|
Empty -- no --> WaitKill
|
||||||
|
WaitKill[Kill Process] --> HCWait
|
||||||
|
|
||||||
|
HCWrite[Run Healthcheck] --> HCWriteSuccess{Success?}
|
||||||
|
HCWriteSuccess -- yes --> Write
|
||||||
|
HCWriteSuccess -- no --> WaitR
|
||||||
|
Write[Write Lock] --> WriteSuccess{Success?}
|
||||||
|
WriteSuccess -- yes --> WriteWaitCR
|
||||||
|
WriteSuccess -- no ---> WaitR
|
||||||
|
WriteWaitCR[["Wait R\n(C-1 times):\n->Write Lock\n->Wait R"]] --> WaitR
|
||||||
|
style WriteWaitCR text-align:left
|
||||||
|
|
||||||
|
HCWait[Run Healthcheck] --> HCWaitSuccess{Success?}
|
||||||
|
HCWaitSuccess -- yes --> WaitFR
|
||||||
|
HCWaitSuccess -- no --> WaitR
|
||||||
|
WaitR[Wait remainder of R, if any]
|
||||||
|
|
||||||
|
WaitFR[Wait F*R for takeover] --> HCWrite
|
||||||
|
|
||||||
|
HCSunnyLoop[Run Healthcheck] --> HCSunnyLoopSuccess{Success?}
|
||||||
|
HCSunnyLoopSuccess -- yes --> SunnyLoopWrite
|
||||||
|
SunnyLoopWrite[Renew lock\nMay attempt for\nup to F*R seconds] --> SunnyLoopWriteSuccess{Success?}
|
||||||
|
SunnyLoopWriteSuccess -- yes --> SunnyStartProcess
|
||||||
|
SunnyLoopWriteSuccess -- no --> SunnyLoopAbort
|
||||||
|
SunnyStartProcess[Unfence Self\nFence Others\nEnable Process] --> WaitR
|
||||||
|
HCSunnyLoopSuccess -- no --> SunnyLoopAbort
|
||||||
|
SunnyLoopAbort[Attempt to write blank to lock\nOne attempt only.] --> SunnyKill
|
||||||
|
SunnyKill[Kill Process] --> WaitR
|
||||||
|
WaitR --> End((End)) --> ToStart[Back to start]
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user