Commit Graph

63 Commits

Author SHA1 Message Date
Hector Sanjuan
34fdc329fc Fix #24: Auto-join and auto-leave operations for Cluster
This is the third implementation attempt. This time, rather than
broadcasting PeerAdd/Join requests to the whole cluster, we use the
consensus log to broadcast new peers joining.

This makes it easier to recover from errors and to know who exactly
is member of a cluster and who is not. The consensus is, after all,
meant to agree on things, and the list of cluster peers is something
everyone has to agree on.

Raft itself uses a special log operation to maintain the peer set.

The tests are almost unchanged from the previous attempts so it should
be the same, except it doesn't seem possible to bootstrap a bunch of nodes
at the same time using different bootstrap nodes. It works when using
the same. I'm not sure this worked before either, but the code is
simpler than recursively contacting peers, and scales better for
larger clusters.

Nodes have to be careful about joining clusters while keeping the state
from a different cluster (disjoint logs). This may cause problems with
Raft.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-02-07 18:46:09 +01:00
Hector Sanjuan
4e0407ff5c Add ascii diagram to ipfs-cluster-service help. Add explicit "run" command.
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-02-02 14:34:51 +01:00
Hector Sanjuan
6c18c02106 Issue #10: peers/add and peers/rm feature + tests
This commit adds PeerAdd() and PeerRemove() endpoints, CLI support,
tests. Peer management is a delicate issue because of how the consensus
works underneath and the places that need to track such peers.

When adding a peer the procedure is as follows:

* Try to open a connection to the new peer and abort if not reachable
* Broadcast a PeerManagerAddPeer operation which tells all cluster members
to add the new Peer. The Raft leader will add it to Raft's peerset and
the multiaddress will be saved in the ClusterPeers configuration key.
* If the above fails because some cluster node is not responding,
broadcast a PeerRemove() and try to undo any damage.
* If the broadcast succeeds, send our ClusterPeers to the new Peer along with
the local multiaddress we are using in the connection opened in the
first step (that is the multiaddress through which the other peer can reach us)
* The new peer updates its configuration with the new list and joins
the consensus

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-02-02 13:51:49 +01:00
Hector Sanjuan
43dea68edb Update README, Captain log, fix logging.
Addresses some stuff in #19.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-27 13:30:15 +01:00
Hector Sanjuan
3f833a8c17 Use urfave/cli for ipfs-cluster-service too.
Added consistency to tools, plus it's worth the effot at this point.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-24 19:55:06 +01:00
Hector Sanjuan
81db084249 Make sure the commit string gets set. Fix PublicKey. Output JSON in cluster-ctl
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-24 16:56:14 +01:00
Hector Sanjuan
af177bfde6 Address formatting, mispellings, lint errors from goreportcard
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-24 12:39:08 +01:00
Hector Sanjuan
9efa2d063f Merge pull request #31 from ipfs/22-configs
Fix #22: Address feedback regarding configuration
2017-01-24 00:51:40 +01:00
Hector Sanjuan
7954556848 Improve error messages with fishy configurations
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-23 20:39:09 +01:00
Hector Sanjuan
74c494eb1e Fix configuration generation.
Add a custom RaftConfig section which is limited to whatever
values we want to leave to the user.

Make sure the consensus data is, by default, next to the service.json file.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-23 19:06:00 +01:00
Hector Sanjuan
d1731ebd28 Use multiaddresses in the configuration and rename JSON entries for clarity
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-23 18:38:59 +01:00
mateon1
51f87407c6 Fix typo and remove trailing whitespace.
License: MIT
Signed-off-by: Mateusz Naściszewski <matin1111@wp.pl>
2017-01-23 14:21:26 +01:00
Hector Sanjuan
365c549d7c Fix #5: Rename apps to ipfs-cluster-service and ipfs-cluster-ctl
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-01-23 13:34:22 +01:00