ipfs-cluster

Author	SHA1	Message	Date
Hector Sanjuan	d87c64b611	Unify the way of handling contex and cancels in components. Cleaning up some ugly code License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-03-02 14:35:16 +01:00
Hector Sanjuan	2512ecb701	Issue #41 : Add Replication factor New PeerManager, Allocator, Informer components have been added along with a new "replication_factor" configuration option. First, cluster peers collect and push metrics (Informer) to the Cluster leader regularly. The Informer is an interface that can be implemented in custom wayts to support custom metrics. Second, on a pin operation, using the information from the collected metrics, an Allocator can provide a list of preferences as to where the new pin should be assigned. The Allocator is an interface allowing to provide different allocation strategies. Both Allocator and Informer are Cluster Componenets, and have access to the RPC API. The allocations are kept in the shared state. Cluster peer failure detection is still missing and re-allocation is still missing, although re-pinning something when a node is down/metrics missing does re-allocate the pin somewhere else. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-02-14 19:13:08 +01:00
Hector Sanjuan	1b3d04e18b	Move all API-related types to the /api subpackage. At the beginning we opted for native types which were serializable (PinInfo had a CidStr field instead of Cid). Now we provide types in two versions: native and serializable. Go methods use native. The rest of APIs (REST/RPC) use always serializable versions. Methods are provided to convert between the two. The reason for moving these out of the way is to be able to re-use type definitions when parsing API responses in `ipfs-cluster-ctl` or any other clients that come up. API responses are just the serializable version of types in JSON encoding. This also reduces having duplicate types defs and parsing methods everywhere. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-02-09 16:30:53 +01:00
Hector Sanjuan	34fdc329fc	Fix #24 : Auto-join and auto-leave operations for Cluster This is the third implementation attempt. This time, rather than broadcasting PeerAdd/Join requests to the whole cluster, we use the consensus log to broadcast new peers joining. This makes it easier to recover from errors and to know who exactly is member of a cluster and who is not. The consensus is, after all, meant to agree on things, and the list of cluster peers is something everyone has to agree on. Raft itself uses a special log operation to maintain the peer set. The tests are almost unchanged from the previous attempts so it should be the same, except it doesn't seem possible to bootstrap a bunch of nodes at the same time using different bootstrap nodes. It works when using the same. I'm not sure this worked before either, but the code is simpler than recursively contacting peers, and scales better for larger clusters. Nodes have to be careful about joining clusters while keeping the state from a different cluster (disjoint logs). This may cause problems with Raft. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-02-07 18:46:09 +01:00
Hector Sanjuan	89ecc1ce89	Encapsulate Raft functions better and simplify the Consensus component License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-02-02 13:51:49 +01:00
Hector Sanjuan	6c18c02106	Issue #10 : peers/add and peers/rm feature + tests This commit adds PeerAdd() and PeerRemove() endpoints, CLI support, tests. Peer management is a delicate issue because of how the consensus works underneath and the places that need to track such peers. When adding a peer the procedure is as follows: * Try to open a connection to the new peer and abort if not reachable * Broadcast a PeerManagerAddPeer operation which tells all cluster members to add the new Peer. The Raft leader will add it to Raft's peerset and the multiaddress will be saved in the ClusterPeers configuration key. * If the above fails because some cluster node is not responding, broadcast a PeerRemove() and try to undo any damage. * If the broadcast succeeds, send our ClusterPeers to the new Peer along with the local multiaddress we are using in the connection opened in the first step (that is the multiaddress through which the other peer can reach us) * The new peer updates its configuration with the new list and joins the consensus License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-02-02 13:51:49 +01:00
Hector Sanjuan	4c1e0068f5	Fix #15 : Peers() provides lots of information now I have renamed "members" to "peers". Added IPFS daemon ID and addresses to the ID object and have Peers() return the collection of ID() objects from the cluster. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-26 20:24:00 +01:00
Hector Sanjuan	9a47e6dd1f	Update go-libp2p-gorpc Uses experimental version of multicodecs but should finally pin all deps License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-25 12:50:46 +01:00
Hector Sanjuan	af177bfde6	Address formatting, mispellings, lint errors from goreportcard License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-24 12:39:08 +01:00
Hector Sanjuan	9af863e3e0	Merge pull request #29 from ipfs/25-leader-comm Fix #25: Only the consensus layer should deal with leaders	2017-01-24 01:13:42 +01:00
Hector Sanjuan	b3039b85d5	global status and global sync should not error when a node is down License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-24 00:54:26 +01:00
Hector Sanjuan	afa8a5c33f	Improve startup messages and information License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-23 23:58:04 +01:00
Hector Sanjuan	031523f7bf	Fix #25 : Only the consensus layer should deal with leaders License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-23 14:01:49 +01:00
Hector Sanjuan	1308a3a32f	Increase leader timeout License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-23 12:27:24 +01:00
Hector Sanjuan	c2d9326715	Improve consensus startup by waiting for a leader. This should also fix some tests error-ing randomly when there is no leader. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2017-01-23 12:09:29 +01:00
Hector Sanjuan	3243cfcccf	Make golint happy License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-28 16:29:07 +01:00
Hector Sanjuan	805b867651	Use go-libp2p-rpc. Tests updated. The former RPC stuff had become a monster, really hard to have an overview of the RPC api capabilities and with lots of magic. go-libp2p-rpc allows to have a clearly defined RPC api which shows which methods every component can use. A component to perform remote requests, and the convoluted LeaderRPC, BroadcastRPC methods are no longer necessary. Things are much simpler now, less goroutines are needed, the central channel handling bottleneck is gone, RPC requests are very streamlined in form. In the future, it would be inmediate to have components living on different libp2p hosts and it is way clearer how to plug into the advanced cluster rpc api. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-27 18:19:54 +01:00
Hector Sanjuan	34720465cd	ipfscluster-server executable License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-21 19:37:25 +01:00
Hector Sanjuan	5c41d69abc	Figured out globalSync and globalSync cid. Tests. More tests. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-20 19:51:13 +01:00
Hector Sanjuan	8172b0ca61	Global pin status. /status /status/cid will now report pin tracker state from all cluster members. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-19 18:35:24 +01:00
Hector Sanjuan	0422ceed16	Preliminary support for Remote RPC operations License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-16 12:40:28 +01:00
Hector Sanjuan	319c97585b	Renames everywhere removing redundant "Cluster" from "ClusterSomething". Start preparing syncs() and status() License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-15 19:08:46 +01:00
Hector Sanjuan	45c31846d1	Fix startup sync mechanism License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-15 15:08:43 +01:00
Hector Sanjuan	09cc7e9265	Lowercase error messages License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-15 14:19:41 +01:00
Hector Sanjuan	a655288fd6	Improve shutdown routines License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-15 14:07:19 +01:00
Hector Sanjuan	0f31995bd6	consensus tests License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-14 15:31:50 +01:00
Hector Sanjuan	98dc9f2289	Fix syncAll being triggered before cluster run(). Other small things. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-12 13:54:58 +01:00
Hector Sanjuan	34b2b6cbd1	Sync between tracker and cluster state. go vet. tests. License: MIT Signed-off-by: Hector Sanjuan <hector@protocol.ai>	2016-12-09 20:54:46 +01:00
Hector Sanjuan	a9dcf57a90	Add MapPinTracker sync and recover capabilities. Sync checks that the CID status corresponds to what IPFS says. Recover retries pinning, unpinning on Error-ed cids.	2016-12-07 17:21:29 +01:00
Hector Sanjuan	9c1c256e33	Introduce the concept of PinTracker. Thin ClusterState to minimal. + Try to make RPC handling code cleaner.	2016-12-06 22:29:59 +01:00
Hector Sanjuan	2285f8d1a8	Make ipfs pinning async. Add Pin intermediary states	2016-12-05 15:30:11 +01:00
Hector Sanjuan	e0840df267	WIP: basic functionality	2016-12-02 19:33:39 +01:00

32 Commits