Commit Graph

1532 Commits

Author SHA1 Message Date
Hector Sanjuan
3c3341e491 Monitor: add PublishMetric() to component interface
The monitor component should be in charge of deciding how it is
best to send metrics to other peers and what that means.

This adds the PublishMetric() method to the component interface
and moves that functionality from Cluster main component to the
basic monitor.

There is a behaviour change. Before, the metrics where sent only to
the leader, while the leader was the only peer to broadcast them everywhere.
Now, all peers broadcast all metrics everywhere. This is mostly
because we should not rely on the consensus layer providing a Leader(), so
we are taking the chance to remove this dependency.

Note that in any-case, pubsub monitoring should replace the
existing basic monitor. This is just paving the ground.

Additionally, in order to not duplicate the multiRPC code
in the monitor, I have moved that functionality to go-libp2p-gorpc
and added an rpcutil library to cluster which includes useful
methods to perform multiRPC requests (some of them existed in
util.go, others are new and help handling multiple contexts etc).

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-07 14:26:06 +02:00
Hector Sanjuan
1886782530 Feat pubsubmon: Extract MetricsWindow to utils module
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-07 14:24:49 +02:00
Hector Sanjuan
029cd77c27
Merge pull request #398 from ipfs/feat/promote-consensus
Emancipate the consensus component
2018-05-07 08:29:14 +02:00
Hector Sanjuan
33d9cdd3c4 Feat: emancipate Consensus from the Cluster component
This commit promotes the Consensus component (and Raft) to become a fully
independent thing like other components, passed to NewCluster during
initialization. Cluster (main component) no longer creates the consensus
layer internally. This has triggered a number of breaking changes
that I will explain below.

Motivation: Future work will require the possibility of running Cluster
with a consensus layer that is not Raft. The "consensus" layer is in charge
of maintaining two things:
  * The current cluster peerset, as required by the implementation
  * The current cluster pinset (shared state)

While the pinset maintenance has always been in the consensus layer, the
peerset maintenance was handled by the main component (starting by the "peers"
key in the configuration) AND the Raft component (internally)
and this generated lots of confusion: if the user edited the peers in the
configuration they would be greeted with an error.

The bootstrap process (adding a peer to an existing cluster) and configuration
key also complicated many things, since the main component did it, but only
when the consensus was initialized and in single peer mode.

In all this we also mixed the peerstore (list of peer addresses in the libp2p
host) with the peerset, when they need not to be linked.

By initializing the consensus layer before calling NewCluster, all the
difficulties in maintaining the current implementation in the same way
have come to light. Thus, the following changes have been introduced:

* Remove "peers" and "bootstrap" keys from the configuration: we no longer
edit or save the configuration files. This was a very bad practice, requiring
write permissions by the process to the file containing the private key and
additionally made things like Puppet deployments of cluster difficult as
configuration would mutate from its initial version. Needless to say all the
maintenance associated to making sure peers and bootstrap had correct values
when peers are bootstrapped or removed. A loud and detailed error message has
been added when staring cluster with an old config, along with instructions on
how to move forward.

* Introduce a PeerstoreFile ("peerstore") which stores peer addresses: in
ipfs, the peerstore is not persisted because it can be re-built from the
network bootstrappers and the DHT. Cluster should probably also allow
discoverability of peers addresses (when not bootstrapping, as in that case
we have it), but in the meantime, we will read and persist the peerstore
addresses for cluster peers in this file, different from the configuration.
Note that dns multiaddresses are now fully supported and no IPs are saved
when we have DNS multiaddresses for a peer.

* The former "peer_manager" code is now a pstoremgr module, providing utilities
to parse, add, list and generally maintain the libp2p host peerstore, including
operations on the PeerstoreFile. This "pstoremgr" can now also be extended to
perform address autodiscovery and other things indepedently from Cluster.

* Create and initialize Raft outside of the main Cluster component: since we
can now launch Raft independently from Cluster, we have more degrees of
freedom. A new "staging" option when creating the object allows a raft peer to
be launched in Staging mode, waiting to be added to a running consensus, and
thus, not electing itself as leader or doing anything like we were doing
before. This additionally allows us to track when the peer has become a
Voter, which only happens when it's caught up with the state, something that
was wonky previously.

* The raft configuration now includes an InitPeerset key, which allows to
provide a peerset for new peers and which is ignored when staging==true. The
whole Raft initialization code is way cleaner and stronger now.

* Cluster peer bootsrapping is now an ipfs-cluster-service feature. The
--bootstrap flag works as before (additionally allowing comma-separated-list
of entries). What bootstrap does, is to initialize Raft with staging == true,
and then call Join in the main cluster component. Only when the Raft peer
transitions to Voter, consensus becomes ready, and cluster becomes Ready.
This is cleaner, works better and is less complex than before (supporting
both flags and config values). We also backup and clean the state whenever
we are boostrapping, automatically

* ipfs-cluster-service no longer runs the daemon. Starting cluster needs
now "ipfs-cluster-service daemon". The daemon specific flags (bootstrap,
alloc) are now flags for the daemon subcommand. Here we mimic ipfs ("ipfs"
does not start the daemon but print help) and pave the path for merging both
service and ctl in the future.

While this brings some breaking changes, it significantly reduces the
complexity of the configuration, the code and most importantly, the
documentation. It should be easier now to explain the user what is the
right way to launch a cluster peer, and more difficult to make mistakes.

As a side effect, the PR also:

* Fixes #381 - peers with dynamic addresses
* Fixes #371 - peers should be Raft configuration option
* Fixes #378 - waitForUpdates may return before state fully synced
* Fixes #235 - config option shadowing (no cfg saves, no need to shadow)

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-07 07:39:41 +02:00
Hector Sanjuan
bfcf700fa8
Merge pull request #383 from ipfs/feat/pintracker-revamp
Feat: pintracker revamp
2018-05-07 07:38:15 +02:00
Hector Sanjuan
f748cbcb03
Merge pull request #403 from LEonGAo1991/fix/unit-test-for-https-endpoint
Add unit test for HTTPS endpoint
2018-05-05 19:42:47 +02:00
LeonGGGG
50dd729a52 Fix #191: Add go HTTPs tests
Added https server and client in restapi_test.go, with a sample unit test in TestRestAPIIDEndpoint

License: MIT
Signed-off-by: Liang Gao lianggao91@hotmail.com
2018-05-04 18:31:29 -07:00
Adrian Lanzafame
19257ad8fd
finish maptracker tests
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-04 21:18:30 +10:00
Adrian Lanzafame
401eb408c4
rename operationCtx functions
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-04 21:18:12 +10:00
Hector Sanjuan
859cf75a01 Pintracker: improve tests
Avoid writing tests which will hang indefinitely on failure conditions.
Introduce TODOs.
Rename some vars to more explicit names.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-02 16:13:18 +02:00
Hector Sanjuan
877e65a53d Pintracker: always cancel operation contexts
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-02 15:24:26 +02:00
Hector Sanjuan
e186dbe2c2 Undo extra delays
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-02 15:24:26 +02:00
Hector Sanjuan
8b08dfeed8 Pintracker: rename and fmting.
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-02 15:24:26 +02:00
Hector Sanjuan
9856bcdb94 Pintracker: remove timeouts
Pinning/unpinning timeouts are controlled by the ipfs connector component.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-02 15:24:26 +02:00
Hector Sanjuan
5709e5d03c pintracker: do not register operation after putting it in channel
This creates a race condition where the items may have been
already pinned before the operation is registered in the tracker.

This may result in operations being left in the tracker and potentially
never completed.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-02 15:24:26 +02:00
Adrian Lanzafame
f68c7f5354 ipfshttp: add pin/unpin specific timeouts
and get the tests passing and add Pin/UnpinQueued
tracker statuses back in.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-02 15:24:26 +02:00
Adrian Lanzafame
1eade86209 pintracker: add filtering of operationCtxs as they
come off the pin/unpin channels.

Also fix a race condition in the operationTracker.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-02 15:24:26 +02:00
Adrian Lanzafame
9e20e4e3b2 ipfsconn/ipfshttp: Pass ctx through from rpc_api
to the ipfscluster.IPFSConnector interface and then
to the implementation of that interface in ipfsconn/ipfshttp.
This allows calls from MapPinTracker to cancel requests made
to the local IPFS node.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-02 15:24:26 +02:00
Adrian Lanzafame
ab2a883a3d pintracker/mappintracker: separate status and operation concepts
The TrackerStatuses were starting to be used to convey the inflight
status of an 'operation', instead of just the status of the Pin.
I have separated out any thing related to 'operations' and
an operation's 'phases'.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-02 15:24:26 +02:00
Adrian Lanzafame
a2f59b26af ipfshttp/config: add ClientPostTimeout value
ipfshttp: cancel POST request when timeout reached

ipfshttp/config: fix config test

ipfshttp: use struct styling for multi-line func calls

ipfshttp/config: add general ClientTimeout

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-02 15:24:26 +02:00
Adrian Lanzafame
5316c3bb4c typos and style nitpicks
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-05-02 15:24:26 +02:00
Hector Sanjuan
a0a0898719
Merge pull request #393 from ipfs/docs/move-to-website
Docs/move to website
2018-05-01 11:14:17 +02:00
Hector Sanjuan
b01ac8f063
Merge pull request #392 from ipfs/fix/ipfsconn/api-compat
ipfsconn/ipfshttp: handle cid args passed in url path correctly
2018-05-01 10:36:58 +02:00
Adrian Lanzafame
b50a05f898
ipfsconn/ipfshttp: go1.9 ServeMux doesn't redirect
query args correctly, requiring both a trailing slash and
non-trailing slash handle pattern to be defined for the
pin and unpin handlers.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-04-30 15:21:03 +10:00
Adrian Lanzafame
22ec210c25
ipfsconn/ipfshttp: handle cid args passed in url path correctly
The extractCid function was added to enable the extraction of
a cid argument from either the url path or query string.
This puts the proxy behaviour on par with the current IPFS API.
The function does rely on the fact that ipfs-cluster doesn't
intercept any command that has more than one subcommand.
If that changes, this function will have to be updated.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-04-30 15:21:03 +10:00
Hector Sanjuan
dbcc5c2fde
Merge pull request #396 from ipfs/fix/empty-config-option
Fix: do not generate "listen_multiaddress" deprecated option in config
2018-04-27 17:17:58 +02:00
Hector Sanjuan
921c17ca47
Merge pull request #397 from ipfs/feat/peer-add-hide
ipfs-cluster-ctl: do not provide "peers add"
2018-04-27 17:17:46 +02:00
Hector Sanjuan
1d20f3de36 ipfs-cluster-ctl: do not provide "peers add"
This uses the PeerAdd endpoint which should NOT be used as the current
workflow states that the way to adding peers is bootstrapping.

Adding peers manually with this endpoint leads to split-head states very
easily. The fact that this operation is visible in ipfs-cluster-ctl
is only leading the users to bad places.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-27 10:27:52 +02:00
Hector Sanjuan
cd32daf4d7 Fix: do not generate "listen_multiaddress" deprecated option in config
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-27 10:24:39 +02:00
Hector Sanjuan
b08b3aba64 Docs: Move to website.
Updated READMEs, removed docs and point everything to website documentation.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-27 09:05:11 +02:00
Hector Sanjuan
01bf4c16be
Merge pull request #394 from ipfs/docker/run-daemon-default
Docker: Run with daemon --upgrade by default.
2018-04-27 08:19:25 +02:00
Hector Sanjuan
2fa41e72f9 Docker: Run with daemon --upgrade by default.
Plus add some warnings for users running the container randomly.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-27 07:57:08 +02:00
Hector Sanjuan
695177a080
Merge pull request #395 from ipfs/fix/upgrade-empty-state
Fix: do not fail when running daemon --upgrade and no state exists
2018-04-27 07:54:33 +02:00
Hector Sanjuan
9e29e646ed Fix: do not fail when running daemon --upgrade and no state exists
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-26 20:19:47 +02:00
Hector Sanjuan
35427801da
Merge pull request #391 from ipfs/doc/guide/basic-auth-creds
doc: fix basic_auth_credentials format in config
2018-04-24 14:47:32 +02:00
Hector Sanjuan
7a826f8240
Merge pull request #390 from ipfs/log/cluster/version-difference-err
cluster: add version diff log to start errors
2018-04-24 14:46:12 +02:00
Sina Mahmoodi
2b02eac5c3 doc: fix basic_auth_credentials format in config
`basic_auth_credentials` in `api` part of the config accepts a map, with username as key and password as value.

License: MIT
Signed-off-by: Sina Mahmoodi <itz.s1na@gmail.com>
2018-04-24 14:44:25 +02:00
Sina Mahmoodi
d8f7a2adcc cluster: add version diff log to start errors
License: MIT
Signed-off-by: Sina Mahmoodi <itz.s1na@gmail.com>
2018-04-24 14:27:15 +02:00
Hector Sanjuan
a5710dd055
Merge pull request #387 from ipfs/fix/config/disable-repinnings
Add disable_repinning cluster option
2018-04-24 09:25:40 +02:00
Sina Mahmoodi
03cc809708 config: Add log and testcase for disable_repinning
* Test case creates a bunch of clusters, assigns a pin with replica factor
of n-1 to them, and removes one of the peers randomly. It then tests
to check that the number of clusters pinning the cid is n-2.
* Add warn log to let user know that due to disable_repinning option,
the cluster won't attempt to re-assign the pin.

License: MIT
Signed-off-by: Sina Mahmoodi <itz.s1na@gmail.com>
2018-04-23 22:01:52 +02:00
Sina Mahmoodi
0954c6d6fa Add disable_repinning cluster option
License: MIT
Signed-off-by: Sina Mahmoodi <itz.s1na@gmail.com>
2018-04-22 18:40:46 +02:00
Hector Sanjuan
da0915a098
Merge pull request #350 from ipfs/feat/339-faster-tests
Feat #339: faster tests
2018-04-05 18:16:41 +02:00
Hector Sanjuan
0069c0062f Fix metric expire type. Do not discard metrics in Allocate().
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-05 17:57:24 +02:00
Hector Sanjuan
f5f56f2d11 Add some clarifications about delays
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-05 16:49:26 +02:00
Hector Sanjuan
c73e540b7b Pre-create and pre-connect hosts in tests
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-05 16:49:26 +02:00
Hector Sanjuan
dd4128affc Fix #339: Reduce Sleeps in tests
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-05 16:49:26 +02:00
Hector Sanjuan
58acf16efa cluster: introduce PeerWatchInterval config option.
It should provide a way to speed up peer list updates when
peers join/part. It was hardcoded.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-04-05 16:49:26 +02:00
Hector Sanjuan
95ae1746b7
Merge pull request #370 from ipfs/fix/svc/init/lock
cmd/svc/lock: check config dir and file existence
2018-04-05 09:23:50 +02:00
Hector Sanjuan
c651aed5f1
Merge pull request #358 from ipfs/feat/cmd/svc/force-quit
force quit ipfs-cluster-service on second ctrl-c
2018-04-04 12:22:19 +02:00
Adrian Lanzafame
271c743b51 cmd/svc/lock: check config dir and file existence
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-04-03 15:29:27 +10:00