Commit Graph

508 Commits

Author SHA1 Message Date
Hector Sanjuan
d6a7caf7a4 Issue #259: Address CR comments
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-12-04 13:59:48 +01:00
Hector Sanjuan
4922c95589 Support --local parameter for Status[Local] and Sync[Local] operations
This allows to call the Rest API's status and sync endpoints with a
"?local=true" parameter. This will trigger operations but only on the
local peer. Cluster *Local and RPC-*Local methods have been accordingly,
although they are aliases for the PinTracker methods (but otherwise they
would not be exposed in external APIs). ipfs-cluster-ctl has been updated to
support the new flag.

The rationaly behind this feature is that sometimes, a single cluster peer
(or the ipfs daemon in it) is misbehaving. The user then wants to Sync,
Recover, or see Status for that single peer. This is specially relevant
when working with big pinsets in larger clusters, as a Status() call will
be considerably more expensive when broadcasted everywhere.

Note that the Rest API keeps returning GlobalPinInfo objects even on local=true
calls. This ensures that the user always gets the same datatype from an endpoint.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-12-01 12:56:26 +01:00
Hector Sanjuan
e824aea55e RecoverAll: Implement RecoverAllLocal() which recovers all pins in a peer
This adds API, RPC calls to support RecoverAllLocal() (and expose RecoverLocal()
on the Rest API too). cluster-ctl is updated accordingly.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-30 01:53:31 +01:00
Hector Sanjuan
11a8926236 MapPinTracker: support configuration section
This also generates a default configuration section when it
doesn't exist, so it's backwards compatible.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-29 14:42:50 +01:00
Hector Sanjuan
39fb193eaf Peerstore: support dns multiaddresses
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-29 10:34:03 +01:00
Hector Sanjuan
d6800045b5 Dockerfile: Remove ipfs from container
The main container will now run only ipfs-cluster-service.

A new ipfs-cluster-bundle container is built by Dockerfile-bundle
which will provide ipfs-cluster+ipfs.

Fixes #197

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-29 10:34:03 +01:00
Hector Sanjuan
025fc95279 Dockerfile-test: use go-ipfs:master
Allows us to test against latest ipfs (either it's broken and this
helps finding out, or it's not broken anymore and we don't lose time
trying to figure out why).

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-29 10:34:03 +01:00
Hector Sanjuan
cb5012c53b
Merge pull request #220 from ipfs/feat/backups-upgrade-path
Feat/backups upgrade path
2017-11-29 09:55:50 +01:00
Wyatt
47b744f1c0 ipfs-cluster-service state upgrade cli command
ipfs-cluster-service now has a migration subcommand that upgrades
    persistant state snapshots with an out-of-date format version to the
    newest version of raft state. If all cluster members shutdown with
    consistent state, upgrade ipfs-cluster, and run the state upgrade command,
    the new version of cluster will be compatible with persistent storage.
    ipfs-cluster now validates its persistent state upon loading it and exits
    with a clear error in the case the state format version is not up to date.

    Raft snapshotting is enforced on all shutdowns and the json backup is no
    longer run.  This commit makes use of recent changes to libp2p-raft
    allowing raft states to implement their own marshaling strategies. Now
    mapstate handles the logic for its (de)serialization.  In the interest of
    supporting various potential upgrade formats the state serialization
    begins with a varint (right now one byte) describing the version.

    Some go tests are modified and a go test is added to cover new ipfs-cluster
    raft snapshot reading functions.  Sharness tests are added to cover the
    state upgrade command.
2017-11-28 22:35:48 -05:00
Hector Sanjuan
2bc7aec079
Merge pull request #242 from elopio/test-snap-in-travis
Move the snap deploy script to a separate file
2017-11-16 17:30:18 +01:00
Leo Arias
22c1d2401f Move back the docker call to travis 2017-11-16 14:36:42 +00:00
Leo Arias
980c812d20 Pass the multiarch environment variables 2017-11-16 07:05:19 +00:00
Leo Arias
58a884373b Move the snap deploy script to a separate file 2017-11-16 01:52:06 +00:00
Hector Sanjuan
c8dd79b235
Merge pull request #241 from elopio/snap-push-escape
Escape the variable in the snapcraft loop
2017-11-16 00:30:55 +01:00
Leo Arias
ce574a6849 Escape the variable in the snapcraft loop 2017-11-15 23:25:22 +00:00
Hector Sanjuan
ff698141f2
sign release commits too
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-15 23:54:31 +01:00
Hector Sanjuan
0c40d50497 gx publish 0.3.0
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-15 23:40:03 +01:00
Hector Sanjuan
16acaa67c0 Release 0.3.0
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-15 23:39:50 +01:00
Hector Sanjuan
3726ef8c9d Include a tag annotation, sign tags
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-15 23:39:22 +01:00
Leo Arias
81e1f3855e Enable the snap builds for i386, armhf and arm64 (#237)
Enable the snap builds for i386, armhf and arm64
2017-11-15 23:23:01 +01:00
Hector Sanjuan
1afde29fa3 Merge branch '0.3.0/changelog' 2017-11-15 23:21:18 +01:00
Hector Sanjuan
1d67cb3c66
Merge pull request #238 from ipfs/fix/start-panic
cluster: safeguard consensus not set when calling ID
2017-11-15 20:48:08 +01:00
Hector Sanjuan
081384fb7f cluster: Make peersFromMultiaddrs remove any duplicates.
Use it to find out the number of peers in the config and prevent
peerAdd test failures.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-15 18:55:55 +01:00
Hector Sanjuan
47116ab0ce
Merge pull request #239 from ipfs/elopio-patch-1
Get the snap version from git
2017-11-15 18:45:35 +01:00
Leo Arias
bfdfe5e584
Get the snap version from git 2017-11-15 11:22:22 -06:00
Hector Sanjuan
1f93662b3e cluster: get first peerset from configuration
make sure we save a new config if the new peerset
is different than the one in the configuration at
boot.

Hopefully this fixes a race condition in PeerAdd test

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2017-11-15 18:01:49 +01:00
Hector Sanjuan
a656e45375 cluster: safeguard consensus not set when calling ID
SwarmConnect on the ipfs connector calls rpc Peers() which
requests IDs for every peer member. If that peer member
is booting, it might get the request after RPC is setup
but before consensus is initialized. In which case
a panic happens. Probability that this happens is small, but still.

Also increase the connect swarms delay to 30 seconds, which
should be a bit longer than the default wait_for_leader timeout,
otherwise we might connect swarms while there's not even a leader.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-15 16:38:21 +01:00
Hector Sanjuan
7230704dd2 Add one more bugfix
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-15 16:29:10 +01:00
Hector Sanjuan
baa6bc65c0 Update to captain's log
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-15 14:55:34 +01:00
Hector Sanjuan
133117e3e8 Changelog for v0.3.0
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-15 14:55:12 +01:00
Hector Sanjuan
a1f1ef15d8
Merge pull request #236 from ipfs/fix/minor-fixes
Fix/minor fixes
2017-11-15 12:09:30 +01:00
Hector Sanjuan
76fe62fae4 Fix error message when not enough candidates for pinning exist
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-15 03:17:08 +01:00
Hector Sanjuan
145dced3e8 Cluster: Fix libp2p host getting shutdown in the middle of peer removal
This is what it was likely causing PeerRemove tests to fail randomly
but very often. We cancelled the Cluster context before shutting down
the Consensus component. This killed networking and aborted
the peer remove operations when the leader is removing itself.

As a result, it would error with "leadership lost", which would
trigger a retry which would set the final error to "context cancelled"
because the shutdown of the consensus component proceeds during the
retry, cancelling the consensus context.

This is not only affecting tests, it might affected operations when
running cluster.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-15 02:33:46 +01:00
Hector Sanjuan
41b6a114c1
Merge pull request #223 from ipfs/192-docs
Documentation: bring in line to 0.3.0
2017-11-14 23:58:33 +01:00
Hector Sanjuan
cc81ffe96b cluster: peerAdd: try to return an up-to-date new peer ID
Sometimes tests fail because the returned api.ID for a new peer
does not include the current cluster peers. This is because the
new peerset has not yet be commited in the new peer at the time
of the request.

This commit retries obtaining the request until the correct peerset
comes in, or gives up after two seconds retrying.

Rather than the tests failing, note that the ID returned it is very
user-facing and should contain the current cluster peers after adding,
and not the former peerset, at least while peerAdd operation is allowed.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 23:54:23 +01:00
Hector Sanjuan
417f30c9ea Avoid shutting down consensus in the middle of a commit
I think this will prevents some random tests failures
when we realize that we are not anymore in the peerset
and trigger a shutdown but Raft has not finished fully
committing the operation, which then triggers an error,
and a retry. But the contexts are cancelled in the retry
so it won't find a leader and will error finally error
with that message.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 23:29:56 +01:00
Hector Sanjuan
15bb953afd Issue #192: Update docs to the new peerset handling
Also, start removing mentions of `PeerAdd` operation, as things should
happen with `--bootstrap` (Join). PeerAdd should not be part of the user
workflow and might disappear or be hidden in the future.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 22:06:59 +01:00
Hector Sanjuan
0525403f8c cluster-service: make sure we wait for configuration to be saved on init.
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 21:06:10 +01:00
Hector Sanjuan
5e465c3f62 PeerAdd: send cluster multiaddresses to new peer before adding it to raft
It seems more logical that the new peer should know how to contact
everyone before it needs to start doing it, but it probably does not
matter much. Still, more logical.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 21:04:33 +01:00
Hector Sanjuan
398865f381 Issue #192: Fix typos and address feedback
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 19:26:49 +01:00
Hector Sanjuan
7baa4c9c6d Guide/Troubleshooting: add the meanings for some libp2p errors
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 19:26:49 +01:00
Hector Sanjuan
2c3085586c Documentation: bring in line to 0.3.0
Review documentation to be in line with latest updates to Raft and
any other feature introduced since 0.12.0.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 19:26:49 +01:00
Hector Sanjuan
5a7c1fc847 Re-enable snapcraft build on travis
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 19:06:05 +01:00
Hector Sanjuan
09cd86e183
Merge pull request #234 from ipfs/feat/snap-travis-ci
Enable travis for snapcraft builds
2017-11-14 18:24:32 +01:00
Hector Sanjuan
798571d3fc
Merge pull request #226 from ipfs/feat/219-peers
Fix #219: WIP: Remove duplicate peer accounting
2017-11-14 18:23:39 +01:00
Hector Sanjuan
4c3f16d7aa
Merge pull request #233 from ipfs/feat/raft-state-backups
Raft: do not ever remove state, rename it and leave it around
2017-11-14 18:13:15 +01:00
Hector Sanjuan
33c325b865 cluster-service: Fix too many arguments for panic
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 18:01:32 +01:00
Hector Sanjuan
18ff7a79cc cluster-service: Upstream snap patch
Snaps set a custom $HOME, but we were using /etc/passwd.
There might be other cases were using a custom $HOME might be
handy.

In UNIX systems, $HOME should be always set. For all the rest,
we fall back to the original os/user.HomeDir method.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 13:23:43 +01:00
Hector Sanjuan
dc3f5b202e
Merge pull request #231 from ipfs/fix/224-pin-progress
Fix #224: Better handling of progress updates when proxy-adding file
2017-11-14 12:33:09 +01:00
Hector Sanjuan
2837170d51 Raft: improve test error message
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-11-14 12:31:58 +01:00