Commit Graph

107 Commits

Author SHA1 Message Date
Kishan Sagathiya
295915272b Tests: multiple fixes to tests reliability (#943)
This makes a number of fixes to improve the reliability of tests.
2019-10-31 21:51:13 +01:00
Hector Sanjuan
1a0998f10d CRDT: update and increase timeout
* Update go-ds-crdt to 0.1.5 which adds a return statement in case of error fetching a node.
* Increase DAG-Get timeout to 2 minutes
* Downgrade go-bitswap to 0.1.6.
2019-09-12 19:22:52 +02:00
Hector Sanjuan
81ea1e76bc Merge branch 'master' into feat/sort-responses 2019-09-06 15:13:54 +02:00
Hector Sanjuan
d63a7fd641
Merge pull request #877 from ipfs/fix/ipfs-to-p2p
Use `p2p` protocol name over `ipfs` for multiaddr
2019-09-06 15:00:36 +02:00
Hector Sanjuan
8ca0f5781c Fix: trust-all remained enabled always 2019-08-27 19:18:50 +02:00
Kishan Mohanbhai Sagathiya
c109a01343 Sort peers for crdt consensus.Peers 2019-08-26 18:27:17 +05:30
Hector Sanjuan
df7621ba6e crdt: inform about trust all mode 2019-08-26 12:22:44 +02:00
Kishan Mohanbhai Sagathiya
6656b80a00 Some more occurences of /ipfs
and use  SwapToP2pMultiaddrs (very helpful since ipfs still send
addresses with `/ipfs` tag)
2019-08-16 11:56:09 +05:30
Hector Sanjuan
28ae394fa9 Fix #883: Tweak timeouts for better tests 2019-08-13 19:44:48 +02:00
Hector Sanjuan
676ad1b61e CRDT: TrustAll by default. 2019-08-12 10:25:04 +02:00
Kishan Sagathiya
0a5598a922 Fix #211: Remove commented code around LeaderObservation (#858)
* Remove 32bit safegaurd and remove LeaderObersvation
2019-07-29 19:11:24 +02:00
Kishan Sagathiya
7f52242f35 Fix #840: Removed Raft peers should dissapear from peerstore (#846)
With this commit, cluster peer will observe on events of peer removal
from cluster. On occurence of the event, the cluster peer will clear
the removed peer from its peerstore.
2019-07-25 14:40:05 +02:00
Kishan Sagathiya
e7b731e0e4 Fix #835: service: init --peers
* Init should take a list of peers

This commit adds `--peers` option to `ipfs-cluster-service init`

`ipfs-cluster-service init --peers <multiaddress,multiaddress>`

- Adds and writes the given peers to the peerstore file
- For raft config section, adds the peer IDs to the `init_peerset`
- For crdt config section, add the peer IDs to the `trusted_peers`
2019-07-25 10:47:44 +02:00
Hector Sanjuan
b804e61ef0 Update deps along with go-libp2p-core refactor
Lots of rewrites in imports...
2019-06-14 13:10:45 +02:00
Hector Sanjuan
83c4866100 Remove Leftover println 2019-06-13 17:27:31 +02:00
Hector Sanjuan
27368ab077 Fix: alert at most once PER METRIC
Before it would alert at most once per peer, which prevented some metrics
from alerting at all.
2019-06-11 11:44:12 +02:00
Hector Sanjuan
f1707e40f2
Merge pull request #811 from ipfs/rpi-fixes
Multiple fixes to CRDTs
2019-06-10 14:53:34 +01:00
Hector Sanjuan
b349aacc83 crdt: Allow to configure CRDT in "TrustAll" mode
Specifying "*" as part of "trusted_peers" in the configuration will
result in trusting all peers.

This is useful for private clusters where we don't want to list every
peer ID in the config.
2019-06-10 13:35:25 +02:00
Hector Sanjuan
f0b7d2a839 crdt: fix: Create the ipfslite peer before peermanager bootstrapping
Bitswap needs to exist before connections are opened!

Fixes #798
2019-06-09 15:12:19 +02:00
Hector Sanjuan
451b91d8d1 crdt: fix wrapping the ipfslite dag syncer
No need to wrap it anymore since we don't time out in it.

Also, it was preventing the use of bitswap sessions because it does
not implement them.
2019-06-07 23:33:52 +02:00
Hector Sanjuan
16a2a3611f Reorder imports 2019-06-05 19:31:51 +02:00
Adrian Lanzafame
21c2b6fbdd
add tracing to all crdt methods
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-31 12:58:26 +10:00
Adrian Lanzafame
4cc5182502
add tracing to crdt hooks
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-31 12:56:33 +10:00
Hector Sanjuan
2a2f8c0309 CRDT: Fix protecting of nodes before loading the peerstore
Addresses comments from review #792
2019-05-27 14:27:23 +02:00
Hector Sanjuan
196aa23f34 Fix #787: Connectivity fixes
Currently, unless doing Join() (--bootstrap), we do not connect to any peers on startup.

We however loaded up the peerstore file and Raft will automatically connect
older peers to figure out who is the leader etc. DHT bootstrap, after Raft
was working, did the rest.

For CRDTs we need to connect to people on a normal boot as otherwise, unless
bootstrapping, this does not happen, even if the peerstore contains known peers.

This introduces a number of changes:

* Move peerstore file management back inside the Cluster component, which was
already in charge of saving the peerstore file.
* We keep saving all "known addresses" but we load them with a non permanent
TTL, so that there will be clean up of peers we're not connected to for long.
* "Bootstrap" (connect) to a small number of peers during Cluster component creation.
* Bootstrap the DHT asap after this, so that other cluster components can
initialize with a working peer discovery mechanism.
* CRDT Trust() method will now:
  * Protect the trusted Peer ID in the conn manager
  * Give top priority in the PeerManager to that Peer (see below)
  * Mark addresses as permanent in the Peerstore

The PeerManager now attaches priorities to peers when importing them and is
able to order them according to that priority. The result is that peers with
high priority are saved first in the peerstore file. When we load the peerstore
file, the first entries in it are given the highest priority.

This means that during startup we will connect to "trusted peers" first
(because they have been tagged with priority in the previous run and saved at
the top of the list). Once connected to a small number of peers, we let the
DHT bootstrap process in the background do the rest and discover the network.

All this makes the peerstore file a "bootstrap" list for CRDTs and we will attempt
to connect to peers on that list until some of those connections succeed.
2019-05-27 14:27:23 +02:00
Hector Sanjuan
b46f022884 Raft: rewrite logger
New Raft update has changed the type of the logger
2019-05-25 00:24:30 +02:00
Hector Sanjuan
44d93d61e0 fix timeouts in crdt 2019-05-21 11:55:48 +02:00
Hector Sanjuan
8e6eefb714 Tests: multiple fixes
This fixes multiple issues in and around tests while
increasing ttls and delays in 100ms. Multiple issues, including
races, tests not running with consensus-crdt missing log messages
and better initialization have been fixed.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-05-20 23:45:04 +02:00
Hector Sanjuan
21032f2101 Raft: remove TODO. Trust all peers. 2019-05-13 23:22:08 +02:00
Hector Sanjuan
d468ea5d31 crdt: add test for DistrustPeer 2019-05-13 23:22:08 +02:00
Hector Sanjuan
dbc52ae981 rpc auth: golint 2019-05-09 22:36:03 +02:00
Hector Sanjuan
949e6f2364 RPC auth: Support Trusted Peers in CRDT consensus component.
TrustedPeers are specified in the configuration. Additional peers
can be added at runtime with Trust/Distrust functions.

Unfortunately we cannot use consensus.PeerAdd as a way to trust a peer as
cluster.PeerAdd+Join can be called by any peer and this calls
consensus.PeerAdd.

The result is consensus.PeerAdd doing a lot in Raft while consensus.Trust does
nothing, while in CRDTs consensus.Trust does something but consensus.PeerAdd
does nothing. But this is more or less consistent.
2019-05-09 19:48:40 +02:00
Hector Sanjuan
70f4cad613 RPC Auth: start using the RPC policy in the RPC server. 2019-05-09 15:14:26 +02:00
Hector Sanjuan
3d49ac26a5 Feat: Split components into RPC Services
I had thought of this for a very long time but there were no compelling
reasons to do it. Specifying RPC endpoint permissions becomes however
significantly nicer if each Component is a different RPC Service. This also
fixes some naming issues like having to prefix methods with the component name
to separate them from methods named in the same way in some other component
(Pin and IPFSPin).
2019-05-04 21:36:10 +01:00
Hector Sanjuan
acbd7fda60 Consensus: add new "crdt" consensus component
This adds a new "crdt" consensus component using go-ds-crdt.

This implies several refactors to fully make cluster consensus-component
independent:

* Delete mapstate and fully adopt dsstate (after people have migrated).
* Return errors from state methods rather than ignoring them.
* Add a new "datastore" modules so that we can configure datastores in the
   main configuration like other components.
* Let the consensus components fully define the "state.State". Thus, they do
not receive the state, they receive the storage where we put the state (a
go-datastore).
* Allow to customize how the monitor component obtains Peers() (the current
  peerset), including avoiding using the current peerset. At the moment the
  crdt consensus uses the monitoring component to define the current peerset.
  Therefore the monitor component cannot rely on the consensus component to
  produce a peerset.
* Re-factor/re-implementation of "ipfs-cluster-service state"
  operations. Includes the dissapearance of the "migrate" one.

The CRDT consensus component defines creates a crdt-datastore (with ipfs-lite)
and uses it to intitialize a dssate. Thus the crdt-store is elegantly
wrapped. Any modifications to the state get automatically replicated to other
peers. We store all the CRDT DAG blocks in the local datastore.

The consensus components only expose a ReadOnly state, as any modifications to
the shared state should happen through them.

DHT and PubSub facilities must now be created outside of Cluster and passed in
so they can be re-used by different components.
2019-04-17 19:14:26 +02:00
Hector Sanjuan
ea85cf7805 Rename "test.Test*" to "test.*" (test.TestCid1 -> test.Cid1)
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:19:10 +00:00
Hector Sanjuan
9df6344a07 Avoid using string testing CIDs and use cid.Cids directly
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:09:31 +00:00
Hector Sanjuan
cbf51a2b66 Fix struct tags
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 18:50:46 +00:00
Hector Sanjuan
c4b18cd5f6 Address issues from self-review
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 18:43:29 +00:00
Hector Sanjuan
6447ea51d2 Remove *Serial types. Use pointers for all types.
This takes advantange of the latest features in go-cid, peer.ID and
go-multiaddr and makes the Go types serializable by default.

This means we no longer need to copy between Pin <-> PinSerial, or ID <->
IDSerial etc. We can now efficiently binary-encode these types using short
field keys and without parsing/stringifying (in many cases it just a cast).

We still get the same json output as before (with minor modifications for
Cids).

This should greatly improve Cluster performance and memory usage when dealing
with large collections of items.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 17:04:35 +00:00
Hector Sanjuan
0fed61192a Remove backwards compatibility hacks
The things removed here have been live for more than 2 releases.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-20 14:02:09 +00:00
Hector Sanjuan
d57b81490f State: Use go-datastore to implement the state interface
Since the beginning, we have used a Go map to store the shared state (pinset)
in memory. The mapstate knew how to serialize itself so that libp2p-raft would
know how to write to disk when it:

* Saved snapshots of the state on shutdown
* Sent the state to a newcomer peer

hashicorp.Raft assumes an in-memory state which is snapshotted from time to
time and read from disk on boot.

This commit adds a `dsstate` implementation of the state interface using
`go-datastore`. This allows to effortlessly switch to a disk-backed state in
the future (as we will need), and also have at our disposal the different
implementations and utilities of Datastore for fine-tuning (caching, batching
etc.).

`mapstate` has been reworked to use dsstate. Ideally, we would not even need
`mapstate`, as it would suffice to initialize `dsstate` with a
`MapDatastore`. BUT, we still need it separate to be able to auto-migrate to
the new format.

This will be the last migration with the current system. Once this has been
released and users have been able to upgrade we will just remove `mapstate` as
it is now.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-02-19 18:31:14 +00:00
Hector Sanjuan
3059ab387a
Merge pull request #663 from roignpar/issue_656
Read config values from env on init command
2019-02-19 17:48:47 +00:00
Robert Ignat
bac982c5aa Add ApplyEnvVars test to raft config
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-18 17:43:54 +02:00
Robert Ignat
168cf76224 Change ApplyEnvVars strategy for all config components
Get jsonConfig from Config, apply env vars to it, load jsonConfig
back into Config.

License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-15 19:07:20 +02:00
Hector Sanjuan
10d6a37304 Update deps: fix the things that need fixing
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-15 12:40:53 +00:00
Robert Ignat
032f02802f Implement ApplyEnvVars for all ComponentConfigs
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-08 23:57:16 +02:00
Robert Ignat
ed30ac1ab4 Add ApplyEnvVars() to ComponentConfig interface
* cluster and restapi configs can also get values from environment variables
* other config components don't read any values from the environment

License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-07 20:51:20 +02:00
Adrian Lanzafame
3b3f786d68
add opencensus tracing and metrics
This commit adds support for OpenCensus tracing
and metrics collection. This required support for
context.Context propogation throughout the cluster
codebase, and in particular, the ipfscluster component
interfaces.

The tracing propogates across RPC and HTTP boundaries.
The current default tracing backend is Jaeger.

The metrics currently exports the metrics exposed by
the opencensus http plugin as well as the pprof metrics
to a prometheus endpoint for scraping.
The current default metrics backend is Prometheus.

Metrics are currently exposed by default due to low
overhead, can be turned off if desired, whereas tracing
is off by default as it has a much higher performance
overhead, though the extent of the performance hit can be
adjusted with smaller sampling rates.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-02-04 18:53:21 +10:00
Adrian Lanzafame
4f194f52d3
use DecodeCid in log_op
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2018-10-30 21:07:27 +10:00