Fixes: #1835.
If IPFS introduces a new multiaddress type/string that we have not compiled,
we error. This caused issues with latest ipfs version (which we fixed by
upgrading libraries too). This makes cluster a bit more future proof with
upcoming ipfs versions.
We are propagating the wrong context (mostly from the Cluster top-level
methods). This makes that request cancellations (and cancellations of the
associated contexts) are not propagated to many methods, and can result in
deadlocks when an operation that is holding a lock is not aborted.
This affects for example the operation tracker. Getting all operations from
the tracker relies on someone reading from the out channel, or on the context
being cancelled. When a request is aborted in the middle of the response, and
the context is not cancelled, everything that wants to list operations would
become deadlocked, including operations that need write locks like
TrackNewOperation.
This fixes it.
When IPFS starts failing or doesn't respond (i.e. during a restart), cluster
is likely to start sending requests at very fast rates. i.e. if there are 100k
items to be pinned, and pins start failing immediately, cluster will consume
the pin queue really fast and it will all be failures. At the same time, ipfs
is hammered non-stop until recover, which may make it harder.
This commits introduces a rate-limit when requests to IPFS fail. After 10
failed requests, requests will be sent at most at 1req/s rate. Once a requests
succeeds, the rate-limit is raised.
This should prevent hammering the IPFS daemon, but also increased CPU in
cluster as it burns through pinning queues when IPFS is offline, making the
situation in machines worse (and emitting way more logs).
This fixes a bug in API code that made it return 204-No content when the RPC
methods failed with an error before any items were returned on the channel.
This commit introduces unlimited waiting on start until a request to `ipfs id`
succeeds.
Waiting has some consequences:
* State watching (recover/sync) and metrics publishing does not start until ipfs is ready
* swarm/connect is not triggered until ipfs is ready.
Once the first request to ipfs succeeds everything goes to what it was before.
This alleviates trying operations like sending our IDs in metrics when IPFS is
simply not there.
This fixes two bugs. First, the "blockPut response CID does not match the
multihash" warning was coming up when it shouldn't. Particularly, the
multipart reader called Node() several times for the same block, resulting in
CIDs been removed from the Seen set, and causing the warning when there were
several blocks (usually the empty dir block).
This also means we were counting Blockputs (and total data added) wrong in the
metrics, double-counting some blocks as these were recorded in Node() calls.
The fix makes the tracking in Next(), which is only called once for each
block. To avoid timing issues between Block reads from the channel and
blockput responses, the Seen set now stores how many times we have seen a
block. Thus a duplicated block that will get two BlockPut responses will not
trigger a warning regardless of the time when those responses arrive.
Fixes#1706.
* Gauge for total number of ipfs pins
* Counter for pin/add
* Counter for pin/add errors
* Counter for Block/Puts
* Counter for blocks added
* Counter for block/added size
* Counted for block/added errors
Adding keeps retrying indefinitely to send blocks to ipfs. When ipfs is down this:
* Stores all errors
* Keeps retrying
* Additionally, forgot to close bodies
This resulted in memory leaks. The new behaviour does not keep retrying. And
response bodies are closed after reading.
Because we are adding blocks on a single call, and we choose the format
parameter based on the prefix of the first block, IPFS will return block CIDs
based on that option.
This caused warnings when adding content that has multiple CID prefixes: for
example, any cid-version=1 file will include both dag-pb CIDs and raw
CIDs. Since the first block is usually a leave, IPFS will only return
raw-cids, and cause a warning because of the CID-mistmatch.
This fixes things by comparing multihashes only.
But! We might be writing blocks with the wrong CID and then the good CID won't
work!
Correct, we might, in some corner cases.
In go-ipfs >= 0.12.0, all blocks are addressed by multihash so CID prefixes
are irrelevant. This problem does not exist in that case.
In go-ipfs < 0.12.0, if a read for a CIDv1 DAG-PB fails, it is retried as it
it was raw. This means that if we wrote something with cidv1/format=raw, that
should have been a cidv1/format=dag-pb, the read will still work. That covers
some common cases (i.e. adding with cid-version=1) because the first block
should be a raw-leaf. Default-params (cidv0) is not affected since everything
is raw multihashes. However, there are still possible CAR layouts etc. where
cluster will write blocks wrongly to older IPFS versions.
This commit introduces an api.Cid type and replaces the usage of cid.Cid
everywhere.
The main motivation here is to override MarshalJSON so that Cids are
JSON-ified as '"Qm...."' instead of '{ "/": "Qm....." }', as this "ipld"
representation of IDs is horrible to work with, and our APIs are not issuing
IPLD objects to start with.
Unfortunately, there is no way to do this cleanly, and the best way is to just
switch everything to our own type.
The ipfsadder actually ends up DAG-putting some nodes several times
(i.e. non-leafs, root)... but usually one after the other. This prevents that
and avoids sending the same data multiple times over the wire (not a good
thing to 3x a small payload because of this).
This commit fixes#810 and adds block streaming to the final destinations when
adding. This should add major performance gains when adding data to clusters.
Before, everytime cluster issued a block, it was broadcasted individually to
all destinations (new libp2p stream), where it was block/put to IPFS (a single
block/put http roundtrip per block).
Now, blocks are streamed all the way from the adder module to the ipfs daemon,
by making every block as it arrives a single part in a multipart block/put
request.
Before, block-broadcast needed to wait for all destinations to finish in order
to process the next block. Now, buffers allow some destinations to be faster
than others while sending and receiving blocks.
Before, if a block put request failed to be broadcasted everywhere, an error
would happen at that moment.
Now, we keep streaming until the end and only then report any errors. The
operation succeeds as long as at least one stream finished successfully.
Errors block/putting to IPFS will not abort streams. Instead, subsequent
blocks are retried with a new request, although the method will return an
error when the stream finishes if there were errors at any point.
This commit makes all the changes to make Peers() a streaming call.
While Peers is usually a non problematic call, for consistency, all calls
returning collections assembled through broadcast to cluster peers are now
streaming calls.
This commit continues the work of taking advantage of the streaming
capabilities in go-libp2p-gorpc by improving the ipfsconnector and pintracker
components.
StatusAll and RecoverAll methods are now streaming methods, with the REST API
output changing accordingly to produce a stream of GlobalPinInfos rather than
a json array.
pin/ls request to the ipfs daemon now use ?stream=true and avoid having to
load the full pinset map on memory. StatusAllLocal and RecoverAllLocal
requests to the pin tracker stream all the way and no longer store the full
pinset, and the full PinInfo status slice before sending it out.
We have additionally switched to a pattern where streaming methods receive the
channel as an argument, allowing the caller to decide on whether to launch a
goroutine, do buffering etc.
This commit introduces the new go-libp2p-gorpc streaming capabilities for
Cluster. The main aim is to work towards heavily reducing memory usage when
working with very large pinsets.
As a side-effect, it takes the chance to revampt all types for all public
methods so that pointers to static what should be static objects are not used
anymore. This should heavily reduce heap allocations and GC activity.
The main change is that state.List now returns a channel from which to read
the pins, rather than pins being all loaded into a huge slice.
Things reading pins have been all updated to iterate on the channel rather
than on the slice. The full pinset is no longer fully loaded onto memory for
things that run regularly like StateSync().
Additionally, the /allocations endpoint of the rest API no longer returns an
array of pins, but rather streams json-encoded pin objects directly. This
change has extended to the restapi client (which puts pins into a channel as
they arrive) and to ipfs-cluster-ctl.
There are still pending improvements like StatusAll() calls which should also
stream responses, and specially BlockPut calls which should stream blocks
directly into IPFS on a single call.
These are coming up in future commits.
Given that every pin and block/put writes something to IPFS and thus increases
the repo size, a while ago we added a check to let the IPFS connector directly
trigger the sending of metrics every 10 of such requests. This was meant to
update the metrics more often so that balancing happened more granularly
(particularly the freespace one).
In practice, on a cluster that receives several hundreds of pin/adds
operations in a few seconds, this is just bad.
So:
* We disable by default the whole thing.
* We add a new InformerTriggerInterval configuration option to enable the thing.
* Fix a bug that made this always call the first informer, which may not
have been the freespace one).
Fixes#1554
Fixes: peer names unset for remote peers
This adds an IPFS field to pin status information (PinInfoShort).
It has not been easy to add this, given that the IPFS ID is something that
comes from outside of cluster (unlike the peer name). After several tries I
have settled in the following things:
- Use the ping metric to send out peer names and IPFS IDs to the peers in the
cluster.
- Cache the latest known IPFS ID (if IPFS dies we should still be setting
the ID).
- Provide an RPC method for the Pintracker to obtain IPFS ID from the cache.
- Given we now know information for peernames and IPFS IDs from other peers,
we can use that information even if the requests to them error or we are not
contacting (i.e. peers allocated as remote are not queried for status). We can
use the information from the last received ping metric.
- This means we should keep metrics around even if peers go away, at least for
a while rather than deleting them as soon as we detect that they have expired.
Puting it all together we now have a system to gossip peer information around on top
of the ping metrics.
The pin objects now include an Origin field which was set to
multiaddr.Multiaddr.
This type is an interface and has no idea how to deserialize itself. As a
result, an state json export cannot be deseralized.
We had already added a wrapper Multiaddr type to handle these issues. I
believe this fix does not affect anything else other than fixing
UnmarshalJSON. PB types are deserialized by hand and it should not make a
difference.
This allows the ipfshttp IPFS-connector to support pin origins.
When present, it will issue "swarm connect" requests to IPFS, in the
background, with 0 weight (unsupported in go-ipfs yet) to a maximum of 10
peers. Errors are ignored.
branch feat/1374-providers # Changes to be committed: # modified: ipfshttp.go
../../cmd/ipfs-cluster-service/export.json #
../../cmd/ipfs-cluster-service/test/ #
Receive the full pin object so that it can decide whether to check
for recursive or direct pins directly.
Additionally, unpin will not check for the pin presence anymore and
simply trigger unpins (ignoring errors)
* Libp2p protectors no longer needed, use PSK directly
* Generate cluster 32-byte secret here (helper gone from pnet)
* Switch to go-log/v2 in all places
* DHT bootstrapping not needed. Adjust DHT options for tests.
* Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp.
* Disable QUIC (does not support private networks)
* Fix tests: autodiscovery started working properly
Currently we were only specifying the block format. When adding with
a custom hash function, even though we produced the right cids, IPFS
did not know the hash function and ended up storing them using SHA256.
Additionally, since NodeWithMeta serializes the CID, we do not need
to carry a Format parameter (which specifies the Codec): it is already
embedded.
Tests have been added and BlockPut in ipfshttp now checks that the
response's CID matches the data sent. This will catch errors like
what was happening, but also any data corruption between cluster and
IPFS during the block upload.
* add ipv6 listening addresses to the default config
* ipfsproxy: support multiple listeners. Add default ipv6.
* mm
* restapi: support multiple listen addresses. enable ipv6
* cluster_config: format default listen addresses
* commands: update for multiple listeners. Fix randomports for udp and ipv6.
* ipfs-cluster-service: fix randomports test
* multiple listeners: fix remaining tests
* golint
* Disable ipv6 in defaults
It is not supported by docker by default. It is not supported in travis-CI
build environments. User can enable it now manually.
* proxy: disable ipv6 in test
* ipfshttp: fix test
Co-authored-by: @RubenKelevra <cyrond@gmail.com>