The go-ds-crdt upgrade disables multi-head-processing by default again. We see
this causes a lot of branching.
We however increase the number of workers. With large deltas, it may be
possible that all the 5 workers are busy downloading a delta or processing
them, while we potentially have hundreds of children in the DAG. Thus it is
not bad to attempt to do more things in parallel.
This allows to specifically request status for several CIDs as
provided in the "cids" query parameter, instead of request status for
all CIDs.
In this case, the filter is ignored.
It is not good to add something locally only to pin it somewhere else:
* The locally used space is not GCed automatically or anything and is lost
* Pay the penalty of having to copy things somewhere else
Given that every pin and block/put writes something to IPFS and thus increases
the repo size, a while ago we added a check to let the IPFS connector directly
trigger the sending of metrics every 10 of such requests. This was meant to
update the metrics more often so that balancing happened more granularly
(particularly the freespace one).
In practice, on a cluster that receives several hundreds of pin/adds
operations in a few seconds, this is just bad.
So:
* We disable by default the whole thing.
* We add a new InformerTriggerInterval configuration option to enable the thing.
* Fix a bug that made this always call the first informer, which may not
have been the freespace one).
Fixes#1554
Fixes: peer names unset for remote peers
This adds an IPFS field to pin status information (PinInfoShort).
It has not been easy to add this, given that the IPFS ID is something that
comes from outside of cluster (unlike the peer name). After several tries I
have settled in the following things:
- Use the ping metric to send out peer names and IPFS IDs to the peers in the
cluster.
- Cache the latest known IPFS ID (if IPFS dies we should still be setting
the ID).
- Provide an RPC method for the Pintracker to obtain IPFS ID from the cache.
- Given we now know information for peernames and IPFS IDs from other peers,
we can use that information even if the requests to them error or we are not
contacting (i.e. peers allocated as remote are not queried for status). We can
use the information from the last received ping metric.
- This means we should keep metrics around even if peers go away, at least for
a while rather than deleting them as soon as we detect that they have expired.
Puting it all together we now have a system to gossip peer information around on top
of the ping metrics.
We call RecoverAll regularly and I noticed it was way slower than it should be.
After all, it should just loop the pinset and enqueued items that are
unexpectedly unpinned or in pin error.
However, at some point we decided that RecoverAll would return information for
all pins, regardless of whether they were recovered or not. This ends up
resulting in a separate Status call for every pin that is already pinned, and
this call hits IPFS. This is pretty bad with big pinsets.
This commit fixes that, we return no state information for pins that are not
touched.
This must has been an oversight. We added a special unexpectedly_unpinned
status so that we could just return things from the operation tracker when
filtering by pin_error. unexpectedly_unpinned are things that we have no
operation for but are unpinned on ipfs.
Status however still returned a pin_error state for these, even though,
StatusAll would not show them when filtering with pin_error, and would show
them as unexpectedly_unpinned otherwise.
Since Recover correctly repins pin_error and unexpectedly_unpinned, this
change has no further consequences.