This new component broadcasts metrics about the current size of the pinqueue,
which can in turn be used to inform allocations.
It has a weight_bucket_size option that serves to divide the actual size by a
given factor. This allows considering peers with similar queue sizes to have
the same weight.
Additionally, some changes have been made to the balanced allocator so that a
combination of tags, pinqueue sizes and free-spaces can be used. When
allocating by [<tag>, pinqueue, freespace], the allocator will prioritize
choosing peers with the smallest pin queue weight first, and of those with the
same weight, it will allocate based on freespace.
* Gauge for total number of ipfs pins
* Counter for pin/add
* Counter for pin/add errors
* Counter for Block/Puts
* Counter for blocks added
* Counter for block/added size
* Counted for block/added errors
We were currently tracking this metrics as a counter (SUM). The number is
good, but conceptually this is more a gauge (LastValue), given it can go down.
Thus we switch it by tracking the aggregation numbers directy in the operation
tracker.
Adding keeps retrying indefinitely to send blocks to ipfs. When ipfs is down this:
* Stores all errors
* Keeps retrying
* Additionally, forgot to close bodies
This resulted in memory leaks. The new behaviour does not keep retrying. And
response bodies are closed after reading.
When batching is enabled, the "batchingstate" is used to add/remove pins, but the
non-batching state is used as read-only state for doing List().
This means that both states will be writing pins metrics but the batching
state is never used for List(), so it never has the right total number.
This fixes that.
Because we are adding blocks on a single call, and we choose the format
parameter based on the prefix of the first block, IPFS will return block CIDs
based on that option.
This caused warnings when adding content that has multiple CID prefixes: for
example, any cid-version=1 file will include both dag-pb CIDs and raw
CIDs. Since the first block is usually a leave, IPFS will only return
raw-cids, and cause a warning because of the CID-mistmatch.
This fixes things by comparing multihashes only.
But! We might be writing blocks with the wrong CID and then the good CID won't
work!
Correct, we might, in some corner cases.
In go-ipfs >= 0.12.0, all blocks are addressed by multihash so CID prefixes
are irrelevant. This problem does not exist in that case.
In go-ipfs < 0.12.0, if a read for a CIDv1 DAG-PB fails, it is retried as it
it was raw. This means that if we wrote something with cidv1/format=raw, that
should have been a cidv1/format=dag-pb, the read will still work. That covers
some common cases (i.e. adding with cid-version=1) because the first block
should be a raw-leaf. Default-params (cidv0) is not affected since everything
is raw multihashes. However, there are still possible CAR layouts etc. where
cluster will write blocks wrongly to older IPFS versions.
This commit introduces an api.Cid type and replaces the usage of cid.Cid
everywhere.
The main motivation here is to override MarshalJSON so that Cids are
JSON-ified as '"Qm...."' instead of '{ "/": "Qm....." }', as this "ipld"
representation of IDs is horrible to work with, and our APIs are not issuing
IPLD objects to start with.
Unfortunately, there is no way to do this cleanly, and the best way is to just
switch everything to our own type.
The ipfsadder actually ends up DAG-putting some nodes several times
(i.e. non-leafs, root)... but usually one after the other. This prevents that
and avoids sending the same data multiple times over the wire (not a good
thing to 3x a small payload because of this).