This allows to specifically request status for several CIDs as
provided in the "cids" query parameter, instead of request status for
all CIDs.
In this case, the filter is ignored.
Fixes#1554
Fixes: peer names unset for remote peers
This adds an IPFS field to pin status information (PinInfoShort).
It has not been easy to add this, given that the IPFS ID is something that
comes from outside of cluster (unlike the peer name). After several tries I
have settled in the following things:
- Use the ping metric to send out peer names and IPFS IDs to the peers in the
cluster.
- Cache the latest known IPFS ID (if IPFS dies we should still be setting
the ID).
- Provide an RPC method for the Pintracker to obtain IPFS ID from the cache.
- Given we now know information for peernames and IPFS IDs from other peers,
we can use that information even if the requests to them error or we are not
contacting (i.e. peers allocated as remote are not queried for status). We can
use the information from the last received ping metric.
- This means we should keep metrics around even if peers go away, at least for
a while rather than deleting them as soon as we detect that they have expired.
Puting it all together we now have a system to gossip peer information around on top
of the ping metrics.
We call RecoverAll regularly and I noticed it was way slower than it should be.
After all, it should just loop the pinset and enqueued items that are
unexpectedly unpinned or in pin error.
However, at some point we decided that RecoverAll would return information for
all pins, regardless of whether they were recovered or not. This ends up
resulting in a separate Status call for every pin that is already pinned, and
this call hits IPFS. This is pretty bad with big pinsets.
This commit fixes that, we return no state information for pins that are not
touched.
This adds a Timestamp field to the pin objects. This allows to track when they were pinned.
This:
* Allows the pin-tracker to actually show accurate information on when the pin
entered the system for pins that are not part of ongoing operations
(currently it shows time.Now())
* Adds support for reporting timestamp on a pinning services api.
On large pinsets this may take a very long time and prevents metrics and
re-boostrapping from starting, among other things. See bug description.
This lets watchPinset trigger an immediate RecoverAllLocal instead, but this
happens in its own goroutine and should allow everything else to start.
This commit modifies the pintracker StatusAll call to take a status filter.
This allows to skip a PinLs call to ipfs when checking status for items that
are queued, pinning, unpinning or in error. Those status come directly from
the operation tracker. This should result in a significant performance
increase for those calls, particularly in nodes with several hundred thousand
pins and more, where the call to IPFS is very expensive.
A new TrackerStatusUnexpectedlyUnpinned status has been introduce to
differentiate between pin errors (tracked by the operation tracker) and "lost"
items (which before were pin errors too). This new status is handled by the
Recover() operation as before.
StatusCID() and Peers() are calls that should return relatively quickly.
If they don't, it means they are hanging for some reason. We cannot let the
whole Multicall request hang waiting on a single peer, therefore, set a
hardcoded 15 second deadline for both.
The Allocations of a pin that has been added with default replication factor
are kept even when the replication factor turns out to be -1.
This resulted in the Status(cid) code skipping calls to a number of peers
and setting the pin directly as REMOTE.
The fix, on one side makes sure Allocations is always nil when the replication
factor is -1. On the other size, lets the globalPinInfoCid method check the
replication factor value, rather than the number of allocations to decide if
any nodes are bound to be remote.
On the plus side, the pin tracker used the IsRemotePin method, which uses the
replication factor, so things were pinned even if the Status(cid) method shows
them as remote.
* feat: make MDNS failure on start non-fatal
- if discovery.NewMdnsService errors on start, show warning "error message, MDNS service will be disabled"
- same as setting MDNSInterval to 0: NewCluster is still created and daemon runs, but without MDNS
GlobalPinInfo objects carried redundant information (Cid, Peer) that takes
space and time to serialize.
This has been addressed by having GlobalPinInfo embed PinInfoShort rather than
PinInfo. This new types ommits redundant fields.
We cannot rely on current pin allocations anymore since the allocations
may be follower peers that do not even handle alerts.
Instead, we assume that we are a trusted peer (because we are not in follower
mode), get all other trusted peers and act if we are the XOR-closest to the
CID.
This also means that replication-factor = 1 pins can be recovered too.
i.e. a direct pin can be repinned as recursive, but a recursive
pin cannot be pinned as direct (this fails in IPFS too).
Additionally, save a couple of calls to the datastore by obtaining the
existing pin only once.
Add necessary validators.
There is currently no way to run a dynamic-mode DHT that supports
LAN-based peers. Therefore for the time-being we are setting the
dht.Mode to "server".
* Libp2p protectors no longer needed, use PSK directly
* Generate cluster 32-byte secret here (helper gone from pnet)
* Switch to go-log/v2 in all places
* DHT bootstrapping not needed. Adjust DHT options for tests.
* Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp.
* Disable QUIC (does not support private networks)
* Fix tests: autodiscovery started working properly
StateSync() used to take care of this by issuing Track() calls. But this
functionality was removed.
This starts returning items that are in the state but not on IPFS as
PIN_ERRORs. It ensures that the Recover methods see them so that they can
trigger repinnings for missing items. This covers cases where the user
modifies the ipfs state manually, or resets the ipfs daemon but keeps the
cluster state, and cases where cluster was stopped half-way through a pinning.
This removes mappintracker and sets stateless tracker as the default (and only) pintracker component.
Because the stateless tracker matches the cluster state with only ongoing operations being kept on memory, and additional information provided by ipfs-pin-ls, syncing operations are not necessary. Therefore the Sync/SyncAll operations are removed cluster-wide.
This introduces the possiblity of running Cluster with multiple informer components. The first one in the list is the used for "allocations". This is just groundwork for working with several informers in the future.
This adds a PeerAddresses entry to the main cluster configuration.
The peer will ingest and potentially connect to those peer addresses during
the start (similarly to the ones in the peerstore).
This allows to provide "bootstrap" (as in "peers we connect to") addresses
directly in the configuration, which is useful when distributing a single
configuration template that will allow a cluster peer to know where to connect
on the first boot.