Commit Graph

114 Commits

Author SHA1 Message Date
Hector Sanjuan
c871c85f98 stateless: cast empty peer.ID correctly 2022-03-11 16:41:22 +01:00
Hector Sanjuan
cd5f9c869d pinsvcapi: Test and fix bugs in Add Pin. Improve performance. 2022-03-11 14:01:15 +01:00
Hector Sanjuan
fe4c0f61c9 stateless: fix test mock rpc 2022-03-11 09:58:19 +01:00
Hector Sanjuan
5fed4a2c5e types: include IPFSAddresses in pinInfo objects.
pinsvcapi: do not cache peer information here as all the needed information is
in the status objects.

This adds ipfs_addresses as a field broadcasted with the ping metrics.
2022-03-10 23:49:01 +01:00
Hector Sanjuan
0787ffbe36 PinInfo type: include Allocations, Origins, Metadata
This will facilitate building outputs for the Pinning Services API, saving a
round trip to query the cluster State, since all the needed information
already comes from the PinTracker, which has already accessed the state.

Since the pintracker already included a state attribute (Name), we are simply
going down that path.
2022-02-02 00:52:38 +01:00
Hector Sanjuan
60c6b16ac6 pintracker: Remove unnecessary locking 2022-01-31 21:04:11 +01:00
Hector Sanjuan
5e89c0ba41 Pintracker: set Name in operation tracker. Fixes #1212. 2022-01-31 21:04:11 +01:00
Hector Sanjuan
809b7fbda5 Pintracker: add IPFS ID to Pin Information
Fixes #1554
Fixes: peer names unset for remote peers

This adds an IPFS field to pin status information (PinInfoShort).

It has not been easy to add this, given that the IPFS ID is something that
comes from outside of cluster (unlike the peer name). After several tries I
have settled in the following things:

- Use the ping metric to send out peer names and IPFS IDs to the peers in the
  cluster.
- Cache the latest known IPFS ID (if IPFS dies we should still be setting
  the ID).
- Provide an RPC method for the Pintracker to obtain IPFS ID from the cache.
- Given we now know information for peernames and IPFS IDs from other peers,
  we can use that information even if the requests to them error or we are not
  contacting (i.e. peers allocated as remote are not queried for status). We can
  use the information from the last received ping metric.
- This means we should keep metrics around even if peers go away, at least for
  a while rather than deleting them as soon as we detect that they have expired.

Puting it all together we now have a system to gossip peer information around on top
of the ping metrics.
2022-01-31 17:53:09 +01:00
Hector Sanjuan
af0cf8b106
Merge pull request #1538 from ipfs/pintracker/recover-speedup
pintracker: RecoverAll should only return status for recovered items
2022-01-11 17:09:20 +01:00
Hector Sanjuan
592ce450ce pintracker: RecoverAll should only return status for recovered items
We call RecoverAll regularly and I noticed it was way slower than it should be.

After all, it should just loop the pinset and enqueued items that are
unexpectedly unpinned or in pin error.

However, at some point we decided that RecoverAll would return information for
all pins, regardless of whether they were recovered or not. This ends up
resulting in a separate Status call for every pin that is already pinned, and
this call hits IPFS. This is pretty bad with big pinsets.

This commit fixes that, we return no state information for pins that are not
touched.
2022-01-11 16:22:03 +01:00
Hector Sanjuan
789b633366 pintracker: set status unexpectedly_unpinned correctly on Status()
This must has been an oversight. We added a special unexpectedly_unpinned
status so that we could just return things from the operation tracker when
filtering by pin_error. unexpectedly_unpinned are things that we have no
operation for but are unpinned on ipfs.

Status however still returned a pin_error state for these, even though,
StatusAll would not show them when filtering with pin_error, and would show
them as unexpectedly_unpinned otherwise.

Since Recover correctly repins pin_error and unexpectedly_unpinned, this
change has no further consequences.
2022-01-10 13:04:21 +01:00
Hector Sanjuan
67eeb45798 pintracker: clean exit during recover
Shutting down the cluster while a recover operation is ongoing resulted in
each remaining pin in the recover loop producing an error in the logs, as the
loop kept going even though compontents were already shutdown.

With 8 million items, this meant a lot of log messages, and a shutdown delay
that forced the killing of the process in most cases.

The recover loop now exits when the component's context is cancelled.
2021-12-17 11:47:50 +01:00
Hector Sanjuan
33343751b0 Stateless: add tests to make sure AttemptCounts are set correctly 2021-11-30 04:43:16 +01:00
Hector Sanjuan
c4ca5b7abe pintracker: carry over attempt account only for operations of the same type 2021-11-30 04:20:35 +01:00
Hector Sanjuan
7cf40de354 pintracker: Fix attempt count not increasing. Expose priority queueing.
"RetryCount" has been renamed to "AttemptCount", because it counts attempts.
2021-11-30 04:20:35 +01:00
Hector Sanjuan
be5c2d1569 pintracker: PriorityPinMaxAge and PriorityPinMaxRetries
These new configuration settings control whether pins are enqueued in the
priority queue or not.
2021-11-30 04:20:35 +01:00
Hector Sanjuan
0a146dae76 pintracker: support a priority channel for pinning 2021-11-30 04:20:35 +01:00
Hector Sanjuan
29c277b67f Pintracker: add and track retry counts in the operation manager.
Report retry count in the PinStatus
2021-11-30 04:20:35 +01:00
Hector Sanjuan
e9857652f2 Add a timestamp to Pins
This adds a Timestamp field to the pin objects. This allows to track when they were pinned.

This:

* Allows the pin-tracker to actually show accurate information on when the pin
  entered the system for pins that are not part of ongoing operations
  (currently it shows time.Now())
* Adds support for reporting timestamp on a pinning services api.
2021-10-20 16:55:57 +02:00
Hector Sanjuan
27569bdf88 pintracker: avoid listing the state unless necessary
This is a follow up to #1360 which further optimizes StatusAll calls by
avoiding listing and filtering the cluster state when requesting status for
operations that should be direclty in the operation tracker because they are
ongoing (queued, pinning, unpinning, error).
2021-07-08 01:01:25 +02:00
Hector Sanjuan
edfcfa3fb0 Fix #1360: Efficient pinset status with filters
This commit modifies the pintracker StatusAll call to take a status filter.

This allows to skip a PinLs call to ipfs when checking status for items that
are queued, pinning, unpinning or in error. Those status come directly from
the operation tracker. This should result in a significant performance
increase for those calls, particularly in nodes with several hundred thousand
pins and more, where the call to IPFS is very expensive.

A new TrackerStatusUnexpectedlyUnpinned status has been introduce to
differentiate between pin errors (tracked by the operation tracker) and "lost"
items (which before were pin errors too). This new status is handled by the
Recover() operation as before.
2021-07-06 11:34:19 +02:00
Hector Sanjuan
88cfcf62fc
Pintracker: use cid.Cid as map keys. (#1322)
* No reason not to.
* Should save a non trivial amount of time when doing
"Status" operation by not having to stringify Cids.
2021-03-16 15:47:44 +01:00
Hector Sanjuan
e967238848
Merge pull request #1129 from ipfs/fix/1013-follow-list
Improvements to ipfs-cluster-follow * list
2020-05-16 02:31:06 +02:00
Hector Sanjuan
c026299b95 Include Name as GlobalPinInfo key and consolidate redundant keys
GlobalPinInfo objects carried redundant information (Cid, Peer) that takes
space and time to serialize.

This has been addressed by having GlobalPinInfo embed PinInfoShort rather than
PinInfo. This new types ommits redundant fields.
2020-05-16 02:27:24 +02:00
Kishan Mohanbhai Sagathiya
ae8e74453b Fix #937: Print full working configuration at startup
Only when using debug mode

Co-authored-by: Hector Sanjuan <code@hector.link>
2020-05-15 01:33:04 +02:00
Hector Sanjuan
7e9cece29c Include pin Names in PinInfo objects
Fixes #1013 by avoiding to have to make a specific request to allocations.
2020-05-15 00:18:14 +02:00
Hector Sanjuan
fa762d78cf Improve PinLsCid to check for pinned items
Receive the full pin object so that it can decide whether to check
for recursive or direct pins directly.

Additionally, unpin will not check for the pin presence anymore and
simply trigger unpins (ignoring errors)
2020-04-21 17:23:55 +02:00
Hector Sanjuan
717ed85823 gofmt -s fixes 2020-04-14 23:44:18 +02:00
Hector Sanjuan
f83ff9b655 staticcheck: fix all staticcheck warnings in the project 2020-04-14 20:16:10 +02:00
Hector Sanjuan
b3853caf36 Dependency ugprade: changes needed
* Libp2p protectors no longer needed, use PSK directly
* Generate cluster 32-byte secret here (helper gone from pnet)
* Switch to go-log/v2 in all places
* DHT bootstrapping not needed. Adjust DHT options for tests.
* Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp.
* Disable QUIC (does not support private networks)
* Fix tests: autodiscovery started working properly
2020-03-22 14:50:25 +01:00
Hector Sanjuan
09d933fde1 pintracker: take care of tests
Simplify the tests, remove things that are not used at all, align the
behaviour of the mocks, add methods to test the correct behaviour of Status
etc.
2019-12-13 12:03:01 +01:00
Hector Sanjuan
8b6fd1fabe Fix: stateless: cluster should pin items that are in the state but not on ipfs
StateSync() used to take care of this by issuing Track() calls. But this
functionality was removed.

This starts returning items that are in the state but not on IPFS as
PIN_ERRORs. It ensures that the Recover methods see them so that they can
trigger repinnings for missing items. This covers cases where the user
modifies the ipfs state manually, or resets the ipfs daemon but keeps the
cluster state, and cases where cluster was stopped half-way through a pinning.
2019-12-13 12:00:03 +01:00
Kishan Sagathiya
5258a4d428 Remove map pintracker (#944)
This removes mappintracker and sets stateless tracker as the default (and only) pintracker component.

Because the stateless tracker matches the cluster state with only ongoing operations being kept on memory, and additional information provided by ipfs-pin-ls, syncing operations are not necessary. Therefore the Sync/SyncAll operations are removed cluster-wide.
2019-12-12 21:22:54 +01:00
Kishan Mohanbhai Sagathiya
19cde2e8cf Error queue is full for stateless pintracker
- increase max pin queue size to 1 million
- hide max_pin_queue_size from configuration
2019-09-11 12:53:51 +07:00
Hector Sanjuan
c2b28be6de
Merge pull request #901 from ipfs/fix/pin-queue-full
Error queue is full
2019-09-06 15:05:53 +02:00
Kishan Mohanbhai Sagathiya
2d9e6c1de8 Error queue is full
- abort if a Track() calls fails due to queue being full
- increase max pin queue size to 1 million
- hind max_pin_queue_size from configuration
- use an elaborated error message

Fixes #377
2019-08-26 13:23:02 +05:30
Hector Sanjuan
62b7054d31 Fix: pintrackers: Do not spam the logs when running recover
Currently logs every pin we call recover with. We call recover regularly.
So it will print all pins.
2019-08-14 14:10:44 +02:00
Hector Sanjuan
b804e61ef0 Update deps along with go-libp2p-core refactor
Lots of rewrites in imports...
2019-06-14 13:10:45 +02:00
Hector Sanjuan
d51c2a0377 Merge branch 'master' into feat/monitor-ring 2019-05-16 15:46:30 +02:00
Adrian Lanzafame
f1afce7644
add String method for Operation and OperationTracker types
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-07 19:09:05 +10:00
Adrian Lanzafame
8748c45600
go:generate stringer phase and operationtype
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-07 12:24:28 +10:00
Adrian Lanzafame
43fb2cf857
fix typo in comment
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-07 12:01:20 +10:00
Hector Sanjuan
3d49ac26a5 Feat: Split components into RPC Services
I had thought of this for a very long time but there were no compelling
reasons to do it. Specifying RPC endpoint permissions becomes however
significantly nicer if each Component is a different RPC Service. This also
fixes some naming issues like having to prefix methods with the component name
to separate them from methods named in the same way in some other component
(Pin and IPFSPin).
2019-05-04 21:36:10 +01:00
Adrian Lanzafame
c4b76619c1
Add failure_threshold monitors config
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:14:13 +10:00
Hector Sanjuan
acbd7fda60 Consensus: add new "crdt" consensus component
This adds a new "crdt" consensus component using go-ds-crdt.

This implies several refactors to fully make cluster consensus-component
independent:

* Delete mapstate and fully adopt dsstate (after people have migrated).
* Return errors from state methods rather than ignoring them.
* Add a new "datastore" modules so that we can configure datastores in the
   main configuration like other components.
* Let the consensus components fully define the "state.State". Thus, they do
not receive the state, they receive the storage where we put the state (a
go-datastore).
* Allow to customize how the monitor component obtains Peers() (the current
  peerset), including avoiding using the current peerset. At the moment the
  crdt consensus uses the monitoring component to define the current peerset.
  Therefore the monitor component cannot rely on the consensus component to
  produce a peerset.
* Re-factor/re-implementation of "ipfs-cluster-service state"
  operations. Includes the dissapearance of the "migrate" one.

The CRDT consensus component defines creates a crdt-datastore (with ipfs-lite)
and uses it to intitialize a dssate. Thus the crdt-store is elegantly
wrapped. Any modifications to the state get automatically replicated to other
peers. We store all the CRDT DAG blocks in the local datastore.

The consensus components only expose a ReadOnly state, as any modifications to
the shared state should happen through them.

DHT and PubSub facilities must now be created outside of Cluster and passed in
so they can be re-used by different components.
2019-04-17 19:14:26 +02:00
Hector Sanjuan
ea85cf7805 Rename "test.Test*" to "test.*" (test.TestCid1 -> test.Cid1)
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:19:10 +00:00
Hector Sanjuan
9df6344a07 Avoid using string testing CIDs and use cid.Cids directly
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:09:31 +00:00
Hector Sanjuan
6447ea51d2 Remove *Serial types. Use pointers for all types.
This takes advantange of the latest features in go-cid, peer.ID and
go-multiaddr and makes the Go types serializable by default.

This means we no longer need to copy between Pin <-> PinSerial, or ID <->
IDSerial etc. We can now efficiently binary-encode these types using short
field keys and without parsing/stringifying (in many cases it just a cast).

We still get the same json output as before (with minor modifications for
Cids).

This should greatly improve Cluster performance and memory usage when dealing
with large collections of items.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 17:04:35 +00:00
Robert Ignat
50844b9e5b Add ApplyEnvVars test to stateless config
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-18 17:51:04 +02:00
Robert Ignat
368f1de6bc Add ApplyEnvVars test to maptracker config
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-18 17:50:17 +02:00