Commit Graph

116 Commits

Author SHA1 Message Date
Hector Sanjuan
508791b547 Migrate from ipfs/ipfs-cluster to ipfs-cluster/ipfs-cluster
This performs the necessary renamings.
2022-06-16 17:43:30 +02:00
Hector Sanjuan
4daece2b98 Feat: add a new "pinqueue" informer component
This new component broadcasts metrics about the current size of the pinqueue,
which can in turn be used to inform allocations.

It has a weight_bucket_size option that serves to divide the actual size by a
given factor. This allows considering peers with similar queue sizes to have
the same weight.

Additionally, some changes have been made to the balanced allocator so that a
combination of tags, pinqueue sizes and free-spaces can be used. When
allocating by [<tag>, pinqueue, freespace], the allocator will prioritize
choosing peers with the smallest pin queue weight first, and of those with the
same weight, it will allocate based on freespace.
2022-06-16 17:43:29 +02:00
Hector Sanjuan
a97ed10d0b Adopt api.Cid type - replaces cid.Cid everwhere.
This commit introduces an api.Cid type and replaces the usage of cid.Cid
everywhere.

The main motivation here is to override MarshalJSON so that Cids are
JSON-ified as '"Qm...."' instead of '{ "/": "Qm....." }', as this "ipld"
representation of IDs is horrible to work with, and our APIs are not issuing
IPLD objects to start with.

Unfortunately, there is no way to do this cleanly, and the best way is to just
switch everything to our own type.
2022-04-07 14:27:39 +02:00
Hector Sanjuan
f07c1e6552 fix: BlockStream rpc: only cancel response context at the end 2022-03-28 17:55:31 +02:00
Hector Sanjuan
1d98538411 Adders: stream blocks to destinations
This commit fixes #810 and adds block streaming to the final destinations when
adding. This should add major performance gains when adding data to clusters.

Before, everytime cluster issued a block, it was broadcasted individually to
all destinations (new libp2p stream), where it was block/put to IPFS (a single
block/put http roundtrip per block).

Now, blocks are streamed all the way from the adder module to the ipfs daemon,
by making every block as it arrives a single part in a multipart block/put
request.

Before, block-broadcast needed to wait for all destinations to finish in order
to process the next block. Now, buffers allow some destinations to be faster
than others while sending and receiving blocks.

Before, if a block put request failed to be broadcasted everywhere, an error
would happen at that moment.

Now, we keep streaming until the end and only then report any errors. The
operation succeeds as long as at least one stream finished successfully.

Errors block/putting to IPFS will not abort streams. Instead, subsequent
blocks are retried with a new request, although the method will return an
error when the stream finishes if there were errors at any point.
2022-03-24 17:24:58 +01:00
Hector Sanjuan
eee53bfa4f Streaming Peers(): make Peers() a streaming call
This commit makes all the changes to make Peers() a streaming call.

While Peers is usually a non problematic call, for consistency, all calls
returning collections assembled through broadcast to cluster peers are now
streaming calls.
2022-03-23 01:27:57 +01:00
Hector Sanjuan
0d73d33ef5 Pintracker: streaming methods
This commit continues the work of taking advantage of the streaming
capabilities in go-libp2p-gorpc by improving the ipfsconnector and pintracker
components.

StatusAll and RecoverAll methods are now streaming methods, with the REST API
output changing accordingly to produce a stream of GlobalPinInfos rather than
a json array.

pin/ls request to the ipfs daemon now use ?stream=true and avoid having to
load the full pinset map on memory. StatusAllLocal and RecoverAllLocal
requests to the pin tracker stream all the way and no longer store the full
pinset, and the full PinInfo status slice before sending it out.

We have additionally switched to a pattern where streaming methods receive the
channel as an argument, allowing the caller to decide on whether to launch a
goroutine, do buffering etc.
2022-03-22 15:38:01 +01:00
Hector Sanjuan
9b9d76f92d Pinset streaming and method type revamp
This commit introduces the new go-libp2p-gorpc streaming capabilities for
Cluster. The main aim is to work towards heavily reducing memory usage when
working with very large pinsets.

As a side-effect, it takes the chance to revampt all types for all public
methods so that pointers to static what should be static objects are not used
anymore. This should heavily reduce heap allocations and GC activity.

The main change is that state.List now returns a channel from which to read
the pins, rather than pins being all loaded into a huge slice.

Things reading pins have been all updated to iterate on the channel rather
than on the slice. The full pinset is no longer fully loaded onto memory for
things that run regularly like StateSync().

Additionally, the /allocations endpoint of the rest API no longer returns an
array of pins, but rather streams json-encoded pin objects directly. This
change has extended to the restapi client (which puts pins into a channel as
they arrive) and to ipfs-cluster-ctl.

There are still pending improvements like StatusAll() calls which should also
stream responses, and specially BlockPut calls which should stream blocks
directly into IPFS on a single call.

These are coming up in future commits.
2022-03-19 03:02:55 +01:00
Hector Sanjuan
cd5f9c869d pinsvcapi: Test and fix bugs in Add Pin. Improve performance. 2022-03-11 14:01:15 +01:00
Hector Sanjuan
5fed4a2c5e types: include IPFSAddresses in pinInfo objects.
pinsvcapi: do not cache peer information here as all the needed information is
in the status objects.

This adds ipfs_addresses as a field broadcasted with the ping metrics.
2022-03-10 23:49:01 +01:00
Hector Sanjuan
5b0d9d68e3 Merge branch 'master' into feat/pinning-api 2022-03-10 13:41:54 +01:00
Hector Sanjuan
00dffe23b8 Adder: Add "no-pin" option.
This does 3 things:

- Add a NoPin option to the adder. When set to true, the adding process does not
send a pin in the end.

- When user-allocations are set and local=true happens, we do not overwrite
  the allocations returned by the allocator to include the local peer
  anymore, as this could alter user-allocations.

- Some code improvement (remove pointers).
2022-02-28 20:10:12 +01:00
Hector Sanjuan
d4073f9cfa cluster: add PeersWithFilter option that only requests info for certain peer IDs
(currently will be unused)
2022-02-02 00:43:00 +01:00
Hector Sanjuan
2c204968b8
Merge pull request #1559 from ipfs/fix/repo-stat-hammering
Fix: repo/stat gets hammered on busy cluster peers
2022-02-01 14:36:27 +01:00
Hector Sanjuan
acde3f16d0 Fix: repo/stat gets hammered on busy cluster peers
Given that every pin and block/put writes something to IPFS and thus increases
the repo size, a while ago we added a check to let the IPFS connector directly
trigger the sending of metrics every 10 of such requests. This was meant to
update the metrics more often so that balancing happened more granularly
(particularly the freespace one).

In practice, on a cluster that receives several hundreds of pin/adds
operations in a few seconds, this is just bad.

So:

* We disable by default the whole thing.
* We add a new InformerTriggerInterval configuration option to enable the thing.
* Fix a bug that made this always call the first informer, which may not
  have been the freespace one).
2022-02-01 01:34:17 +01:00
Hector Sanjuan
809b7fbda5 Pintracker: add IPFS ID to Pin Information
Fixes #1554
Fixes: peer names unset for remote peers

This adds an IPFS field to pin status information (PinInfoShort).

It has not been easy to add this, given that the IPFS ID is something that
comes from outside of cluster (unlike the peer name). After several tries I
have settled in the following things:

- Use the ping metric to send out peer names and IPFS IDs to the peers in the
  cluster.
- Cache the latest known IPFS ID (if IPFS dies we should still be setting
  the ID).
- Provide an RPC method for the Pintracker to obtain IPFS ID from the cache.
- Given we now know information for peernames and IPFS IDs from other peers,
  we can use that information even if the requests to them error or we are not
  contacting (i.e. peers allocated as remote are not queried for status). We can
  use the information from the last received ping metric.
- This means we should keep metrics around even if peers go away, at least for
  a while rather than deleting them as soon as we detect that they have expired.

Puting it all together we now have a system to gossip peer information around on top
of the ping metrics.
2022-01-31 17:53:09 +01:00
Hector Sanjuan
ea5e18078c Informers: GetMetric() -> GetMetrics()
Support returning multiple metrics per informer.
2021-09-15 20:07:37 +02:00
Hector Sanjuan
edfcfa3fb0 Fix #1360: Efficient pinset status with filters
This commit modifies the pintracker StatusAll call to take a status filter.

This allows to skip a PinLs call to ipfs when checking status for items that
are queued, pinning, unpinning or in error. Those status come directly from
the operation tracker. This should result in a significant performance
increase for those calls, particularly in nodes with several hundred thousand
pins and more, where the call to IPFS is very expensive.

A new TrackerStatusUnexpectedlyUnpinned status has been introduce to
differentiate between pin errors (tracked by the operation tracker) and "lost"
items (which before were pin errors too). This new status is handled by the
Recover() operation as before.
2021-07-06 11:34:19 +02:00
Hector Sanjuan
90208b45f9 health/alerts endpoint: brush up old PR 2021-01-13 22:09:21 +01:00
Hector Sanjuan
4bcb91ee2b Merge branch 'master' into feat/alerts 2021-01-13 21:08:49 +01:00
Hector Sanjuan
fa762d78cf Improve PinLsCid to check for pinned items
Receive the full pin object so that it can decide whether to check
for recursive or direct pins directly.

Additionally, unpin will not check for the pin presence anymore and
simply trigger unpins (ignoring errors)
2020-04-21 17:23:55 +02:00
Hector Sanjuan
99eb29a7d6 cluster: do not allow to repin recursive pins as something else.
i.e. a direct pin can be repinned as recursive, but a recursive
pin cannot be pinned as direct (this fails in IPFS too).

Additionally, save a couple of calls to the datastore by obtaining the
existing pin only once.
2020-04-21 17:23:55 +02:00
Hector Sanjuan
f83ff9b655 staticcheck: fix all staticcheck warnings in the project 2020-04-14 20:16:10 +02:00
Kishan Mohanbhai Sagathiya
d21860eee7 Merge branch 'master' into feat/alerts 2019-12-13 10:22:06 +05:30
Kishan Sagathiya
5258a4d428 Remove map pintracker (#944)
This removes mappintracker and sets stateless tracker as the default (and only) pintracker component.

Because the stateless tracker matches the cluster state with only ongoing operations being kept on memory, and additional information provided by ipfs-pin-ls, syncing operations are not necessary. Therefore the Sync/SyncAll operations are removed cluster-wide.
2019-12-12 21:22:54 +01:00
Kishan Mohanbhai Sagathiya
04069b8c81 Introduction of ctl alerts
- basic version, just alerts if peers are down
2019-12-13 00:21:28 +05:30
Kishan Sagathiya
30fd5ee6a0 Feat: Allow and run multiple informer components (#962)
This introduces the possiblity of running Cluster with multiple informer components. The first one in the list is the used for "allocations". This is just groundwork for working with several informers in the future.
2019-12-05 15:08:43 +01:00
Hector Sanjuan
249d9007d2 Merge branch 'master' into feat/cluster-gc 2019-11-07 18:35:42 +01:00
Kishan Sagathiya
31534a429b Fix #374: health metrics improvements
- Human-sizes for freespace metrics. Display whether if metric is
expires in something like "expires in 3m".
- When not passing metric name `ipfs-cluster-ctl health metrics` hits
the the metrics endpoint which returns a list of available metrics and
displays to user
- Humanize metrics output
- Sort metrics output
2019-10-24 16:37:26 +02:00
Kishan Mohanbhai Sagathiya
492b5612e7 Add ability to run Garbage Collector on all peers
- cluster method, ipfs connector method, rpc and rest apis,
command, etc for repo gc
    - Remove extra space from policy generator
    - Added special timeout for `/repo/gc` call to IPFS
    - Added `RepoGCLocal` cluster rpc method, which will be used to run gc
    on local IPFS daemon
    - Added peer name to the repo gc struct
    - Sorted with peer ids, while formatting(only affects cli
    results)
    - Special timeout setting where timeout gets checked from last update
    - Added `local` argument, which would run gc only on contacted peer
2019-10-22 11:13:19 +05:30
Kishan Mohanbhai Sagathiya
9cb1cdeaff Merge branch 'master' into feat/recover-all 2019-09-08 17:06:43 +07:00
Kishan Mohanbhai Sagathiya
ea7093aa73 BlockAllocate should respect user allocations 2019-08-31 16:24:20 +05:30
Kishan Mohanbhai Sagathiya
512bf6a13b Pin recover on all peers
- recover works without `--local` flag as well (recovers all pins on all
peers)
- remove extra space from rpc policy

Fixes #763
2019-08-21 11:19:07 +05:30
Hector Sanjuan
1eade4ae58 Fix #732: Introduce native pin/update
This introduces a pin/update operation which allows to Pin a new item to
cluster indicating that said pin is an update to an already-existing pin.

When this is the case, all the configuration for the existing pin is copied to
the new one (including allocations). The IPFS connector will then trigger
pin/update directly in IPFS, allowing an efficient pinning based on
DAG-differences. Since the allocations where the same for both pins,
the pin/update can proceed.

PinUpdate does not unpin the previous pin (it is not possible to do this
atomically in cluster like it happens in IPFS). The user can manually do it
after the pin/update is done.

Internally, after a lot of deliberations on what the optimal way for this is,
I opted for adding a `PinUpdate` option to the `PinOptions` type (carries the
CID to update from). In order to carry this option from the REST API to the
IPFS Connector, it is serialized in the Protobuf (and stored in the
datastore). There is no other way to do this in a simple fashion since the Pin
object is piece of information that is sent around.

Additionally, making it a PinOption plays well with the Pin/PinPath APIs which
need little changes. Effectively, you are pinning a new thing. You are just
indicating that it should be configured from an existing one.

Fixes #732
2019-08-09 16:11:52 +02:00
Hector Sanjuan
084e763468 Fix #803: Add "follower_mode" to the config
Peers configured with follower_mode = true fail to add/pin/unpin.

Additionally they do not contact other peers when doing Status, Sync or
Recover and report on themselves.

They still contact other peers when doing "peers ls", as this is an OpenRPC
endpoint.

This is merely improving user interaction with a cluster peer and avoids
getting into confusing places:

* pin/unpin seems to work even no one trusts them
* status will query all peers in the peerset only to get auth errors and
ignore them, becoming way slower than it could be

This is not a security feature.
2019-07-30 19:59:59 +02:00
Hector Sanjuan
7c636061bd
Improve pin/unpin method signatures (#843)
* Improve pin/unpin method signatures:

These changes the following Cluster Go API methods:

* -> Cluster.Pin(ctx, cid, options) (pin, error)
* -> Cluster.Unpin(ctx, cid) (pin, error)
* -> Cluster.PinPath(ctx, path, opts) (pin,error)

Pin and Unpin now return the pinned object.

The signature of the methods now matches that of the API Client, is clearer as
to what options the user can set and is aligned with PinPath, UnpinPath, which
returned pin methods.

The REST API now returns the Pinned/Unpinned object rather than 204-Accepted.

This was necessary for a cleaner pin/update approach, which I'm working on in
another branch.

Most of the changes here are updating tests to the new signatures

* Adapt load-balancing client to new Pin/Unpin signatures

* cluster.go: Fix typo

Co-Authored-By: Kishan Sagathiya <kishansagathiya@gmail.com>

* cluster.go: Fix typo

Co-Authored-By: Kishan Sagathiya <kishansagathiya@gmail.com>
2019-07-22 15:39:11 +02:00
Hector Sanjuan
b804e61ef0 Update deps along with go-libp2p-core refactor
Lots of rewrites in imports...
2019-06-14 13:10:45 +02:00
Hector Sanjuan
59fdff97f2 policygen: use format.Source() directly in code. 2019-05-15 10:46:03 +02:00
Hector Sanjuan
de2e64e1e0 RPC Auth: make policygen.go generate a full rpc_policy.go
So that the file can be replaced. Helpers: "make" and "make install"
2019-05-13 23:22:08 +02:00
Hector Sanjuan
a2d8ce2ab6
Avoid using Sprintf("%s.%s")
Co-Authored-By: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-10 11:52:37 +01:00
Hector Sanjuan
c5a2e7fdc5 RPC auth: Fix tests
I cannot have RPCAPIs expose a SvcID() method as gorpc will warn about it not
having the right signature. So I have created an RPCServiceID() method instead.
2019-05-09 16:33:59 +02:00
Hector Sanjuan
70f4cad613 RPC Auth: start using the RPC policy in the RPC server. 2019-05-09 15:14:26 +02:00
Hector Sanjuan
2ed48b6ac4 RPC auth: Rework PeerAdd and Join
PeerAdd called RPC endpoints for `LogMetric` and `ConnectSwarms`
remotely. However, I think similar effect can be achieved by calling
these from the Join() function locally.

In particular, ConnectSwarms was called when maybe the joining peer did not
even know about the other peers in the Cluster. Now this is delayed until some
ping metrics have come through.
2019-05-09 14:19:07 +02:00
Hector Sanjuan
3d49ac26a5 Feat: Split components into RPC Services
I had thought of this for a very long time but there were no compelling
reasons to do it. Specifying RPC endpoint permissions becomes however
significantly nicer if each Component is a different RPC Service. This also
fixes some naming issues like having to prefix methods with the component name
to separate them from methods named in the same way in some other component
(Pin and IPFSPin).
2019-05-04 21:36:10 +01:00
Hector Sanjuan
da24114ae0 Proxy: hijack pin/update
The IPFS pin/update endpoint takes two arguments and usually
unpins the first and pins the second. It is a bit more efficient
to do it in a single operation than two separate ones.

This will make the proxy endpoint hijack pin/update requests.

First, the FROM pin is fetched from the state. If present, we
set the options (replication factors, actual allocations) from
that pin to the new one. Then we pin the TO item and proceed
to unpin the FROM item when `unpin` is not false.

We need to support path resolving, just like IPFS, therefore
it was necessary to expose IPFSResolve() via RPC.
2019-04-29 16:36:40 +02:00
Hector Sanjuan
acbd7fda60 Consensus: add new "crdt" consensus component
This adds a new "crdt" consensus component using go-ds-crdt.

This implies several refactors to fully make cluster consensus-component
independent:

* Delete mapstate and fully adopt dsstate (after people have migrated).
* Return errors from state methods rather than ignoring them.
* Add a new "datastore" modules so that we can configure datastores in the
   main configuration like other components.
* Let the consensus components fully define the "state.State". Thus, they do
not receive the state, they receive the storage where we put the state (a
go-datastore).
* Allow to customize how the monitor component obtains Peers() (the current
  peerset), including avoiding using the current peerset. At the moment the
  crdt consensus uses the monitoring component to define the current peerset.
  Therefore the monitor component cannot rely on the consensus component to
  produce a peerset.
* Re-factor/re-implementation of "ipfs-cluster-service state"
  operations. Includes the dissapearance of the "migrate" one.

The CRDT consensus component defines creates a crdt-datastore (with ipfs-lite)
and uses it to intitialize a dssate. Thus the crdt-store is elegantly
wrapped. Any modifications to the state get automatically replicated to other
peers. We store all the CRDT DAG blocks in the local datastore.

The consensus components only expose a ReadOnly state, as any modifications to
the shared state should happen through them.

DHT and PubSub facilities must now be created outside of Cluster and passed in
so they can be re-used by different components.
2019-04-17 19:14:26 +02:00
Hector Sanjuan
23db807b87 ipfsproxy: use PinPath to match IPFS behaviour
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-03-04 15:54:34 +00:00
Hector Sanjuan
9df6344a07 Avoid using string testing CIDs and use cid.Cids directly
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:09:31 +00:00
Hector Sanjuan
c4b18cd5f6 Address issues from self-review
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 18:43:29 +00:00
Hector Sanjuan
6447ea51d2 Remove *Serial types. Use pointers for all types.
This takes advantange of the latest features in go-cid, peer.ID and
go-multiaddr and makes the Go types serializable by default.

This means we no longer need to copy between Pin <-> PinSerial, or ID <->
IDSerial etc. We can now efficiently binary-encode these types using short
field keys and without parsing/stringifying (in many cases it just a cast).

We still get the same json output as before (with minor modifications for
Cids).

This should greatly improve Cluster performance and memory usage when dealing
with large collections of items.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 17:04:35 +00:00