This new component broadcasts metrics about the current size of the pinqueue,
which can in turn be used to inform allocations.
It has a weight_bucket_size option that serves to divide the actual size by a
given factor. This allows considering peers with similar queue sizes to have
the same weight.
Additionally, some changes have been made to the balanced allocator so that a
combination of tags, pinqueue sizes and free-spaces can be used. When
allocating by [<tag>, pinqueue, freespace], the allocator will prioritize
choosing peers with the smallest pin queue weight first, and of those with the
same weight, it will allocate based on freespace.
This commit introduces an api.Cid type and replaces the usage of cid.Cid
everywhere.
The main motivation here is to override MarshalJSON so that Cids are
JSON-ified as '"Qm...."' instead of '{ "/": "Qm....." }', as this "ipld"
representation of IDs is horrible to work with, and our APIs are not issuing
IPLD objects to start with.
Unfortunately, there is no way to do this cleanly, and the best way is to just
switch everything to our own type.
This commit continues the work of taking advantage of the streaming
capabilities in go-libp2p-gorpc by improving the ipfsconnector and pintracker
components.
StatusAll and RecoverAll methods are now streaming methods, with the REST API
output changing accordingly to produce a stream of GlobalPinInfos rather than
a json array.
pin/ls request to the ipfs daemon now use ?stream=true and avoid having to
load the full pinset map on memory. StatusAllLocal and RecoverAllLocal
requests to the pin tracker stream all the way and no longer store the full
pinset, and the full PinInfo status slice before sending it out.
We have additionally switched to a pattern where streaming methods receive the
channel as an argument, allowing the caller to decide on whether to launch a
goroutine, do buffering etc.
This commit introduces the new go-libp2p-gorpc streaming capabilities for
Cluster. The main aim is to work towards heavily reducing memory usage when
working with very large pinsets.
As a side-effect, it takes the chance to revampt all types for all public
methods so that pointers to static what should be static objects are not used
anymore. This should heavily reduce heap allocations and GC activity.
The main change is that state.List now returns a channel from which to read
the pins, rather than pins being all loaded into a huge slice.
Things reading pins have been all updated to iterate on the channel rather
than on the slice. The full pinset is no longer fully loaded onto memory for
things that run regularly like StateSync().
Additionally, the /allocations endpoint of the rest API no longer returns an
array of pins, but rather streams json-encoded pin objects directly. This
change has extended to the restapi client (which puts pins into a channel as
they arrive) and to ipfs-cluster-ctl.
There are still pending improvements like StatusAll() calls which should also
stream responses, and specially BlockPut calls which should stream blocks
directly into IPFS on a single call.
These are coming up in future commits.
This is a preparatory PR to add additional APIs (Pinning Service API) easily
to cluster.
Instead of copy-pasting most of what the REST API does, I have refactored so
that the whole configuration, routing and request-handling utilities can be
re-used.
The worst part has been to divide the test between tests that test core
(common.API) functionality and tests that test specific REST API endpoint
functionality. I could not get away without an additional common/test package
to provide functions that are used from both places. This is a side effect of
testing both http and libp2p endpoints for every request etc.
Badger can take 1000x the amount of needed space if not GC'ed or compacted
(#1320), even for non heavy usage. Cluster has no provisions to run datastore
GC operations and while they could be added, they are not ensured to
help. Improvements on Badger v3 might help but would still need to GC
explicitally.
Cluster was however designed to support any go-datastore as backend.
This commit adds LevelDB support. LevelDB go-datastore wrapper is mature, does
not need GC and should work well for most cluster usecases, which are not
overly demanding.
A new `--datastore` flag has been added on init. The store backend is selected
based on the value in the configuration, similar to how raft/crdt is. The
default is set to leveldb. From now on it should be easier to add additional
backends, i.e. badgerv3.
GlobalPinInfo objects carried redundant information (Cid, Peer) that takes
space and time to serialize.
This has been addressed by having GlobalPinInfo embed PinInfoShort rather than
PinInfo. This new types ommits redundant fields.
For the online case we were unnecessarily loading the configuration from IPFS, a very
slow operation if IPFS is very busy (when syncing).
For the offline case, we required IPFS to be online (for remote
configurations) slow things down as well and is uncovenient. Instead, simply
open the database with default parameters and list it.
* add ipv6 listening addresses to the default config
* ipfsproxy: support multiple listeners. Add default ipv6.
* mm
* restapi: support multiple listen addresses. enable ipv6
* cluster_config: format default listen addresses
* commands: update for multiple listeners. Fix randomports for udp and ipv6.
* ipfs-cluster-service: fix randomports test
* multiple listeners: fix remaining tests
* golint
* Disable ipv6 in defaults
It is not supported by docker by default. It is not supported in travis-CI
build environments. User can enable it now manually.
* proxy: disable ipv6 in test
* ipfshttp: fix test
Co-authored-by: @RubenKelevra <cyrond@gmail.com>
This adds a new cluster command: ipfs-cluster-follow.
This command allows initializing and running follower peers as configured by a
remote-source configuration. The command can list configured peers
and obtain information for each of them.
Peers are launched with the rest API listening on a local unix socket. The
command can be run to list the items in the cluster pinset using this
endpoint. Alternatively, if no socket is present, the peer will be assumed to
be offline and the pin list will be directly read from the datastore.
Cluster peers launched with this command (and their configurations) are
compatible with ipfs-cluster-ctl and ipfs-cluster-service. We purposely do not
support most configuration options here. Using ipfs-cluster-ctl or launching
the peers using ipfs-cluster-service is always an option when the usecase
deviates from that supported by ipfs-cluster-follow.
Examples:
$ ipfs-cluster-follow -> list configured peers
$ ipfs-cluster-follow --help
$ ipfs-cluster-follow <clusterName> init <url>
$ ipfs-cluster-follow <clusterName> info
$ ipfs-cluster-follow <clusterName> run
$ ipfs-cluster-follow <clusterName> list