ipfs-cluster

Author	SHA1	Message	Date
Hector Sanjuan	11124ee224	Fix: repinning does not re-allocate as needed Long story: Since #1768 there has been a recurring repinning test failure with Raft consensus. Per the test, if a pin is allocated to a peer that has been shutdown, submitting the pin again should re-allocate it to a peer that is still running. Investigation on why this test fails and why it fails only in Raft lead to realizing that this and other similar tests, were passing by chance. The needed re-allocations were made not by the new submission of the pin, but by the automatic-repinning feature. The actual resubmitted pin was carrying the same allocations (one of them being the peer that was down), but it was silently failing because the RedirectToLeader() code path was using cc.ctx and hitting the peer that had been shutdown, which caused it to error. Fixing the context propagation, meant that we would re-overwrite the pin with the old allocations, thus the actual behaviour did not pass the test. So, on one side, this fix an number of tests that had not disabled automatic repinning and was probably getting in the way of things. On the other side, this removes a condition that prevents re-allocation of pins if they exists and options have not changed. I don't fully understand why this was there though, since the Allocate() code does return the old allocations anyways when they are enough, so it should not re-allocate randomly. I suspect this was preventing some misbehaviour in the Allocate() code from the time before it was improved with multiple allocators etc.	2022-09-27 12:31:24 +02:00
Hector Sanjuan	21855c3130	Fix bad context propagation / deadlocks We are propagating the wrong context (mostly from the Cluster top-level methods). This makes that request cancellations (and cancellations of the associated contexts) are not propagated to many methods, and can result in deadlocks when an operation that is holding a lock is not aborted. This affects for example the operation tracker. Getting all operations from the tracker relies on someone reading from the out channel, or on the context being cancelled. When a request is aborted in the middle of the response, and the context is not cancelled, everything that wants to list operations would become deadlocked, including operations that need write locks like TrackNewOperation. This fixes it.	2022-09-26 19:35:55 +02:00
Hector Sanjuan	29d6f69819	Add "service state crdt dot" command This subcommand allows to export the peer's CRDT DAG as dot file. It also opens the door to have more crdt-specific subcommands, i.e. CAR export etc.	2022-09-13 16:36:10 +02:00
Hector Sanjuan	b81379e383	Update raft libraries This updates Raft to v1.3.0. We can also remove some logging glue that is no longer necessary in here.	2022-09-09 17:17:45 +02:00
Hector Sanjuan	5452b59a2e	Dependency upgrades (#1755 ) * Update go-libp2p to v0.22.0 * Testing with go1.19 * build(deps): bump github.com/multiformats/go-multicodec Bumps [github.com/multiformats/go-multicodec](https://github.com/multiformats/go-multicodec) from 0.5.0 to 0.6.0. - [Release notes](https://github.com/multiformats/go-multicodec/releases) - [Commits](https://github.com/multiformats/go-multicodec/compare/v0.5.0...v0.6.0) --- updated-dependencies: - dependency-name: github.com/multiformats/go-multicodec dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/ipld/go-car from 0.4.0 to 0.5.0 Bumps [github.com/ipld/go-car](https://github.com/ipld/go-car) from 0.4.0 to 0.5.0. - [Release notes](https://github.com/ipld/go-car/releases) - [Commits](https://github.com/ipld/go-car/compare/v0.4.0...v0.5.0) --- updated-dependencies: - dependency-name: github.com/ipld/go-car dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/prometheus/client_golang Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.12.2 to 1.13.0. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](https://github.com/prometheus/client_golang/compare/v1.12.2...v1.13.0) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/hashicorp/go-hclog from 1.2.1 to 1.3.0 Bumps [github.com/hashicorp/go-hclog](https://github.com/hashicorp/go-hclog) from 1.2.1 to 1.3.0. - [Release notes](https://github.com/hashicorp/go-hclog/releases) - [Commits](https://github.com/hashicorp/go-hclog/compare/v1.2.1...v1.3.0) --- updated-dependencies: - dependency-name: github.com/hashicorp/go-hclog dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/ipfs/go-ds-crdt from 0.3.6 to 0.3.7 Bumps [github.com/ipfs/go-ds-crdt](https://github.com/ipfs/go-ds-crdt) from 0.3.6 to 0.3.7. - [Release notes](https://github.com/ipfs/go-ds-crdt/releases) - [Commits](https://github.com/ipfs/go-ds-crdt/compare/v0.3.6...v0.3.7) --- updated-dependencies: - dependency-name: github.com/ipfs/go-ds-crdt dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/urfave/cli/v2 from 2.10.2 to 2.14.1 Bumps [github.com/urfave/cli/v2](https://github.com/urfave/cli) from 2.10.2 to 2.14.1. - [Release notes](https://github.com/urfave/cli/releases) - [Changelog](https://github.com/urfave/cli/blob/main/docs/CHANGELOG.md) - [Commits](https://github.com/urfave/cli/compare/v2.10.2...v2.14.1) --- updated-dependencies: - dependency-name: github.com/urfave/cli/v2 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/libp2p/go-libp2p-http from 0.3.0 to 0.4.0 Bumps [github.com/libp2p/go-libp2p-http](https://github.com/libp2p/go-libp2p-http) from 0.3.0 to 0.4.0. - [Release notes](https://github.com/libp2p/go-libp2p-http/releases) - [Commits](https://github.com/libp2p/go-libp2p-http/compare/v0.3.0...v0.4.0) --- updated-dependencies: - dependency-name: github.com/libp2p/go-libp2p-http dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/libp2p/go-libp2p-gorpc from 0.4.0 to 0.5.0 Bumps [github.com/libp2p/go-libp2p-gorpc](https://github.com/libp2p/go-libp2p-gorpc) from 0.4.0 to 0.5.0. - [Release notes](https://github.com/libp2p/go-libp2p-gorpc/releases) - [Commits](https://github.com/libp2p/go-libp2p-gorpc/compare/v0.4.0...v0.5.0) --- updated-dependencies: - dependency-name: github.com/libp2p/go-libp2p-gorpc dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump contrib.go.opencensus.io/exporter/prometheus Bumps [contrib.go.opencensus.io/exporter/prometheus](https://github.com/census-ecosystem/opencensus-go-exporter-prometheus) from 0.4.1 to 0.4.2. - [Release notes](https://github.com/census-ecosystem/opencensus-go-exporter-prometheus/releases) - [Commits](https://github.com/census-ecosystem/opencensus-go-exporter-prometheus/compare/v0.4.1...v0.4.2) --- updated-dependencies: - dependency-name: contrib.go.opencensus.io/exporter/prometheus dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/libp2p/go-libp2p-raft from 0.1.8 to 0.2.0 Bumps [github.com/libp2p/go-libp2p-raft](https://github.com/libp2p/go-libp2p-raft) from 0.1.8 to 0.2.0. - [Release notes](https://github.com/libp2p/go-libp2p-raft/releases) - [Commits](https://github.com/libp2p/go-libp2p-raft/compare/v0.1.8...v0.2.0) --- updated-dependencies: - dependency-name: github.com/libp2p/go-libp2p-raft dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build(deps): bump github.com/urfave/cli from 1.22.9 to 1.22.10 Bumps [github.com/urfave/cli](https://github.com/urfave/cli) from 1.22.9 to 1.22.10. - [Release notes](https://github.com/urfave/cli/releases) - [Changelog](https://github.com/urfave/cli/blob/main/docs/CHANGELOG.md) - [Commits](https://github.com/urfave/cli/compare/v1.22.9...v1.22.10) --- updated-dependencies: - dependency-name: github.com/urfave/cli dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * Fix checker/linter/staticcheck warnings Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-06 16:57:17 +02:00
Hector Sanjuan	5e7a694cd1	crdt: Implement proper Batch commit on shutdown This implements committing batches on shutdown properly. Now the batchWorker will only finish when there are no more things queued to be included in the final batch(es). LogPin/Unpin operations will fail while we are shutting down and they cannot be included in the batch.	2022-06-22 20:19:12 +02:00
Hector Sanjuan	a393ebd8d8	crdt: Commit batches on shutdown This attempt to commit any pending batches when the crdt component is being shutudown. A commit should succeed if the new DAG node is created, heads are replaced and broadcast. The latest version of CRDT ensure that the datastore does not unnecessarily gets marked as dirty when a broadcasted head cannot be fetched/processed, so the side effect of publishing a head before shutting down should be under control at least.	2022-06-21 19:11:05 +02:00
Hector Sanjuan	755cebbe0d	Enable spell checking and fix spelling errors (using US locale)	2022-06-16 17:43:30 +02:00
Hector Sanjuan	508791b547	Migrate from ipfs/ipfs-cluster to ipfs-cluster/ipfs-cluster This performs the necessary renamings.	2022-06-16 17:43:30 +02:00
Hector Sanjuan	6716f5471a	Fix: wrong pins metric when batching enabled When batching is enabled, the "batchingstate" is used to add/remove pins, but the non-batching state is used as read-only state for doing List(). This means that both states will be writing pins metrics but the batching state is never used for List(), so it never has the right total number. This fixes that.	2022-04-25 23:45:34 +02:00
Hector Sanjuan	3169fba9d1	metrics: track total pins, queued, pinning, pin error. This fixes #1470 and #1187.	2022-04-22 15:57:48 +02:00
Hector Sanjuan	a97ed10d0b	Adopt api.Cid type - replaces cid.Cid everwhere. This commit introduces an api.Cid type and replaces the usage of cid.Cid everywhere. The main motivation here is to override MarshalJSON so that Cids are JSON-ified as '"Qm...."' instead of '{ "/": "Qm....." }', as this "ipld" representation of IDs is horrible to work with, and our APIs are not issuing IPLD objects to start with. Unfortunately, there is no way to do this cleanly, and the best way is to just switch everything to our own type.	2022-04-07 14:27:39 +02:00
Hector Sanjuan	0d73d33ef5	Pintracker: streaming methods This commit continues the work of taking advantage of the streaming capabilities in go-libp2p-gorpc by improving the ipfsconnector and pintracker components. StatusAll and RecoverAll methods are now streaming methods, with the REST API output changing accordingly to produce a stream of GlobalPinInfos rather than a json array. pin/ls request to the ipfs daemon now use ?stream=true and avoid having to load the full pinset map on memory. StatusAllLocal and RecoverAllLocal requests to the pin tracker stream all the way and no longer store the full pinset, and the full PinInfo status slice before sending it out. We have additionally switched to a pattern where streaming methods receive the channel as an argument, allowing the caller to decide on whether to launch a goroutine, do buffering etc.	2022-03-22 15:38:01 +01:00
Hector Sanjuan	9b9d76f92d	Pinset streaming and method type revamp This commit introduces the new go-libp2p-gorpc streaming capabilities for Cluster. The main aim is to work towards heavily reducing memory usage when working with very large pinsets. As a side-effect, it takes the chance to revampt all types for all public methods so that pointers to static what should be static objects are not used anymore. This should heavily reduce heap allocations and GC activity. The main change is that state.List now returns a channel from which to read the pins, rather than pins being all loaded into a huge slice. Things reading pins have been all updated to iterate on the channel rather than on the slice. The full pinset is no longer fully loaded onto memory for things that run regularly like StateSync(). Additionally, the /allocations endpoint of the rest API no longer returns an array of pins, but rather streams json-encoded pin objects directly. This change has extended to the restapi client (which puts pins into a channel as they arrive) and to ipfs-cluster-ctl. There are still pending improvements like StatusAll() calls which should also stream responses, and specially BlockPut calls which should stream blocks directly into IPFS on a single call. These are coming up in future commits.	2022-03-19 03:02:55 +01:00
Hector Sanjuan	b8327e25e3	crdt: log when batches are committed	2022-03-10 00:15:12 +01:00
Hector Sanjuan	975814229c	crdt: increase number of workers from 5 to 50. Upgrade. The go-ds-crdt upgrade disables multi-head-processing by default again. We see this causes a lot of branching. We however increase the number of workers. With large deltas, it may be possible that all the 5 workers are busy downloading a delta or processing them, while we potentially have hundreds of children in the DAG. Thus it is not bad to attempt to do more things in parallel.	2022-02-15 16:54:42 +01:00
Hector Sanjuan	a64a1c2b6e	crdt: add RepairInterval option	2022-02-01 23:29:40 +01:00
Hector Sanjuan	4739ed9210	Changes pertaining to go-libp2p v0.16.0	2021-11-30 06:25:15 +01:00
Alan Shaw	275d01efb6	fix: error message for LogPin	2021-05-12 14:26:51 +01:00
Hector Sanjuan	8b33264a29	crdt: add test for batching items	2021-04-30 19:45:01 +02:00
Hector Sanjuan	d7e46f9e36	crdt: Increase default MaxQueueSize to 50000	2021-04-30 19:38:48 +02:00
Hector Sanjuan	b5cc68a321	Feat #1008 : Support pin-batching with CRDT consensus. This adds batching support to crdt-consensus per #1008 . The crdt component can now take advantage of the BatchingState, which uses the batching-crdt datastore. In batching mode, the crdt datastore groups any Add and Delete operations in a single delta (instead of just 1, as it does by default). Batching is enabled in the crdt configuration section by setting MaxBatchSize and MaxBatchAge. These two settings control when a batch is committed, either by reaching a maximum number of pin/unpin operations, or by reaching a maximum age. Batching unlocks large pin-ingestion scalability for clusters, but should be set according to expected work loads. An additional, hidden MaxQueueSize parameter provides the ability to perform backpressure on Pin/Unpin requests. When more than MaxQueueSize pin/unpins are waiting to be included in a batch, the LogPin/LogUnpin operations will fail. If this happens, it is means cluster cannot commit batches as fast as pins are arriving. Thus, MaxQueueSize should be increase (to accommodate bursts), or the batch size increased (to perform less commits and hopefully handle the requests faster). Note that the underlying CRDT library will auto-commit when batch deltas reach 1MB of size.	2021-04-29 01:18:37 +02:00
Hector Sanjuan	75cf1b32c7	Feat #1008 : Add batching configuration options and parsing	2021-04-29 01:08:46 +02:00
Hector Sanjuan	82412ca91c	crdt: fix pubsub validation for crdt messages Currently, it was not looking at the signer of the message, but at the peer that relayed it, to verify the validity of the message. This caused that under certain gossip graph configurations, some nodes would only get messages via untrusted peers, and thus be unable to sync the chain.	2021-01-13 22:13:03 +01:00
Hector Sanjuan	7e0d39cdf7	config: Fix some slices being appended the same values twice trusted_peers and peer_addresses were appended the same info twice when reparsing the config. Once when parsing json and once when re-parsing with env-vars.	2021-01-13 22:13:03 +01:00
Hector Sanjuan	4125f12f52	chore: update dependencies	2020-09-02 12:18:16 +02:00
Kishan Mohanbhai Sagathiya	ae8e74453b	Fix #937 : Print full working configuration at startup Only when using debug mode Co-authored-by: Hector Sanjuan <code@hector.link>	2020-05-15 01:33:04 +02:00
Hector Sanjuan	b513ec194d	Fix some mispellings	2020-04-14 23:47:09 +02:00
Hector Sanjuan	7ffd18e41b	Feat: upgrade to dual DHT	2020-04-14 22:03:24 +02:00
Hector Sanjuan	f83ff9b655	staticcheck: fix all staticcheck warnings in the project	2020-04-14 20:16:10 +02:00
Hector Sanjuan	b14a3cdb4b	crdt: tests: set dht mode to server in tests	2020-04-07 17:49:42 +02:00
Hector Sanjuan	eba3246b98	chore: update deps: dht to cypress Add necessary validators. There is currently no way to run a dynamic-mode DHT that supports LAN-based peers. Therefore for the time-being we are setting the dht.Mode to "server".	2020-04-07 16:22:46 +02:00
Hector Sanjuan	b3853caf36	Dependency ugprade: changes needed * Libp2p protectors no longer needed, use PSK directly * Generate cluster 32-byte secret here (helper gone from pnet) * Switch to go-log/v2 in all places * DHT bootstrapping not needed. Adjust DHT options for tests. * Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp. * Disable QUIC (does not support private networks) * Fix tests: autodiscovery started working properly	2020-03-22 14:50:25 +01:00
Hector Sanjuan	5cd4abe58b	consensus: fix panic when getting state	2019-12-16 15:43:38 +01:00
Hector Sanjuan	35c1a4895e	crdt consensus: do not panic when getting the state while not ready. Wait instead.	2019-12-16 14:22:09 +01:00
Hector Sanjuan	0671159bee	Crdt: fix OfflineState comment	2019-12-16 13:42:35 +01:00
Kishan Sagathiya	5258a4d428	Remove map pintracker (#944 ) This removes mappintracker and sets stateless tracker as the default (and only) pintracker component. Because the stateless tracker matches the cluster state with only ongoing operations being kept on memory, and additional information provided by ipfs-pin-ls, syncing operations are not necessary. Therefore the Sync/SyncAll operations are removed cluster-wide.	2019-12-12 21:22:54 +01:00
Hector Sanjuan	b306bda877	Raft logging: update logger to new interface	2019-11-05 12:51:18 +01:00
Kishan Sagathiya	295915272b	Tests: multiple fixes to tests reliability (#943 ) This makes a number of fixes to improve the reliability of tests.	2019-10-31 21:51:13 +01:00
Hector Sanjuan	1a0998f10d	CRDT: update and increase timeout * Update go-ds-crdt to 0.1.5 which adds a return statement in case of error fetching a node. * Increase DAG-Get timeout to 2 minutes * Downgrade go-bitswap to 0.1.6.	2019-09-12 19:22:52 +02:00
Hector Sanjuan	81ea1e76bc	Merge branch 'master' into feat/sort-responses	2019-09-06 15:13:54 +02:00
Hector Sanjuan	d63a7fd641	Merge pull request #877 from ipfs/fix/ipfs-to-p2p Use `p2p` protocol name over `ipfs` for multiaddr	2019-09-06 15:00:36 +02:00
Hector Sanjuan	8ca0f5781c	Fix: trust-all remained enabled always	2019-08-27 19:18:50 +02:00
Kishan Mohanbhai Sagathiya	c109a01343	Sort peers for crdt consensus.Peers	2019-08-26 18:27:17 +05:30
Hector Sanjuan	df7621ba6e	crdt: inform about trust all mode	2019-08-26 12:22:44 +02:00
Kishan Mohanbhai Sagathiya	6656b80a00	Some more occurences of /ipfs and use SwapToP2pMultiaddrs (very helpful since ipfs still send addresses with `/ipfs` tag)	2019-08-16 11:56:09 +05:30
Hector Sanjuan	28ae394fa9	Fix #883 : Tweak timeouts for better tests	2019-08-13 19:44:48 +02:00
Hector Sanjuan	676ad1b61e	CRDT: TrustAll by default.	2019-08-12 10:25:04 +02:00
Kishan Sagathiya	0a5598a922	Fix #211 : Remove commented code around LeaderObservation (#858 ) * Remove 32bit safegaurd and remove LeaderObersvation	2019-07-29 19:11:24 +02:00
Kishan Sagathiya	7f52242f35	Fix #840 : Removed Raft peers should dissapear from peerstore (#846 ) With this commit, cluster peer will observe on events of peer removal from cluster. On occurence of the event, the cluster peer will clear the removed peer from its peerstore.	2019-07-25 14:40:05 +02:00

1 2 3

145 Commits