ipfs-cluster

Author	SHA1	Message	Date
Hector Sanjuan	592ce450ce	pintracker: RecoverAll should only return status for recovered items We call RecoverAll regularly and I noticed it was way slower than it should be. After all, it should just loop the pinset and enqueued items that are unexpectedly unpinned or in pin error. However, at some point we decided that RecoverAll would return information for all pins, regardless of whether they were recovered or not. This ends up resulting in a separate Status call for every pin that is already pinned, and this call hits IPFS. This is pretty bad with big pinsets. This commit fixes that, we return no state information for pins that are not touched.	2022-01-11 16:22:03 +01:00
Hector Sanjuan	37f9728f49	Avoid publishing invalid metrics Invalid metrics returned by informers should not be sent around	2021-12-17 11:57:01 +01:00
Hector Sanjuan	ff104a9220	mdns fix: ensure the new mdns service is started	2021-12-01 01:27:59 +01:00
Hector Sanjuan	4739ed9210	Changes pertaining to go-libp2p v0.16.0	2021-11-30 06:25:15 +01:00
Hector Sanjuan	3ddda1fb59	Merge branch 'master' into dependency-upgrades	2021-10-27 15:55:34 +02:00
Hector Sanjuan	32386d853a	Dependency upgrades	2021-10-20 16:56:24 +02:00
Hector Sanjuan	e9857652f2	Add a timestamp to Pins This adds a Timestamp field to the pin objects. This allows to track when they were pinned. This: * Allows the pin-tracker to actually show accurate information on when the pin entered the system for pins that are not part of ongoing operations (currently it shows time.Now()) * Adds support for reporting timestamp on a pinning services api.	2021-10-20 16:55:57 +02:00
Hector Sanjuan	6b31f44351	Address most comments from PR review	2021-10-05 14:04:28 +02:00
Hector Sanjuan	ea5e18078c	Informers: GetMetric() -> GetMetrics() Support returning multiple metrics per informer.	2021-09-15 20:07:37 +02:00
Hector Sanjuan	67497c4eb4	Fix #1436 : Do not block peer startup waiting for RecoverAll On large pinsets this may take a very long time and prevents metrics and re-boostrapping from starting, among other things. See bug description. This lets watchPinset trigger an immediate RecoverAllLocal instead, but this happens in its own goroutine and should allow everything else to start.	2021-08-06 11:30:29 +02:00
Ian Davis	cb4023855c	Ensure read of alerts slice is performed while holding lock Fixes data race: ================== WARNING: DATA RACE Read at 0x00c029ae7f10 by goroutine 4785: github.com/ipfs/ipfs-cluster.(Cluster).Alerts() /home/iand/wip/iand/ipfs-cluster/cluster.go:395 +0x64 github.com/ipfs/ipfs-cluster.TestClusterAlerts() /home/iand/wip/iand/ipfs-cluster/ipfscluster_test.go:2159 +0x238 testing.tRunner() /opt/go/src/testing/testing.go:1194 +0x202 Previous write at 0x00c029ae7f10 by goroutine 5062: github.com/ipfs/ipfs-cluster.(Cluster).alertsHandler() /home/iand/wip/iand/ipfs-cluster/cluster.go:429 +0x48c github.com/ipfs/ipfs-cluster.(Cluster).run.func5() /home/iand/wip/iand/ipfs-cluster/cluster.go:596 +0x76 Goroutine 4785 (running) created at: testing.(T).Run() /opt/go/src/testing/testing.go:1239 +0x5d7 testing.runTests.func1() /opt/go/src/testing/testing.go:1512 +0xa6 testing.tRunner() /opt/go/src/testing/testing.go:1194 +0x202 testing.runTests() /opt/go/src/testing/testing.go:1510 +0x612 testing.(M).Run() /opt/go/src/testing/testing.go:1418 +0x3b3 github.com/ipfs/ipfs-cluster.TestMain() /home/iand/wip/iand/ipfs-cluster/ipfscluster_test.go:134 +0x7dc main.main() _testmain.go:179 +0x271 Goroutine 5062 (running) created at: github.com/ipfs/ipfs-cluster.(Cluster).run() /home/iand/wip/iand/ipfs-cluster/cluster.go:594 +0x1f6 github.com/ipfs/ipfs-cluster.NewCluster.func1() /home/iand/wip/iand/ipfs-cluster/cluster.go:208 +0xa4 ================== --- FAIL: TestClusterAlerts (8.69s) testing.go:1093: race detected during execution of test	2021-07-28 13:08:56 +01:00
Hector Sanjuan	edfcfa3fb0	Fix #1360 : Efficient pinset status with filters This commit modifies the pintracker StatusAll call to take a status filter. This allows to skip a PinLs call to ipfs when checking status for items that are queued, pinning, unpinning or in error. Those status come directly from the operation tracker. This should result in a significant performance increase for those calls, particularly in nodes with several hundred thousand pins and more, where the call to IPFS is very expensive. A new TrackerStatusUnexpectedlyUnpinned status has been introduce to differentiate between pin errors (tracked by the operation tracker) and "lost" items (which before were pin errors too). This new status is handled by the Recover() operation as before.	2021-07-06 11:34:19 +02:00
Hector Sanjuan	5419d0ff7c	Issue #1350 : Ensure CID status and Peers do not take too long StatusCID() and Peers() are calls that should return relatively quickly. If they don't, it means they are hanging for some reason. We cannot let the whole Multicall request hang waiting on a single peer, therefore, set a hardcoded 15 second deadline for both.	2021-05-03 17:39:33 +02:00
Hector Sanjuan	d1700dbe81	Fixes #1319 : Status wrongly shows pins as REMOTE The Allocations of a pin that has been added with default replication factor are kept even when the replication factor turns out to be -1. This resulted in the Status(cid) code skipping calls to a number of peers and setting the pin directly as REMOTE. The fix, on one side makes sure Allocations is always nil when the replication factor is -1. On the other size, lets the globalPinInfoCid method check the replication factor value, rather than the number of allocations to decide if any nodes are bound to be remote. On the plus side, the pin tracker used the IsRemotePin method, which uses the replication factor, so things were pinned even if the Status(cid) method shows them as remote.	2021-03-24 00:47:15 +01:00
Sergei Udris	43fa2994ac	feat: make MDNS failure on start non-fatal (#1310 ) * feat: make MDNS failure on start non-fatal - if discovery.NewMdnsService errors on start, show warning "error message, MDNS service will be disabled" - same as setting MDNSInterval to 0: NewCluster is still created and daemon runs, but without MDNS	2021-02-19 09:47:46 +01:00
Hector Sanjuan	7ea11da75f	Fix linter problem	2021-01-14 00:18:16 +01:00
Hector Sanjuan	90208b45f9	health/alerts endpoint: brush up old PR	2021-01-13 22:09:21 +01:00
Hector Sanjuan	4bcb91ee2b	Merge branch 'master' into feat/alerts	2021-01-13 21:08:49 +01:00
Hector Sanjuan	e967238848	Merge pull request #1129 from ipfs/fix/1013-follow-list Improvements to ipfs-cluster-follow * list	2020-05-16 02:31:06 +02:00
Hector Sanjuan	c026299b95	Include Name as GlobalPinInfo key and consolidate redundant keys GlobalPinInfo objects carried redundant information (Cid, Peer) that takes space and time to serialize. This has been addressed by having GlobalPinInfo embed PinInfoShort rather than PinInfo. This new types ommits redundant fields.	2020-05-16 02:27:24 +02:00
Hector Sanjuan	b0dcfe68c7	Merge pull request #1127 from ipfs/fix/1064-repin-followers Fix #1064: Make the peer closest to the CID in charge of repinning	2020-05-15 20:21:26 +02:00
Hector Sanjuan	2e49d522ec	Fix latest staticcheck errors and let travis test it	2020-05-14 23:54:11 +02:00
Hector Sanjuan	9bfbde8e76	Fix #1064 : Make the peer closest to the CID in charge of repinning We cannot rely on current pin allocations anymore since the allocations may be follower peers that do not even handle alerts. Instead, we assume that we are a trusted peer (because we are not in follower mode), get all other trusted peers and act if we are the XOR-closest to the CID. This also means that replication-factor = 1 pins can be recovered too.	2020-05-14 23:47:56 +02:00
Hector Sanjuan	1499107835	Cluster: update docstring for setupPin()	2020-04-21 22:59:35 +02:00
Hector Sanjuan	99eb29a7d6	cluster: do not allow to repin recursive pins as something else. i.e. a direct pin can be repinned as recursive, but a recursive pin cannot be pinned as direct (this fails in IPFS too). Additionally, save a couple of calls to the datastore by obtaining the existing pin only once.	2020-04-21 17:23:55 +02:00
Hector Sanjuan	a6d8e00d20	Merge pull request #1065 from gargdeepak/fix/cluster/pinupdate Fixes #996 pin expiry is updated if set in options	2020-04-16 10:55:18 +02:00
Hector Sanjuan	7ffd18e41b	Feat: upgrade to dual DHT	2020-04-14 22:03:24 +02:00
Hector Sanjuan	f83ff9b655	staticcheck: fix all staticcheck warnings in the project	2020-04-14 20:16:10 +02:00
Hector Sanjuan	eba3246b98	chore: update deps: dht to cypress Add necessary validators. There is currently no way to run a dynamic-mode DHT that supports LAN-based peers. Therefore for the time-being we are setting the dht.Mode to "server".	2020-04-07 16:22:46 +02:00
Hector Sanjuan	65ad4bd632	cluster/daemons: Close the datastore AFTER the DHT. Avoids panics. This also removes the abnormality of cluster closing a datastore that it did not create.	2020-04-02 16:29:41 +02:00
deepakgarg	788ecff327	Fixes #996 pin expiry is updated if set in options	2020-03-26 20:07:23 -07:00
Hector Sanjuan	b3853caf36	Dependency ugprade: changes needed * Libp2p protectors no longer needed, use PSK directly * Generate cluster 32-byte secret here (helper gone from pnet) * Switch to go-log/v2 in all places * DHT bootstrapping not needed. Adjust DHT options for tests. * Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp. * Disable QUIC (does not support private networks) * Fix tests: autodiscovery started working properly	2020-03-22 14:50:25 +01:00
Yang Hau	7986d94242	fix: Fix typos (#1001 ) Fix typos in files	2020-02-03 10:30:04 +01:00
Kishan Mohanbhai Sagathiya	68abae9287	Merge branch 'master' into feat/alerts	2019-12-23 12:45:22 +05:30
Kishan Mohanbhai Sagathiya	a3b8767e87	Added tests for Alerts - tests for related cluster method, rest api, client method etc - clean expired alerts everytime a new alerts come in	2019-12-23 12:42:38 +05:30
Hector Sanjuan	ad1e739bfb	Cluster: follower peers should ignore alerts/re-allocations.	2019-12-16 13:42:35 +01:00
Hector Sanjuan	9f660ba38e	Fix: stateless: cluster should pin items that are in the state but not on ipfs StateSync() used to take care of this by issuing Track() calls. But this functionality was removed. This starts returning items that are in the state but not on IPFS as PIN_ERRORs. It ensures that the Recover methods see them so that they can trigger repinnings for missing items. This covers cases where the user modifies the ipfs state manually, or resets the ipfs daemon but keeps the cluster state, and cases where cluster was stopped half-way through a pinning.	2019-12-16 13:42:35 +01:00
Hector Sanjuan	6ad8974bde	cluster: Shutdown informer components	2019-12-13 09:48:12 +01:00
Hector Sanjuan	31f4afadea	cluster: Improve Shutdown() docstring	2019-12-13 09:47:42 +01:00
Kishan Mohanbhai Sagathiya	618ebd23f4	Check expiry in alert	2019-12-13 12:25:28 +05:30
Kishan Mohanbhai Sagathiya	d21860eee7	Merge branch 'master' into feat/alerts	2019-12-13 10:22:06 +05:30
Kishan Sagathiya	5258a4d428	Remove map pintracker (#944 ) This removes mappintracker and sets stateless tracker as the default (and only) pintracker component. Because the stateless tracker matches the cluster state with only ongoing operations being kept on memory, and additional information provided by ipfs-pin-ls, syncing operations are not necessary. Therefore the Sync/SyncAll operations are removed cluster-wide.	2019-12-12 21:22:54 +01:00
Kishan Mohanbhai Sagathiya	04069b8c81	Introduction of `ctl alerts` - basic version, just alerts if peers are down	2019-12-13 00:21:28 +05:30
Kishan Sagathiya	30fd5ee6a0	Feat: Allow and run multiple informer components (#962 ) This introduces the possiblity of running Cluster with multiple informer components. The first one in the list is the used for "allocations". This is just groundwork for working with several informers in the future.	2019-12-05 15:08:43 +01:00
Hector Sanjuan	d2bf1bc7b6	config: Add PeerAddresses This adds a PeerAddresses entry to the main cluster configuration. The peer will ingest and potentially connect to those peer addresses during the start (similarly to the ones in the peerstore). This allows to provide "bootstrap" (as in "peers we connect to") addresses directly in the configuration, which is useful when distributing a single configuration template that will allow a cluster peer to know where to connect on the first boot.	2019-12-02 15:08:37 +01:00
Kishan Mohanbhai Sagathiya	95016492c3	Merge branch 'master' into fix/broadcast-ops	2019-11-08 20:50:19 +05:30
Kishan Mohanbhai Sagathiya	d42c0fd651	move comment with variables	2019-11-08 20:49:06 +05:30
Kishan Mohanbhai Sagathiya	0e7ed97e59	Make it clearer	2019-11-08 19:45:01 +05:30
Kishan Mohanbhai Sagathiya	108fcff8a9	temp	2019-11-08 12:43:01 +05:30
Hector Sanjuan	249d9007d2	Merge branch 'master' into feat/cluster-gc	2019-11-07 18:35:42 +01:00

1 2 3 4 5 ...

308 Commits