ipfs-cluster

Author	SHA1	Message	Date
Hector Sanjuan	eee53bfa4f	Streaming Peers(): make Peers() a streaming call This commit makes all the changes to make Peers() a streaming call. While Peers is usually a non problematic call, for consistency, all calls returning collections assembled through broadcast to cluster peers are now streaming calls.	2022-03-23 01:27:57 +01:00
Hector Sanjuan	0d73d33ef5	Pintracker: streaming methods This commit continues the work of taking advantage of the streaming capabilities in go-libp2p-gorpc by improving the ipfsconnector and pintracker components. StatusAll and RecoverAll methods are now streaming methods, with the REST API output changing accordingly to produce a stream of GlobalPinInfos rather than a json array. pin/ls request to the ipfs daemon now use ?stream=true and avoid having to load the full pinset map on memory. StatusAllLocal and RecoverAllLocal requests to the pin tracker stream all the way and no longer store the full pinset, and the full PinInfo status slice before sending it out. We have additionally switched to a pattern where streaming methods receive the channel as an argument, allowing the caller to decide on whether to launch a goroutine, do buffering etc.	2022-03-22 15:38:01 +01:00
Hector Sanjuan	9b9d76f92d	Pinset streaming and method type revamp This commit introduces the new go-libp2p-gorpc streaming capabilities for Cluster. The main aim is to work towards heavily reducing memory usage when working with very large pinsets. As a side-effect, it takes the chance to revampt all types for all public methods so that pointers to static what should be static objects are not used anymore. This should heavily reduce heap allocations and GC activity. The main change is that state.List now returns a channel from which to read the pins, rather than pins being all loaded into a huge slice. Things reading pins have been all updated to iterate on the channel rather than on the slice. The full pinset is no longer fully loaded onto memory for things that run regularly like StateSync(). Additionally, the /allocations endpoint of the rest API no longer returns an array of pins, but rather streams json-encoded pin objects directly. This change has extended to the restapi client (which puts pins into a channel as they arrive) and to ipfs-cluster-ctl. There are still pending improvements like StatusAll() calls which should also stream responses, and specially BlockPut calls which should stream blocks directly into IPFS on a single call. These are coming up in future commits.	2022-03-19 03:02:55 +01:00
Hector Sanjuan	d0e905babe	Fix critical log level The CRITICAL log level no longer exists and we were logging more than we wanted to.	2022-02-15 19:38:33 +01:00
Hector Sanjuan	5e89c0ba41	Pintracker: set Name in operation tracker. Fixes #1212 .	2022-01-31 21:04:11 +01:00
Hector Sanjuan	809b7fbda5	Pintracker: add IPFS ID to Pin Information Fixes #1554 Fixes: peer names unset for remote peers This adds an IPFS field to pin status information (PinInfoShort). It has not been easy to add this, given that the IPFS ID is something that comes from outside of cluster (unlike the peer name). After several tries I have settled in the following things: - Use the ping metric to send out peer names and IPFS IDs to the peers in the cluster. - Cache the latest known IPFS ID (if IPFS dies we should still be setting the ID). - Provide an RPC method for the Pintracker to obtain IPFS ID from the cache. - Given we now know information for peernames and IPFS IDs from other peers, we can use that information even if the requests to them error or we are not contacting (i.e. peers allocated as remote are not queried for status). We can use the information from the last received ping metric. - This means we should keep metrics around even if peers go away, at least for a while rather than deleting them as soon as we detect that they have expired. Puting it all together we now have a system to gossip peer information around on top of the ping metrics.	2022-01-31 17:53:09 +01:00
Hector Sanjuan	592ce450ce	pintracker: RecoverAll should only return status for recovered items We call RecoverAll regularly and I noticed it was way slower than it should be. After all, it should just loop the pinset and enqueued items that are unexpectedly unpinned or in pin error. However, at some point we decided that RecoverAll would return information for all pins, regardless of whether they were recovered or not. This ends up resulting in a separate Status call for every pin that is already pinned, and this call hits IPFS. This is pretty bad with big pinsets. This commit fixes that, we return no state information for pins that are not touched.	2022-01-11 16:22:03 +01:00
Hector Sanjuan	26e229df94	Rename allocator/metrics to allocator/balanced	2021-10-06 11:26:38 +02:00
Hector Sanjuan	b6a46cd8a4	allocator: rework the whole allocator system The new "metrics" allocator is about to partition metrics and distribe allocations among the partitions. For example: given a region, an availability zone and free space on disk, the allocator would be able to choose allocations by distributing among regions and availability zones as much as possible, and for those peers in the same region/az, selecting those with most free space first. This requires a major overhaul of the allocator component.	2021-09-13 12:24:00 +02:00
Hector Sanjuan	ce2490c64f	Merge pull request #1389 from ipfs/fix/db-close-tests Fix: tests: close datastore on cluster node shutdown.	2021-07-06 14:05:36 +02:00
Hector Sanjuan	8ce98ceae3	Fix: tests: close datastore on cluster node shutdown. This resulted in too-many-files-open when running with leveldb.	2021-07-06 12:28:03 +02:00
Hector Sanjuan	edfcfa3fb0	Fix #1360 : Efficient pinset status with filters This commit modifies the pintracker StatusAll call to take a status filter. This allows to skip a PinLs call to ipfs when checking status for items that are queued, pinning, unpinning or in error. Those status come directly from the operation tracker. This should result in a significant performance increase for those calls, particularly in nodes with several hundred thousand pins and more, where the call to IPFS is very expensive. A new TrackerStatusUnexpectedlyUnpinned status has been introduce to differentiate between pin errors (tracked by the operation tracker) and "lost" items (which before were pin errors too). This new status is handled by the Recover() operation as before.	2021-07-06 11:34:19 +02:00
Hector Sanjuan	099e23cbd1	Run tests also using leveldb backend	2021-06-11 18:43:54 +02:00
Hector Sanjuan	39a7f5fdad	alerts: fix TestClusterAlerts	2021-01-13 22:23:51 +01:00
Hector Sanjuan	4bcb91ee2b	Merge branch 'master' into feat/alerts	2021-01-13 21:08:49 +01:00
Hector Sanjuan	4125f12f52	chore: update dependencies	2020-09-02 12:18:16 +02:00
Hector Sanjuan	c026299b95	Include Name as GlobalPinInfo key and consolidate redundant keys GlobalPinInfo objects carried redundant information (Cid, Peer) that takes space and time to serialize. This has been addressed by having GlobalPinInfo embed PinInfoShort rather than PinInfo. This new types ommits redundant fields.	2020-05-16 02:27:24 +02:00
Hector Sanjuan	fe4a5b32f4	tests: improve TestClustersPinDirect mode Check that a recursive pin cannot be repinned as direct	2020-04-23 18:28:16 +02:00
Hector Sanjuan	84eddf4ac8	Direct pins: fix add tests for direct pinning	2020-04-23 18:05:05 +02:00
Hector Sanjuan	74110b9b96	Tests: improve explicitness of expiry time update check	2020-04-16 11:02:33 +02:00
Hector Sanjuan	a6d8e00d20	Merge pull request #1065 from gargdeepak/fix/cluster/pinupdate Fixes #996 pin expiry is updated if set in options	2020-04-16 10:55:18 +02:00
Hector Sanjuan	7ffd18e41b	Feat: upgrade to dual DHT	2020-04-14 22:03:24 +02:00
deepakgarg	bf9a089ae3	Fixes #996 Rounding expiry time to 2s in TestClusterPingUpdate test	2020-04-14 12:45:34 -07:00
Hector Sanjuan	f83ff9b655	staticcheck: fix all staticcheck warnings in the project	2020-04-14 20:16:10 +02:00
deepakgarg	4aae7080a8	Fixes #996 . Updated the TestClustersPinUpdate test	2020-04-10 23:41:48 -07:00
Hector Sanjuan	eba3246b98	chore: update deps: dht to cypress Add necessary validators. There is currently no way to run a dynamic-mode DHT that supports LAN-based peers. Therefore for the time-being we are setting the dht.Mode to "server".	2020-04-07 16:22:46 +02:00
Hector Sanjuan	bebd1e168c	cluster: re-add quic to defaults, but keep transport disabled This still works.	2020-04-02 16:12:45 +02:00
Hector Sanjuan	b3853caf36	Dependency ugprade: changes needed * Libp2p protectors no longer needed, use PSK directly * Generate cluster 32-byte secret here (helper gone from pnet) * Switch to go-log/v2 in all places * DHT bootstrapping not needed. Adjust DHT options for tests. * Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp. * Disable QUIC (does not support private networks) * Fix tests: autodiscovery started working properly	2020-03-22 14:50:25 +01:00
Hector Sanjuan	531379b1d9	Feature: Support multiple listeners in configuration * add ipv6 listening addresses to the default config * ipfsproxy: support multiple listeners. Add default ipv6. * mm * restapi: support multiple listen addresses. enable ipv6 * cluster_config: format default listen addresses * commands: update for multiple listeners. Fix randomports for udp and ipv6. * ipfs-cluster-service: fix randomports test * multiple listeners: fix remaining tests * golint * Disable ipv6 in defaults It is not supported by docker by default. It is not supported in travis-CI build environments. User can enable it now manually. * proxy: disable ipv6 in test * ipfshttp: fix test Co-authored-by: @RubenKelevra <cyrond@gmail.com>	2020-02-28 11:16:16 -05:00
Kishan Mohanbhai Sagathiya	a3b8767e87	Added tests for Alerts - tests for related cluster method, rest api, client method etc - clean expired alerts everytime a new alerts come in	2019-12-23 12:42:38 +05:30
Kishan Sagathiya	5258a4d428	Remove map pintracker (#944 ) This removes mappintracker and sets stateless tracker as the default (and only) pintracker component. Because the stateless tracker matches the cluster state with only ongoing operations being kept on memory, and additional information provided by ipfs-pin-ls, syncing operations are not necessary. Therefore the Sync/SyncAll operations are removed cluster-wide.	2019-12-12 21:22:54 +01:00
Hector Sanjuan	cf0afdcbc3	Tests: re-enable quic listen address on tests	2019-12-07 12:26:55 +01:00
Kishan Sagathiya	30fd5ee6a0	Feat: Allow and run multiple informer components (#962 ) This introduces the possiblity of running Cluster with multiple informer components. The first one in the list is the used for "allocations". This is just groundwork for working with several informers in the future.	2019-12-05 15:08:43 +01:00
Hector Sanjuan	7b4d647267	Tests: multiple fixes, increase timings	2019-11-08 12:46:11 +01:00
Hector Sanjuan	9649664dbd	Run tests by default with crdt and not with raft	2019-11-07 20:19:34 +01:00
Hector Sanjuan	249d9007d2	Merge branch 'master' into feat/cluster-gc	2019-11-07 18:35:42 +01:00
Kishan Mohanbhai Sagathiya	4d8ef92b3d	health/graph: Improve graph Mark local, trusted peers. Add peernames. Improve display.	2019-11-07 10:47:29 +01:00
Hector Sanjuan	e34bddd3ab	Do not start Autonat when not used Avoid chaining options as not necessary. Add functions for common options and protector creation.	2019-11-05 12:51:18 +01:00
Hector Sanjuan	8d7ff58787	Use fixed version of go-libp2p-tls	2019-11-05 12:51:15 +01:00
Hector Sanjuan	669e75aefc	libp2p host: add secio as alternative, do not rewrap host Only use QUIC for tests, as TCP+TLS has proven very unreliable.	2019-11-05 12:50:46 +01:00
Kishan Mohanbhai Sagathiya	56ef75b50c	Use TLS instead of secio for security	2019-11-05 12:50:46 +01:00
Kishan Mohanbhai Sagathiya	ce85bfc745	Added support for QUIC - Cluster peers will now be able dial and listen using QUIC - By default QUIC is enabled, to disable it remove QUIC listen address from service.json - This commit also adds a config option for whether to act as relay or not, EnableRelayHop	2019-11-05 12:50:46 +01:00
Kishan Mohanbhai Sagathiya	e4e1cbea6e	Fix #481 : Pin expiration This adds a new PinOption: ExpireAt. The StateSync ticker will check and unpin expired pins from the Cluster. ipfs-cluster-ctl supports an "expire-in" which gives a duration.	2019-11-05 10:40:48 +01:00
Kishan Sagathiya	295915272b	Tests: multiple fixes to tests reliability (#943 ) This makes a number of fixes to improve the reliability of tests.	2019-10-31 21:51:13 +01:00
Kishan Mohanbhai Sagathiya	492b5612e7	Add ability to run Garbage Collector on all peers - cluster method, ipfs connector method, rpc and rest apis, command, etc for repo gc - Remove extra space from policy generator - Added special timeout for `/repo/gc` call to IPFS - Added `RepoGCLocal` cluster rpc method, which will be used to run gc on local IPFS daemon - Added peer name to the repo gc struct - Sorted with peer ids, while formatting(only affects cli results) - Special timeout setting where timeout gets checked from last update - Added `local` argument, which would run gc only on contacted peer	2019-10-22 11:13:19 +05:30
Hector Sanjuan	05be173241	Fix test program for go1.13	2019-10-04 20:01:40 +02:00
Hector Sanjuan	a98292bfa6	Merge pull request #893 from ipfs/feat/recover-all Pin recover on all peers	2019-09-23 13:13:48 -04:00
Hector Sanjuan	18dad223b2	Merge pull request #912 from ipfs/fix/allocations Fix: handling allocations	2019-09-09 10:22:14 +02:00
Kishan Mohanbhai Sagathiya	9cb1cdeaff	Merge branch 'master' into feat/recover-all	2019-09-08 17:06:43 +07:00
Hector Sanjuan	96752e4e58	Fix: handling allocations * pin() should not allocate if allocations are already provided * pin() should not skip pinning if the exact same pin exists * Additionally this was unreliable as it allocated it before so the pin may have existed but the allocations may have been artificially changed. * pin() re-uses existing pin when pin options are the same and thus avoids changing the allocations of a pin. As a side effect, this fixes re-allocations which were broken: peers called `shouldPeerRepinCid()` and instead of repinning that single cid proceeded to repin the full state. For every pin. Additionally tests have been adapted. It may be that some re-alloc tests were very unreliable for the problems above.	2019-09-06 17:56:00 +02:00

1 2 3 4

184 Commits