Hector Sanjuan
b6a46cd8a4
allocator: rework the whole allocator system
...
The new "metrics" allocator is about to partition metrics and distribe
allocations among the partitions.
For example: given a region, an availability zone and free space on disk, the
allocator would be able to choose allocations by distributing among regions
and availability zones as much as possible, and for those peers in the same
region/az, selecting those with most free space first.
This requires a major overhaul of the allocator component.
2021-09-13 12:24:00 +02:00
Hector Sanjuan
ecf287c8e6
Fix #1409 : Better support of metrics from older peers
...
Before: receiving a metric from a peer <= 0.13.3 causes decoding error on logs.
Now: metric is correctly parsed and a warning message is printed once.
2021-08-12 00:03:05 +02:00
Hector Sanjuan
adb15feb6e
Dependency upgrades ( #1395 )
...
* build(deps): bump github.com/multiformats/go-multiaddr-dns
Bumps [github.com/multiformats/go-multiaddr-dns](https://github.com/multiformats/go-multiaddr-dns ) from 0.2.0 to 0.3.1.
- [Release notes](https://github.com/multiformats/go-multiaddr-dns/releases )
- [Commits](https://github.com/multiformats/go-multiaddr-dns/compare/v0.2.0...v0.3.1 )
Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
* build(deps): bump github.com/hashicorp/go-hclog from 0.15.0 to 0.16.0
Bumps [github.com/hashicorp/go-hclog](https://github.com/hashicorp/go-hclog ) from 0.15.0 to 0.16.0.
- [Release notes](https://github.com/hashicorp/go-hclog/releases )
- [Commits](https://github.com/hashicorp/go-hclog/compare/v0.15.0...v0.16.0 )
Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
* build(deps): bump github.com/ipfs/go-unixfs from 0.2.4 to 0.2.5
Bumps [github.com/ipfs/go-unixfs](https://github.com/ipfs/go-unixfs ) from 0.2.4 to 0.2.5.
- [Release notes](https://github.com/ipfs/go-unixfs/releases )
- [Commits](https://github.com/ipfs/go-unixfs/compare/v0.2.4...v0.2.5 )
Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
* build(deps): bump github.com/libp2p/go-libp2p-peerstore
Bumps [github.com/libp2p/go-libp2p-peerstore](https://github.com/libp2p/go-libp2p-peerstore ) from 0.2.6 to 0.2.7.
- [Release notes](https://github.com/libp2p/go-libp2p-peerstore/releases )
- [Commits](https://github.com/libp2p/go-libp2p-peerstore/compare/v0.2.6...v0.2.7 )
Signed-off-by: dependabot[bot] <support@github.com>
* build(deps): bump go.uber.org/multierr from 1.6.0 to 1.7.0
Bumps [go.uber.org/multierr](https://github.com/uber-go/multierr ) from 1.6.0 to 1.7.0.
- [Release notes](https://github.com/uber-go/multierr/releases )
- [Changelog](https://github.com/uber-go/multierr/blob/master/CHANGELOG.md )
- [Commits](https://github.com/uber-go/multierr/compare/v1.6.0...v1.7.0 )
Signed-off-by: dependabot[bot] <support@github.com>
* Chore: update deps
* Update changelog
* Update to go1.16. Downgrade unixfs.
* go mod tidy
* travis: use go install
* golint no more
* Update configuration for dependabot
* Fix wrong dependabot config
* dependabot
* Revert update of go-unixfs
* Dependency upgrades
* Bump github.com/libp2p/go-libp2p-gorpc from 0.1.2 to 0.1.3
Bumps [github.com/libp2p/go-libp2p-gorpc](https://github.com/libp2p/go-libp2p-gorpc ) from 0.1.2 to 0.1.3.
- [Release notes](https://github.com/libp2p/go-libp2p-gorpc/releases )
- [Commits](https://github.com/libp2p/go-libp2p-gorpc/compare/v0.1.2...v0.1.3 )
---
updated-dependencies:
- dependency-name: github.com/libp2p/go-libp2p-gorpc
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
* Fix deprecated objects with prometheus
* chore: update dependencies
* monitor: remove dependency to go-multicodec
go-multicodec has been deprecated and it was just a wrapper.
This switches directly to ugorji/go/codec's msgpack for cluster metrics
serialization.
* Upgrade mfs so it works with latest go-unixfs
* Bump github.com/ugorji/go/codec from 1.2.5 to 1.2.6 (#1391 )
Bumps [github.com/ugorji/go/codec](https://github.com/ugorji/go ) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/ugorji/go/releases )
- [Commits](https://github.com/ugorji/go/compare/v1.2.5...v1.2.6 )
---
updated-dependencies:
- dependency-name: github.com/ugorji/go/codec
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Bump github.com/hashicorp/go-hclog from 0.16.0 to 0.16.1 (#1392 )
Bumps [github.com/hashicorp/go-hclog](https://github.com/hashicorp/go-hclog ) from 0.16.0 to 0.16.1.
- [Release notes](https://github.com/hashicorp/go-hclog/releases )
- [Commits](https://github.com/hashicorp/go-hclog/compare/v0.16.0...v0.16.1 )
---
updated-dependencies:
- dependency-name: github.com/hashicorp/go-hclog
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-07-06 16:47:04 +02:00
Hector Sanjuan
c02130245a
Revert "monitor: remove dependency to go-multicodec ( #1313 )" ( #1330 )
...
This reverts commit fab2ed8149
.
2021-03-23 23:59:36 +01:00
Hector Sanjuan
fab2ed8149
monitor: remove dependency to go-multicodec ( #1313 )
...
go-multicodec has been deprecated and it was just a wrapper.
This switches directly to ugorji/go/codec's msgpack for cluster metrics
serialization.
2021-02-19 09:59:17 +01:00
Hector Sanjuan
12756e0220
monitor: fix panic in checker
...
Alert might be launched when no metrics for peer are received at all.
2021-01-14 00:04:40 +01:00
Hector Sanjuan
90208b45f9
health/alerts endpoint: brush up old PR
2021-01-13 22:09:21 +01:00
Hector Sanjuan
4bcb91ee2b
Merge branch 'master' into feat/alerts
2021-01-13 21:08:49 +01:00
Hector Sanjuan
0dfa9ca185
(chore) Upgrade dependencies
...
Upgrade dependencies and bump to go1.15.
2020-08-27 14:10:58 +02:00
Kishan Mohanbhai Sagathiya
ae8e74453b
Fix #937 : Print full working configuration at startup
...
Only when using debug mode
Co-authored-by: Hector Sanjuan <code@hector.link>
2020-05-15 01:33:04 +02:00
Hector Sanjuan
b513ec194d
Fix some mispellings
2020-04-14 23:47:09 +02:00
Hector Sanjuan
f83ff9b655
staticcheck: fix all staticcheck warnings in the project
2020-04-14 20:16:10 +02:00
Hector Sanjuan
b3853caf36
Dependency ugprade: changes needed
...
* Libp2p protectors no longer needed, use PSK directly
* Generate cluster 32-byte secret here (helper gone from pnet)
* Switch to go-log/v2 in all places
* DHT bootstrapping not needed. Adjust DHT options for tests.
* Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp.
* Disable QUIC (does not support private networks)
* Fix tests: autodiscovery started working properly
2020-03-22 14:50:25 +01:00
Kishan Mohanbhai Sagathiya
618ebd23f4
Check expiry in alert
2019-12-13 12:25:28 +05:30
Kishan Sagathiya
31534a429b
Fix #374 : health metrics
improvements
...
- Human-sizes for freespace metrics. Display whether if metric is
expires in something like "expires in 3m".
- When not passing metric name `ipfs-cluster-ctl health metrics` hits
the the metrics endpoint which returns a list of available metrics and
displays to user
- Humanize metrics output
- Sort metrics output
2019-10-24 16:37:26 +02:00
Kishan Mohanbhai Sagathiya
76857112b2
Test that expired PeerMetrics gets deleted
2019-09-13 08:01:15 +07:00
Hector Sanjuan
e240c2a19f
Simplify failed peer detection
2019-06-27 16:55:51 +01:00
Adrian Lanzafame
2255ba737b
fix ttl expiration check
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-06-25 12:54:41 +02:00
Hector Sanjuan
563a0da9ae
Do alert for all metric types
2019-06-23 10:14:29 +01:00
Adrian Lanzafame
27295c10ac
fix check failed
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-06-23 10:14:29 +01:00
Adrian Lanzafame
5e09da9d63
address pr feedback
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-06-23 10:14:29 +01:00
Adrian Lanzafame
e1b40d49c1
fix how accrual fd treats ttls
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-06-23 10:14:29 +01:00
Hector Sanjuan
b804e61ef0
Update deps along with go-libp2p-core refactor
...
Lots of rewrites in imports...
2019-06-14 13:10:45 +02:00
Hector Sanjuan
27368ab077
Fix: alert at most once PER METRIC
...
Before it would alert at most once per peer, which prevented some metrics
from alerting at all.
2019-06-11 11:44:12 +02:00
Hector Sanjuan
a0d93fc62c
Change MaxAlertThreshold to 1
2019-06-11 10:54:12 +02:00
Adrian Lanzafame
14841e4e24
address pr feedback
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-06-11 10:54:12 +02:00
Adrian Lanzafame
7459917275
alerting for peers stops after one alert
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-06-11 10:54:12 +02:00
Hector Sanjuan
6caf78a57b
monitor config: make threshold optional in the configuration
...
takes default when not set.
2019-05-16 12:52:40 +02:00
Adrian Lanzafame
a763560e0c
extend the initial size of metrics distribution to 5
...
This prevents accrual failure detection from kicking in too
soon after a cluster has started.
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-05-07 19:07:11 +10:00
Adrian Lanzafame
9464759ae6
remove hard timeout limits and use only accrual failure detection
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-30 12:06:01 +10:00
Adrian Lanzafame
42693eb06d
fix passing ctx from daemon to pubsub
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-29 17:58:28 +10:00
Adrian Lanzafame
32ca9167d6
use accrual instead of metric expiration
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-26 17:58:29 +10:00
Adrian Lanzafame
3c09ebcc71
add Alerts measure
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-26 17:56:44 +10:00
Adrian Lanzafame
b0dbcbaa8d
add reference to original prob.go
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-26 12:20:31 +10:00
Adrian Lanzafame
d5ecd9ef6a
WIP
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-23 20:30:26 +10:00
Adrian Lanzafame
bf1b5eff90
comment config value
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:18:19 +10:00
Adrian Lanzafame
eae4329cb3
address pr feedback
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:18:19 +10:00
Adrian Lanzafame
31af640e33
use allocations list to choose peer to repin
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:16:40 +10:00
Adrian Lanzafame
1349e99c1e
fix time taken by tests
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:16:39 +10:00
Adrian Lanzafame
4338ea6905
refactor prob to use gonum and pass []float64
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:16:39 +10:00
Adrian Lanzafame
bcbe7b453f
refactor from big.Float to float64 and add prob tests
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:14:13 +10:00
Adrian Lanzafame
e187b800cf
rename TS to ReceivedAt
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:14:13 +10:00
Adrian Lanzafame
c4b76619c1
Add failure_threshold monitors config
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:14:13 +10:00
Adrian Lanzafame
3d6eb64db6
Add accrual failure detection method
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:14:13 +10:00
Adrian Lanzafame
13ed78786c
fix distribution test and general clean up
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:09:19 +10:00
Hector Sanjuan
4e61935379
Use defer for locks. Move to Prev() in All()
...
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-04-18 16:09:19 +10:00
Hector Sanjuan
da3c543ce2
Revert "attempt copying slice"
...
This reverts commit 0d4d40513fccd31b9cdc4db369aa87e87c529be4.
2019-04-18 16:09:19 +10:00
Adrian Lanzafame
46d6cb155d
attempt copying slice
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:09:19 +10:00
Adrian Lanzafame
2b1b8a4389
remove use of last
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:09:18 +10:00
Adrian Lanzafame
ebcf40cf7d
rename TS to ReceivedAt
...
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:09:18 +10:00