Commit Graph

26 Commits

Author SHA1 Message Date
Adrian Lanzafame
c4b76619c1
Add failure_threshold monitors config
License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-04-18 16:14:13 +10:00
Hector Sanjuan
6447ea51d2 Remove *Serial types. Use pointers for all types.
This takes advantange of the latest features in go-cid, peer.ID and
go-multiaddr and makes the Go types serializable by default.

This means we no longer need to copy between Pin <-> PinSerial, or ID <->
IDSerial etc. We can now efficiently binary-encode these types using short
field keys and without parsing/stringifying (in many cases it just a cast).

We still get the same json output as before (with minor modifications for
Cids).

This should greatly improve Cluster performance and memory usage when dealing
with large collections of items.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 17:04:35 +00:00
Robert Ignat
e38ceab222 Add ApplyEnvVars test to numpin config
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-18 17:46:15 +02:00
Robert Ignat
08580c3d5c Add ApplyEnvVars test to disk config
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-18 17:45:42 +02:00
Robert Ignat
168cf76224 Change ApplyEnvVars strategy for all config components
Get jsonConfig from Config, apply env vars to it, load jsonConfig
back into Config.

License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-15 19:07:20 +02:00
Robert Ignat
032f02802f Implement ApplyEnvVars for all ComponentConfigs
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-08 23:57:16 +02:00
Robert Ignat
ed30ac1ab4 Add ApplyEnvVars() to ComponentConfig interface
* cluster and restapi configs can also get values from environment variables
* other config components don't read any values from the environment

License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-07 20:51:20 +02:00
Adrian Lanzafame
3b3f786d68
add opencensus tracing and metrics
This commit adds support for OpenCensus tracing
and metrics collection. This required support for
context.Context propogation throughout the cluster
codebase, and in particular, the ipfscluster component
interfaces.

The tracing propogates across RPC and HTTP boundaries.
The current default tracing backend is Jaeger.

The metrics currently exports the metrics exposed by
the opencensus http plugin as well as the pprof metrics
to a prometheus endpoint for scraping.
The current default metrics backend is Prometheus.

Metrics are currently exposed by default due to low
overhead, can be turned off if desired, whereas tracing
is off by default as it has a much higher performance
overhead, though the extent of the performance hit can be
adjusted with smaller sampling rates.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-02-04 18:53:21 +10:00
Hector Sanjuan
7d16108751 Start using libp2p/go-libp2p-gorpc
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-10-17 15:28:03 +02:00
Hector Sanjuan
2ffa3d80de Fix 466: Hijack repo/stat in the proxy and return aggregates from the cluster.
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-08-20 20:43:27 +02:00
Hector Sanjuan
a9d6fe3479 Types: rename metric.SetTTLDuration to metric.SetTTL
GetTTL returns duration. SetTTL should take duration too, not seconds.
This removes the original SetTTL method which used seconds.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-05-07 14:26:06 +02:00
Hector Sanjuan
a746255719 Fix tests
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-03-13 16:30:47 +01:00
Hector Sanjuan
c231fe8060 Update libp2p and deps to correct versions
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-03-13 15:32:56 +01:00
Hector Sanjuan
5831b251fe Issue #202: Fix mock informer
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-10-26 16:01:41 +02:00
Hector Sanjuan
00e871ddec Fix #202: Fix informers and allocators for 32-bit architectures 2017-10-26 16:01:41 +02:00
Hector Sanjuan
fb8fdb94c5 Issue #162: Improve Config.ToJSON() tests
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-10-20 10:42:41 +02:00
Hector Sanjuan
89364fd671 Issue #162: Add tests for informer Configs
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-10-19 21:09:03 +02:00
Hector Sanjuan
8f06baa1bf Issue #162: Rework configuration format
The following commit reimplements ipfs-cluster configuration under
the following premises:

  * Each component is initialized with a configuration object
  defined by its module
  * Each component decides how the JSON representation of its
  configuration looks like
  * Each component parses and validates its own configuration
  * Each component exposes its own defaults
  * Component configurations are make the sections of a
  central JSON configuration file (which replaces the current
  JSON format)
  * Component configurations implement a common interface
  (config.ComponentConfig) with a set of common operations
  * The central configuration file is managed by a
  config.ConfigManager which:
    * Registers ComponentConfigs
    * Assigns the correspondent sections from the JSON file to each
    component and delegates the parsing
    * Delegates the JSON generation for each section
    * Can be notified when the configuration is updated and must be
    saved to disk

The new service.json would then look as follows:

```json
{
  "cluster": {
    "id": "QmTVW8NoRxC5wBhV7WtAYtRn7itipEESfozWN5KmXUQnk2",
    "private_key": "<...>",
    "secret": "00224102ae6aaf94f2606abf69a0e278251ecc1d64815b617ff19d6d2841f786",
    "peers": [],
    "bootstrap": [],
    "leave_on_shutdown": false,
    "listen_multiaddress": "/ip4/0.0.0.0/tcp/9096",
    "state_sync_interval": "1m0s",
    "ipfs_sync_interval": "2m10s",
    "replication_factor": -1,
    "monitor_ping_interval": "15s"
  },
  "consensus": {
    "raft": {
      "heartbeat_timeout": "1s",
      "election_timeout": "1s",
      "commit_timeout": "50ms",
      "max_append_entries": 64,
      "trailing_logs": 10240,
      "snapshot_interval": "2m0s",
      "snapshot_threshold": 8192,
      "leader_lease_timeout": "500ms"
    }
  },
  "api": {
    "restapi": {
      "listen_multiaddress": "/ip4/127.0.0.1/tcp/9094",
      "read_timeout": "30s",
      "read_header_timeout": "5s",
      "write_timeout": "1m0s",
      "idle_timeout": "2m0s"
    }
  },
  "ipfs_connector": {
    "ipfshttp": {
      "proxy_listen_multiaddress": "/ip4/127.0.0.1/tcp/9095",
      "node_multiaddress": "/ip4/127.0.0.1/tcp/5001",
      "connect_swarms_delay": "7s",
      "proxy_read_timeout": "10m0s",
      "proxy_read_header_timeout": "5s",
      "proxy_write_timeout": "10m0s",
      "proxy_idle_timeout": "1m0s"
    }
  },
  "monitor": {
    "monbasic": {
      "check_interval": "15s"
    }
  },
  "informer": {
    "disk": {
      "metric_ttl": "30s",
      "metric_type": "freespace"
    },
    "numpin": {
      "metric_ttl": "10s"
    }
  }
}
```

This new format aims to be easily extensible per component. As such,
it already surfaces quite a few new options which were hardcoded
before.

Additionally, since Go API have changed, some redundant methods have been
removed and small refactoring has happened to take advantage of the new
way.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-10-18 00:00:12 +02:00
dgrisham
b1356cd33a removing name arg from NewInformer 2017-09-01 15:02:15 -06:00
dgrisham
456603f7a8 adding error output to NewInformerWithMetric 2017-09-01 09:48:15 -06:00
dgrisham
407fd9f68a Informer impl refactored; SortNumeric added for allocators. 2017-08-28 08:51:01 -06:00
dgrisham
2d2a9da793 FreeSpace metric impl (including descendalloc) refactored and tested. 2017-08-11 13:57:42 -06:00
dgrisham
a46ab3dda9 freespace metric partial impl 2017-08-03 11:24:19 -06:00
Hector Sanjuan
2bbbea79cc Issue #49: Add disk informer
The disk informer uses "ipfs repo stat" to fetch the RepoSize value and
uses it as a metric.

The numpinalloc allocator is now a generalized ascendalloc which
sorts metrics in ascending order and return the ones with lowest
values.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-03-27 20:40:49 +02:00
Hector Sanjuan
8e45ce64c2 Documentation: captain log, readme and golint fixes
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-02-15 15:46:51 +01:00
Hector Sanjuan
2512ecb701 Issue #41: Add Replication factor
New PeerManager, Allocator, Informer components have been added along
with a new "replication_factor" configuration option.

First, cluster peers collect and push metrics (Informer) to the Cluster
leader regularly. The Informer is an interface that can be implemented
in custom wayts to support custom metrics.

Second, on a pin operation, using the information from the collected metrics,
an Allocator can provide a list of preferences as to where the new pin
should be assigned. The Allocator is an interface allowing to provide
different allocation strategies.

Both Allocator and Informer are Cluster Componenets, and have access
to the RPC API.

The allocations are kept in the shared state. Cluster peer failure
detection is still missing and re-allocation is still missing, although
re-pinning something when a node is down/metrics missing does re-allocate
the pin somewhere else.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-02-14 19:13:08 +01:00