ipfs-cluster/README.md
Hector Sanjuan 6c18c02106 Issue #10: peers/add and peers/rm feature + tests
This commit adds PeerAdd() and PeerRemove() endpoints, CLI support,
tests. Peer management is a delicate issue because of how the consensus
works underneath and the places that need to track such peers.

When adding a peer the procedure is as follows:

* Try to open a connection to the new peer and abort if not reachable
* Broadcast a PeerManagerAddPeer operation which tells all cluster members
to add the new Peer. The Raft leader will add it to Raft's peerset and
the multiaddress will be saved in the ClusterPeers configuration key.
* If the above fails because some cluster node is not responding,
broadcast a PeerRemove() and try to undo any damage.
* If the broadcast succeeds, send our ClusterPeers to the new Peer along with
the local multiaddress we are using in the connection opened in the
first step (that is the multiaddress through which the other peer can reach us)
* The new peer updates its configuration with the new list and joins
the consensus

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2017-02-02 13:51:49 +01:00

14 KiB

ipfs-cluster

standard-readme compliant GoDoc Go Report Card Build Status Coverage Status

Collective pinning and composition for IPFS.

WORK IN PROGRESS

DO NOT USE IN PRODUCTION

ipfs-cluster is a tool which groups a number of IPFS nodes together, allowing to collectively perform operations such as pinning.

In order to do so IPFS Cluster nodes use a libp2p-based consensus algorithm (currently Raft) to agree on a log of operations and build a consistent state across the cluster. The state represents which objects should be pinned by which nodes.

Additionally, cluster nodes act as a proxy/wrapper to the IPFS API, so they can be used as a regular node, with the difference that pin add, pin rm and pin ls requests are handled by the Cluster.

IPFS Cluster provides a cluster-node application (ipfs-cluster-service), a Go API, a HTTP API and a command-line tool (ipfs-cluster-ctl).

Current functionality only allows pinning in all cluster peers, but more strategies (like setting a replication factor for each pin) will be developed.

Table of Contents

Background

Since the start of IPFS it was clear that a tool to coordinate a number of different nodes (and the content they are supposed to store) would add a great value to the IPFS ecosystem. Naïve approaches are possible, but they present some weaknesses, specially at dealing with error handling, recovery and implementation of advanced pinning strategies.

ipfs-cluster aims to address this issues by providing a IPFS node wrapper which coordinates multiple cluster peers via a consensus algorithm. This ensures that the desired state of the system is always agreed upon and can be easily maintained by the cluster peers. Thus, every cluster node knows which content is tracked, can decide whether asking IPFS to pin it and can react to any contingencies like node reboots.

Maintainers and roadmap

This project is captained by @hsanjuan. See the captain's log for a written summary of current status and upcoming features. You can also check out the project's Roadmap for a high level overview of what's coming and the project's Waffle Board to see what issues are being worked on at the moment.

Install

In order to install the ipfs-cluster-service the ipfs-cluster-ctl tool simply download this repository and run make as follows:

$ go get -u -d github.com/ipfs/ipfs-cluster
$ cd $GOPATH/src/github.com/ipfs/ipfs-cluster
$ make install

This will install ipfs-cluster-service and ipfs-cluster-ctl in your $GOPATH/bin folder.

Usage

ipfs-cluster-service

ipfs-cluster-service runs a cluster peer. Usage information can be obtained running:

$ ipfs-cluster-service -h

Before running ipfs-cluster-service for the first time, initialize a configuration file with:

$ ipfs-cluster-service -init

The configuration will be placed in ~/.ipfs-cluster/service.json by default.

You can add the multiaddresses for the other cluster peers the cluster_peers variable. For example, here is a valid configuration for a cluster of 4 peers:

{
    "id": "QmSGCzHkz8gC9fNndMtaCZdf9RFtwtbTEEsGo4zkVfcykD",
    "private_key" : "<redacted>",
    "cluster_peers" : [
          "/ip4/192.168.1.2/tcp/9096/ipfs/QmcQ5XvrSQ4DouNkQyQtEoLczbMr6D9bSenGy6WQUCQUBt",
          "/ip4/192.168.1.3/tcp/9096/ipfs/QmdFBMf9HMDH3eCWrc1U11YCPenC3Uvy9mZQ2BedTyKTDf",
          "/ip4/192.168.1.4/tcp/9096/ipfs/QmYY1ggjoew5eFrvkenTR3F4uWqtkBkmgfJk8g9Qqcwy51"
          ],
    "cluster_multiaddress": "/ip4/0.0.0.0/tcp/9096",
    "api_listen_multiaddress": "/ip4/127.0.0.1/tcp/9094",
    "ipfs_proxy_listen_multiaddress": "/ip4/127.0.0.1/tcp/9095",
    "ipfs_node_multiaddress": "/ip4/127.0.0.1/tcp/5001",
    "consensus_data_folder": "/home/user/.ipfs-cluster/data"
}

The configuration file should probably be identical among all cluster peers, except for the id and private_key fields. To facilitate configuration, cluster_peers may include its own address, but it will be removed on boot. For additional information about the configuration format, see the JSONConfig documentation.

Once every cluster peer has the configuration in place, you can run ipfs-cluster-service to start the cluster.

Debugging

ipfs-cluster-service offers two debugging options:

  • --debug enables debug logging from the ipfs-cluster, go-libp2p-raft and go-libp2p-rpc layers. This will be a very verbose log output, but at the same time it is the most informative.
  • --loglevel sets the log level ([error, warning, info, debug]) for the ipfs-cluster only, allowing to get an overview of the what cluster is doing. The default log-level is info.

ipfs-cluster-ctl

ipfs-cluster-ctl is the client application to manage the cluster nodes and perform actions. ipfs-cluster-ctl uses the HTTP API provided by the nodes and it is completely separate from the cluster service. It can talk to any cluster peer (--host) and uses localhost by default.

After installing, you can run ipfs-cluster-ctl --help to display general description and options, or alternatively ipfs-cluster-ctl help [cmd] to display information about supported commands.

In summary, it works as follows:

$ ipfs-cluster-ctl id                                                       # show cluster peer and ipfs daemon information
$ ipfs-cluster-ctl peers ls                                                 # list cluster peers
$ ipfs-cluster-ctl peers add /ip4/1.2.3.4/tcp/1234/<peerid>                 # add a new cluster peer
$ ipfs-cluster-ctl peers rm <peerid>                                        # remove a cluster peer
$ ipfs-cluster-ctl pin add Qma4Lid2T1F68E3Xa3CpE6vVJDLwxXLD8RfiB9g1Tmqp58   # pins a CID in the cluster
$ ipfs-cluster-ctl pin rm Qma4Lid2T1F68E3Xa3CpE6vVJDLwxXLD8RfiB9g1Tmqp58    # unpins a CID from the cluster
$ ipfs-cluster-ctl status                                                   # display tracked CIDs information
$ ipfs-cluster-ctl sync Qma4Lid2T1F68E3Xa3CpE6vVJDLwxXLD8RfiB9g1Tmqp58      # sync information from the IPFS daemon
$ ipfs-cluster-ctl recover Qma4Lid2T1F68E3Xa3CpE6vVJDLwxXLD8RfiB9g1Tmqp58   # attempt to re-pin/unpin CIDs in error state

Debugging

ipfs-cluster-ctl provides a --debug flag which allows to inspect request paths and raw response bodies.

Quick start: Building and updating an IPFS Cluster

Step 0: Run your first cluster node

This step creates a single-node IPFS Cluster.

First initialize the configuration:

node0 $ ipfs-cluster-service init
ipfs-cluster-service configuration written to /home/user/.ipfs-cluster/service.json

Then run cluster:

node0> ipfs-cluster-service
12:38:34.470  INFO    cluster: IPFS Cluster v0.0.1 listening on: cluster.go:59
12:38:34.472  INFO    cluster:         /ip4/127.0.0.1/tcp/9096/ipfs/QmWM4knzVuWU5utXqkD2JeQ9zYT82f4s9s196bJ8PekStF cluster.go:61
12:38:34.472  INFO    cluster:         /ip4/192.168.1.58/tcp/9096/ipfs/QmWM4knzVuWU5utXqkD2JeQ9zYT82f4s9s196bJ8PekStF cluster.go:61
12:38:34.472  INFO    cluster: This is a single-node cluster peer_manager.go:141
12:38:34.569  INFO    cluster: starting Consensus and waiting for a leader... consensus.go:124
12:38:34.591  INFO    cluster: REST API: /ip4/127.0.0.1/tcp/9094 rest_api.go:309
12:38:34.591  INFO    cluster: IPFS Proxy: /ip4/127.0.0.1/tcp/9095 -> /ip4/127.0.0.1/tcp/5001 ipfs_http_connector.go:168
12:38:34.592  INFO    cluster: PinTracker ready map_pin_tracker.go:71
12:38:36.092  INFO    cluster: Raft Leader elected: QmWM4knzVuWU5utXqkD2JeQ9zYT82f4s9s196bJ8PekStF raft.go:146
12:38:36.092  INFO    cluster: Consensus state is up to date consensus.go:170
12:38:36.092  INFO    cluster: IPFS Cluster is ready cluster.go:526

Step 1: Add new members to the cluster

Initialize and run cluster in a different node(s):

node1> ipfs-cluster-service init
ipfs-cluster-service configuration written to /home/user/.ipfs-cluster/service.json
node1> ipfs-cluster-service
12:39:24.818  INFO    cluster: IPFS Cluster v0.0.1 listening on: cluster.go:59
12:39:24.819  INFO    cluster:         /ip4/127.0.0.1/tcp/9096/ipfs/QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK cluster.go:61
12:39:24.820  INFO    cluster:         /ip4/192.168.1.57/tcp/9096/ipfs/QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK cluster.go:61
12:39:24.820  INFO    cluster: This is a single-node cluster peer_manager.go:141
12:39:24.850  INFO    cluster: starting Consensus and waiting for a leader... consensus.go:124
12:39:24.876  INFO    cluster: IPFS Proxy: /ip4/127.0.0.1/tcp/9095 -> /ip4/127.0.0.1/tcp/5001 ipfs_http_connector.go:168
12:39:24.876  INFO    cluster: REST API: /ip4/127.0.0.1/tcp/9094 rest_api.go:309
12:39:24.876  INFO    cluster: PinTracker ready map_pin_tracker.go:71
12:39:26.877  INFO    cluster: Raft Leader elected: QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK raft.go:146
12:39:26.877  INFO    cluster: Consensus state is up to date consensus.go:170
12:39:26.878  INFO    service: IPFS Cluster is ready main.go:184

Add them to the original cluster with ipfs-cluster-ctl peers add <multiaddr>. The command will return the ID information of the newly added member:

node0> ipfs-cluster-ctl peers add /ip4/192.168.1.57/tcp/9096/ipfs/QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK
{
  "id": "QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK",
  "public_key": "CAASpgIwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQDQUN0iWAdbYEfQFAYcORsd0XnBvR9dk1QrJbzyqwDEHebP/wYqjeK73cyzBrpTYzxyd205ZSrpImL1GvVl+iLONMlz0CHsQ2YL0zzYHy55Y+1LhGGZY5R14MqvrjSq8pWo8U9nF8aenXSxhNvVeErnE5voVUU7YTjXSaXYmsK0cT7erKHZooJ16dzL/UmRTYlirMuGcOv/4WmgYX5fikH1mtw1Ln2xew76qxL5MeCu7v7NNugbsachJFiC/0DewxPClS03Nv6TvW2FsN4iis961EoBH7qTI3E1gUS89s7xp2njfgD/hsyk6YUbEEbOJUNihPFJ3Wlx6ogbis3cdX3tAgMBAAE=",
  "addresses": [
    "/ip4/127.0.0.1/tcp/9096/ipfs/QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK",
    "/ip4/192.168.1.57/tcp/9096/ipfs/QmQn6aaWJNvyZnLh4soqjXQXiXGZM8VTuZW4B6AGaWxeNK"
  ],
  "cluster_peers": [
    "/ip4/192.168.1.58/tcp/9096/ipfs/QmWM4knzVuWU5utXqkD2JeQ9zYT82f4s9s196bJ8PekStF"
  ],
  "version": "0.0.1",
  "commit": "",
  "rpc_protocol_version": "/ipfscluster/0.0.1/rpc",
  "ipfs": {
    "id": "QmRbn14eEDGEDf9d6mW64W6KsinkmMXqZaToWVbRANT8gY",
    "addresses": [
      "/ip4/127.0.0.1/tcp/4001/ipfs/QmRbn14eEDGEDf9d6mW64W6KsinkmMXqZaToWVbRANT8gY",
      "/ip4/192.168.1.57/tcp/4001/ipfs/QmRbn14eEDGEDf9d6mW64W6KsinkmMXqZaToWVbRANT8gY"
    ]
  }
}

You can repeat the process with any other nodes.

Step 3: Remove no longer needed nodes

You can use ipfs-cluster-ctl peers rm <multiaddr> to remove and disconnect any nodes from your cluster. The nodes will be automatically shutdown. They can be restarted manually and re-added to the Cluster any time:

node0> ipfs-cluster-ctl peers rm QmbGFbZVTF3UAEPK9pBVdwHGdDAYkHYufQwSh4k1i8bbbb
OK

The node1 is then disconnected and shuts down:

12:41:13.693 WARNI    cluster: this peer has been removed from the Cluster and will shutdown itself peer_manager.go:80
12:41:13.695  INFO    cluster: stopping Consensus component consensus.go:231
12:41:14.695  INFO    cluster: shutting down IPFS Cluster cluster.go:135
12:41:14.696  INFO    cluster: stopping Cluster API rest_api.go:327
12:41:14.696  INFO    cluster: stopping IPFS Proxy ipfs_http_connector.go:332
12:41:14.697  INFO    cluster: stopping MapPinTracker map_pin_tracker.go:87

Go

IPFS Cluster nodes can be launched directly from Go. The Cluster object provides methods to interact with the cluster and perform actions.

Documentation and examples on how to use IPFS Cluster from Go can be found in godoc.org/github.com/ipfs/ipfs-cluster.

API

TODO: Swagger

This is a quick summary of API endpoints offered by the Rest API component (these may change before 1.0):

Method Endpoint Comment
GET /id Cluster peer information
GET /version Cluster version
GET /peers Cluster peers
POST /peers Add new peer
DELETE /peers/{peerID} Remove a peer
GET /pinlist List of pins in the consensus state
GET /pins Status of all tracked CIDs
POST /pins/sync Sync all
GET /pins/{cid} Status of single CID
POST /pins/{cid} Pin CID
DELETE /pins/{cid} Unpin CID
POST /pins/{cid}/sync Sync CID
POST /pins/{cid}/recover Recover CID

Contribute

PRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © Protocol Labs, Inc.