Commit Graph

61 Commits

Author SHA1 Message Date
Hector Sanjuan
1d98538411 Adders: stream blocks to destinations
This commit fixes #810 and adds block streaming to the final destinations when
adding. This should add major performance gains when adding data to clusters.

Before, everytime cluster issued a block, it was broadcasted individually to
all destinations (new libp2p stream), where it was block/put to IPFS (a single
block/put http roundtrip per block).

Now, blocks are streamed all the way from the adder module to the ipfs daemon,
by making every block as it arrives a single part in a multipart block/put
request.

Before, block-broadcast needed to wait for all destinations to finish in order
to process the next block. Now, buffers allow some destinations to be faster
than others while sending and receiving blocks.

Before, if a block put request failed to be broadcasted everywhere, an error
would happen at that moment.

Now, we keep streaming until the end and only then report any errors. The
operation succeeds as long as at least one stream finished successfully.

Errors block/putting to IPFS will not abort streams. Instead, subsequent
blocks are retried with a new request, although the method will return an
error when the stream finishes if there were errors at any point.
2022-03-24 17:24:58 +01:00
Hector Sanjuan
0d73d33ef5 Pintracker: streaming methods
This commit continues the work of taking advantage of the streaming
capabilities in go-libp2p-gorpc by improving the ipfsconnector and pintracker
components.

StatusAll and RecoverAll methods are now streaming methods, with the REST API
output changing accordingly to produce a stream of GlobalPinInfos rather than
a json array.

pin/ls request to the ipfs daemon now use ?stream=true and avoid having to
load the full pinset map on memory. StatusAllLocal and RecoverAllLocal
requests to the pin tracker stream all the way and no longer store the full
pinset, and the full PinInfo status slice before sending it out.

We have additionally switched to a pattern where streaming methods receive the
channel as an argument, allowing the caller to decide on whether to launch a
goroutine, do buffering etc.
2022-03-22 15:38:01 +01:00
Hector Sanjuan
9b9d76f92d Pinset streaming and method type revamp
This commit introduces the new go-libp2p-gorpc streaming capabilities for
Cluster. The main aim is to work towards heavily reducing memory usage when
working with very large pinsets.

As a side-effect, it takes the chance to revampt all types for all public
methods so that pointers to static what should be static objects are not used
anymore. This should heavily reduce heap allocations and GC activity.

The main change is that state.List now returns a channel from which to read
the pins, rather than pins being all loaded into a huge slice.

Things reading pins have been all updated to iterate on the channel rather
than on the slice. The full pinset is no longer fully loaded onto memory for
things that run regularly like StateSync().

Additionally, the /allocations endpoint of the rest API no longer returns an
array of pins, but rather streams json-encoded pin objects directly. This
change has extended to the restapi client (which puts pins into a channel as
they arrive) and to ipfs-cluster-ctl.

There are still pending improvements like StatusAll() calls which should also
stream responses, and specially BlockPut calls which should stream blocks
directly into IPFS on a single call.

These are coming up in future commits.
2022-03-19 03:02:55 +01:00
Hector Sanjuan
81d5759cc8 Fix #1366: IPFS Proxy API serializes wrong error types
The errors returned by the IPFS Proxy API are not understood by IPFS.

This was caused by go-ipfs-cmds setting an API error format which requires the
errors to have a type: "error" field.

This commit brings this up to speed.
2021-06-28 22:42:47 +02:00
Hector Sanjuan
a39ee13834 ipfsproxy: fix 'undefined cid' string appeareance in progress output
Fixes #1286. Some AddedOutput objects carry an undefined CID. This was getting
stringified as 'b', making the proxy responses include this rather than
skipping setting the field.
2021-01-13 16:32:08 +01:00
Hector Sanjuan
0dfa9ca185
(chore) Upgrade dependencies
Upgrade dependencies and bump to go1.15.
2020-08-27 14:10:58 +02:00
Kishan Mohanbhai Sagathiya
ae8e74453b Fix #937: Print full working configuration at startup
Only when using debug mode

Co-authored-by: Hector Sanjuan <code@hector.link>
2020-05-15 01:33:04 +02:00
Hector Sanjuan
6c60097421 ipfsproxy: support direct pin mode 2020-04-21 17:23:55 +02:00
Hector Sanjuan
717ed85823 gofmt -s fixes 2020-04-14 23:44:18 +02:00
Hector Sanjuan
f83ff9b655 staticcheck: fix all staticcheck warnings in the project 2020-04-14 20:16:10 +02:00
Hector Sanjuan
b3853caf36 Dependency ugprade: changes needed
* Libp2p protectors no longer needed, use PSK directly
* Generate cluster 32-byte secret here (helper gone from pnet)
* Switch to go-log/v2 in all places
* DHT bootstrapping not needed. Adjust DHT options for tests.
* Do not rely on dissappeared CidToDsKey and DsKeyToCid functions fro dshelp.
* Disable QUIC (does not support private networks)
* Fix tests: autodiscovery started working properly
2020-03-22 14:50:25 +01:00
Hector Sanjuan
531379b1d9
Feature: Support multiple listeners in configuration
* add ipv6 listening addresses to the default config

* ipfsproxy: support multiple listeners. Add default ipv6.

* mm

* restapi: support multiple listen addresses. enable ipv6

* cluster_config: format default listen addresses

* commands: update for multiple listeners. Fix randomports for udp and ipv6.

* ipfs-cluster-service: fix randomports test

* multiple listeners: fix remaining tests

* golint

* Disable ipv6 in defaults

It is not supported by docker by default. It is not supported in travis-CI
build environments. User can enable it now manually.

* proxy: disable ipv6 in test

* ipfshttp: fix test

Co-authored-by: @RubenKelevra <cyrond@gmail.com>
2020-02-28 11:16:16 -05:00
Kishan Sagathiya
e1faf12bae ipfsproxy: hijack repo/gc and trigger cluster-wide GC
This adds hijacking of the repo/gc endpoint to the proxy to do cluster-wide gc.
2019-12-06 13:08:57 +01:00
Kishan Mohanbhai Sagathiya
bc357400f6 API logs for IPFS Proxy 2019-09-10 13:59:03 +07:00
Hector Sanjuan
1eade4ae58 Fix #732: Introduce native pin/update
This introduces a pin/update operation which allows to Pin a new item to
cluster indicating that said pin is an update to an already-existing pin.

When this is the case, all the configuration for the existing pin is copied to
the new one (including allocations). The IPFS connector will then trigger
pin/update directly in IPFS, allowing an efficient pinning based on
DAG-differences. Since the allocations where the same for both pins,
the pin/update can proceed.

PinUpdate does not unpin the previous pin (it is not possible to do this
atomically in cluster like it happens in IPFS). The user can manually do it
after the pin/update is done.

Internally, after a lot of deliberations on what the optimal way for this is,
I opted for adding a `PinUpdate` option to the `PinOptions` type (carries the
CID to update from). In order to carry this option from the REST API to the
IPFS Connector, it is serialized in the Protobuf (and stored in the
datastore). There is no other way to do this in a simple fashion since the Pin
object is piece of information that is sent around.

Additionally, making it a PinOption plays well with the Pin/PinPath APIs which
need little changes. Effectively, you are pinning a new thing. You are just
indicating that it should be configured from an existing one.

Fixes #732
2019-08-09 16:11:52 +02:00
Hector Sanjuan
7c636061bd
Improve pin/unpin method signatures (#843)
* Improve pin/unpin method signatures:

These changes the following Cluster Go API methods:

* -> Cluster.Pin(ctx, cid, options) (pin, error)
* -> Cluster.Unpin(ctx, cid) (pin, error)
* -> Cluster.PinPath(ctx, path, opts) (pin,error)

Pin and Unpin now return the pinned object.

The signature of the methods now matches that of the API Client, is clearer as
to what options the user can set and is aligned with PinPath, UnpinPath, which
returned pin methods.

The REST API now returns the Pinned/Unpinned object rather than 204-Accepted.

This was necessary for a cleaner pin/update approach, which I'm working on in
another branch.

Most of the changes here are updating tests to the new signatures

* Adapt load-balancing client to new Pin/Unpin signatures

* cluster.go: Fix typo

Co-Authored-By: Kishan Sagathiya <kishansagathiya@gmail.com>

* cluster.go: Fix typo

Co-Authored-By: Kishan Sagathiya <kishansagathiya@gmail.com>
2019-07-22 15:39:11 +02:00
Hector Sanjuan
b804e61ef0 Update deps along with go-libp2p-core refactor
Lots of rewrites in imports...
2019-06-14 13:10:45 +02:00
Hector Sanjuan
99be078548 Fix: ipfsproxy: fix test failing with empty multiaddresses
This is a recent change in the multiaddress library to disallow
empty addresses.
2019-05-27 14:27:23 +02:00
Hector Sanjuan
a86c7cae2b rpc auth: handle some auth errors gracefully
particuarly we will ignore authorization errors for some broadcasts and somply
not include those responses in the assembled one.
2019-05-09 21:23:49 +02:00
Hector Sanjuan
3d49ac26a5 Feat: Split components into RPC Services
I had thought of this for a very long time but there were no compelling
reasons to do it. Specifying RPC endpoint permissions becomes however
significantly nicer if each Component is a different RPC Service. This also
fixes some naming issues like having to prefix methods with the component name
to separate them from methods named in the same way in some other component
(Pin and IPFSPin).
2019-05-04 21:36:10 +01:00
Hector Sanjuan
036e3da7f1 Proxy pin/update: Respond with BadRequest when arguments missing 2019-05-02 10:32:13 +01:00
Hector Sanjuan
da24114ae0 Proxy: hijack pin/update
The IPFS pin/update endpoint takes two arguments and usually
unpins the first and pins the second. It is a bit more efficient
to do it in a single operation than two separate ones.

This will make the proxy endpoint hijack pin/update requests.

First, the FROM pin is fetched from the state. If present, we
set the options (replication factors, actual allocations) from
that pin to the new one. Then we pin the TO item and proceed
to unpin the FROM item when `unpin` is not false.

We need to support path resolving, just like IPFS, therefore
it was necessary to expose IPFSResolve() via RPC.
2019-04-29 16:36:40 +02:00
Hector Sanjuan
acbd7fda60 Consensus: add new "crdt" consensus component
This adds a new "crdt" consensus component using go-ds-crdt.

This implies several refactors to fully make cluster consensus-component
independent:

* Delete mapstate and fully adopt dsstate (after people have migrated).
* Return errors from state methods rather than ignoring them.
* Add a new "datastore" modules so that we can configure datastores in the
   main configuration like other components.
* Let the consensus components fully define the "state.State". Thus, they do
not receive the state, they receive the storage where we put the state (a
go-datastore).
* Allow to customize how the monitor component obtains Peers() (the current
  peerset), including avoiding using the current peerset. At the moment the
  crdt consensus uses the monitoring component to define the current peerset.
  Therefore the monitor component cannot rely on the consensus component to
  produce a peerset.
* Re-factor/re-implementation of "ipfs-cluster-service state"
  operations. Includes the dissapearance of the "migrate" one.

The CRDT consensus component defines creates a crdt-datastore (with ipfs-lite)
and uses it to intitialize a dssate. Thus the crdt-store is elegantly
wrapped. Any modifications to the state get automatically replicated to other
peers. We store all the CRDT DAG blocks in the local datastore.

The consensus components only expose a ReadOnly state, as any modifications to
the shared state should happen through them.

DHT and PubSub facilities must now be created outside of Cluster and passed in
so they can be re-used by different components.
2019-04-17 19:14:26 +02:00
Alexey Novikov
a586548e62 fix #636: review nitpicks
License: MIT
Signed-off-by: Alexey Novikov <alexey@novikov.io>
2019-03-11 05:25:26 +00:00
Alexey Novikov
53d624e701 fix #636: mitingate long header attack
License: MIT
Signed-off-by: Alexey Novikov <alexey@novikov.io>
2019-03-10 21:16:26 +00:00
Hector Sanjuan
0008f69162 Types: make AddedOutput carry a cid.Cid
This was a leftover. For consisency, all CIDs coming out of the API
should have the canonical cid form ( { "/": "cid" } ), otherwise
we are inconsistent.
2019-03-06 18:48:25 +00:00
Kishan Mohanbhai Sagathiya
27a59991f6 Hide extract_headers_path, extract_headers_ttl
proxy: "extract_headers_path" and "extract_headers_ttl" should be
hidden options in the config
2019-03-06 17:55:06 +05:30
Hector Sanjuan
23db807b87 ipfsproxy: use PinPath to match IPFS behaviour
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-03-04 15:54:34 +00:00
Hector Sanjuan
ea85cf7805 Rename "test.Test*" to "test.*" (test.TestCid1 -> test.Cid1)
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:19:10 +00:00
Hector Sanjuan
9df6344a07 Avoid using string testing CIDs and use cid.Cids directly
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 20:09:31 +00:00
Hector Sanjuan
c4b18cd5f6 Address issues from self-review
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 18:43:29 +00:00
Hector Sanjuan
6447ea51d2 Remove *Serial types. Use pointers for all types.
This takes advantange of the latest features in go-cid, peer.ID and
go-multiaddr and makes the Go types serializable by default.

This means we no longer need to copy between Pin <-> PinSerial, or ID <->
IDSerial etc. We can now efficiently binary-encode these types using short
field keys and without parsing/stringifying (in many cases it just a cast).

We still get the same json output as before (with minor modifications for
Cids).

This should greatly improve Cluster performance and memory usage when dealing
with large collections of items.

License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-27 17:04:35 +00:00
Hector Sanjuan
666e370987 ipfsproxy: Remove additional backwards compatibility things
License: MIT
Signed-off-by: Hector Sanjuan <hector@protocol.ai>
2019-02-20 16:14:22 +00:00
Robert Ignat
36ee0f84eb Add ApplyEnvVars test to ipfsproxy config
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-18 17:40:51 +02:00
Robert Ignat
168cf76224 Change ApplyEnvVars strategy for all config components
Get jsonConfig from Config, apply env vars to it, load jsonConfig
back into Config.

License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-15 19:07:20 +02:00
Robert Ignat
da58eae2c9 Add ipfsproxy NodeHTTPS config field to ToJSON
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-11 18:08:55 +02:00
Robert Ignat
032f02802f Implement ApplyEnvVars for all ComponentConfigs
License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-08 23:57:16 +02:00
Robert Ignat
ed30ac1ab4 Add ApplyEnvVars() to ComponentConfig interface
* cluster and restapi configs can also get values from environment variables
* other config components don't read any values from the environment

License: MIT
Signed-off-by: Robert Ignat <robert.ignat91@gmail.com>
2019-02-07 20:51:20 +02:00
Adrian Lanzafame
3b3f786d68
add opencensus tracing and metrics
This commit adds support for OpenCensus tracing
and metrics collection. This required support for
context.Context propogation throughout the cluster
codebase, and in particular, the ipfscluster component
interfaces.

The tracing propogates across RPC and HTTP boundaries.
The current default tracing backend is Jaeger.

The metrics currently exports the metrics exposed by
the opencensus http plugin as well as the pprof metrics
to a prometheus endpoint for scraping.
The current default metrics backend is Prometheus.

Metrics are currently exposed by default due to low
overhead, can be turned off if desired, whereas tracing
is off by default as it has a much higher performance
overhead, though the extent of the performance hit can be
adjusted with smaller sampling rates.

License: MIT
Signed-off-by: Adrian Lanzafame <adrianlanzafame92@gmail.com>
2019-02-04 18:53:21 +10:00
Hector Sanjuan
6bda1e340b ipfsproxy: fix typos in comments
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-01-11 13:36:56 +01:00
Hector Sanjuan
80bb66e029 ipfsproxy: fix tests for new configuration keys
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-01-11 13:34:03 +01:00
Hector Sanjuan
bfbc652713 Fix typos in comments
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-01-11 11:50:28 +01:00
Hector Sanjuan
2a1eb3c2f9 Fix #382: Add TTL for cached headers
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-01-11 11:36:44 +01:00
Hector Sanjuan
66525fe0c8 ipfsproxy: add ExtractHeaderTTL option to config
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-01-11 11:01:48 +01:00
Hector Sanjuan
a0185fac2a Fix #382 (again): A better strategy for handling proxy headers
This changes the current strategy to extract headers from the IPFS daemon to
use them for hijacked endpoints in the proxy. The ipfs daemon is a bit of a
mess and what we were doing is not really reliable, specially when it comes to
setting CORS headers right (which we were not doing).

The new approach is:

* For every hijacked request, make an OPTIONS request to the same path, with
the given Origin, to the IPFS daemon and extract some CORS headers from
that. Use those in the hijacked response

* Avoid hijacking OPTIONS request, they should always go through so the IPFS
daemon controls all the CORS-preflight things as it wants.

* Similar to before, have a only-once-triggered request to extract other
interesting or custom headers from a fixed IPFS endpoint.  This allows us to
have the proxy forward other custom headers and to catch
`Access-Control-Expose-Methods`. The difference is that the endpoint use for
this and the additional headers are configurable by the user (but with hidden
configuration options because this is quite exotic from regular usage).

Now the implementation:

* Replaced the standard Muxer with gorilla/mux (I have also taken the change
to update the gxed version to the latest tag). This gives us much better
matching control over routes and allows us to not handle OPTIONS requests.

* This allows also to remove the extractArgument code and have proper handlers
for the endpoints passing command arguments as the last segment of the URL. A
very simple handler that wraps the default ones can be used to extract the
argument from the url and put it in the query.  Overall much cleaner this way.

* No longer capture interesting headers from any random proxied request.  This
made things complicated with a wrapping handler. We will just trigger the one
request to do it when we need it.

* When preparing the headers for the hijacked responses:
  * Trigger the OPTIONS request and figure out which CORS things we should set
  * Set the additional headers (perhaps triggering a POST request to fetch them)
  * Set our own headers.

* Moved all the headers stuff to a new headers.go file.

* Added configuration options (hidden by default) to:
  * Customize the extract headers endpoint
  * Customize what additional headers are extracted
  * Use HTTPs when talking to the IPFS API
    * I haven't tested this, but I did not want to have hardcoded 'http://' urls
      around, as before.

* Added extra testing for this, and tested manually a lot comparing the
daemon original output with our hijacked endpoint outputs while looking
at the API traffic with ngrep and making sure the requets happen as expected.
Also tested with IPFS companion in FF and Chrome.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2019-01-10 21:35:44 +01:00
Hector Sanjuan
159243d845 proxy: Check all headers to decide if we need to request
And do it by only loading them from the sync.Map once.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-12-20 14:44:32 +01:00
Hector Sanjuan
3896858a83 proxy: Rename helper to ipfsHeadersKnown()
License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-12-20 14:24:06 +01:00
Hector Sanjuan
862c1eb3ea Fix #382: Extract headers from IPFS API requests & apply them to hijacked ones.
This commit makes the proxy extract useful fixed headers (like CORS) from
the IPFS daemon API responses and then apply them to the responses
from hijacked endpoints like /add or /repo/stat.

It does this by caching a list of headers from the first IPFS API
response which has them. If we have not performed any proxied request or
managed to obtain the headers we're interested in, this will try triggering a
request to "/api/v0/version" to obtain them first.

This should fix the issues with using Cluster proxy with IPFS Companion and
Chrome.

License: MIT
Signed-off-by: Hector Sanjuan <code@hector.link>
2018-12-18 16:05:12 +01:00
Kishan Sagathiya
3b86a94418 Put a note for deprecating json fields
Put a note for deprecating json fields that they are only here to
maintain compatibility and they will be removed in future

Start using env vars starting with `CLUSTER_IPFSPROXY`

License: MIT
Signed-off-by: Kishan Mohanbhai Sagathiya <kishansagathiya@gmail.com>
2018-12-17 08:24:17 +05:30
Kishan Sagathiya
8b33dbec03 Remove proxy_ and Proxy from proxy config
Remove proxy_ and Proxy from proxy config objects without breaking
compatibility with previous revisions

Fixes #616

License: MIT
Signed-off-by: Kishan Mohanbhai Sagathiya <kishansagathiya@gmail.com>
2018-12-13 00:21:21 +05:30