sv-dependency-builds: src/capnproto-0.6.0/doc/rpc.md annotate

annotate src/capnproto-0.6.0/doc/rpc.md @ 169:223a55898ab9 tip default

Add null config files

author	Chris Cannam <cannam@all-day-breakfast.com>
date	Mon, 02 Mar 2020 14:03:47 +0000
parents	45360b968bf4
children

rev	line source
cannam@147	1 ---
cannam@147	2 layout: page
cannam@147	3 title: RPC Protocol
cannam@147	4 ---
cannam@147	5
cannam@147	6 # RPC Protocol
cannam@147	7
cannam@147	8 ## Introduction
cannam@147	9
cannam@147	10 ### Time Travel! _(Promise Pipelining)_
cannam@147	11
cannam@147	12 <img src='images/time-travel.png' style='max-width:639px'>
cannam@147	13
cannam@147	14 Cap'n Proto RPC employs TIME TRAVEL! The results of an RPC call are returned to the client
cannam@147	15 instantly, before the server even receives the initial request!
cannam@147	16
cannam@147	17 There is, of course, a catch: The results can only be used as part of a new request sent to the
cannam@147	18 same server. If you want to use the results for anything else, you must wait.
cannam@147	19
cannam@147	20 This is useful, however: Say that, as in the picture, you want to call `foo()`, then call `bar()`
cannam@147	21 on its result, i.e. `bar(foo())`. Or -- as is very common in object-oriented programming -- you
cannam@147	22 want to call a method on the result of another call, i.e. `foo().bar()`. With any traditional RPC
cannam@147	23 system, this will require two network round trips. With Cap'n Proto, it takes only one. In fact,
cannam@147	24 you can chain any number of such calls together -- with diamond dependencies and everything -- and
cannam@147	25 Cap'n Proto will collapse them all into one round trip.
cannam@147	26
cannam@147	27 By now you can probably imagine how it works: if you execute `bar(foo())`, the client sends two
cannam@147	28 messages to the server, one saying "Please execute foo()", and a second saying "Please execute
cannam@147	29 bar() on the result of the first call". These messages can be sent together -- there's no need
cannam@147	30 to wait for the first call to actually return.
cannam@147	31
cannam@147	32 To make programming to this model easy, in your code, each call returns a "promise". Promises
cannam@147	33 work much like Javascript promises or promises/futures in other languages: the promise is returned
cannam@147	34 immediately, but you must later call `wait()` on it, or call `then()` to register an asynchronous
cannam@147	35 callback.
cannam@147	36
cannam@147	37 However, Cap'n Proto promises support an additional feature:
cannam@147	38 [pipelining](http://www.erights.org/elib/distrib/pipeline.html). The promise
cannam@147	39 actually has methods corresponding to whatever methods the final result would have, except that
cannam@147	40 these methods may only be used for the purpose of calling back to the server. Moreover, a
cannam@147	41 pipelined promise can be used in the parameters to another call without waiting.
cannam@147	42
cannam@147	43 _But isn't that just syntax sugar?_
cannam@147	44
cannam@147	45 OK, fair enough. In a traditional RPC system, we might solve our problem by introducing a new
cannam@147	46 method `foobar()` which combines `foo()` and `bar()`. Now we've eliminated the round trip, without
cannam@147	47 inventing a whole new RPC protocol.
cannam@147	48
cannam@147	49 The problem is, this kind of arbitrary combining of orthogonal features quickly turns elegant
cannam@147	50 object-oriented protocols into ad-hoc messes.
cannam@147	51
cannam@147	52 For example, consider the following interface:
cannam@147	53
cannam@147	54 {% highlight capnp %}
cannam@147	55 # A happy, object-oriented interface!
cannam@147	56
cannam@147	57 interface Node {}
cannam@147	58
cannam@147	59 interface Directory extends(Node) {
cannam@147	60 list @0 () -> (list: List(Entry));
cannam@147	61 struct Entry {
cannam@147	62 name @0 :Text;
cannam@147	63 file @1 :Node;
cannam@147	64 }
cannam@147	65
cannam@147	66 create @1 (name :Text) -> (node :Node);
cannam@147	67 open @2 (name :Text) -> (node :Node);
cannam@147	68 delete @3 (name :Text);
cannam@147	69 link @4 (name :Text, node :Node);
cannam@147	70 }
cannam@147	71
cannam@147	72 interface File extends(Node) {
cannam@147	73 size @0 () -> (size: UInt64);
cannam@147	74 read @1 (startAt :UInt64, amount :UInt64) -> (data: Data);
cannam@147	75 write @2 (startAt :UInt64, data :Data);
cannam@147	76 truncate @3 (size :UInt64);
cannam@147	77 }
cannam@147	78 {% endhighlight %}
cannam@147	79
cannam@147	80 This is a very clean interface for interacting with a file system. But say you are using this
cannam@147	81 interface over a satellite link with 1000ms latency. Now you have a problem: simply reading the
cannam@147	82 file `foo` in directory `bar` takes four round trips!
cannam@147	83
cannam@147	84 {% highlight python %}
cannam@147	85 # pseudocode
cannam@147	86 bar = root.open("bar"); # 1
cannam@147	87 foo = bar.open("foo"); # 2
cannam@147	88 size = foo.size(); # 3
cannam@147	89 data = foo.read(0, size); # 4
cannam@147	90 # The above is four calls but takes only one network
cannam@147	91 # round trip with Cap'n Proto!
cannam@147	92 {% endhighlight %}
cannam@147	93
cannam@147	94 In such a high-latency scenario, making your interface elegant is simply not worth 4x the latency.
cannam@147	95 So now you're going to change it. You'll probably do something like:
cannam@147	96
cannam@147	97 * Introduce a notion of path strings, so that you can specify "foo/bar" rather than make two
cannam@147	98 separate calls.
cannam@147	99 * Merge the `File` and `Directory` interfaces into a single `Filesystem` interface, where every
cannam@147	100 call takes a path as an argument.
cannam@147	101
cannam@147	102 {% highlight capnp %}
cannam@147	103 # A sad, singleton-ish interface.
cannam@147	104
cannam@147	105 interface Filesystem {
cannam@147	106 list @0 (path :Text) -> (list :List(Text));
cannam@147	107 create @1 (path :Text, data :Data);
cannam@147	108 delete @2 (path :Text);
cannam@147	109 link @3 (path :Text, target :Text);
cannam@147	110
cannam@147	111 fileSize @4 (path :Text) -> (size: UInt64);
cannam@147	112 read @5 (path :Text, startAt :UInt64, amount :UInt64)
cannam@147	113 -> (data :Data);
cannam@147	114 readAll @6 (path :Text) -> (data: Data);
cannam@147	115 write @7 (path :Text, startAt :UInt64, data :Data);
cannam@147	116 truncate @8 (path :Text, size :UInt64);
cannam@147	117 }
cannam@147	118 {% endhighlight %}
cannam@147	119
cannam@147	120 We've now solved our latency problem... but at what cost?
cannam@147	121
cannam@147	122 * We now have to implement path string manipulation, which is always a headache.
cannam@147	123 * If someone wants to perform multiple operations on a file or directory, we now either have to
cannam@147	124 re-allocate resources for every call or we have to implement some sort of cache, which tends to
cannam@147	125 be complicated and error-prone.
cannam@147	126 * We can no longer give someone a specific `File` or a `Directory` -- we have to give them a
cannam@147	127 `Filesystem` and a path.
cannam@147	128 * But what if they are buggy and have hard-coded some path other than the one we specified?
cannam@147	129 * Or what if we don't trust them, and we really want them to access only one particular `File` or
cannam@147	130 `Directory` and not have permission to anything else. Now we have to implement authentication
cannam@147	131 and authorization systems! Arrgghh!
cannam@147	132
cannam@147	133 Essentially, in our quest to avoid latency, we've resorted to using a singleton-ish design, and
cannam@147	134 [singletons are evil](http://www.object-oriented-security.org/lets-argue/singletons).
cannam@147	135
cannam@147	136 Promise Pipelining solves all of this!
cannam@147	137
cannam@147	138 With pipelining, our 4-step example can be automatically reduced to a single round trip with no
cannam@147	139 need to change our interface at all. We keep our simple, elegant, singleton-free interface, we
cannam@147	140 don't have to implement path strings, caching, authentication, or authorization, and yet everything
cannam@147	141 performs as well as we can possibly hope for.
cannam@147	142
cannam@147	143 #### Example code
cannam@147	144
cannam@147	145 [The calculator example](https://github.com/sandstorm-io/capnproto/blob/master/c++/samples/calculator-client.c++)
cannam@147	146 uses promise pipelining. Take a look at the client side in particular.
cannam@147	147
cannam@147	148 ### Distributed Objects
cannam@147	149
cannam@147	150 As you've noticed by now, Cap'n Proto RPC is a distributed object protocol. Interface references --
cannam@147	151 or, as we more commonly call them, capabilities -- are a first-class type. You can pass a
cannam@147	152 capability as a parameter to a method or embed it in a struct or list. This is a huge difference
cannam@147	153 from many modern RPC-over-HTTP protocols that only let you address global URLs, or other RPC
cannam@147	154 systems like Protocol Buffers and Thrift that only let you address singleton objects exported at
cannam@147	155 startup. The ability to dynamically introduce new objects and pass around references to them
cannam@147	156 allows you to use the same design patterns over the network that you use locally in object-oriented
cannam@147	157 programming languages. Many kinds of interactions become vastly easier to express given the
cannam@147	158 richer vocabulary.
cannam@147	159
cannam@147	160 _Didn't CORBA prove this doesn't work?_
cannam@147	161
cannam@147	162 No!
cannam@147	163
cannam@147	164 CORBA failed for many reasons, with the usual problems of design-by-committee being a big one.
cannam@147	165
cannam@147	166 However, the biggest reason for CORBA's failure is that it tried to make remote calls look the
cannam@147	167 same as local calls. Cap'n Proto does NOT do this -- remote calls have a different kind of API
cannam@147	168 involving promises, and accounts for the presence of a network introducing latency and
cannam@147	169 unreliability.
cannam@147	170
cannam@147	171 As shown above, promise pipelining is absolutely critical to making object-oriented interfaces work
cannam@147	172 in the presence of latency. If remote calls look the same as local calls, there is no opportunity
cannam@147	173 to introduce promise pipelining, and latency is inevitable. Any distributed object protocol which
cannam@147	174 does not support promise pipelining cannot -- and should not -- succeed. Thus the failure of CORBA
cannam@147	175 (and DCOM, etc.) was inevitable, but Cap'n Proto is different.
cannam@147	176
cannam@147	177 ### Handling disconnects
cannam@147	178
cannam@147	179 Networks are unreliable. Occasionally, connections will be lost. When this happens, all
cannam@147	180 capabilities (object references) served by the connection will become disconnected. Any further
cannam@147	181 calls addressed to these capabilities will throw "disconnected" exceptions. When this happens, the
cannam@147	182 client will need to create a new connection and try again. All Cap'n Proto applications with
cannam@147	183 long-running connections (and probably short-running ones too) should be prepared to catch
cannam@147	184 "disconnected" exceptions and respond appropriately.
cannam@147	185
cannam@147	186 On the server side, when all references to an object have been "dropped" (either because the
cannam@147	187 clients explicitly dropped them or because they became disconnected), the object will be closed
cannam@147	188 (in C++, the destructor is called; in GC'd languages, a `close()` method is called). This allows
cannam@147	189 servers to easily allocate per-client resources without having to clean up on a timeout or risk
cannam@147	190 leaking memory.
cannam@147	191
cannam@147	192 ### Security
cannam@147	193
cannam@147	194 Cap'n Proto interface references are
cannam@147	195 [capabilities](http://en.wikipedia.org/wiki/Capability-based_security). That is, they both
cannam@147	196 designate an object to call and confer permission to call it. When a new object is created, only
cannam@147	197 the creator is initially able to call it. When the object is passed over a network connection,
cannam@147	198 the receiver gains permission to make calls -- but no one else does. In fact, it is impossible
cannam@147	199 for others to access the capability without consent of either the host or the receiver because
cannam@147	200 the host only assigns it an ID specific to the connection over which it was sent.
cannam@147	201
cannam@147	202 Capability-based design patterns -- which largely boil down to object-oriented design patterns --
cannam@147	203 work great with Cap'n Proto. Such patterns tend to be much more adaptable than traditional
cannam@147	204 ACL-based security, making it easy to keep security tight and avoid confused-deputy attacks while
cannam@147	205 minimizing pain for legitimate users. That said, you can of course implement ACLs or any other
cannam@147	206 pattern on top of capabilities.
cannam@147	207
cannam@147	208 For an extended discussion of what capabilities are and why they are often easier and more powerful
cannam@147	209 than ACLs, see Mark Miller's
cannam@147	210 ["An Ode to the Granovetter Diagram"](http://www.erights.org/elib/capability/ode/index.html) and
cannam@147	211 [Capability Myths Demolished](http://zesty.ca/capmyths/usenix.pdf).
cannam@147	212
cannam@147	213 ## Protocol Features
cannam@147	214
cannam@147	215 Cap'n Proto's RPC protocol has the following notable features. Since the protocol is complicated,
cannam@147	216 the feature set has been divided into numbered "levels", so that implementations may declare which
cannam@147	217 features they have covered by advertising a level number.
cannam@147	218
cannam@147	219 * Level 1: Object references and promise pipelining, as described above.
cannam@147	220 * Level 2: Persistent capabilities. You may request to "save" a capability, receiving a
cannam@147	221 persistent token which can be used to "restore" it in the future (on a new connection). Not
cannam@147	222 all capabilities can be saved; the host app must implement support for it. Building this into
cannam@147	223 the protocol makes it possible for a Cap'n-Proto-based data store to transparently save
cannam@147	224 structures containing capabilities without knowledge of the particular capability types or the
cannam@147	225 application built on them, as well as potentially enabling more powerful analysis and
cannam@147	226 visualization of stored data.
cannam@147	227 * Level 3: Three-way interactions. A network of Cap'n Proto vats (nodes) can pass object
cannam@147	228 references to each other and automatically form direct connections as needed. For instance, if
cannam@147	229 Alice (on machine A) sends Bob (on machine B) a reference to Carol (on machine C), then machine B
cannam@147	230 will form a new connection to machine C so that Bob can call Carol directly without proxying
cannam@147	231 through machine A.
cannam@147	232 * Level 4: Reference equality / joining. If you receive a set of capabilities from different
cannam@147	233 parties which should all point to the same underlying objects, you can verify securely that they
cannam@147	234 in fact do. This is subtle, but enables many security patterns that rely on one party being able
cannam@147	235 to verify that two or more other parties agree on something (imagine a digital escrow agent).
cannam@147	236 See [E's page on equality](http://erights.org/elib/equality/index.html).
cannam@147	237
cannam@147	238 ## Encryption
cannam@147	239
cannam@147	240 At this time, Cap'n Proto does not specify an encryption scheme, but as it is a simple byte
cannam@147	241 stream protocol, it can easily be layered on top of SSL/TLS or other such protocols.
cannam@147	242
cannam@147	243 ## Specification
cannam@147	244
cannam@147	245 The Cap'n Proto RPC protocol is defined in terms of Cap'n Proto serialization schemas. The
cannam@147	246 documentation is inline. See
cannam@147	247 [rpc.capnp](https://github.com/sandstorm-io/capnproto/blob/master/c++/src/capnp/rpc.capnp).
cannam@147	248
cannam@147	249 Cap'n Proto's RPC protocol is based heavily on
cannam@147	250 [CapTP](http://www.erights.org/elib/distrib/captp/index.html), the distributed capability protocol
cannam@147	251 used by the [E programming language](http://www.erights.org/index.html). Lots of useful material
cannam@147	252 for understanding capabilities can be found at those links.
cannam@147	253
cannam@147	254 The protocol is complex, but the functionality it supports is conceptually simple. Just as TCP
cannam@147	255 is a complex protocol that implements the simple concept of a byte stream, Cap'n Proto RPC is a
cannam@147	256 complex protocol that implements the simple concept of objects with callable methods.

Mercurial > hg > sv-dependency-builds

annotate src/capnproto-0.6.0/doc/rpc.md @ 169:223a55898ab9 tip default