Mercurial > hg > sv-dependency-builds
comparison src/capnproto-git-20161025/doc/rpc.md @ 133:1ac99bfc383d
Add Cap'n Proto source
author | Chris Cannam <cannam@all-day-breakfast.com> |
---|---|
date | Tue, 25 Oct 2016 11:17:01 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
132:42a73082be24 | 133:1ac99bfc383d |
---|---|
1 --- | |
2 layout: page | |
3 title: RPC Protocol | |
4 --- | |
5 | |
6 # RPC Protocol | |
7 | |
8 ## Introduction | |
9 | |
10 ### Time Travel! _(Promise Pipelining)_ | |
11 | |
12 <img src='images/time-travel.png' style='max-width:639px'> | |
13 | |
14 Cap'n Proto RPC employs TIME TRAVEL! The results of an RPC call are returned to the client | |
15 instantly, before the server even receives the initial request! | |
16 | |
17 There is, of course, a catch: The results can only be used as part of a new request sent to the | |
18 same server. If you want to use the results for anything else, you must wait. | |
19 | |
20 This is useful, however: Say that, as in the picture, you want to call `foo()`, then call `bar()` | |
21 on its result, i.e. `bar(foo())`. Or -- as is very common in object-oriented programming -- you | |
22 want to call a method on the result of another call, i.e. `foo().bar()`. With any traditional RPC | |
23 system, this will require two network round trips. With Cap'n Proto, it takes only one. In fact, | |
24 you can chain any number of such calls together -- with diamond dependencies and everything -- and | |
25 Cap'n Proto will collapse them all into one round trip. | |
26 | |
27 By now you can probably imagine how it works: if you execute `bar(foo())`, the client sends two | |
28 messages to the server, one saying "Please execute foo()", and a second saying "Please execute | |
29 bar() on the result of the first call". These messages can be sent together -- there's no need | |
30 to wait for the first call to actually return. | |
31 | |
32 To make programming to this model easy, in your code, each call returns a "promise". Promises | |
33 work much like Javascript promises or promises/futures in other languages: the promise is returned | |
34 immediately, but you must later call `wait()` on it, or call `then()` to register an asynchronous | |
35 callback. | |
36 | |
37 However, Cap'n Proto promises support an additional feature: | |
38 [pipelining](http://www.erights.org/elib/distrib/pipeline.html). The promise | |
39 actually has methods corresponding to whatever methods the final result would have, except that | |
40 these methods may only be used for the purpose of calling back to the server. Moreover, a | |
41 pipelined promise can be used in the parameters to another call without waiting. | |
42 | |
43 **_But isn't that just syntax sugar?_** | |
44 | |
45 OK, fair enough. In a traditional RPC system, we might solve our problem by introducing a new | |
46 method `foobar()` which combines `foo()` and `bar()`. Now we've eliminated the round trip, without | |
47 inventing a whole new RPC protocol. | |
48 | |
49 The problem is, this kind of arbitrary combining of orthogonal features quickly turns elegant | |
50 object-oriented protocols into ad-hoc messes. | |
51 | |
52 For example, consider the following interface: | |
53 | |
54 {% highlight capnp %} | |
55 # A happy, object-oriented interface! | |
56 | |
57 interface Node {} | |
58 | |
59 interface Directory extends(Node) { | |
60 list @0 () -> (list: List(Entry)); | |
61 struct Entry { | |
62 name @0 :Text; | |
63 file @1 :Node; | |
64 } | |
65 | |
66 create @1 (name :Text) -> (node :Node); | |
67 open @2 (name :Text) -> (node :Node); | |
68 delete @3 (name :Text); | |
69 link @4 (name :Text, node :Node); | |
70 } | |
71 | |
72 interface File extends(Node) { | |
73 size @0 () -> (size: UInt64); | |
74 read @1 (startAt :UInt64, amount :UInt64) -> (data: Data); | |
75 write @2 (startAt :UInt64, data :Data); | |
76 truncate @3 (size :UInt64); | |
77 } | |
78 {% endhighlight %} | |
79 | |
80 This is a very clean interface for interacting with a file system. But say you are using this | |
81 interface over a satellite link with 1000ms latency. Now you have a problem: simply reading the | |
82 file `foo` in directory `bar` takes four round trips! | |
83 | |
84 {% highlight python %} | |
85 # pseudocode | |
86 bar = root.open("bar"); # 1 | |
87 foo = bar.open("foo"); # 2 | |
88 size = foo.size(); # 3 | |
89 data = foo.read(0, size); # 4 | |
90 # The above is four calls but takes only one network | |
91 # round trip with Cap'n Proto! | |
92 {% endhighlight %} | |
93 | |
94 In such a high-latency scenario, making your interface elegant is simply not worth 4x the latency. | |
95 So now you're going to change it. You'll probably do something like: | |
96 | |
97 * Introduce a notion of path strings, so that you can specify "foo/bar" rather than make two | |
98 separate calls. | |
99 * Merge the `File` and `Directory` interfaces into a single `Filesystem` interface, where every | |
100 call takes a path as an argument. | |
101 | |
102 {% highlight capnp %} | |
103 # A sad, singleton-ish interface. | |
104 | |
105 interface Filesystem { | |
106 list @0 (path :Text) -> (list :List(Text)); | |
107 create @1 (path :Text, data :Data); | |
108 delete @2 (path :Text); | |
109 link @3 (path :Text, target :Text); | |
110 | |
111 fileSize @4 (path :Text) -> (size: UInt64); | |
112 read @5 (path :Text, startAt :UInt64, amount :UInt64) | |
113 -> (data :Data); | |
114 readAll @6 (path :Text) -> (data: Data); | |
115 write @7 (path :Text, startAt :UInt64, data :Data); | |
116 truncate @8 (path :Text, size :UInt64); | |
117 } | |
118 {% endhighlight %} | |
119 | |
120 We've now solved our latency problem... but at what cost? | |
121 | |
122 * We now have to implement path string manipulation, which is always a headache. | |
123 * If someone wants to perform multiple operations on a file or directory, we now either have to | |
124 re-allocate resources for every call or we have to implement some sort of cache, which tends to | |
125 be complicated and error-prone. | |
126 * We can no longer give someone a specific `File` or a `Directory` -- we have to give them a | |
127 `Filesystem` and a path. | |
128 * But what if they are buggy and have hard-coded some path other than the one we specified? | |
129 * Or what if we don't trust them, and we really want them to access only one particular `File` or | |
130 `Directory` and not have permission to anything else. Now we have to implement authentication | |
131 and authorization systems! Arrgghh! | |
132 | |
133 Essentially, in our quest to avoid latency, we've resorted to using a singleton-ish design, and | |
134 [singletons are evil](http://www.object-oriented-security.org/lets-argue/singletons). | |
135 | |
136 **Promise Pipelining solves all of this!** | |
137 | |
138 With pipelining, our 4-step example can be automatically reduced to a single round trip with no | |
139 need to change our interface at all. We keep our simple, elegant, singleton-free interface, we | |
140 don't have to implement path strings, caching, authentication, or authorization, and yet everything | |
141 performs as well as we can possibly hope for. | |
142 | |
143 #### Example code | |
144 | |
145 [The calculator example](https://github.com/sandstorm-io/capnproto/blob/master/c++/samples/calculator-client.c++) | |
146 uses promise pipelining. Take a look at the client side in particular. | |
147 | |
148 ### Distributed Objects | |
149 | |
150 As you've noticed by now, Cap'n Proto RPC is a distributed object protocol. Interface references -- | |
151 or, as we more commonly call them, capabilities -- are a first-class type. You can pass a | |
152 capability as a parameter to a method or embed it in a struct or list. This is a huge difference | |
153 from many modern RPC-over-HTTP protocols that only let you address global URLs, or other RPC | |
154 systems like Protocol Buffers and Thrift that only let you address singleton objects exported at | |
155 startup. The ability to dynamically introduce new objects and pass around references to them | |
156 allows you to use the same design patterns over the network that you use locally in object-oriented | |
157 programming languages. Many kinds of interactions become vastly easier to express given the | |
158 richer vocabulary. | |
159 | |
160 **_Didn't CORBA prove this doesn't work?_** | |
161 | |
162 No! | |
163 | |
164 CORBA failed for many reasons, with the usual problems of design-by-committee being a big one. | |
165 | |
166 However, the biggest reason for CORBA's failure is that it tried to make remote calls look the | |
167 same as local calls. Cap'n Proto does NOT do this -- remote calls have a different kind of API | |
168 involving promises, and accounts for the presence of a network introducing latency and | |
169 unreliability. | |
170 | |
171 As shown above, promise pipelining is absolutely critical to making object-oriented interfaces work | |
172 in the presence of latency. If remote calls look the same as local calls, there is no opportunity | |
173 to introduce promise pipelining, and latency is inevitable. Any distributed object protocol which | |
174 does not support promise pipelining cannot -- and should not -- succeed. Thus the failure of CORBA | |
175 (and DCOM, etc.) was inevitable, but Cap'n Proto is different. | |
176 | |
177 ### Handling disconnects | |
178 | |
179 Networks are unreliable. Occasionally, connections will be lost. When this happens, all | |
180 capabilities (object references) served by the connection will become disconnected. Any further | |
181 calls addressed to these capabilities will throw "disconnected" exceptions. When this happens, the | |
182 client will need to create a new connection and try again. All Cap'n Proto applications with | |
183 long-running connections (and probably short-running ones too) should be prepared to catch | |
184 "disconnected" exceptions and respond appropriately. | |
185 | |
186 On the server side, when all references to an object have been "dropped" (either because the | |
187 clients explicitly dropped them or because they became disconnected), the object will be closed | |
188 (in C++, the destructor is called; in GC'd languages, a `close()` method is called). This allows | |
189 servers to easily allocate per-client resources without having to clean up on a timeout or risk | |
190 leaking memory. | |
191 | |
192 ### Security | |
193 | |
194 Cap'n Proto interface references are | |
195 [capabilities](http://en.wikipedia.org/wiki/Capability-based_security). That is, they both | |
196 designate an object to call and confer permission to call it. When a new object is created, only | |
197 the creator is initially able to call it. When the object is passed over a network connection, | |
198 the receiver gains permission to make calls -- but no one else does. In fact, it is impossible | |
199 for others to access the capability without consent of either the host or the receiver because | |
200 the host only assigns it an ID specific to the connection over which it was sent. | |
201 | |
202 Capability-based design patterns -- which largely boil down to object-oriented design patterns -- | |
203 work great with Cap'n Proto. Such patterns tend to be much more adaptable than traditional | |
204 ACL-based security, making it easy to keep security tight and avoid confused-deputy attacks while | |
205 minimizing pain for legitimate users. That said, you can of course implement ACLs or any other | |
206 pattern on top of capabilities. | |
207 | |
208 For an extended discussion of what capabilities are and why they are often easier and more powerful | |
209 than ACLs, see Mark Miller's | |
210 ["An Ode to the Granovetter Diagram"](http://www.erights.org/elib/capability/ode/index.html) and | |
211 [Capability Myths Demolished](http://zesty.ca/capmyths/usenix.pdf). | |
212 | |
213 ## Protocol Features | |
214 | |
215 Cap'n Proto's RPC protocol has the following notable features. Since the protocol is complicated, | |
216 the feature set has been divided into numbered "levels", so that implementations may declare which | |
217 features they have covered by advertising a level number. | |
218 | |
219 * **Level 1:** Object references and promise pipelining, as described above. | |
220 * **Level 2:** Persistent capabilities. You may request to "save" a capability, receiving a | |
221 persistent token which can be used to "restore" it in the future (on a new connection). Not | |
222 all capabilities can be saved; the host app must implement support for it. Building this into | |
223 the protocol makes it possible for a Cap'n-Proto-based data store to transparently save | |
224 structures containing capabilities without knowledge of the particular capability types or the | |
225 application built on them, as well as potentially enabling more powerful analysis and | |
226 visualization of stored data. | |
227 * **Level 3:** Three-way interactions. A network of Cap'n Proto vats (nodes) can pass object | |
228 references to each other and automatically form direct connections as needed. For instance, if | |
229 Alice (on machine A) sends Bob (on machine B) a reference to Carol (on machine C), then machine B | |
230 will form a new connection to machine C so that Bob can call Carol directly without proxying | |
231 through machine A. | |
232 * **Level 4:** Reference equality / joining. If you receive a set of capabilities from different | |
233 parties which should all point to the same underlying objects, you can verify securely that they | |
234 in fact do. This is subtle, but enables many security patterns that rely on one party being able | |
235 to verify that two or more other parties agree on something (imagine a digital escrow agent). | |
236 See [E's page on equality](http://erights.org/elib/equality/index.html). | |
237 | |
238 ## Encryption | |
239 | |
240 At this time, Cap'n Proto does not specify an encryption scheme, but as it is a simple byte | |
241 stream protocol, it can easily be layered on top of SSL/TLS or other such protocols. | |
242 | |
243 ## Specification | |
244 | |
245 The Cap'n Proto RPC protocol is defined in terms of Cap'n Proto serialization schemas. The | |
246 documentation is inline. See | |
247 [rpc.capnp](https://github.com/sandstorm-io/capnproto/blob/master/c++/src/capnp/rpc.capnp). | |
248 | |
249 Cap'n Proto's RPC protocol is based heavily on | |
250 [CapTP](http://www.erights.org/elib/distrib/captp/index.html), the distributed capability protocol | |
251 used by the [E programming language](http://www.erights.org/index.html). Lots of useful material | |
252 for understanding capabilities can be found at those links. | |
253 | |
254 The protocol is complex, but the functionality it supports is conceptually simple. Just as TCP | |
255 is a complex protocol that implements the simple concept of a byte stream, Cap'n Proto RPC is a | |
256 complex protocol that implements the simple concept of objects with callable methods. |