Mercurial > hg > sv-dependency-builds
comparison src/capnproto-git-20161025/doc/language.md @ 48:9530b331f8c1
Add Cap'n Proto source
author | Chris Cannam <cannam@all-day-breakfast.com> |
---|---|
date | Tue, 25 Oct 2016 11:17:01 +0100 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
47:d93140aac40b | 48:9530b331f8c1 |
---|---|
1 --- | |
2 layout: page | |
3 title: Schema Language | |
4 --- | |
5 | |
6 # Schema Language | |
7 | |
8 Like Protocol Buffers and Thrift (but unlike JSON or MessagePack), Cap'n Proto messages are | |
9 strongly-typed and not self-describing. You must define your message structure in a special | |
10 language, then invoke the Cap'n Proto compiler (`capnp compile`) to generate source code to | |
11 manipulate that message type in your desired language. | |
12 | |
13 For example: | |
14 | |
15 {% highlight capnp %} | |
16 @0xdbb9ad1f14bf0b36; # unique file ID, generated by `capnp id` | |
17 | |
18 struct Person { | |
19 name @0 :Text; | |
20 birthdate @3 :Date; | |
21 | |
22 email @1 :Text; | |
23 phones @2 :List(PhoneNumber); | |
24 | |
25 struct PhoneNumber { | |
26 number @0 :Text; | |
27 type @1 :Type; | |
28 | |
29 enum Type { | |
30 mobile @0; | |
31 home @1; | |
32 work @2; | |
33 } | |
34 } | |
35 } | |
36 | |
37 struct Date { | |
38 year @0 :Int16; | |
39 month @1 :UInt8; | |
40 day @2 :UInt8; | |
41 } | |
42 {% endhighlight %} | |
43 | |
44 Some notes: | |
45 | |
46 * Types come after names. The name is by far the most important thing to see, especially when | |
47 quickly skimming, so we put it up front where it is most visible. Sorry, C got it wrong. | |
48 * The `@N` annotations show how the protocol evolved over time, so that the system can make sure | |
49 to maintain compatibility with older versions. Fields (and enumerants, and interface methods) | |
50 must be numbered consecutively starting from zero in the order in which they were added. In this | |
51 example, it looks like the `birthdate` field was added to the `Person` structure recently -- its | |
52 number is higher than the `email` and `phones` fields. Unlike Protobufs, you cannot skip numbers | |
53 when defining fields -- but there was never any reason to do so anyway. | |
54 | |
55 ## Language Reference | |
56 | |
57 ### Comments | |
58 | |
59 Comments are indicated by hash signs and extend to the end of the line: | |
60 | |
61 {% highlight capnp %} | |
62 # This is a comment. | |
63 {% endhighlight %} | |
64 | |
65 Comments meant as documentation should appear _after_ the declaration, either on the same line, or | |
66 on a subsequent line. Doc comments for aggregate definitions should appear on the line after the | |
67 opening brace. | |
68 | |
69 {% highlight capnp %} | |
70 struct Date { | |
71 # A standard Gregorian calendar date. | |
72 | |
73 year @0 :Int16; | |
74 # The year. Must include the century. | |
75 # Negative value indicates BC. | |
76 | |
77 month @1 :UInt8; # Month number, 1-12. | |
78 day @2 :UInt8; # Day number, 1-30. | |
79 } | |
80 {% endhighlight %} | |
81 | |
82 Placing the comment _after_ the declaration rather than before makes the code more readable, | |
83 especially when doc comments grow long. You almost always need to see the declaration before you | |
84 can start reading the comment. | |
85 | |
86 ### Built-in Types | |
87 | |
88 The following types are automatically defined: | |
89 | |
90 * **Void:** `Void` | |
91 * **Boolean:** `Bool` | |
92 * **Integers:** `Int8`, `Int16`, `Int32`, `Int64` | |
93 * **Unsigned integers:** `UInt8`, `UInt16`, `UInt32`, `UInt64` | |
94 * **Floating-point:** `Float32`, `Float64` | |
95 * **Blobs:** `Text`, `Data` | |
96 * **Lists:** `List(T)` | |
97 | |
98 Notes: | |
99 | |
100 * The `Void` type has exactly one possible value, and thus can be encoded in zero bits. It is | |
101 rarely used, but can be useful as a union member. | |
102 * `Text` is always UTF-8 encoded and NUL-terminated. | |
103 * `Data` is a completely arbitrary sequence of bytes. | |
104 * `List` is a parameterized type, where the parameter is the element type. For example, | |
105 `List(Int32)`, `List(Person)`, and `List(List(Text))` are all valid. | |
106 | |
107 ### Structs | |
108 | |
109 A struct has a set of named, typed fields, numbered consecutively starting from zero. | |
110 | |
111 {% highlight capnp %} | |
112 struct Person { | |
113 name @0 :Text; | |
114 email @1 :Text; | |
115 } | |
116 {% endhighlight %} | |
117 | |
118 Fields can have default values: | |
119 | |
120 {% highlight capnp %} | |
121 foo @0 :Int32 = 123; | |
122 bar @1 :Text = "blah"; | |
123 baz @2 :List(Bool) = [ true, false, false, true ]; | |
124 qux @3 :Person = (name = "Bob", email = "bob@example.com"); | |
125 corge @4 :Void = void; | |
126 grault @5 :Data = 0x"a1 40 33"; | |
127 {% endhighlight %} | |
128 | |
129 ### Unions | |
130 | |
131 A union is two or more fields of a struct which are stored in the same location. Only one of | |
132 these fields can be set at a time, and a separate tag is maintained to track which one is | |
133 currently set. Unlike in C, unions are not types, they are simply properties of fields, therefore | |
134 union declarations do not look like types. | |
135 | |
136 {% highlight capnp %} | |
137 struct Person { | |
138 # ... | |
139 | |
140 employment :union { | |
141 unemployed @4 :Void; | |
142 employer @5 :Company; | |
143 school @6 :School; | |
144 selfEmployed @7 :Void; | |
145 # We assume that a person is only one of these. | |
146 } | |
147 } | |
148 {% endhighlight %} | |
149 | |
150 Additionally, unions can be unnamed. Each struct can contain no more than one unnamed union. Use | |
151 unnamed unions in cases where you would struggle to think of an appropriate name for the union, | |
152 because the union represents the main body of the struct. | |
153 | |
154 {% highlight capnp %} | |
155 struct Shape { | |
156 area @0 :Float64; | |
157 | |
158 union { | |
159 circle @1 :Float64; # radius | |
160 square @2 :Float64; # width | |
161 } | |
162 } | |
163 {% endhighlight %} | |
164 | |
165 Notes: | |
166 | |
167 * Unions members are numbered in the same number space as fields of the containing struct. | |
168 Remember that the purpose of the numbers is to indicate the evolution order of the | |
169 struct. The system needs to know when the union fields were declared relative to the non-union | |
170 fields. | |
171 | |
172 * Notice that we used the "useless" `Void` type here. We don't have any extra information to store | |
173 for the `unemployed` or `selfEmployed` cases, but we still want the union to distinguish these | |
174 states from others. | |
175 | |
176 * By default, when a struct is initialized, the lowest-numbered field in the union is "set". If | |
177 you do not want any field set by default, simply declare a field called "unset" and make it the | |
178 lowest-numbered field. | |
179 | |
180 * You can move an existing field into a new union without breaking compatibility with existing | |
181 data, as long as all of the other fields in the union are new. Since the existing field is | |
182 necessarily the lowest-numbered in the union, it will be the union's default field. | |
183 | |
184 **Wait, why aren't unions first-class types?** | |
185 | |
186 Requiring unions to be declared inside a struct, rather than living as free-standing types, has | |
187 some important advantages: | |
188 | |
189 * If unions were first-class types, then union members would clearly have to be numbered separately | |
190 from the containing type's fields. This means that the compiler, when deciding how to position | |
191 the union in its containing struct, would have to conservatively assume that any kind of new | |
192 field might be added to the union in the future. To support this, all unions would have to | |
193 be allocated as separate objects embedded by pointer, wasting space. | |
194 | |
195 * A free-standing union would be a liability for protocol evolution, because no additional data | |
196 can be attached to it later on. Consider, for example, a type which represents a parser token. | |
197 This type is naturally a union: it may be a keyword, identifier, numeric literal, quoted string, | |
198 etc. So the author defines it as a union, and the type is used widely. Later on, the developer | |
199 wants to attach information to the token indicating its line and column number in the source | |
200 file. Unfortunately, this is impossible without updating all users of the type, because the new | |
201 information ought to apply to _all_ token instances, not just specific members of the union. On | |
202 the other hand, if unions must be embedded within structs, it is always possible to add new | |
203 fields to the struct later on. | |
204 | |
205 * When evolving a protocol it is common to discover that some existing field really should have | |
206 been enclosed in a union, because new fields being added are mutually exclusive with it. With | |
207 Cap'n Proto's unions, it is actually possible to "retroactively unionize" such a field without | |
208 changing its layout. This allows you to continue being able to read old data without wasting | |
209 space when writing new data. This is only possible when unions are declared within their | |
210 containing struct. | |
211 | |
212 Cap'n Proto's unconventional approach to unions provides these advantages without any real down | |
213 side: where you would conventionally define a free-standing union type, in Cap'n Proto you | |
214 may simply define a struct type that contains only that union (probably unnamed), and you have | |
215 achieved the same effect. Thus, aside from being slightly unintuitive, it is strictly superior. | |
216 | |
217 ### Groups | |
218 | |
219 A group is a set of fields that are encapsulated in their own scope. | |
220 | |
221 {% highlight capnp %} | |
222 struct Person { | |
223 # ... | |
224 | |
225 # Note: This is a terrible way to use groups, and meant | |
226 # only to demonstrate the syntax. | |
227 address :group { | |
228 houseNumber @8 :UInt32; | |
229 street @9 :Text; | |
230 city @10 :Text; | |
231 country @11 :Text; | |
232 } | |
233 } | |
234 {% endhighlight %} | |
235 | |
236 Interface-wise, the above group behaves as if you had defined a nested struct called `Address` and | |
237 then a field `address :Address`. However, a group is _not_ a separate object from its containing | |
238 struct: the fields are numbered in the same space as the containing struct's fields, and are laid | |
239 out exactly the same as if they hadn't been grouped at all. Essentially, a group is just a | |
240 namespace. | |
241 | |
242 Groups on their own (as in the above example) are useless, almost as much so as the `Void` type. | |
243 They become interesting when used together with unions. | |
244 | |
245 {% highlight capnp %} | |
246 struct Shape { | |
247 area @0 :Float64; | |
248 | |
249 union { | |
250 circle :group { | |
251 radius @1 :Float64; | |
252 } | |
253 rectangle :group { | |
254 width @2 :Float64; | |
255 height @3 :Float64; | |
256 } | |
257 } | |
258 } | |
259 {% endhighlight %} | |
260 | |
261 There are two main reason to use groups with unions: | |
262 | |
263 1. They are often more self-documenting. Notice that `radius` is now a member of `circle`, so | |
264 we don't need a comment to explain that the value of `circle` is its radius. | |
265 2. You can add additional members later on, without breaking compatibility. Notice how we upgraded | |
266 `square` to `rectangle` above, adding a `height` field. This definition is actually | |
267 wire-compatible with the previous version of the `Shape` example from the "union" section | |
268 (aside from the fact that `height` will always be zero when reading old data -- hey, it's not | |
269 a perfect example). In real-world use, it is common to realize after the fact that you need to | |
270 add some information to a struct that only applies when one particular union field is set. | |
271 Without the ability to upgrade to a group, you would have to define the new field separately, | |
272 and have it waste space when not relevant. | |
273 | |
274 Note that a named union is actually exactly equivalent to a named group containing an unnamed | |
275 union. | |
276 | |
277 **Wait, weren't groups considered a misfeature in Protobufs? Why did you do this again?** | |
278 | |
279 They are useful in unions, which Protobufs did not have. Meanwhile, you cannot have a "repeated | |
280 group" in Cap'n Proto, which was the case that got into the most trouble with Protobufs. | |
281 | |
282 ### Dynamically-typed Fields | |
283 | |
284 A struct may have a field with type `AnyPointer`. This field's value can be of any pointer type -- | |
285 i.e. any struct, interface, list, or blob. This is essentially like a `void*` in C. | |
286 | |
287 See also [generics](#generic-types). | |
288 | |
289 ### Enums | |
290 | |
291 An enum is a type with a small finite set of symbolic values. | |
292 | |
293 {% highlight capnp %} | |
294 enum Rfc3092Variable { | |
295 foo @0; | |
296 bar @1; | |
297 baz @2; | |
298 qux @3; | |
299 # ... | |
300 } | |
301 {% endhighlight %} | |
302 | |
303 Like fields, enumerants must be numbered sequentially starting from zero. In languages where | |
304 enums have numeric values, these numbers will be used, but in general Cap'n Proto enums should not | |
305 be considered numeric. | |
306 | |
307 ### Interfaces | |
308 | |
309 An interface has a collection of methods, each of which takes some parameters and return some | |
310 results. Like struct fields, methods are numbered. Interfaces support inheritance, including | |
311 multiple inheritance. | |
312 | |
313 {% highlight capnp %} | |
314 interface Node { | |
315 isDirectory @0 () -> (result :Bool); | |
316 } | |
317 | |
318 interface Directory extends(Node) { | |
319 list @0 () -> (list :List(Entry)); | |
320 struct Entry { | |
321 name @0 :Text; | |
322 node @1 :Node; | |
323 } | |
324 | |
325 create @1 (name :Text) -> (file :File); | |
326 mkdir @2 (name :Text) -> (directory :Directory); | |
327 open @3 (name :Text) -> (node :Node); | |
328 delete @4 (name :Text); | |
329 link @5 (name :Text, node :Node); | |
330 } | |
331 | |
332 interface File extends(Node) { | |
333 size @0 () -> (size :UInt64); | |
334 read @1 (startAt :UInt64 = 0, amount :UInt64 = 0xffffffffffffffff) | |
335 -> (data :Data); | |
336 # Default params = read entire file. | |
337 | |
338 write @2 (startAt :UInt64, data :Data); | |
339 truncate @3 (size :UInt64); | |
340 } | |
341 {% endhighlight %} | |
342 | |
343 Notice something interesting here: `Node`, `Directory`, and `File` are interfaces, but several | |
344 methods take these types as parameters or return them as results. `Directory.Entry` is a struct, | |
345 but it contains a `Node`, which is an interface. Structs (and primitive types) are passed over RPC | |
346 by value, but interfaces are passed by reference. So when `Directory.list` is called remotely, the | |
347 content of a `List(Entry)` (including the text of each `name`) is transmitted back, but for the | |
348 `node` field, only a reference to some remote `Node` object is sent. | |
349 | |
350 When an address of an object is transmitted, the RPC system automatically manages making sure that | |
351 the recipient gets permission to call the addressed object -- because if the recipient wasn't | |
352 meant to have access, the sender shouldn't have sent the reference in the first place. This makes | |
353 it very easy to develop secure protocols with Cap'n Proto -- you almost don't need to think about | |
354 access control at all. This feature is what makes Cap'n Proto a "capability-based" RPC system -- a | |
355 reference to an object inherently represents a "capability" to access it. | |
356 | |
357 ### Generic Types | |
358 | |
359 A struct or interface type may be parameterized, making it "generic". For example, this is useful | |
360 for defining type-safe containers: | |
361 | |
362 {% highlight capnp %} | |
363 struct Map(Key, Value) { | |
364 entries @0 :List(Entry); | |
365 struct Entry { | |
366 key @0 :Key; | |
367 value @1 :Value; | |
368 } | |
369 } | |
370 | |
371 struct People { | |
372 byName @0 :Map(Text, Person); | |
373 # Maps names to Person instances. | |
374 } | |
375 {% endhighlight %} | |
376 | |
377 Cap'n Proto generics work very similarly to Java generics or C++ templates. Some notes: | |
378 | |
379 * Only pointer types (structs, lists, blobs, and interfaces) can be used as generic parameters, | |
380 much like in Java. This is a pragmatic limitation: allowing parameters to have non-pointer types | |
381 would mean that different parameterizations of a struct could have completely different layouts, | |
382 which would excessively complicate the Cap'n Proto implementation. | |
383 | |
384 * A type declaration nested inside a generic type may use the type parameters of the outer type, | |
385 as you can see in the example above. This differs from Java, but matches C++. If you want to | |
386 refer to a nested type from outside the outer type, you must specify the parameters on the outer | |
387 type, not the inner. For example, `Map(Text, Person).Entry` is a valid type; | |
388 `Map.Entry(Text, Person)` is NOT valid. (Of course, an inner type may declare additional generic | |
389 parameters.) | |
390 | |
391 * If you refer to a generic type but omit its parameters (e.g. declare a field of type `Map` rather | |
392 than `Map(T, U)`), it is as if you specified `AnyPointer` for each parameter. Note that such | |
393 a type is wire-compatible with any specific parameterization, so long as you interpret the | |
394 `AnyPointer`s as the correct type at runtime. | |
395 | |
396 * Relatedly, it is safe to cast an generic interface of a specific parameterization to a generic | |
397 interface where all parameters are `AnyPointer` and vice versa, as long as the `AnyPointer`s are | |
398 treated as the correct type at runtime. This means that e.g. you can implement a server in a | |
399 generic way that is correct for all parameterizations but call it from clients using a specific | |
400 parameterization. | |
401 | |
402 * The encoding of a generic type is exactly the same as the encoding of a type produced by | |
403 substituting the type parameters manually. For example, `Map(Text, Person)` is encoded exactly | |
404 the same as: | |
405 | |
406 <div>{% highlight capnp %} | |
407 struct PersonMap { | |
408 # Encoded the same as Map(Text, Person). | |
409 entries @0 :List(Entry); | |
410 struct Entry { | |
411 key @0 :Text; | |
412 value @1 :Person; | |
413 } | |
414 } | |
415 {% endhighlight %} | |
416 </div> | |
417 | |
418 Therefore, it is possible to upgrade non-generic types to generic types while retaining | |
419 backwards-compatibility. | |
420 | |
421 * Similarly, a generic interface's protocol is exactly the same as the interface obtained by | |
422 manually substituting the generic parameters. | |
423 | |
424 ### Generic Methods | |
425 | |
426 Interface methods may also have "implicit" generic parameters that apply to a particular method | |
427 call. This commonly applies to "factory" methods. For example: | |
428 | |
429 {% highlight capnp %} | |
430 interface Assignable(T) { | |
431 # A generic interface, with non-generic methods. | |
432 get @0 () -> (value :T); | |
433 set @1 (value :T) -> (); | |
434 } | |
435 | |
436 interface AssignableFactory { | |
437 newAssignable @0 [T] (initialValue :T) | |
438 -> (assignable :Assignable(T)); | |
439 # A generic method. | |
440 } | |
441 {% endhighlight %} | |
442 | |
443 Here, the method `newAssignable()` is generic. The return type of the method depends on the input | |
444 type. | |
445 | |
446 Ideally, calls to a generic method should not have to explicitly specify the method's type | |
447 parameters, because they should be inferred from the types of the method's regular parameters. | |
448 However, this may not always be possible; it depends on the programming language and API details. | |
449 | |
450 Note that if a method's generic parameter is used only in its returns, not its parameters, then | |
451 this implies that the returned value is appropriate for any parameterization. For example: | |
452 | |
453 {% highlight capnp %} | |
454 newUnsetAssignable @1 [T] () -> (assignable :Assignable(T)); | |
455 # Create a new assignable. `get()` on the returned object will | |
456 # throw an exception until `set()` has been called at least once. | |
457 {% endhighlight %} | |
458 | |
459 Because of the way this method is designed, the returned `Assignable` is initially valid for any | |
460 `T`. Effectively, it doesn't take on a type until the first time `set()` is called, and then `T` | |
461 retroactively becomes the type of value passed to `set()`. | |
462 | |
463 In contrast, if it's the case that the returned type is unknown, then you should NOT declare it | |
464 as generic. Instead, use `AnyPointer`, or omit a type's parameters (since they default to | |
465 `AnyPointer`). For example: | |
466 | |
467 {% highlight capnp %} | |
468 getNamedAssignable @2 (name :Text) -> (assignable :Assignable); | |
469 # Get the `Assignable` with the given name. It is the | |
470 # responsibility of the caller to keep track of the type of each | |
471 # named `Assignable` and cast the returned object appropriately. | |
472 {% endhighlight %} | |
473 | |
474 Here, we omitted the parameters to `Assignable` in the return type, because the returned object | |
475 has a specific type parameterization but it is not locally knowable. | |
476 | |
477 ### Constants | |
478 | |
479 You can define constants in Cap'n Proto. These don't affect what is sent on the wire, but they | |
480 will be included in the generated code, and can be [evaluated using the `capnp` | |
481 tool](capnp-tool.html#evaluating-constants). | |
482 | |
483 {% highlight capnp %} | |
484 const pi :Float32 = 3.14159; | |
485 const bob :Person = (name = "Bob", email = "bob@example.com"); | |
486 const secret :Data = 0x"9f98739c2b53835e 6720a00907abd42f"; | |
487 {% endhighlight %} | |
488 | |
489 Additionally, you may refer to a constant inside another value (e.g. another constant, or a default | |
490 value of a field). | |
491 | |
492 {% highlight capnp %} | |
493 const foo :Int32 = 123; | |
494 const bar :Text = "Hello"; | |
495 const baz :SomeStruct = (id = .foo, message = .bar); | |
496 {% endhighlight %} | |
497 | |
498 Note that when substituting a constant into another value, the constant's name must be qualified | |
499 with its scope. E.g. if a constant `qux` is declared nested in a type `Corge`, it would need to | |
500 be referenced as `Corge.qux` rather than just `qux`, even when used within the `Corge` scope. | |
501 Constants declared at the top-level scope are prefixed just with `.`. This rule helps to make it | |
502 clear that the name refers to a user-defined constant, rather than a literal value (like `true` or | |
503 `inf`) or an enum value. | |
504 | |
505 ### Nesting, Scope, and Aliases | |
506 | |
507 You can nest constant, alias, and type definitions inside structs and interfaces (but not enums). | |
508 This has no effect on any definition involved except to define the scope of its name. So in Java | |
509 terms, inner classes are always "static". To name a nested type from another scope, separate the | |
510 path with `.`s. | |
511 | |
512 {% highlight capnp %} | |
513 struct Foo { | |
514 struct Bar { | |
515 #... | |
516 } | |
517 bar @0 :Bar; | |
518 } | |
519 | |
520 struct Baz { | |
521 bar @0 :Foo.Bar; | |
522 } | |
523 {% endhighlight %} | |
524 | |
525 If typing long scopes becomes cumbersome, you can use `using` to declare an alias. | |
526 | |
527 {% highlight capnp %} | |
528 struct Qux { | |
529 using Foo.Bar; | |
530 bar @0 :Bar; | |
531 } | |
532 | |
533 struct Corge { | |
534 using T = Foo.Bar; | |
535 bar @0 :T; | |
536 } | |
537 {% endhighlight %} | |
538 | |
539 ### Imports | |
540 | |
541 An `import` expression names the scope of some other file: | |
542 | |
543 {% highlight capnp %} | |
544 struct Foo { | |
545 # Use type "Baz" defined in bar.capnp. | |
546 baz @0 :import "bar.capnp".Baz; | |
547 } | |
548 {% endhighlight %} | |
549 | |
550 Of course, typically it's more readable to define an alias: | |
551 | |
552 {% highlight capnp %} | |
553 using Bar = import "bar.capnp"; | |
554 | |
555 struct Foo { | |
556 # Use type "Baz" defined in bar.capnp. | |
557 baz @0 :Bar.Baz; | |
558 } | |
559 {% endhighlight %} | |
560 | |
561 Or even: | |
562 | |
563 {% highlight capnp %} | |
564 using import "bar.capnp".Baz; | |
565 | |
566 struct Foo { | |
567 baz @0 :Baz; | |
568 } | |
569 {% endhighlight %} | |
570 | |
571 The above imports specify relative paths. If the path begins with a `/`, it is absolute -- in | |
572 this case, the `capnp` tool searches for the file in each of the search path directories specified | |
573 with `-I`. | |
574 | |
575 ### Annotations | |
576 | |
577 Sometimes you want to attach extra information to parts of your protocol that isn't part of the | |
578 Cap'n Proto language. This information might control details of a particular code generator, or | |
579 you might even read it at run time to assist in some kind of dynamic message processing. For | |
580 example, you might create a field annotation which means "hide from the public", and when you send | |
581 a message to an external user, you might invoke some code first that iterates over your message and | |
582 removes all of these hidden fields. | |
583 | |
584 You may declare annotations and use them like so: | |
585 | |
586 {% highlight capnp %} | |
587 # Declare an annotation 'foo' which applies to struct and enum types. | |
588 annotation foo(struct, enum) :Text; | |
589 | |
590 # Apply 'foo' to to MyType. | |
591 struct MyType $foo("bar") { | |
592 # ... | |
593 } | |
594 {% endhighlight %} | |
595 | |
596 The possible targets for an annotation are: `file`, `struct`, `field`, `union`, `enum`, `enumerant`, | |
597 `interface`, `method`, `parameter`, `annotation`, `const`. You may also specify `*` to cover them | |
598 all. | |
599 | |
600 {% highlight capnp %} | |
601 # 'baz' can annotate anything! | |
602 annotation baz(*) :Int32; | |
603 | |
604 $baz(1); # Annotate the file. | |
605 | |
606 struct MyStruct $baz(2) { | |
607 myField @0 :Text = "default" $baz(3); | |
608 myUnion :union $baz(4) { | |
609 # ... | |
610 } | |
611 } | |
612 | |
613 enum MyEnum $baz(5) { | |
614 myEnumerant @0 $baz(6); | |
615 } | |
616 | |
617 interface MyInterface $baz(7) { | |
618 myMethod @0 (myParam :Text $baz(9)) -> () $baz(8); | |
619 } | |
620 | |
621 annotation myAnnotation(struct) :Int32 $baz(10); | |
622 const myConst :Int32 = 123 $baz(11); | |
623 {% endhighlight %} | |
624 | |
625 `Void` annotations can omit the value. Struct-typed annotations are also allowed. Tip: If | |
626 you want an annotation to have a default value, declare it as a struct with a single field with | |
627 a default value. | |
628 | |
629 {% highlight capnp %} | |
630 annotation qux(struct, field) :Void; | |
631 | |
632 struct MyStruct $qux { | |
633 string @0 :Text $qux; | |
634 number @1 :Int32 $qux; | |
635 } | |
636 | |
637 annotation corge(file) :MyStruct; | |
638 | |
639 $corge(string = "hello", number = 123); | |
640 | |
641 struct Grault { | |
642 value @0 :Int32 = 123; | |
643 } | |
644 | |
645 annotation grault(file) :Grault; | |
646 | |
647 $grault(); # value defaults to 123 | |
648 $grault(value = 456); | |
649 {% endhighlight %} | |
650 | |
651 ### Unique IDs | |
652 | |
653 A Cap'n Proto file must have a unique 64-bit ID, and each type and annotation defined therein may | |
654 also have an ID. Use `capnp id` to generate a new ID randomly. ID specifications begin with `@`: | |
655 | |
656 {% highlight capnp %} | |
657 # file ID | |
658 @0xdbb9ad1f14bf0b36; | |
659 | |
660 struct Foo @0x8db435604d0d3723 { | |
661 # ... | |
662 } | |
663 | |
664 enum Bar @0xb400f69b5334aab3 { | |
665 # ... | |
666 } | |
667 | |
668 interface Baz @0xf7141baba3c12691 { | |
669 # ... | |
670 } | |
671 | |
672 annotation qux @0xf8a1bedf44c89f00 (field) :Text; | |
673 {% endhighlight %} | |
674 | |
675 If you omit the ID for a type or annotation, one will be assigned automatically. This default | |
676 ID is derived by taking the first 8 bytes of the MD5 hash of the parent scope's ID concatenated | |
677 with the declaration's name (where the "parent scope" is the file for top-level declarations, or | |
678 the outer type for nested declarations). You can see the automatically-generated IDs by "compiling" | |
679 your file with the `-ocapnp` flag, which echos the schema back to the terminal annotated with | |
680 extra information, e.g. `capnp compile -ocapnp myschema.capnp`. In general, you would only specify | |
681 an explicit ID for a declaration if that declaration has been renamed or moved and you want the ID | |
682 to stay the same for backwards-compatibility. | |
683 | |
684 IDs exist to provide a relatively short yet unambiguous way to refer to a type or annotation from | |
685 another context. They may be used for representing schemas, for tagging dynamically-typed fields, | |
686 etc. Most languages prefer instead to define a symbolic global namespace e.g. full of "packages", | |
687 but this would have some important disadvantages in the context of Cap'n Proto: | |
688 | |
689 * Programmers often feel the need to change symbolic names and organization in order to make their | |
690 code cleaner, but the renamed code should still work with existing encoded data. | |
691 * It's easy for symbolic names to collide, and these collisions could be hard to detect in a large | |
692 distributed system with many different binaries using different versions of protocols. | |
693 * Fully-qualified type names may be large and waste space when transmitted on the wire. | |
694 | |
695 Note that IDs are 64-bit (actually, 63-bit, as the first bit is always 1). Random collisions | |
696 are possible, but unlikely -- there would have to be on the order of a billion types before this | |
697 becomes a real concern. Collisions from misuse (e.g. copying an example without changing the ID) | |
698 are much more likely. | |
699 | |
700 ## Evolving Your Protocol | |
701 | |
702 A protocol can be changed in the following ways without breaking backwards-compatibility, and | |
703 without changing the [canonical](encoding.html#canonicalization) encoding of a message: | |
704 | |
705 * New types, constants, and aliases can be added anywhere, since they obviously don't affect the | |
706 encoding of any existing type. | |
707 | |
708 * New fields, enumerants, and methods may be added to structs, enums, and interfaces, respectively, | |
709 as long as each new member's number is larger than all previous members. Similarly, new fields | |
710 may be added to existing groups and unions. | |
711 | |
712 * New parameters may be added to a method. The new parameters must be added to the end of the | |
713 parameter list and must have default values. | |
714 | |
715 * Members can be re-arranged in the source code, so long as their numbers stay the same. | |
716 | |
717 * Any symbolic name can be changed, as long as the type ID / ordinal numbers stay the same. Note | |
718 that type declarations have an implicit ID generated based on their name and parent's ID, but | |
719 you can use `capnp compile -ocapnp myschema.capnp` to find out what that number is, and then | |
720 declare it explicitly after your rename. | |
721 | |
722 * Type definitions can be moved to different scopes, as long as the type ID is declared | |
723 explicitly. | |
724 | |
725 * A field can be moved into a group or a union, as long as the group/union and all other fields | |
726 within it are new. In other words, a field can be replaced with a group or union containing an | |
727 equivalent field and some new fields. | |
728 | |
729 * A non-generic type can be made [generic](#generic-types), and new generic parameters may be | |
730 added to an existing generic type. Other types used inside the body of the newly-generic type can | |
731 be replaced with the new generic parameter so long as all existing users of the type are updated | |
732 to bind that generic parameter to the type it replaced. For example: | |
733 | |
734 <div>{% highlight capnp %} | |
735 struct Map { | |
736 entries @0 :List(Entry); | |
737 struct Entry { | |
738 key @0 :Text; | |
739 value @1 :Text; | |
740 } | |
741 } | |
742 {% endhighlight %} | |
743 </div> | |
744 | |
745 Can change to: | |
746 | |
747 <div>{% highlight capnp %} | |
748 struct Map(Key, Value) { | |
749 entries @0 :List(Entry); | |
750 struct Entry { | |
751 key @0 :Key; | |
752 value @1 :Value; | |
753 } | |
754 } | |
755 {% endhighlight %} | |
756 </div> | |
757 | |
758 As long as all existing uses of `Map` are replaced with `Map(Text, Text)` (and any uses of | |
759 `Map.Entry` are replaced with `Map(Text, Text).Entry`). | |
760 | |
761 (This rule applies analogously to generic methods.) | |
762 | |
763 The following changes are backwards-compatible but may change the canonical encoding of a message. | |
764 Apps that rely on canonicalization (such as some cryptographic protocols) should avoid changes in | |
765 this list, but most apps can safely use them: | |
766 | |
767 * A field of type `List(T)`, where `T` is a primitive type, blob, or list, may be changed to type | |
768 `List(U)`, where `U` is a struct type whose `@0` field is of type `T`. This rule is useful when | |
769 you realize too late that you need to attach some extra data to each element of your list. | |
770 Without this rule, you would be stuck defining parallel lists, which are ugly and error-prone. | |
771 As a special exception to this rule, `List(Bool)` may **not** be upgraded to a list of structs, | |
772 because implementing this for bit lists has proven unreasonably expensive. | |
773 | |
774 Any change not listed above should be assumed NOT to be safe. In particular: | |
775 | |
776 * You cannot change a field, method, or enumerant's number. | |
777 * You cannot change a field or method parameter's type or default value. | |
778 * You cannot change a type's ID. | |
779 * You cannot change the name of a type that doesn't have an explicit ID, as the implicit ID is | |
780 generated based in part on the type name. | |
781 * You cannot move a type to a different scope or file unless it has an explicit ID, as the implicit | |
782 ID is based in part on the scope's ID. | |
783 * You cannot move an existing field into or out of an existing union, nor can you form a new union | |
784 containing more than one existing field. | |
785 | |
786 Also, these rules only apply to the Cap'n Proto native encoding. It is sometimes useful to | |
787 transcode Cap'n Proto types to other formats, like JSON, which may have different rules (e.g., | |
788 field names cannot change in JSON). |