sv-dependency-builds: src/capnproto-0.6.0/doc/language.md annotate

annotate src/capnproto-0.6.0/doc/language.md @ 169:223a55898ab9 tip default

Add null config files

author	Chris Cannam <cannam@all-day-breakfast.com>
date	Mon, 02 Mar 2020 14:03:47 +0000
parents	45360b968bf4
children

rev	line source
cannam@147	1 ---
cannam@147	2 layout: page
cannam@147	3 title: Schema Language
cannam@147	4 ---
cannam@147	5
cannam@147	6 # Schema Language
cannam@147	7
cannam@147	8 Like Protocol Buffers and Thrift (but unlike JSON or MessagePack), Cap'n Proto messages are
cannam@147	9 strongly-typed and not self-describing. You must define your message structure in a special
cannam@147	10 language, then invoke the Cap'n Proto compiler (`capnp compile`) to generate source code to
cannam@147	11 manipulate that message type in your desired language.
cannam@147	12
cannam@147	13 For example:
cannam@147	14
cannam@147	15 {% highlight capnp %}
cannam@147	16 @0xdbb9ad1f14bf0b36; # unique file ID, generated by `capnp id`
cannam@147	17
cannam@147	18 struct Person {
cannam@147	19 name @0 :Text;
cannam@147	20 birthdate @3 :Date;
cannam@147	21
cannam@147	22 email @1 :Text;
cannam@147	23 phones @2 :List(PhoneNumber);
cannam@147	24
cannam@147	25 struct PhoneNumber {
cannam@147	26 number @0 :Text;
cannam@147	27 type @1 :Type;
cannam@147	28
cannam@147	29 enum Type {
cannam@147	30 mobile @0;
cannam@147	31 home @1;
cannam@147	32 work @2;
cannam@147	33 }
cannam@147	34 }
cannam@147	35 }
cannam@147	36
cannam@147	37 struct Date {
cannam@147	38 year @0 :Int16;
cannam@147	39 month @1 :UInt8;
cannam@147	40 day @2 :UInt8;
cannam@147	41 }
cannam@147	42 {% endhighlight %}
cannam@147	43
cannam@147	44 Some notes:
cannam@147	45
cannam@147	46 * Types come after names. The name is by far the most important thing to see, especially when
cannam@147	47 quickly skimming, so we put it up front where it is most visible. Sorry, C got it wrong.
cannam@147	48 * The `@N` annotations show how the protocol evolved over time, so that the system can make sure
cannam@147	49 to maintain compatibility with older versions. Fields (and enumerants, and interface methods)
cannam@147	50 must be numbered consecutively starting from zero in the order in which they were added. In this
cannam@147	51 example, it looks like the `birthdate` field was added to the `Person` structure recently -- its
cannam@147	52 number is higher than the `email` and `phones` fields. Unlike Protobufs, you cannot skip numbers
cannam@147	53 when defining fields -- but there was never any reason to do so anyway.
cannam@147	54
cannam@147	55 ## Language Reference
cannam@147	56
cannam@147	57 ### Comments
cannam@147	58
cannam@147	59 Comments are indicated by hash signs and extend to the end of the line:
cannam@147	60
cannam@147	61 {% highlight capnp %}
cannam@147	62 # This is a comment.
cannam@147	63 {% endhighlight %}
cannam@147	64
cannam@147	65 Comments meant as documentation should appear _after_ the declaration, either on the same line, or
cannam@147	66 on a subsequent line. Doc comments for aggregate definitions should appear on the line after the
cannam@147	67 opening brace.
cannam@147	68
cannam@147	69 {% highlight capnp %}
cannam@147	70 struct Date {
cannam@147	71 # A standard Gregorian calendar date.
cannam@147	72
cannam@147	73 year @0 :Int16;
cannam@147	74 # The year. Must include the century.
cannam@147	75 # Negative value indicates BC.
cannam@147	76
cannam@147	77 month @1 :UInt8; # Month number, 1-12.
cannam@147	78 day @2 :UInt8; # Day number, 1-30.
cannam@147	79 }
cannam@147	80 {% endhighlight %}
cannam@147	81
cannam@147	82 Placing the comment _after_ the declaration rather than before makes the code more readable,
cannam@147	83 especially when doc comments grow long. You almost always need to see the declaration before you
cannam@147	84 can start reading the comment.
cannam@147	85
cannam@147	86 ### Built-in Types
cannam@147	87
cannam@147	88 The following types are automatically defined:
cannam@147	89
cannam@147	90 * Void: `Void`
cannam@147	91 * Boolean: `Bool`
cannam@147	92 * Integers: `Int8`, `Int16`, `Int32`, `Int64`
cannam@147	93 * Unsigned integers: `UInt8`, `UInt16`, `UInt32`, `UInt64`
cannam@147	94 * Floating-point: `Float32`, `Float64`
cannam@147	95 * Blobs: `Text`, `Data`
cannam@147	96 * Lists: `List(T)`
cannam@147	97
cannam@147	98 Notes:
cannam@147	99
cannam@147	100 * The `Void` type has exactly one possible value, and thus can be encoded in zero bits. It is
cannam@147	101 rarely used, but can be useful as a union member.
cannam@147	102 * `Text` is always UTF-8 encoded and NUL-terminated.
cannam@147	103 * `Data` is a completely arbitrary sequence of bytes.
cannam@147	104 * `List` is a parameterized type, where the parameter is the element type. For example,
cannam@147	105 `List(Int32)`, `List(Person)`, and `List(List(Text))` are all valid.
cannam@147	106
cannam@147	107 ### Structs
cannam@147	108
cannam@147	109 A struct has a set of named, typed fields, numbered consecutively starting from zero.
cannam@147	110
cannam@147	111 {% highlight capnp %}
cannam@147	112 struct Person {
cannam@147	113 name @0 :Text;
cannam@147	114 email @1 :Text;
cannam@147	115 }
cannam@147	116 {% endhighlight %}
cannam@147	117
cannam@147	118 Fields can have default values:
cannam@147	119
cannam@147	120 {% highlight capnp %}
cannam@147	121 foo @0 :Int32 = 123;
cannam@147	122 bar @1 :Text = "blah";
cannam@147	123 baz @2 :List(Bool) = [ true, false, false, true ];
cannam@147	124 qux @3 :Person = (name = "Bob", email = "bob@example.com");
cannam@147	125 corge @4 :Void = void;
cannam@147	126 grault @5 :Data = 0x"a1 40 33";
cannam@147	127 {% endhighlight %}
cannam@147	128
cannam@147	129 ### Unions
cannam@147	130
cannam@147	131 A union is two or more fields of a struct which are stored in the same location. Only one of
cannam@147	132 these fields can be set at a time, and a separate tag is maintained to track which one is
cannam@147	133 currently set. Unlike in C, unions are not types, they are simply properties of fields, therefore
cannam@147	134 union declarations do not look like types.
cannam@147	135
cannam@147	136 {% highlight capnp %}
cannam@147	137 struct Person {
cannam@147	138 # ...
cannam@147	139
cannam@147	140 employment :union {
cannam@147	141 unemployed @4 :Void;
cannam@147	142 employer @5 :Company;
cannam@147	143 school @6 :School;
cannam@147	144 selfEmployed @7 :Void;
cannam@147	145 # We assume that a person is only one of these.
cannam@147	146 }
cannam@147	147 }
cannam@147	148 {% endhighlight %}
cannam@147	149
cannam@147	150 Additionally, unions can be unnamed. Each struct can contain no more than one unnamed union. Use
cannam@147	151 unnamed unions in cases where you would struggle to think of an appropriate name for the union,
cannam@147	152 because the union represents the main body of the struct.
cannam@147	153
cannam@147	154 {% highlight capnp %}
cannam@147	155 struct Shape {
cannam@147	156 area @0 :Float64;
cannam@147	157
cannam@147	158 union {
cannam@147	159 circle @1 :Float64; # radius
cannam@147	160 square @2 :Float64; # width
cannam@147	161 }
cannam@147	162 }
cannam@147	163 {% endhighlight %}
cannam@147	164
cannam@147	165 Notes:
cannam@147	166
cannam@147	167 * Unions members are numbered in the same number space as fields of the containing struct.
cannam@147	168 Remember that the purpose of the numbers is to indicate the evolution order of the
cannam@147	169 struct. The system needs to know when the union fields were declared relative to the non-union
cannam@147	170 fields.
cannam@147	171
cannam@147	172 * Notice that we used the "useless" `Void` type here. We don't have any extra information to store
cannam@147	173 for the `unemployed` or `selfEmployed` cases, but we still want the union to distinguish these
cannam@147	174 states from others.
cannam@147	175
cannam@147	176 * By default, when a struct is initialized, the lowest-numbered field in the union is "set". If
cannam@147	177 you do not want any field set by default, simply declare a field called "unset" and make it the
cannam@147	178 lowest-numbered field.
cannam@147	179
cannam@147	180 * You can move an existing field into a new union without breaking compatibility with existing
cannam@147	181 data, as long as all of the other fields in the union are new. Since the existing field is
cannam@147	182 necessarily the lowest-numbered in the union, it will be the union's default field.
cannam@147	183
cannam@147	184 Wait, why aren't unions first-class types?
cannam@147	185
cannam@147	186 Requiring unions to be declared inside a struct, rather than living as free-standing types, has
cannam@147	187 some important advantages:
cannam@147	188
cannam@147	189 * If unions were first-class types, then union members would clearly have to be numbered separately
cannam@147	190 from the containing type's fields. This means that the compiler, when deciding how to position
cannam@147	191 the union in its containing struct, would have to conservatively assume that any kind of new
cannam@147	192 field might be added to the union in the future. To support this, all unions would have to
cannam@147	193 be allocated as separate objects embedded by pointer, wasting space.
cannam@147	194
cannam@147	195 * A free-standing union would be a liability for protocol evolution, because no additional data
cannam@147	196 can be attached to it later on. Consider, for example, a type which represents a parser token.
cannam@147	197 This type is naturally a union: it may be a keyword, identifier, numeric literal, quoted string,
cannam@147	198 etc. So the author defines it as a union, and the type is used widely. Later on, the developer
cannam@147	199 wants to attach information to the token indicating its line and column number in the source
cannam@147	200 file. Unfortunately, this is impossible without updating all users of the type, because the new
cannam@147	201 information ought to apply to _all_ token instances, not just specific members of the union. On
cannam@147	202 the other hand, if unions must be embedded within structs, it is always possible to add new
cannam@147	203 fields to the struct later on.
cannam@147	204
cannam@147	205 * When evolving a protocol it is common to discover that some existing field really should have
cannam@147	206 been enclosed in a union, because new fields being added are mutually exclusive with it. With
cannam@147	207 Cap'n Proto's unions, it is actually possible to "retroactively unionize" such a field without
cannam@147	208 changing its layout. This allows you to continue being able to read old data without wasting
cannam@147	209 space when writing new data. This is only possible when unions are declared within their
cannam@147	210 containing struct.
cannam@147	211
cannam@147	212 Cap'n Proto's unconventional approach to unions provides these advantages without any real down
cannam@147	213 side: where you would conventionally define a free-standing union type, in Cap'n Proto you
cannam@147	214 may simply define a struct type that contains only that union (probably unnamed), and you have
cannam@147	215 achieved the same effect. Thus, aside from being slightly unintuitive, it is strictly superior.
cannam@147	216
cannam@147	217 ### Groups
cannam@147	218
cannam@147	219 A group is a set of fields that are encapsulated in their own scope.
cannam@147	220
cannam@147	221 {% highlight capnp %}
cannam@147	222 struct Person {
cannam@147	223 # ...
cannam@147	224
cannam@147	225 # Note: This is a terrible way to use groups, and meant
cannam@147	226 # only to demonstrate the syntax.
cannam@147	227 address :group {
cannam@147	228 houseNumber @8 :UInt32;
cannam@147	229 street @9 :Text;
cannam@147	230 city @10 :Text;
cannam@147	231 country @11 :Text;
cannam@147	232 }
cannam@147	233 }
cannam@147	234 {% endhighlight %}
cannam@147	235
cannam@147	236 Interface-wise, the above group behaves as if you had defined a nested struct called `Address` and
cannam@147	237 then a field `address :Address`. However, a group is _not_ a separate object from its containing
cannam@147	238 struct: the fields are numbered in the same space as the containing struct's fields, and are laid
cannam@147	239 out exactly the same as if they hadn't been grouped at all. Essentially, a group is just a
cannam@147	240 namespace.
cannam@147	241
cannam@147	242 Groups on their own (as in the above example) are useless, almost as much so as the `Void` type.
cannam@147	243 They become interesting when used together with unions.
cannam@147	244
cannam@147	245 {% highlight capnp %}
cannam@147	246 struct Shape {
cannam@147	247 area @0 :Float64;
cannam@147	248
cannam@147	249 union {
cannam@147	250 circle :group {
cannam@147	251 radius @1 :Float64;
cannam@147	252 }
cannam@147	253 rectangle :group {
cannam@147	254 width @2 :Float64;
cannam@147	255 height @3 :Float64;
cannam@147	256 }
cannam@147	257 }
cannam@147	258 }
cannam@147	259 {% endhighlight %}
cannam@147	260
cannam@147	261 There are two main reason to use groups with unions:
cannam@147	262
cannam@147	263 1. They are often more self-documenting. Notice that `radius` is now a member of `circle`, so
cannam@147	264 we don't need a comment to explain that the value of `circle` is its radius.
cannam@147	265 2. You can add additional members later on, without breaking compatibility. Notice how we upgraded
cannam@147	266 `square` to `rectangle` above, adding a `height` field. This definition is actually
cannam@147	267 wire-compatible with the previous version of the `Shape` example from the "union" section
cannam@147	268 (aside from the fact that `height` will always be zero when reading old data -- hey, it's not
cannam@147	269 a perfect example). In real-world use, it is common to realize after the fact that you need to
cannam@147	270 add some information to a struct that only applies when one particular union field is set.
cannam@147	271 Without the ability to upgrade to a group, you would have to define the new field separately,
cannam@147	272 and have it waste space when not relevant.
cannam@147	273
cannam@147	274 Note that a named union is actually exactly equivalent to a named group containing an unnamed
cannam@147	275 union.
cannam@147	276
cannam@147	277 Wait, weren't groups considered a misfeature in Protobufs? Why did you do this again?
cannam@147	278
cannam@147	279 They are useful in unions, which Protobufs did not have. Meanwhile, you cannot have a "repeated
cannam@147	280 group" in Cap'n Proto, which was the case that got into the most trouble with Protobufs.
cannam@147	281
cannam@147	282 ### Dynamically-typed Fields
cannam@147	283
cannam@147	284 A struct may have a field with type `AnyPointer`. This field's value can be of any pointer type --
cannam@147	285 i.e. any struct, interface, list, or blob. This is essentially like a `void*` in C.
cannam@147	286
cannam@147	287 See also [generics](#generic-types).
cannam@147	288
cannam@147	289 ### Enums
cannam@147	290
cannam@147	291 An enum is a type with a small finite set of symbolic values.
cannam@147	292
cannam@147	293 {% highlight capnp %}
cannam@147	294 enum Rfc3092Variable {
cannam@147	295 foo @0;
cannam@147	296 bar @1;
cannam@147	297 baz @2;
cannam@147	298 qux @3;
cannam@147	299 # ...
cannam@147	300 }
cannam@147	301 {% endhighlight %}
cannam@147	302
cannam@147	303 Like fields, enumerants must be numbered sequentially starting from zero. In languages where
cannam@147	304 enums have numeric values, these numbers will be used, but in general Cap'n Proto enums should not
cannam@147	305 be considered numeric.
cannam@147	306
cannam@147	307 ### Interfaces
cannam@147	308
cannam@147	309 An interface has a collection of methods, each of which takes some parameters and return some
cannam@147	310 results. Like struct fields, methods are numbered. Interfaces support inheritance, including
cannam@147	311 multiple inheritance.
cannam@147	312
cannam@147	313 {% highlight capnp %}
cannam@147	314 interface Node {
cannam@147	315 isDirectory @0 () -> (result :Bool);
cannam@147	316 }
cannam@147	317
cannam@147	318 interface Directory extends(Node) {
cannam@147	319 list @0 () -> (list :List(Entry));
cannam@147	320 struct Entry {
cannam@147	321 name @0 :Text;
cannam@147	322 node @1 :Node;
cannam@147	323 }
cannam@147	324
cannam@147	325 create @1 (name :Text) -> (file :File);
cannam@147	326 mkdir @2 (name :Text) -> (directory :Directory);
cannam@147	327 open @3 (name :Text) -> (node :Node);
cannam@147	328 delete @4 (name :Text);
cannam@147	329 link @5 (name :Text, node :Node);
cannam@147	330 }
cannam@147	331
cannam@147	332 interface File extends(Node) {
cannam@147	333 size @0 () -> (size :UInt64);
cannam@147	334 read @1 (startAt :UInt64 = 0, amount :UInt64 = 0xffffffffffffffff)
cannam@147	335 -> (data :Data);
cannam@147	336 # Default params = read entire file.
cannam@147	337
cannam@147	338 write @2 (startAt :UInt64, data :Data);
cannam@147	339 truncate @3 (size :UInt64);
cannam@147	340 }
cannam@147	341 {% endhighlight %}
cannam@147	342
cannam@147	343 Notice something interesting here: `Node`, `Directory`, and `File` are interfaces, but several
cannam@147	344 methods take these types as parameters or return them as results. `Directory.Entry` is a struct,
cannam@147	345 but it contains a `Node`, which is an interface. Structs (and primitive types) are passed over RPC
cannam@147	346 by value, but interfaces are passed by reference. So when `Directory.list` is called remotely, the
cannam@147	347 content of a `List(Entry)` (including the text of each `name`) is transmitted back, but for the
cannam@147	348 `node` field, only a reference to some remote `Node` object is sent.
cannam@147	349
cannam@147	350 When an address of an object is transmitted, the RPC system automatically manages making sure that
cannam@147	351 the recipient gets permission to call the addressed object -- because if the recipient wasn't
cannam@147	352 meant to have access, the sender shouldn't have sent the reference in the first place. This makes
cannam@147	353 it very easy to develop secure protocols with Cap'n Proto -- you almost don't need to think about
cannam@147	354 access control at all. This feature is what makes Cap'n Proto a "capability-based" RPC system -- a
cannam@147	355 reference to an object inherently represents a "capability" to access it.
cannam@147	356
cannam@147	357 ### Generic Types
cannam@147	358
cannam@147	359 A struct or interface type may be parameterized, making it "generic". For example, this is useful
cannam@147	360 for defining type-safe containers:
cannam@147	361
cannam@147	362 {% highlight capnp %}
cannam@147	363 struct Map(Key, Value) {
cannam@147	364 entries @0 :List(Entry);
cannam@147	365 struct Entry {
cannam@147	366 key @0 :Key;
cannam@147	367 value @1 :Value;
cannam@147	368 }
cannam@147	369 }
cannam@147	370
cannam@147	371 struct People {
cannam@147	372 byName @0 :Map(Text, Person);
cannam@147	373 # Maps names to Person instances.
cannam@147	374 }
cannam@147	375 {% endhighlight %}
cannam@147	376
cannam@147	377 Cap'n Proto generics work very similarly to Java generics or C++ templates. Some notes:
cannam@147	378
cannam@147	379 * Only pointer types (structs, lists, blobs, and interfaces) can be used as generic parameters,
cannam@147	380 much like in Java. This is a pragmatic limitation: allowing parameters to have non-pointer types
cannam@147	381 would mean that different parameterizations of a struct could have completely different layouts,
cannam@147	382 which would excessively complicate the Cap'n Proto implementation.
cannam@147	383
cannam@147	384 * A type declaration nested inside a generic type may use the type parameters of the outer type,
cannam@147	385 as you can see in the example above. This differs from Java, but matches C++. If you want to
cannam@147	386 refer to a nested type from outside the outer type, you must specify the parameters on the outer
cannam@147	387 type, not the inner. For example, `Map(Text, Person).Entry` is a valid type;
cannam@147	388 `Map.Entry(Text, Person)` is NOT valid. (Of course, an inner type may declare additional generic
cannam@147	389 parameters.)
cannam@147	390
cannam@147	391 * If you refer to a generic type but omit its parameters (e.g. declare a field of type `Map` rather
cannam@147	392 than `Map(T, U)`), it is as if you specified `AnyPointer` for each parameter. Note that such
cannam@147	393 a type is wire-compatible with any specific parameterization, so long as you interpret the
cannam@147	394 `AnyPointer`s as the correct type at runtime.
cannam@147	395
cannam@147	396 * Relatedly, it is safe to cast an generic interface of a specific parameterization to a generic
cannam@147	397 interface where all parameters are `AnyPointer` and vice versa, as long as the `AnyPointer`s are
cannam@147	398 treated as the correct type at runtime. This means that e.g. you can implement a server in a
cannam@147	399 generic way that is correct for all parameterizations but call it from clients using a specific
cannam@147	400 parameterization.
cannam@147	401
cannam@147	402 * The encoding of a generic type is exactly the same as the encoding of a type produced by
cannam@147	403 substituting the type parameters manually. For example, `Map(Text, Person)` is encoded exactly
cannam@147	404 the same as:
cannam@147	405
cannam@147	406 <div>{% highlight capnp %}
cannam@147	407 struct PersonMap {
cannam@147	408 # Encoded the same as Map(Text, Person).
cannam@147	409 entries @0 :List(Entry);
cannam@147	410 struct Entry {
cannam@147	411 key @0 :Text;
cannam@147	412 value @1 :Person;
cannam@147	413 }
cannam@147	414 }
cannam@147	415 {% endhighlight %}
cannam@147	416 </div>
cannam@147	417
cannam@147	418 Therefore, it is possible to upgrade non-generic types to generic types while retaining
cannam@147	419 backwards-compatibility.
cannam@147	420
cannam@147	421 * Similarly, a generic interface's protocol is exactly the same as the interface obtained by
cannam@147	422 manually substituting the generic parameters.
cannam@147	423
cannam@147	424 ### Generic Methods
cannam@147	425
cannam@147	426 Interface methods may also have "implicit" generic parameters that apply to a particular method
cannam@147	427 call. This commonly applies to "factory" methods. For example:
cannam@147	428
cannam@147	429 {% highlight capnp %}
cannam@147	430 interface Assignable(T) {
cannam@147	431 # A generic interface, with non-generic methods.
cannam@147	432 get @0 () -> (value :T);
cannam@147	433 set @1 (value :T) -> ();
cannam@147	434 }
cannam@147	435
cannam@147	436 interface AssignableFactory {
cannam@147	437 newAssignable @0 [T] (initialValue :T)
cannam@147	438 -> (assignable :Assignable(T));
cannam@147	439 # A generic method.
cannam@147	440 }
cannam@147	441 {% endhighlight %}
cannam@147	442
cannam@147	443 Here, the method `newAssignable()` is generic. The return type of the method depends on the input
cannam@147	444 type.
cannam@147	445
cannam@147	446 Ideally, calls to a generic method should not have to explicitly specify the method's type
cannam@147	447 parameters, because they should be inferred from the types of the method's regular parameters.
cannam@147	448 However, this may not always be possible; it depends on the programming language and API details.
cannam@147	449
cannam@147	450 Note that if a method's generic parameter is used only in its returns, not its parameters, then
cannam@147	451 this implies that the returned value is appropriate for any parameterization. For example:
cannam@147	452
cannam@147	453 {% highlight capnp %}
cannam@147	454 newUnsetAssignable @1 [T] () -> (assignable :Assignable(T));
cannam@147	455 # Create a new assignable. `get()` on the returned object will
cannam@147	456 # throw an exception until `set()` has been called at least once.
cannam@147	457 {% endhighlight %}
cannam@147	458
cannam@147	459 Because of the way this method is designed, the returned `Assignable` is initially valid for any
cannam@147	460 `T`. Effectively, it doesn't take on a type until the first time `set()` is called, and then `T`
cannam@147	461 retroactively becomes the type of value passed to `set()`.
cannam@147	462
cannam@147	463 In contrast, if it's the case that the returned type is unknown, then you should NOT declare it
cannam@147	464 as generic. Instead, use `AnyPointer`, or omit a type's parameters (since they default to
cannam@147	465 `AnyPointer`). For example:
cannam@147	466
cannam@147	467 {% highlight capnp %}
cannam@147	468 getNamedAssignable @2 (name :Text) -> (assignable :Assignable);
cannam@147	469 # Get the `Assignable` with the given name. It is the
cannam@147	470 # responsibility of the caller to keep track of the type of each
cannam@147	471 # named `Assignable` and cast the returned object appropriately.
cannam@147	472 {% endhighlight %}
cannam@147	473
cannam@147	474 Here, we omitted the parameters to `Assignable` in the return type, because the returned object
cannam@147	475 has a specific type parameterization but it is not locally knowable.
cannam@147	476
cannam@147	477 ### Constants
cannam@147	478
cannam@147	479 You can define constants in Cap'n Proto. These don't affect what is sent on the wire, but they
cannam@147	480 will be included in the generated code, and can be [evaluated using the `capnp`
cannam@147	481 tool](capnp-tool.html#evaluating-constants).
cannam@147	482
cannam@147	483 {% highlight capnp %}
cannam@147	484 const pi :Float32 = 3.14159;
cannam@147	485 const bob :Person = (name = "Bob", email = "bob@example.com");
cannam@147	486 const secret :Data = 0x"9f98739c2b53835e 6720a00907abd42f";
cannam@147	487 {% endhighlight %}
cannam@147	488
cannam@147	489 Additionally, you may refer to a constant inside another value (e.g. another constant, or a default
cannam@147	490 value of a field).
cannam@147	491
cannam@147	492 {% highlight capnp %}
cannam@147	493 const foo :Int32 = 123;
cannam@147	494 const bar :Text = "Hello";
cannam@147	495 const baz :SomeStruct = (id = .foo, message = .bar);
cannam@147	496 {% endhighlight %}
cannam@147	497
cannam@147	498 Note that when substituting a constant into another value, the constant's name must be qualified
cannam@147	499 with its scope. E.g. if a constant `qux` is declared nested in a type `Corge`, it would need to
cannam@147	500 be referenced as `Corge.qux` rather than just `qux`, even when used within the `Corge` scope.
cannam@147	501 Constants declared at the top-level scope are prefixed just with `.`. This rule helps to make it
cannam@147	502 clear that the name refers to a user-defined constant, rather than a literal value (like `true` or
cannam@147	503 `inf`) or an enum value.
cannam@147	504
cannam@147	505 ### Nesting, Scope, and Aliases
cannam@147	506
cannam@147	507 You can nest constant, alias, and type definitions inside structs and interfaces (but not enums).
cannam@147	508 This has no effect on any definition involved except to define the scope of its name. So in Java
cannam@147	509 terms, inner classes are always "static". To name a nested type from another scope, separate the
cannam@147	510 path with `.`s.
cannam@147	511
cannam@147	512 {% highlight capnp %}
cannam@147	513 struct Foo {
cannam@147	514 struct Bar {
cannam@147	515 #...
cannam@147	516 }
cannam@147	517 bar @0 :Bar;
cannam@147	518 }
cannam@147	519
cannam@147	520 struct Baz {
cannam@147	521 bar @0 :Foo.Bar;
cannam@147	522 }
cannam@147	523 {% endhighlight %}
cannam@147	524
cannam@147	525 If typing long scopes becomes cumbersome, you can use `using` to declare an alias.
cannam@147	526
cannam@147	527 {% highlight capnp %}
cannam@147	528 struct Qux {
cannam@147	529 using Foo.Bar;
cannam@147	530 bar @0 :Bar;
cannam@147	531 }
cannam@147	532
cannam@147	533 struct Corge {
cannam@147	534 using T = Foo.Bar;
cannam@147	535 bar @0 :T;
cannam@147	536 }
cannam@147	537 {% endhighlight %}
cannam@147	538
cannam@147	539 ### Imports
cannam@147	540
cannam@147	541 An `import` expression names the scope of some other file:
cannam@147	542
cannam@147	543 {% highlight capnp %}
cannam@147	544 struct Foo {
cannam@147	545 # Use type "Baz" defined in bar.capnp.
cannam@147	546 baz @0 :import "bar.capnp".Baz;
cannam@147	547 }
cannam@147	548 {% endhighlight %}
cannam@147	549
cannam@147	550 Of course, typically it's more readable to define an alias:
cannam@147	551
cannam@147	552 {% highlight capnp %}
cannam@147	553 using Bar = import "bar.capnp";
cannam@147	554
cannam@147	555 struct Foo {
cannam@147	556 # Use type "Baz" defined in bar.capnp.
cannam@147	557 baz @0 :Bar.Baz;
cannam@147	558 }
cannam@147	559 {% endhighlight %}
cannam@147	560
cannam@147	561 Or even:
cannam@147	562
cannam@147	563 {% highlight capnp %}
cannam@147	564 using import "bar.capnp".Baz;
cannam@147	565
cannam@147	566 struct Foo {
cannam@147	567 baz @0 :Baz;
cannam@147	568 }
cannam@147	569 {% endhighlight %}
cannam@147	570
cannam@147	571 The above imports specify relative paths. If the path begins with a `/`, it is absolute -- in
cannam@147	572 this case, the `capnp` tool searches for the file in each of the search path directories specified
cannam@147	573 with `-I`.
cannam@147	574
cannam@147	575 ### Annotations
cannam@147	576
cannam@147	577 Sometimes you want to attach extra information to parts of your protocol that isn't part of the
cannam@147	578 Cap'n Proto language. This information might control details of a particular code generator, or
cannam@147	579 you might even read it at run time to assist in some kind of dynamic message processing. For
cannam@147	580 example, you might create a field annotation which means "hide from the public", and when you send
cannam@147	581 a message to an external user, you might invoke some code first that iterates over your message and
cannam@147	582 removes all of these hidden fields.
cannam@147	583
cannam@147	584 You may declare annotations and use them like so:
cannam@147	585
cannam@147	586 {% highlight capnp %}
cannam@147	587 # Declare an annotation 'foo' which applies to struct and enum types.
cannam@147	588 annotation foo(struct, enum) :Text;
cannam@147	589
cannam@147	590 # Apply 'foo' to to MyType.
cannam@147	591 struct MyType $foo("bar") {
cannam@147	592 # ...
cannam@147	593 }
cannam@147	594 {% endhighlight %}
cannam@147	595
cannam@147	596 The possible targets for an annotation are: `file`, `struct`, `field`, `union`, `enum`, `enumerant`,
cannam@147	597 `interface`, `method`, `parameter`, `annotation`, `const`. You may also specify `*` to cover them
cannam@147	598 all.
cannam@147	599
cannam@147	600 {% highlight capnp %}
cannam@147	601 # 'baz' can annotate anything!
cannam@147	602 annotation baz(*) :Int32;
cannam@147	603
cannam@147	604 $baz(1); # Annotate the file.
cannam@147	605
cannam@147	606 struct MyStruct $baz(2) {
cannam@147	607 myField @0 :Text = "default" $baz(3);
cannam@147	608 myUnion :union $baz(4) {
cannam@147	609 # ...
cannam@147	610 }
cannam@147	611 }
cannam@147	612
cannam@147	613 enum MyEnum $baz(5) {
cannam@147	614 myEnumerant @0 $baz(6);
cannam@147	615 }
cannam@147	616
cannam@147	617 interface MyInterface $baz(7) {
cannam@147	618 myMethod @0 (myParam :Text $baz(9)) -> () $baz(8);
cannam@147	619 }
cannam@147	620
cannam@147	621 annotation myAnnotation(struct) :Int32 $baz(10);
cannam@147	622 const myConst :Int32 = 123 $baz(11);
cannam@147	623 {% endhighlight %}
cannam@147	624
cannam@147	625 `Void` annotations can omit the value. Struct-typed annotations are also allowed. Tip: If
cannam@147	626 you want an annotation to have a default value, declare it as a struct with a single field with
cannam@147	627 a default value.
cannam@147	628
cannam@147	629 {% highlight capnp %}
cannam@147	630 annotation qux(struct, field) :Void;
cannam@147	631
cannam@147	632 struct MyStruct $qux {
cannam@147	633 string @0 :Text $qux;
cannam@147	634 number @1 :Int32 $qux;
cannam@147	635 }
cannam@147	636
cannam@147	637 annotation corge(file) :MyStruct;
cannam@147	638
cannam@147	639 $corge(string = "hello", number = 123);
cannam@147	640
cannam@147	641 struct Grault {
cannam@147	642 value @0 :Int32 = 123;
cannam@147	643 }
cannam@147	644
cannam@147	645 annotation grault(file) :Grault;
cannam@147	646
cannam@147	647 $grault(); # value defaults to 123
cannam@147	648 $grault(value = 456);
cannam@147	649 {% endhighlight %}
cannam@147	650
cannam@147	651 ### Unique IDs
cannam@147	652
cannam@147	653 A Cap'n Proto file must have a unique 64-bit ID, and each type and annotation defined therein may
cannam@147	654 also have an ID. Use `capnp id` to generate a new ID randomly. ID specifications begin with `@`:
cannam@147	655
cannam@147	656 {% highlight capnp %}
cannam@147	657 # file ID
cannam@147	658 @0xdbb9ad1f14bf0b36;
cannam@147	659
cannam@147	660 struct Foo @0x8db435604d0d3723 {
cannam@147	661 # ...
cannam@147	662 }
cannam@147	663
cannam@147	664 enum Bar @0xb400f69b5334aab3 {
cannam@147	665 # ...
cannam@147	666 }
cannam@147	667
cannam@147	668 interface Baz @0xf7141baba3c12691 {
cannam@147	669 # ...
cannam@147	670 }
cannam@147	671
cannam@147	672 annotation qux @0xf8a1bedf44c89f00 (field) :Text;
cannam@147	673 {% endhighlight %}
cannam@147	674
cannam@147	675 If you omit the ID for a type or annotation, one will be assigned automatically. This default
cannam@147	676 ID is derived by taking the first 8 bytes of the MD5 hash of the parent scope's ID concatenated
cannam@147	677 with the declaration's name (where the "parent scope" is the file for top-level declarations, or
cannam@147	678 the outer type for nested declarations). You can see the automatically-generated IDs by "compiling"
cannam@147	679 your file with the `-ocapnp` flag, which echos the schema back to the terminal annotated with
cannam@147	680 extra information, e.g. `capnp compile -ocapnp myschema.capnp`. In general, you would only specify
cannam@147	681 an explicit ID for a declaration if that declaration has been renamed or moved and you want the ID
cannam@147	682 to stay the same for backwards-compatibility.
cannam@147	683
cannam@147	684 IDs exist to provide a relatively short yet unambiguous way to refer to a type or annotation from
cannam@147	685 another context. They may be used for representing schemas, for tagging dynamically-typed fields,
cannam@147	686 etc. Most languages prefer instead to define a symbolic global namespace e.g. full of "packages",
cannam@147	687 but this would have some important disadvantages in the context of Cap'n Proto:
cannam@147	688
cannam@147	689 * Programmers often feel the need to change symbolic names and organization in order to make their
cannam@147	690 code cleaner, but the renamed code should still work with existing encoded data.
cannam@147	691 * It's easy for symbolic names to collide, and these collisions could be hard to detect in a large
cannam@147	692 distributed system with many different binaries using different versions of protocols.
cannam@147	693 * Fully-qualified type names may be large and waste space when transmitted on the wire.
cannam@147	694
cannam@147	695 Note that IDs are 64-bit (actually, 63-bit, as the first bit is always 1). Random collisions
cannam@147	696 are possible, but unlikely -- there would have to be on the order of a billion types before this
cannam@147	697 becomes a real concern. Collisions from misuse (e.g. copying an example without changing the ID)
cannam@147	698 are much more likely.
cannam@147	699
cannam@147	700 ## Evolving Your Protocol
cannam@147	701
cannam@147	702 A protocol can be changed in the following ways without breaking backwards-compatibility, and
cannam@147	703 without changing the [canonical](encoding.html#canonicalization) encoding of a message:
cannam@147	704
cannam@147	705 * New types, constants, and aliases can be added anywhere, since they obviously don't affect the
cannam@147	706 encoding of any existing type.
cannam@147	707
cannam@147	708 * New fields, enumerants, and methods may be added to structs, enums, and interfaces, respectively,
cannam@147	709 as long as each new member's number is larger than all previous members. Similarly, new fields
cannam@147	710 may be added to existing groups and unions.
cannam@147	711
cannam@147	712 * New parameters may be added to a method. The new parameters must be added to the end of the
cannam@147	713 parameter list and must have default values.
cannam@147	714
cannam@147	715 * Members can be re-arranged in the source code, so long as their numbers stay the same.
cannam@147	716
cannam@147	717 * Any symbolic name can be changed, as long as the type ID / ordinal numbers stay the same. Note
cannam@147	718 that type declarations have an implicit ID generated based on their name and parent's ID, but
cannam@147	719 you can use `capnp compile -ocapnp myschema.capnp` to find out what that number is, and then
cannam@147	720 declare it explicitly after your rename.
cannam@147	721
cannam@147	722 * Type definitions can be moved to different scopes, as long as the type ID is declared
cannam@147	723 explicitly.
cannam@147	724
cannam@147	725 * A field can be moved into a group or a union, as long as the group/union and all other fields
cannam@147	726 within it are new. In other words, a field can be replaced with a group or union containing an
cannam@147	727 equivalent field and some new fields.
cannam@147	728
cannam@147	729 * A non-generic type can be made [generic](#generic-types), and new generic parameters may be
cannam@147	730 added to an existing generic type. Other types used inside the body of the newly-generic type can
cannam@147	731 be replaced with the new generic parameter so long as all existing users of the type are updated
cannam@147	732 to bind that generic parameter to the type it replaced. For example:
cannam@147	733
cannam@147	734 <div>{% highlight capnp %}
cannam@147	735 struct Map {
cannam@147	736 entries @0 :List(Entry);
cannam@147	737 struct Entry {
cannam@147	738 key @0 :Text;
cannam@147	739 value @1 :Text;
cannam@147	740 }
cannam@147	741 }
cannam@147	742 {% endhighlight %}
cannam@147	743 </div>
cannam@147	744
cannam@147	745 Can change to:
cannam@147	746
cannam@147	747 <div>{% highlight capnp %}
cannam@147	748 struct Map(Key, Value) {
cannam@147	749 entries @0 :List(Entry);
cannam@147	750 struct Entry {
cannam@147	751 key @0 :Key;
cannam@147	752 value @1 :Value;
cannam@147	753 }
cannam@147	754 }
cannam@147	755 {% endhighlight %}
cannam@147	756 </div>
cannam@147	757
cannam@147	758 As long as all existing uses of `Map` are replaced with `Map(Text, Text)` (and any uses of
cannam@147	759 `Map.Entry` are replaced with `Map(Text, Text).Entry`).
cannam@147	760
cannam@147	761 (This rule applies analogously to generic methods.)
cannam@147	762
cannam@147	763 The following changes are backwards-compatible but may change the canonical encoding of a message.
cannam@147	764 Apps that rely on canonicalization (such as some cryptographic protocols) should avoid changes in
cannam@147	765 this list, but most apps can safely use them:
cannam@147	766
cannam@147	767 * A field of type `List(T)`, where `T` is a primitive type, blob, or list, may be changed to type
cannam@147	768 `List(U)`, where `U` is a struct type whose `@0` field is of type `T`. This rule is useful when
cannam@147	769 you realize too late that you need to attach some extra data to each element of your list.
cannam@147	770 Without this rule, you would be stuck defining parallel lists, which are ugly and error-prone.
cannam@147	771 As a special exception to this rule, `List(Bool)` may not be upgraded to a list of structs,
cannam@147	772 because implementing this for bit lists has proven unreasonably expensive.
cannam@147	773
cannam@147	774 Any change not listed above should be assumed NOT to be safe. In particular:
cannam@147	775
cannam@147	776 * You cannot change a field, method, or enumerant's number.
cannam@147	777 * You cannot change a field or method parameter's type or default value.
cannam@147	778 * You cannot change a type's ID.
cannam@147	779 * You cannot change the name of a type that doesn't have an explicit ID, as the implicit ID is
cannam@147	780 generated based in part on the type name.
cannam@147	781 * You cannot move a type to a different scope or file unless it has an explicit ID, as the implicit
cannam@147	782 ID is based in part on the scope's ID.
cannam@147	783 * You cannot move an existing field into or out of an existing union, nor can you form a new union
cannam@147	784 containing more than one existing field.
cannam@147	785
cannam@147	786 Also, these rules only apply to the Cap'n Proto native encoding. It is sometimes useful to
cannam@147	787 transcode Cap'n Proto types to other formats, like JSON, which may have different rules (e.g.,
cannam@147	788 field names cannot change in JSON).

Mercurial > hg > sv-dependency-builds

annotate src/capnproto-0.6.0/doc/language.md @ 169:223a55898ab9 tip default