cannam@147: --- cannam@147: layout: page cannam@147: title: Schema Language cannam@147: --- cannam@147: cannam@147: # Schema Language cannam@147: cannam@147: Like Protocol Buffers and Thrift (but unlike JSON or MessagePack), Cap'n Proto messages are cannam@147: strongly-typed and not self-describing. You must define your message structure in a special cannam@147: language, then invoke the Cap'n Proto compiler (`capnp compile`) to generate source code to cannam@147: manipulate that message type in your desired language. cannam@147: cannam@147: For example: cannam@147: cannam@147: {% highlight capnp %} cannam@147: @0xdbb9ad1f14bf0b36; # unique file ID, generated by `capnp id` cannam@147: cannam@147: struct Person { cannam@147: name @0 :Text; cannam@147: birthdate @3 :Date; cannam@147: cannam@147: email @1 :Text; cannam@147: phones @2 :List(PhoneNumber); cannam@147: cannam@147: struct PhoneNumber { cannam@147: number @0 :Text; cannam@147: type @1 :Type; cannam@147: cannam@147: enum Type { cannam@147: mobile @0; cannam@147: home @1; cannam@147: work @2; cannam@147: } cannam@147: } cannam@147: } cannam@147: cannam@147: struct Date { cannam@147: year @0 :Int16; cannam@147: month @1 :UInt8; cannam@147: day @2 :UInt8; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Some notes: cannam@147: cannam@147: * Types come after names. The name is by far the most important thing to see, especially when cannam@147: quickly skimming, so we put it up front where it is most visible. Sorry, C got it wrong. cannam@147: * The `@N` annotations show how the protocol evolved over time, so that the system can make sure cannam@147: to maintain compatibility with older versions. Fields (and enumerants, and interface methods) cannam@147: must be numbered consecutively starting from zero in the order in which they were added. In this cannam@147: example, it looks like the `birthdate` field was added to the `Person` structure recently -- its cannam@147: number is higher than the `email` and `phones` fields. Unlike Protobufs, you cannot skip numbers cannam@147: when defining fields -- but there was never any reason to do so anyway. cannam@147: cannam@147: ## Language Reference cannam@147: cannam@147: ### Comments cannam@147: cannam@147: Comments are indicated by hash signs and extend to the end of the line: cannam@147: cannam@147: {% highlight capnp %} cannam@147: # This is a comment. cannam@147: {% endhighlight %} cannam@147: cannam@147: Comments meant as documentation should appear _after_ the declaration, either on the same line, or cannam@147: on a subsequent line. Doc comments for aggregate definitions should appear on the line after the cannam@147: opening brace. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Date { cannam@147: # A standard Gregorian calendar date. cannam@147: cannam@147: year @0 :Int16; cannam@147: # The year. Must include the century. cannam@147: # Negative value indicates BC. cannam@147: cannam@147: month @1 :UInt8; # Month number, 1-12. cannam@147: day @2 :UInt8; # Day number, 1-30. cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Placing the comment _after_ the declaration rather than before makes the code more readable, cannam@147: especially when doc comments grow long. You almost always need to see the declaration before you cannam@147: can start reading the comment. cannam@147: cannam@147: ### Built-in Types cannam@147: cannam@147: The following types are automatically defined: cannam@147: cannam@147: * **Void:** `Void` cannam@147: * **Boolean:** `Bool` cannam@147: * **Integers:** `Int8`, `Int16`, `Int32`, `Int64` cannam@147: * **Unsigned integers:** `UInt8`, `UInt16`, `UInt32`, `UInt64` cannam@147: * **Floating-point:** `Float32`, `Float64` cannam@147: * **Blobs:** `Text`, `Data` cannam@147: * **Lists:** `List(T)` cannam@147: cannam@147: Notes: cannam@147: cannam@147: * The `Void` type has exactly one possible value, and thus can be encoded in zero bits. It is cannam@147: rarely used, but can be useful as a union member. cannam@147: * `Text` is always UTF-8 encoded and NUL-terminated. cannam@147: * `Data` is a completely arbitrary sequence of bytes. cannam@147: * `List` is a parameterized type, where the parameter is the element type. For example, cannam@147: `List(Int32)`, `List(Person)`, and `List(List(Text))` are all valid. cannam@147: cannam@147: ### Structs cannam@147: cannam@147: A struct has a set of named, typed fields, numbered consecutively starting from zero. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Person { cannam@147: name @0 :Text; cannam@147: email @1 :Text; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Fields can have default values: cannam@147: cannam@147: {% highlight capnp %} cannam@147: foo @0 :Int32 = 123; cannam@147: bar @1 :Text = "blah"; cannam@147: baz @2 :List(Bool) = [ true, false, false, true ]; cannam@147: qux @3 :Person = (name = "Bob", email = "bob@example.com"); cannam@147: corge @4 :Void = void; cannam@147: grault @5 :Data = 0x"a1 40 33"; cannam@147: {% endhighlight %} cannam@147: cannam@147: ### Unions cannam@147: cannam@147: A union is two or more fields of a struct which are stored in the same location. Only one of cannam@147: these fields can be set at a time, and a separate tag is maintained to track which one is cannam@147: currently set. Unlike in C, unions are not types, they are simply properties of fields, therefore cannam@147: union declarations do not look like types. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Person { cannam@147: # ... cannam@147: cannam@147: employment :union { cannam@147: unemployed @4 :Void; cannam@147: employer @5 :Company; cannam@147: school @6 :School; cannam@147: selfEmployed @7 :Void; cannam@147: # We assume that a person is only one of these. cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Additionally, unions can be unnamed. Each struct can contain no more than one unnamed union. Use cannam@147: unnamed unions in cases where you would struggle to think of an appropriate name for the union, cannam@147: because the union represents the main body of the struct. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Shape { cannam@147: area @0 :Float64; cannam@147: cannam@147: union { cannam@147: circle @1 :Float64; # radius cannam@147: square @2 :Float64; # width cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Notes: cannam@147: cannam@147: * Unions members are numbered in the same number space as fields of the containing struct. cannam@147: Remember that the purpose of the numbers is to indicate the evolution order of the cannam@147: struct. The system needs to know when the union fields were declared relative to the non-union cannam@147: fields. cannam@147: cannam@147: * Notice that we used the "useless" `Void` type here. We don't have any extra information to store cannam@147: for the `unemployed` or `selfEmployed` cases, but we still want the union to distinguish these cannam@147: states from others. cannam@147: cannam@147: * By default, when a struct is initialized, the lowest-numbered field in the union is "set". If cannam@147: you do not want any field set by default, simply declare a field called "unset" and make it the cannam@147: lowest-numbered field. cannam@147: cannam@147: * You can move an existing field into a new union without breaking compatibility with existing cannam@147: data, as long as all of the other fields in the union are new. Since the existing field is cannam@147: necessarily the lowest-numbered in the union, it will be the union's default field. cannam@147: cannam@147: **Wait, why aren't unions first-class types?** cannam@147: cannam@147: Requiring unions to be declared inside a struct, rather than living as free-standing types, has cannam@147: some important advantages: cannam@147: cannam@147: * If unions were first-class types, then union members would clearly have to be numbered separately cannam@147: from the containing type's fields. This means that the compiler, when deciding how to position cannam@147: the union in its containing struct, would have to conservatively assume that any kind of new cannam@147: field might be added to the union in the future. To support this, all unions would have to cannam@147: be allocated as separate objects embedded by pointer, wasting space. cannam@147: cannam@147: * A free-standing union would be a liability for protocol evolution, because no additional data cannam@147: can be attached to it later on. Consider, for example, a type which represents a parser token. cannam@147: This type is naturally a union: it may be a keyword, identifier, numeric literal, quoted string, cannam@147: etc. So the author defines it as a union, and the type is used widely. Later on, the developer cannam@147: wants to attach information to the token indicating its line and column number in the source cannam@147: file. Unfortunately, this is impossible without updating all users of the type, because the new cannam@147: information ought to apply to _all_ token instances, not just specific members of the union. On cannam@147: the other hand, if unions must be embedded within structs, it is always possible to add new cannam@147: fields to the struct later on. cannam@147: cannam@147: * When evolving a protocol it is common to discover that some existing field really should have cannam@147: been enclosed in a union, because new fields being added are mutually exclusive with it. With cannam@147: Cap'n Proto's unions, it is actually possible to "retroactively unionize" such a field without cannam@147: changing its layout. This allows you to continue being able to read old data without wasting cannam@147: space when writing new data. This is only possible when unions are declared within their cannam@147: containing struct. cannam@147: cannam@147: Cap'n Proto's unconventional approach to unions provides these advantages without any real down cannam@147: side: where you would conventionally define a free-standing union type, in Cap'n Proto you cannam@147: may simply define a struct type that contains only that union (probably unnamed), and you have cannam@147: achieved the same effect. Thus, aside from being slightly unintuitive, it is strictly superior. cannam@147: cannam@147: ### Groups cannam@147: cannam@147: A group is a set of fields that are encapsulated in their own scope. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Person { cannam@147: # ... cannam@147: cannam@147: # Note: This is a terrible way to use groups, and meant cannam@147: # only to demonstrate the syntax. cannam@147: address :group { cannam@147: houseNumber @8 :UInt32; cannam@147: street @9 :Text; cannam@147: city @10 :Text; cannam@147: country @11 :Text; cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Interface-wise, the above group behaves as if you had defined a nested struct called `Address` and cannam@147: then a field `address :Address`. However, a group is _not_ a separate object from its containing cannam@147: struct: the fields are numbered in the same space as the containing struct's fields, and are laid cannam@147: out exactly the same as if they hadn't been grouped at all. Essentially, a group is just a cannam@147: namespace. cannam@147: cannam@147: Groups on their own (as in the above example) are useless, almost as much so as the `Void` type. cannam@147: They become interesting when used together with unions. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Shape { cannam@147: area @0 :Float64; cannam@147: cannam@147: union { cannam@147: circle :group { cannam@147: radius @1 :Float64; cannam@147: } cannam@147: rectangle :group { cannam@147: width @2 :Float64; cannam@147: height @3 :Float64; cannam@147: } cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: There are two main reason to use groups with unions: cannam@147: cannam@147: 1. They are often more self-documenting. Notice that `radius` is now a member of `circle`, so cannam@147: we don't need a comment to explain that the value of `circle` is its radius. cannam@147: 2. You can add additional members later on, without breaking compatibility. Notice how we upgraded cannam@147: `square` to `rectangle` above, adding a `height` field. This definition is actually cannam@147: wire-compatible with the previous version of the `Shape` example from the "union" section cannam@147: (aside from the fact that `height` will always be zero when reading old data -- hey, it's not cannam@147: a perfect example). In real-world use, it is common to realize after the fact that you need to cannam@147: add some information to a struct that only applies when one particular union field is set. cannam@147: Without the ability to upgrade to a group, you would have to define the new field separately, cannam@147: and have it waste space when not relevant. cannam@147: cannam@147: Note that a named union is actually exactly equivalent to a named group containing an unnamed cannam@147: union. cannam@147: cannam@147: **Wait, weren't groups considered a misfeature in Protobufs? Why did you do this again?** cannam@147: cannam@147: They are useful in unions, which Protobufs did not have. Meanwhile, you cannot have a "repeated cannam@147: group" in Cap'n Proto, which was the case that got into the most trouble with Protobufs. cannam@147: cannam@147: ### Dynamically-typed Fields cannam@147: cannam@147: A struct may have a field with type `AnyPointer`. This field's value can be of any pointer type -- cannam@147: i.e. any struct, interface, list, or blob. This is essentially like a `void*` in C. cannam@147: cannam@147: See also [generics](#generic-types). cannam@147: cannam@147: ### Enums cannam@147: cannam@147: An enum is a type with a small finite set of symbolic values. cannam@147: cannam@147: {% highlight capnp %} cannam@147: enum Rfc3092Variable { cannam@147: foo @0; cannam@147: bar @1; cannam@147: baz @2; cannam@147: qux @3; cannam@147: # ... cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Like fields, enumerants must be numbered sequentially starting from zero. In languages where cannam@147: enums have numeric values, these numbers will be used, but in general Cap'n Proto enums should not cannam@147: be considered numeric. cannam@147: cannam@147: ### Interfaces cannam@147: cannam@147: An interface has a collection of methods, each of which takes some parameters and return some cannam@147: results. Like struct fields, methods are numbered. Interfaces support inheritance, including cannam@147: multiple inheritance. cannam@147: cannam@147: {% highlight capnp %} cannam@147: interface Node { cannam@147: isDirectory @0 () -> (result :Bool); cannam@147: } cannam@147: cannam@147: interface Directory extends(Node) { cannam@147: list @0 () -> (list :List(Entry)); cannam@147: struct Entry { cannam@147: name @0 :Text; cannam@147: node @1 :Node; cannam@147: } cannam@147: cannam@147: create @1 (name :Text) -> (file :File); cannam@147: mkdir @2 (name :Text) -> (directory :Directory); cannam@147: open @3 (name :Text) -> (node :Node); cannam@147: delete @4 (name :Text); cannam@147: link @5 (name :Text, node :Node); cannam@147: } cannam@147: cannam@147: interface File extends(Node) { cannam@147: size @0 () -> (size :UInt64); cannam@147: read @1 (startAt :UInt64 = 0, amount :UInt64 = 0xffffffffffffffff) cannam@147: -> (data :Data); cannam@147: # Default params = read entire file. cannam@147: cannam@147: write @2 (startAt :UInt64, data :Data); cannam@147: truncate @3 (size :UInt64); cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Notice something interesting here: `Node`, `Directory`, and `File` are interfaces, but several cannam@147: methods take these types as parameters or return them as results. `Directory.Entry` is a struct, cannam@147: but it contains a `Node`, which is an interface. Structs (and primitive types) are passed over RPC cannam@147: by value, but interfaces are passed by reference. So when `Directory.list` is called remotely, the cannam@147: content of a `List(Entry)` (including the text of each `name`) is transmitted back, but for the cannam@147: `node` field, only a reference to some remote `Node` object is sent. cannam@147: cannam@147: When an address of an object is transmitted, the RPC system automatically manages making sure that cannam@147: the recipient gets permission to call the addressed object -- because if the recipient wasn't cannam@147: meant to have access, the sender shouldn't have sent the reference in the first place. This makes cannam@147: it very easy to develop secure protocols with Cap'n Proto -- you almost don't need to think about cannam@147: access control at all. This feature is what makes Cap'n Proto a "capability-based" RPC system -- a cannam@147: reference to an object inherently represents a "capability" to access it. cannam@147: cannam@147: ### Generic Types cannam@147: cannam@147: A struct or interface type may be parameterized, making it "generic". For example, this is useful cannam@147: for defining type-safe containers: cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Map(Key, Value) { cannam@147: entries @0 :List(Entry); cannam@147: struct Entry { cannam@147: key @0 :Key; cannam@147: value @1 :Value; cannam@147: } cannam@147: } cannam@147: cannam@147: struct People { cannam@147: byName @0 :Map(Text, Person); cannam@147: # Maps names to Person instances. cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Cap'n Proto generics work very similarly to Java generics or C++ templates. Some notes: cannam@147: cannam@147: * Only pointer types (structs, lists, blobs, and interfaces) can be used as generic parameters, cannam@147: much like in Java. This is a pragmatic limitation: allowing parameters to have non-pointer types cannam@147: would mean that different parameterizations of a struct could have completely different layouts, cannam@147: which would excessively complicate the Cap'n Proto implementation. cannam@147: cannam@147: * A type declaration nested inside a generic type may use the type parameters of the outer type, cannam@147: as you can see in the example above. This differs from Java, but matches C++. If you want to cannam@147: refer to a nested type from outside the outer type, you must specify the parameters on the outer cannam@147: type, not the inner. For example, `Map(Text, Person).Entry` is a valid type; cannam@147: `Map.Entry(Text, Person)` is NOT valid. (Of course, an inner type may declare additional generic cannam@147: parameters.) cannam@147: cannam@147: * If you refer to a generic type but omit its parameters (e.g. declare a field of type `Map` rather cannam@147: than `Map(T, U)`), it is as if you specified `AnyPointer` for each parameter. Note that such cannam@147: a type is wire-compatible with any specific parameterization, so long as you interpret the cannam@147: `AnyPointer`s as the correct type at runtime. cannam@147: cannam@147: * Relatedly, it is safe to cast an generic interface of a specific parameterization to a generic cannam@147: interface where all parameters are `AnyPointer` and vice versa, as long as the `AnyPointer`s are cannam@147: treated as the correct type at runtime. This means that e.g. you can implement a server in a cannam@147: generic way that is correct for all parameterizations but call it from clients using a specific cannam@147: parameterization. cannam@147: cannam@147: * The encoding of a generic type is exactly the same as the encoding of a type produced by cannam@147: substituting the type parameters manually. For example, `Map(Text, Person)` is encoded exactly cannam@147: the same as: cannam@147: cannam@147:
{% highlight capnp %} cannam@147: struct PersonMap { cannam@147: # Encoded the same as Map(Text, Person). cannam@147: entries @0 :List(Entry); cannam@147: struct Entry { cannam@147: key @0 :Text; cannam@147: value @1 :Person; cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147:
cannam@147: cannam@147: Therefore, it is possible to upgrade non-generic types to generic types while retaining cannam@147: backwards-compatibility. cannam@147: cannam@147: * Similarly, a generic interface's protocol is exactly the same as the interface obtained by cannam@147: manually substituting the generic parameters. cannam@147: cannam@147: ### Generic Methods cannam@147: cannam@147: Interface methods may also have "implicit" generic parameters that apply to a particular method cannam@147: call. This commonly applies to "factory" methods. For example: cannam@147: cannam@147: {% highlight capnp %} cannam@147: interface Assignable(T) { cannam@147: # A generic interface, with non-generic methods. cannam@147: get @0 () -> (value :T); cannam@147: set @1 (value :T) -> (); cannam@147: } cannam@147: cannam@147: interface AssignableFactory { cannam@147: newAssignable @0 [T] (initialValue :T) cannam@147: -> (assignable :Assignable(T)); cannam@147: # A generic method. cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Here, the method `newAssignable()` is generic. The return type of the method depends on the input cannam@147: type. cannam@147: cannam@147: Ideally, calls to a generic method should not have to explicitly specify the method's type cannam@147: parameters, because they should be inferred from the types of the method's regular parameters. cannam@147: However, this may not always be possible; it depends on the programming language and API details. cannam@147: cannam@147: Note that if a method's generic parameter is used only in its returns, not its parameters, then cannam@147: this implies that the returned value is appropriate for any parameterization. For example: cannam@147: cannam@147: {% highlight capnp %} cannam@147: newUnsetAssignable @1 [T] () -> (assignable :Assignable(T)); cannam@147: # Create a new assignable. `get()` on the returned object will cannam@147: # throw an exception until `set()` has been called at least once. cannam@147: {% endhighlight %} cannam@147: cannam@147: Because of the way this method is designed, the returned `Assignable` is initially valid for any cannam@147: `T`. Effectively, it doesn't take on a type until the first time `set()` is called, and then `T` cannam@147: retroactively becomes the type of value passed to `set()`. cannam@147: cannam@147: In contrast, if it's the case that the returned type is unknown, then you should NOT declare it cannam@147: as generic. Instead, use `AnyPointer`, or omit a type's parameters (since they default to cannam@147: `AnyPointer`). For example: cannam@147: cannam@147: {% highlight capnp %} cannam@147: getNamedAssignable @2 (name :Text) -> (assignable :Assignable); cannam@147: # Get the `Assignable` with the given name. It is the cannam@147: # responsibility of the caller to keep track of the type of each cannam@147: # named `Assignable` and cast the returned object appropriately. cannam@147: {% endhighlight %} cannam@147: cannam@147: Here, we omitted the parameters to `Assignable` in the return type, because the returned object cannam@147: has a specific type parameterization but it is not locally knowable. cannam@147: cannam@147: ### Constants cannam@147: cannam@147: You can define constants in Cap'n Proto. These don't affect what is sent on the wire, but they cannam@147: will be included in the generated code, and can be [evaluated using the `capnp` cannam@147: tool](capnp-tool.html#evaluating-constants). cannam@147: cannam@147: {% highlight capnp %} cannam@147: const pi :Float32 = 3.14159; cannam@147: const bob :Person = (name = "Bob", email = "bob@example.com"); cannam@147: const secret :Data = 0x"9f98739c2b53835e 6720a00907abd42f"; cannam@147: {% endhighlight %} cannam@147: cannam@147: Additionally, you may refer to a constant inside another value (e.g. another constant, or a default cannam@147: value of a field). cannam@147: cannam@147: {% highlight capnp %} cannam@147: const foo :Int32 = 123; cannam@147: const bar :Text = "Hello"; cannam@147: const baz :SomeStruct = (id = .foo, message = .bar); cannam@147: {% endhighlight %} cannam@147: cannam@147: Note that when substituting a constant into another value, the constant's name must be qualified cannam@147: with its scope. E.g. if a constant `qux` is declared nested in a type `Corge`, it would need to cannam@147: be referenced as `Corge.qux` rather than just `qux`, even when used within the `Corge` scope. cannam@147: Constants declared at the top-level scope are prefixed just with `.`. This rule helps to make it cannam@147: clear that the name refers to a user-defined constant, rather than a literal value (like `true` or cannam@147: `inf`) or an enum value. cannam@147: cannam@147: ### Nesting, Scope, and Aliases cannam@147: cannam@147: You can nest constant, alias, and type definitions inside structs and interfaces (but not enums). cannam@147: This has no effect on any definition involved except to define the scope of its name. So in Java cannam@147: terms, inner classes are always "static". To name a nested type from another scope, separate the cannam@147: path with `.`s. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Foo { cannam@147: struct Bar { cannam@147: #... cannam@147: } cannam@147: bar @0 :Bar; cannam@147: } cannam@147: cannam@147: struct Baz { cannam@147: bar @0 :Foo.Bar; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: If typing long scopes becomes cumbersome, you can use `using` to declare an alias. cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Qux { cannam@147: using Foo.Bar; cannam@147: bar @0 :Bar; cannam@147: } cannam@147: cannam@147: struct Corge { cannam@147: using T = Foo.Bar; cannam@147: bar @0 :T; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: ### Imports cannam@147: cannam@147: An `import` expression names the scope of some other file: cannam@147: cannam@147: {% highlight capnp %} cannam@147: struct Foo { cannam@147: # Use type "Baz" defined in bar.capnp. cannam@147: baz @0 :import "bar.capnp".Baz; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Of course, typically it's more readable to define an alias: cannam@147: cannam@147: {% highlight capnp %} cannam@147: using Bar = import "bar.capnp"; cannam@147: cannam@147: struct Foo { cannam@147: # Use type "Baz" defined in bar.capnp. cannam@147: baz @0 :Bar.Baz; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: Or even: cannam@147: cannam@147: {% highlight capnp %} cannam@147: using import "bar.capnp".Baz; cannam@147: cannam@147: struct Foo { cannam@147: baz @0 :Baz; cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: The above imports specify relative paths. If the path begins with a `/`, it is absolute -- in cannam@147: this case, the `capnp` tool searches for the file in each of the search path directories specified cannam@147: with `-I`. cannam@147: cannam@147: ### Annotations cannam@147: cannam@147: Sometimes you want to attach extra information to parts of your protocol that isn't part of the cannam@147: Cap'n Proto language. This information might control details of a particular code generator, or cannam@147: you might even read it at run time to assist in some kind of dynamic message processing. For cannam@147: example, you might create a field annotation which means "hide from the public", and when you send cannam@147: a message to an external user, you might invoke some code first that iterates over your message and cannam@147: removes all of these hidden fields. cannam@147: cannam@147: You may declare annotations and use them like so: cannam@147: cannam@147: {% highlight capnp %} cannam@147: # Declare an annotation 'foo' which applies to struct and enum types. cannam@147: annotation foo(struct, enum) :Text; cannam@147: cannam@147: # Apply 'foo' to to MyType. cannam@147: struct MyType $foo("bar") { cannam@147: # ... cannam@147: } cannam@147: {% endhighlight %} cannam@147: cannam@147: The possible targets for an annotation are: `file`, `struct`, `field`, `union`, `enum`, `enumerant`, cannam@147: `interface`, `method`, `parameter`, `annotation`, `const`. You may also specify `*` to cover them cannam@147: all. cannam@147: cannam@147: {% highlight capnp %} cannam@147: # 'baz' can annotate anything! cannam@147: annotation baz(*) :Int32; cannam@147: cannam@147: $baz(1); # Annotate the file. cannam@147: cannam@147: struct MyStruct $baz(2) { cannam@147: myField @0 :Text = "default" $baz(3); cannam@147: myUnion :union $baz(4) { cannam@147: # ... cannam@147: } cannam@147: } cannam@147: cannam@147: enum MyEnum $baz(5) { cannam@147: myEnumerant @0 $baz(6); cannam@147: } cannam@147: cannam@147: interface MyInterface $baz(7) { cannam@147: myMethod @0 (myParam :Text $baz(9)) -> () $baz(8); cannam@147: } cannam@147: cannam@147: annotation myAnnotation(struct) :Int32 $baz(10); cannam@147: const myConst :Int32 = 123 $baz(11); cannam@147: {% endhighlight %} cannam@147: cannam@147: `Void` annotations can omit the value. Struct-typed annotations are also allowed. Tip: If cannam@147: you want an annotation to have a default value, declare it as a struct with a single field with cannam@147: a default value. cannam@147: cannam@147: {% highlight capnp %} cannam@147: annotation qux(struct, field) :Void; cannam@147: cannam@147: struct MyStruct $qux { cannam@147: string @0 :Text $qux; cannam@147: number @1 :Int32 $qux; cannam@147: } cannam@147: cannam@147: annotation corge(file) :MyStruct; cannam@147: cannam@147: $corge(string = "hello", number = 123); cannam@147: cannam@147: struct Grault { cannam@147: value @0 :Int32 = 123; cannam@147: } cannam@147: cannam@147: annotation grault(file) :Grault; cannam@147: cannam@147: $grault(); # value defaults to 123 cannam@147: $grault(value = 456); cannam@147: {% endhighlight %} cannam@147: cannam@147: ### Unique IDs cannam@147: cannam@147: A Cap'n Proto file must have a unique 64-bit ID, and each type and annotation defined therein may cannam@147: also have an ID. Use `capnp id` to generate a new ID randomly. ID specifications begin with `@`: cannam@147: cannam@147: {% highlight capnp %} cannam@147: # file ID cannam@147: @0xdbb9ad1f14bf0b36; cannam@147: cannam@147: struct Foo @0x8db435604d0d3723 { cannam@147: # ... cannam@147: } cannam@147: cannam@147: enum Bar @0xb400f69b5334aab3 { cannam@147: # ... cannam@147: } cannam@147: cannam@147: interface Baz @0xf7141baba3c12691 { cannam@147: # ... cannam@147: } cannam@147: cannam@147: annotation qux @0xf8a1bedf44c89f00 (field) :Text; cannam@147: {% endhighlight %} cannam@147: cannam@147: If you omit the ID for a type or annotation, one will be assigned automatically. This default cannam@147: ID is derived by taking the first 8 bytes of the MD5 hash of the parent scope's ID concatenated cannam@147: with the declaration's name (where the "parent scope" is the file for top-level declarations, or cannam@147: the outer type for nested declarations). You can see the automatically-generated IDs by "compiling" cannam@147: your file with the `-ocapnp` flag, which echos the schema back to the terminal annotated with cannam@147: extra information, e.g. `capnp compile -ocapnp myschema.capnp`. In general, you would only specify cannam@147: an explicit ID for a declaration if that declaration has been renamed or moved and you want the ID cannam@147: to stay the same for backwards-compatibility. cannam@147: cannam@147: IDs exist to provide a relatively short yet unambiguous way to refer to a type or annotation from cannam@147: another context. They may be used for representing schemas, for tagging dynamically-typed fields, cannam@147: etc. Most languages prefer instead to define a symbolic global namespace e.g. full of "packages", cannam@147: but this would have some important disadvantages in the context of Cap'n Proto: cannam@147: cannam@147: * Programmers often feel the need to change symbolic names and organization in order to make their cannam@147: code cleaner, but the renamed code should still work with existing encoded data. cannam@147: * It's easy for symbolic names to collide, and these collisions could be hard to detect in a large cannam@147: distributed system with many different binaries using different versions of protocols. cannam@147: * Fully-qualified type names may be large and waste space when transmitted on the wire. cannam@147: cannam@147: Note that IDs are 64-bit (actually, 63-bit, as the first bit is always 1). Random collisions cannam@147: are possible, but unlikely -- there would have to be on the order of a billion types before this cannam@147: becomes a real concern. Collisions from misuse (e.g. copying an example without changing the ID) cannam@147: are much more likely. cannam@147: cannam@147: ## Evolving Your Protocol cannam@147: cannam@147: A protocol can be changed in the following ways without breaking backwards-compatibility, and cannam@147: without changing the [canonical](encoding.html#canonicalization) encoding of a message: cannam@147: cannam@147: * New types, constants, and aliases can be added anywhere, since they obviously don't affect the cannam@147: encoding of any existing type. cannam@147: cannam@147: * New fields, enumerants, and methods may be added to structs, enums, and interfaces, respectively, cannam@147: as long as each new member's number is larger than all previous members. Similarly, new fields cannam@147: may be added to existing groups and unions. cannam@147: cannam@147: * New parameters may be added to a method. The new parameters must be added to the end of the cannam@147: parameter list and must have default values. cannam@147: cannam@147: * Members can be re-arranged in the source code, so long as their numbers stay the same. cannam@147: cannam@147: * Any symbolic name can be changed, as long as the type ID / ordinal numbers stay the same. Note cannam@147: that type declarations have an implicit ID generated based on their name and parent's ID, but cannam@147: you can use `capnp compile -ocapnp myschema.capnp` to find out what that number is, and then cannam@147: declare it explicitly after your rename. cannam@147: cannam@147: * Type definitions can be moved to different scopes, as long as the type ID is declared cannam@147: explicitly. cannam@147: cannam@147: * A field can be moved into a group or a union, as long as the group/union and all other fields cannam@147: within it are new. In other words, a field can be replaced with a group or union containing an cannam@147: equivalent field and some new fields. cannam@147: cannam@147: * A non-generic type can be made [generic](#generic-types), and new generic parameters may be cannam@147: added to an existing generic type. Other types used inside the body of the newly-generic type can cannam@147: be replaced with the new generic parameter so long as all existing users of the type are updated cannam@147: to bind that generic parameter to the type it replaced. For example: cannam@147: cannam@147:
{% highlight capnp %} cannam@147: struct Map { cannam@147: entries @0 :List(Entry); cannam@147: struct Entry { cannam@147: key @0 :Text; cannam@147: value @1 :Text; cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147:
cannam@147: cannam@147: Can change to: cannam@147: cannam@147:
{% highlight capnp %} cannam@147: struct Map(Key, Value) { cannam@147: entries @0 :List(Entry); cannam@147: struct Entry { cannam@147: key @0 :Key; cannam@147: value @1 :Value; cannam@147: } cannam@147: } cannam@147: {% endhighlight %} cannam@147:
cannam@147: cannam@147: As long as all existing uses of `Map` are replaced with `Map(Text, Text)` (and any uses of cannam@147: `Map.Entry` are replaced with `Map(Text, Text).Entry`). cannam@147: cannam@147: (This rule applies analogously to generic methods.) cannam@147: cannam@147: The following changes are backwards-compatible but may change the canonical encoding of a message. cannam@147: Apps that rely on canonicalization (such as some cryptographic protocols) should avoid changes in cannam@147: this list, but most apps can safely use them: cannam@147: cannam@147: * A field of type `List(T)`, where `T` is a primitive type, blob, or list, may be changed to type cannam@147: `List(U)`, where `U` is a struct type whose `@0` field is of type `T`. This rule is useful when cannam@147: you realize too late that you need to attach some extra data to each element of your list. cannam@147: Without this rule, you would be stuck defining parallel lists, which are ugly and error-prone. cannam@147: As a special exception to this rule, `List(Bool)` may **not** be upgraded to a list of structs, cannam@147: because implementing this for bit lists has proven unreasonably expensive. cannam@147: cannam@147: Any change not listed above should be assumed NOT to be safe. In particular: cannam@147: cannam@147: * You cannot change a field, method, or enumerant's number. cannam@147: * You cannot change a field or method parameter's type or default value. cannam@147: * You cannot change a type's ID. cannam@147: * You cannot change the name of a type that doesn't have an explicit ID, as the implicit ID is cannam@147: generated based in part on the type name. cannam@147: * You cannot move a type to a different scope or file unless it has an explicit ID, as the implicit cannam@147: ID is based in part on the scope's ID. cannam@147: * You cannot move an existing field into or out of an existing union, nor can you form a new union cannam@147: containing more than one existing field. cannam@147: cannam@147: Also, these rules only apply to the Cap'n Proto native encoding. It is sometimes useful to cannam@147: transcode Cap'n Proto types to other formats, like JSON, which may have different rules (e.g., cannam@147: field names cannot change in JSON).