annotate src/capnproto-0.6.0/doc/index.md @ 147:45360b968bf4

Cap'n Proto v0.6 + build for OSX
author Chris Cannam <cannam@all-day-breakfast.com>
date Mon, 22 May 2017 10:01:37 +0100
parents
children
rev   line source
cannam@147 1 ---
cannam@147 2 layout: page
cannam@147 3 title: Introduction
cannam@147 4 ---
cannam@147 5
cannam@147 6 # Introduction
cannam@147 7
cannam@147 8 <img src='images/infinity-times-faster.png' style='width:334px; height:306px; float: right;'>
cannam@147 9
cannam@147 10 Cap'n Proto is an insanely fast data interchange format and capability-based RPC system. Think
cannam@147 11 JSON, except binary. Or think [Protocol Buffers](http://protobuf.googlecode.com), except faster.
cannam@147 12 In fact, in benchmarks, Cap'n Proto is INFINITY TIMES faster than Protocol Buffers.
cannam@147 13
cannam@147 14 This benchmark is, of course, unfair. It is only measuring the time to encode and decode a message
cannam@147 15 in memory. Cap'n Proto gets a perfect score because _there is no encoding/decoding step_. The Cap'n
cannam@147 16 Proto encoding is appropriate both as a data interchange format and an in-memory representation, so
cannam@147 17 once your structure is built, you can simply write the bytes straight out to disk!
cannam@147 18
cannam@147 19 **_But doesn't that mean the encoding is platform-specific?_**
cannam@147 20
cannam@147 21 NO! The encoding is defined byte-for-byte independent of any platform. However, it is designed to
cannam@147 22 be efficiently manipulated on common modern CPUs. Data is arranged like a compiler would arrange a
cannam@147 23 struct -- with fixed widths, fixed offsets, and proper alignment. Variable-sized elements are
cannam@147 24 embedded as pointers. Pointers are offset-based rather than absolute so that messages are
cannam@147 25 position-independent. Integers use little-endian byte order because most CPUs are little-endian,
cannam@147 26 and even big-endian CPUs usually have instructions for reading little-endian data.
cannam@147 27
cannam@147 28 **_Doesn't that make backwards-compatibility hard?_**
cannam@147 29
cannam@147 30 Not at all! New fields are always added to the end of a struct (or replace padding space), so
cannam@147 31 existing field positions are unchanged. The recipient simply needs to do a bounds check when
cannam@147 32 reading each field. Fields are numbered in the order in which they were added, so Cap'n Proto
cannam@147 33 always knows how to arrange them for backwards-compatibility.
cannam@147 34
cannam@147 35 **_Won't fixed-width integers, unset optional fields, and padding waste space on the wire?_**
cannam@147 36
cannam@147 37 Yes. However, since all these extra bytes are zeros, when bandwidth matters, we can apply an
cannam@147 38 extremely fast Cap'n-Proto-specific compression scheme to remove them. Cap'n Proto calls this
cannam@147 39 "packing" the message; it achieves similar (better, even) message sizes to protobuf encoding, and
cannam@147 40 it's still faster.
cannam@147 41
cannam@147 42 When bandwidth really matters, you should apply general-purpose compression, like
cannam@147 43 [zlib](http://www.zlib.net/) or [LZ4](https://github.com/Cyan4973/lz4), regardless of your
cannam@147 44 encoding format.
cannam@147 45
cannam@147 46 **_Isn't this all horribly insecure?_**
cannam@147 47
cannam@147 48 No no no! To be clear, we're NOT just casting a buffer pointer to a struct pointer and calling it a day.
cannam@147 49
cannam@147 50 Cap'n Proto generates classes with accessor methods that you use to traverse the message. These accessors validate pointers before following them. If a pointer is invalid (e.g. out-of-bounds), the library can throw an exception or simply replace the value with a default / empty object (your choice).
cannam@147 51
cannam@147 52 Thus, Cap'n Proto checks the structural integrity of the message just like any other serialization protocol would. And, just like any other protocol, it is up to the app to check the validity of the content.
cannam@147 53
cannam@147 54 Cap'n Proto was built to be used in [Sandstorm.io](https://sandstorm.io), where security is a major concern. As of this writing, Cap'n Proto has not undergone a security review, therefore we suggest caution when handling messages from untrusted sources. That said, our response to security issues was once described by security guru Ben Laurie as ["the most awesome response I've ever had."](https://twitter.com/BenLaurie/status/575079375307153409) (Please report all security issues to [security@sandstorm.io](mailto:security@sandstorm.io).)
cannam@147 55
cannam@147 56 **_Are there other advantages?_**
cannam@147 57
cannam@147 58 Glad you asked!
cannam@147 59
cannam@147 60 * **Incremental reads:** It is easy to start processing a Cap'n Proto message before you have
cannam@147 61 received all of it since outer objects appear entirely before inner objects (as opposed to most
cannam@147 62 encodings, where outer objects encompass inner objects).
cannam@147 63 * **Random access:** You can read just one field of a message without parsing the whole thing.
cannam@147 64 * **mmap:** Read a large Cap'n Proto file by memory-mapping it. The OS won't even read in the
cannam@147 65 parts that you don't access.
cannam@147 66 * **Inter-language communication:** Calling C++ code from, say, Java or Python tends to be painful
cannam@147 67 or slow. With Cap'n Proto, the two languages can easily operate on the same in-memory data
cannam@147 68 structure.
cannam@147 69 * **Inter-process communication:** Multiple processes running on the same machine can share a
cannam@147 70 Cap'n Proto message via shared memory. No need to pipe data through the kernel. Calling another
cannam@147 71 process can be just as fast and easy as calling another thread.
cannam@147 72 * **Arena allocation:** Manipulating Protobuf objects tends to be bogged down by memory
cannam@147 73 allocation, unless you are very careful about object reuse. Cap'n Proto objects are always
cannam@147 74 allocated in an "arena" or "region" style, which is faster and promotes cache locality.
cannam@147 75 * **Tiny generated code:** Protobuf generates dedicated parsing and serialization code for every
cannam@147 76 message type, and this code tends to be enormous. Cap'n Proto generated code is smaller by an
cannam@147 77 order of magnitude or more. In fact, usually it's no more than some inline accessor methods!
cannam@147 78 * **Tiny runtime library:** Due to the simplicity of the Cap'n Proto format, the runtime library
cannam@147 79 can be much smaller.
cannam@147 80 * **Time-traveling RPC:** Cap'n Proto features an RPC system that implements [time travel](rpc.html)
cannam@147 81 such that call results are returned to the client before the request even arrives at the server!
cannam@147 82
cannam@147 83 <a href="rpc.html"><img src='images/time-travel.png' style='max-width:639px'></a>
cannam@147 84
cannam@147 85
cannam@147 86 **_Why do you pick on Protocol Buffers so much?_**
cannam@147 87
cannam@147 88 Because it's easy to pick on myself. :) I, Kenton Varda, was the primary author of Protocol Buffers
cannam@147 89 version 2, which is the version that Google released open source. Cap'n Proto is the result of
cannam@147 90 years of experience working on Protobufs, listening to user feedback, and thinking about how
cannam@147 91 things could be done better.
cannam@147 92
cannam@147 93 Note that I no longer work for Google. Cap'n Proto is not, and never has been, affiliated with Google; in fact, it is a property of [Sandstorm.io](https://sandstorm.io), of which I am co-founder.
cannam@147 94
cannam@147 95 **_OK, how do I get started?_**
cannam@147 96
cannam@147 97 To install Cap'n Proto, head over to the [installation page](install.html). If you'd like to help
cannam@147 98 hack on Cap'n Proto, such as by writing bindings in other languages, let us know on the
cannam@147 99 [discussion group](https://groups.google.com/group/capnproto). If you'd like to receive e-mail
cannam@147 100 updates about future releases, add yourself to the
cannam@147 101 [announcement list](https://groups.google.com/group/capnproto-announce).
cannam@147 102
cannam@147 103 {% include buttons.html %}