Image source

Protocol Buffers, Part 2 — The Untold Parts Of Using “Any”

Tomer Rothschild
codeburst
Published in
7 min readAug 1, 2018

--

The nice thing about Protobuf is that it is well documented.. at least for the most part ;) One of the topics covered with less detail is how to wrap arbitrary messages within an “envelope” message. This post aims to fill this gap.

This is the second post in my series on Protocol Buffers. If you haven’t read the first one yet I recommend you check it out here.

Motivation: DDD Aggregates

For me, the most recurring use case for wrapping messages is the case of Aggregates (from Domain Driven Design). In short, aggregates act as boundaries that ensure the integrity of operations on an entity (along with its sub-entities).

For example, most virtual machine hypervisors won’t let you edit a VM’s spec while its state is in transition. So one way to ensure the integrity of a VM is kept intact is to manage it within an Aggregate. i.e. we would funnel all of the commands for a given VM via a single VM Aggregate component. This way there is a single place (per VM) that can verify that integrity conditions are met before going ahead and performing the operation.

Hence, the aggregate component would need to be able to process different types of messages without knowing in advance “which is which”.

Let’s see an example use case, followed by some code.

Use case: VM service

Image source

As mentioned earlier, the use case is that a given service may accept several different types of messages via a single channel. The frame story is the same as in the previous post, so let’s assume you are building a backend for a cloud platform (much like my team does at CloudShare).

So, suppose we want our service to support 3 operations:

  1. Provision a VM.
  2. Edit a VM’s spec.
  3. Stop a VM.

Per the Domain Driven Design approach, each of the above operations would have a command message and an event message. The commands are funneled via a single service — the VmService — in order to deal with conflicting operations.

Back to wrapping Protobuf messages

So how do we go about parsing Protobuf messages without knowing their type in advance?

Attempt no. 1 — Polymorphic Messages:

Referencing a message of specific type from another message is a built-in part Protobuf — but can we have polymorphic references via message inheritance?

Image source

“ Don’t go looking for facilities similar to class inheritance, though — protocol buffers don’t do that.”

Attempt no. 2 — Composition:

It turns out there’s a mechanism for packing an arbitrary message inside another “envelope” message — using the Any message type.

Original image source

The full code example can be found at — https://github.com/rotomer/protobuf-blogpost-2. We will focus on the Provision VM flow as an example.

Message Definitions

The ProvisionVmCommand and its enclosing envelope message definitions may look like so:

As we can see, the VmCommandEnvelope has two fields:

  1. An inner message of type Any.
  2. VM id in order to enable routing the envelope to the appropriate VM aggregate.

Packing Into Any

Packing a message into an Any message is done by: Any.pack(message)

The resulting Any message is actually quite simple. You can see it’s message definition here. It is composed of just two fields — an arbitrary serialized message as bytes along with a type URL that acts as a globally unique identifier to resolve that message's type. In our example the type URL would be: type.googleapis.com/rotomer.simplevm.messages.ProvisionVmCommand.

The pack methods provided by protobuf library will by default use
'type.googleapis.com/full.type.name' as the type URL and the unpack
methods only use the fully qualified type name after the last '/'
in the type URL, for example "foo.bar.com/x/y.z" will yield type
name "y.z". (from the Any message definition)

Sharing Message Definitions

Theoretically, the type URL prefix can be used to specify a schema repository and lookup the message schema there. However, in practice there is no built-in / open source schema repository for Protobuf. Admittedly, Avro has an edge there with Confluent’s schema registry.

What worked for me in practice is to simply share the generated message classes as a binary package. Or as Udi Dahan puts it:

Encoding For Textual Formats

Once packed, we would like to send the serialized message over the wire. In case you are using a textual protocol as HTTP then you would need to encode the message appropriately. In this example we would use AWS SQS as the asynchronous message transport and are therefore required to encode the messages before sending them. Simply use Guava’s / Apache Commons’ base 64 encoders to get the job done.

Another option, which will be covered in the next post, is to serialize the protobuf message to JSON instead of binary.

Full Example: Sending Any Wrapped Messages

Unpacking Using The Type URL — It’s Up To You

As we you can see in the class below, there’s no magic involved in unpacking Any messages . You must specify the type of the message you wish to unpack to. i.e. it’s up to you to map from the type URL into the appropriate generated message class.

Let’s have a look at one possible implementation for doing just that in the VmMessageUnpacker class:

This is pretty much it for the demo part dealing with Any messages. For completeness of demonstrating the use case — let’s see the VmServiceand the ProvisionVmOperation classes:

Side note: The VmService in this demo is over-simplified for the sake of focusing on the Protobuf aspects of this post (it is a single instance, and the processing of the messages is performed in-process & synchronously). In real life our services are modelled as aggregates and implemented using Akka actors (The Akka toolkit facilitates concurrency and asynchronous message passing between services). A future blog post will cover that in detail.

The VmService processes an incoming message by decoding it, unpacking the inner Any message and invoking the appropriate handler:

The above example uses Vavr’s pattern matching as an implementation for dynamically dispatching the appropriate operation based on the command type, but any other implementation will do (map, switch, etc’…).

The ProvisionVmOperation processes an incoming message by calling the hypervisor service to provision a VM, creating an event with the result, and sending it back via the response channel.

Image source, CC license

Pitfalls of Any

Last but not least, it’s worth mentioning some of the pitfalls of using Any:

  • As noted in this excellent blog post from the Envoy team — changing the package namespace of a message packed into Any breaks wire compatibility. Note that this case isn’t covered in the official Protobuf guidelines on how to evolve message schema without breaking existing code.
  • JSON formatting causes deep deserialization of Any packed messages. This has a profound effect on the use cases of Any. i.e. it forces the very first component that deserializes the envelope message to have message type mappings for each of the possible types that can be packed into the Any field. For example, if we were to transform the above code into something more production grade, then we could consider adding a “router” component that would send the incoming message into the appropriate VmAggregate (thus lifting the constraint of a singleton serving all VMs). In this case, the router component would have to import each of the possible generated message classes in order to be able to parse just the envelope part. In summary, when using JSON formatting the Any type is no longer opaque. We’ll cover ways to deal with this case in the next post which deals with serializing Protobuf messages to JSON.

Recap

The Any construct provides an effective way send arbitrary messages in Protobuf. There is zero magic involved which leaves room for different implementation options, but also very little documentation to help evaluating the options.

I hope this post helped in shedding some light into how one can go about using Any. Please feel free to comment and share your experience with using Any if you found other techniques to be useful :)

Next post in the series

Be sure to check the next post on Protobuf & JSON formatting.

✉️ Subscribe to CodeBurst’s once-weekly Email Blast, 🐦 Follow CodeBurst on Twitter, view 🗺️ The 2018 Web Developer Roadmap, and 🕸️ Learn Full Stack Web Development.

--

--