Serialization

In the field of computer science, the process of serialization simply represents the conversion of data structures or objects into binary or human-readable form that is suitable for storing and can be restored back again on the same or other computer environment. The reverse process is called deserialization.

With the help of the serialization process, data can be stored on external storage devices or can be transported over the network.

Existing solutions

In one of the development stages of STSdb, arose a need for a binary serialization tool that can not only serialize complex user types, but also to serialize them fast and without additional development intervention.

Three of the most popular solutions for the task were:

  • BinaryFormater by Microsoft.
  • Protocol Buffers by Google.
  • MessagePack by Sadayuki Furuhashi.

Microsoft’s BinaryFormater – class is integrated directly into .NET Framework. It looks like the most logical approach, but it proves to be one of the slowest. The reason for that is simple: for every object that will be serialized is used type reflection to get its type and members (if it has any). The constant type reflection (for every single object) is slowing down the serialization performance.

Protocol Buffers - Are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data. It provides decent performance, but not very friendly usage. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data, but this means that you have to learn the language used for type description and then compile by hand additional code.

MessagePack – Is an object serialization specification like JSON. It provides better performance than Protocol Buffers, but just like Google’s solution requires additional attributes to be added to the serialized types, which makes it difficult if you want to persist an object which type is from an external assembly.