µSer
Tutorial

This page explains all concepts of using µSer.

Data Structures

µSer's main task is to deal with different kind of data structures designed by the library's user. The object passed to uSer::serialize serialize or uSer::deserialize deserialize is the root of a tree-like data structure. The leaves are always integral or enum types, and the inner nodes various kinds of containers - similar to e.g. JSON, but the structure is fixed by the application's source code.

All standard integral types are supported by µSer: bool, char, char16_t, char32_t, wchar_t, short, int, long, long long and signed/unsigned variants. All extension types for which std::numeric_limits, std::is_signed_v and std::is_unsigned_v are specialized and for which std::is_integral_v<T> is true work as well. Enumeration types are casted to/from their underlying integral type for (de)serialization.

Integers may be stored inside container types, which in turn can be stored in more containers. µSer supports different kinds of containers: C-Arrays, std::array and all other homogeneous containers supporting iterators as well as structs that have non-static member variables, std::tuple and std::pair, which can be considered heterogeneous containers. µSer never resizes containers - before deserialization, the user has to allocate elements that µSer will write to. Structs have to be annotated to make their members known to µSer; all other containers can be directly used. The contained elements are (de)serialized consecutively.

We have already seen how to serialize an integer:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::uint32_t x = 0x54534554;
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
}
Remarks
µSer only supports unsigned integers in the raw data, as bit operations on signed integers would be largely platform-dependent. Unfortunately, C++'s I/O functions usually work on "char", which might be signed. Converting unsigned to signed data is platform-dependent (see paragraph [conv.integral] in the C++ Standard), so converting each element of the raw data individually is not a solution. Using reinterpret_cast to convert the pointer to unsigned data into a pointer to signed data is implementation-defined as well. Since it keeps the bit patterns intact, it can be expected to transfer the bits of the unsigned data directly to/from the file and works on all major platforms. This method will be used in all further examples; the user code has to make sure that the (de)serialized unsigned integers are correctly read/written on the target platform.

Serializing a container essentially works the same way:

#include <cstdint>
#include <iostream>
#include <array>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::array<std::uint16_t, 2> x {{ 0x4554, 0x5453 }};
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
}

This example outputs "TEST" too, but this time we packed two bytes into one 16-Bit-Integer.

Tuples allow us to combine values of different types:

#include <cstdint>
#include <iostream>
#include <tuple>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::tuple<std::uint16_t, std::uint8_t, std::uint8_t> x { 0x4554, 0x53, 0x54 };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
}

The output still stays the same.

Until now, µSer was able to check at compile time whether the data fits into the raw array, since the sizes of both are fixed. This is not possible when a container of dynamic size, such as std::vector, is used. In that case the container might be too large to fit into the raw data array. µSer then automatically implements a size check to prevent buffer overflow. To react to an error condition appropriately, we need to implement some error handling. The easiest way to do that is to defined the macro USER_EXCEPTIONS before including the uSer.hh file to enable exception support, and catch the exceptions:

#include <cstdint>
#include <iostream>
#include <vector>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::vector<std::uint16_t> x { 0x4554, 0x5453 };
try {
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Adding another element to the vector will provoke the error. See Error Handling for more information. As serializing structs needs some extra work, it is documented in Defining and Annotating structs.

Raw Data

Before we dive further into the API documentation, we need to think about how data is mapped from the raw binary stream to the C++ data structures. This is merely conceptual; µSer does calculations on whole integers and not individual bits.

As stated previously, µSer (de)serializes data to/from a stream of unsigned integers of equal type. We will call these integers "serialization word" or short "SWord". Typically, std::uint8_t or unsigned char will be used here. Depending on the architecture, using a larger type may improve performance, particularly if the size of a SWord corresponds to the register size of the processor. For example, on a 32-Bit-Architecture std::uint32_t may yield best results. However, since the size of the elements of dynamic data structures must be a multiple of the SWord size, the size of the SWord may not be chosen arbitrarily depending on the data structures.

The raw binary in/out data is seen as a stream of bits which are made up of the sequence of SWords. The least significant bit of the SWords always comes first, and the most significant one comes last. If you serialize some data into a std::uint8_t stream, then combine pairs of those into std::uint16_t integers in a little endian fashion and deserialize the resulting stream, you should end up with the same data. If some communication channel combines 8-Bit-Integers into 16-Bit-Integers in a big endian fashion, the above rule would be violated - bits 8-16 of the first integer would come first, and then bits 0-7. To deserialize such a data stream, you have to swap the bytes manually or by calling uSer::deserialize a second time (see below for explanation of the API usage):

#include <cstdint>
#include <iostream>
#include <tuple>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
std::tuple<std::uint16_t, std::uint8_t, std::uint8_t> x { 0x4554, 0x53, 0x54 };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Simulate a communication channel which combines the 8-Bit-Integers into 16-Bit-Integers
// in a big endian way:
std::uint16_t raw2 [2];
for (std::size_t i = 0; i < sizeof(raw)/2 /* uint8_t always has size 1 */; ++i) {
raw2[i] = static_cast<std::uint16_t> ((uint16_t { raw[i*2] } << 8) | raw [i*2+1]);
}
// Turn the swapped 16-Bit-Words back into a sequence of words where the least significant bit comes first
std::uint16_t raw3 [2];
uSer::deserialize<uSer::ByteOrder::BE> (raw2, raw3);
// Actually deserialize the data
uSer::deserialize (raw3, x);
// Write the data to standard output. We need to convert the uint8_t elements into integers,
// because else they will be interpreted as individual characters.
std::cout << std::hex << "0x" << std::get<0> (x) << ", " << "0x" << int { std::get<1> (x) } << ", " << "0x" << int { std::get<2> (x) } << std::endl;
}

The bits in one serialization word can be configured by the uSer::RawInfo attribute. This can be used to e.g. serialize data into 7-bit-Words or other word sizes for which the platform offers no integer type. The serialization word type must be large enough to store those bits.

Of the tree formed by the C++ data structure, only the leaves, i.e. the integers (or enums converted to integers) are mapped to the bit stream. No meta information about the types or container sizes is written or read such that the application can implement most given formats. In essence, all integers and enums in the data structure are mapped to a "flat" sequence of integers of different size whose bits are then mapped to the raw data stream. An integer may be split into bytes whose order can be configured as the byte order (usually little or big endian). Of each byte the least significant bit is serialized first, and the most significant one last. This means that the bytes are always stored "forwards" in the raw data stream and never reversed. If individual integers or all SWords need to be reversed, this has to be done manually.

Attributes

µSer's behaviour can be controlled by attributes. These are a set of types, some of them templates, that are passed via type parameters to µSer. There are never instances made of those types, and they provide no member functions for application code. Attributes are valid for one object. If they are used on a container type, they are applied to the contained elements. An exception are structs: Attributes applied to a struct are only used on the members if they are marked as inheritable. The easiest way to specify attributes is by passing them as template parameters to the serialize and serialize functions:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::uint_least32_t x = 0x54534554;
// Perform the actual serialization in Big Endian order.
uSer::serialize<uSer::ByteOrder::BE, uSer::Width<32>> (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
}

The uSer::ByteOrder::BE attribute configures the byte order to big endian, which reverses the string written to standard output. The uSer::Width attribute sets the size of the integer explicitly, see Integer Sizes. Entire structs and their members can be annotated with attributes as well, see below for details.

The following attributes are available:

µSer Attributes
CategoryNameDefaultInheritableDescription
Byte Order uSer::ByteOrder::LESerializes integers in little endian byte order, i.e. the least significant byte first.
uSer::ByteOrder::BESerializes integers in big endian byte order, i.e. the most significant byte first.
uSer::ByteOrder::PDPSerializes integers in PDP-endian byte order, i.e. serialize 32bit-Integers as two 16bit-integers, the most significant one first, and each internally as little endian.
Sign Format uSer::SignFormat::TwosComplementStores signed integers in 2's complement, the standard for most architectures. The top half of the unsigned integer range is mapped on the negative numbers until -1, keeping the order.
uSer::SignFormat::SignedMagnitudeStores signed integers in Signed-Magnitude format, i.e. an absolute value and a sign bit that defines whether the value is positive or negative.
uSer::SignFormat::OnesComplementStores signed integers in 1's complement, similar to Signed-Magnitude but the absolute value is bitwise negated for negative values.
Padding uSer::Padding::NoneNo additional dummy bits after integers.
uSer::Padding::FixedStores a given fixed amount of bits after an integer, e.g. to accommodate alignment requirements.
Dynamic Data uSer::Dyn::SizeDefine the size of a container depending on runtime information
uSer::Dyn::OptionalDefine optional data objects depending on runtime information
Hooks uSer::Hook::SerPreInvoke a user-provided callback function before serializing an object
uSer::Hook::SerPostInvoke a user-provided callback function after serializing an object
uSer::Hook::DeSerPreInvoke a user-provided callback function before deserializing an object
uSer::Hook::DeSerPostInvoke a user-provided callback function after deserializing an object
Integer Width uSer::WidthManually define the size of an integer in bits
SWord information uSer::RawInfoExplicitly define the serialization word and optionally its size; useful if an iterator is used for which std::iterator_traits<>::value_type is not defined, e.g. std::back_insert_iterator. Only valid in the argument list of serialize and deserialize.

The attributes belonging to one category are mutually exclusive; at maximum one of them may be defined on an object. If a sub-object has an attribute defined that conflicts with an inheritable attribute of a surrounding object, the attribute of the sub-object takes precedence. For example, if a struct is annotated with uSer::ByteOrder::LE but a member is annotated with uSer::ByteOrder::BE, the latter is effective for that member and its sub-objects, if any. The complete reference to attributes is found here.

The serialize and deserialize functions

The main entry points of µSer's API are the functions serialize and deserialize which have several overloads to accommodate different use cases. The two functions have almost identical signatures that only differ in const-ness of the first two parameters. Because simply listing all overloads wouldn't be much help, we'll look at the functions on a more abstract level:

1 void|uSer_ErrorCode serialize (raw, const obj [, size] , std::size_t* sizeUsed = nullptr)
2 void|uSer_ErrorCode deserialize (const raw, obj [, size] , std::size_t* sizeUsed = nullptr)

The meaning of the parameters is:

  • The "raw" parameter refers to the raw binary data stream. It can be a reference to a C-Array, std::array, a container providing a ".size()" member function, or an iterator containing unsigned integers, the serialization words. For deserialize, the array/container or the target for the iterator may be const. µSer never allocates elements in output containers; the user has to make sure enough elements are available for writing, or use an iterator that appends as needed, such as std::back_insert_iterator.
  • The "obj" parameter is a reference to the C++ data structure to be (de)serialized, i.e. an integer, enum, struct, std::pair, std::tuple or container supporting iterators. For serialize, the object may be const.
  • The "size" parameter specifies the size of the raw data stream, i.e. how many serialization words are available. It may be omitted if "raw" refers to a C-Array (an actual array, not a pointer to the first element), a std::array, or any container providing a ".size()" member function, in which case its size is used. The size may be specified in different ways:

    • An integer of type std::size_t specifying the number of available serialization words.
    • An instance of uSer::DynSize containing a std::size_t; this is equivalent of specifying a naked std::size_t.
    • An instance of uSer::FixedSize, conveniently obtained by writing "uSer::fixedSize<N>", signifying a compile-time known fixed buffer size.
    • An instance of uSer::InfSize, conveniently obtained by writing "uSer::infSize", signifying an infinite buffer size, e.g. a socket of file handle.

    When using uSer::FixedSize for the size parameter or when using an array type for "raw", µSer can check the buffer size at compile time. If it is to small (excluding dynamic data structures), µSer will emit an error via static_assert.

  • The sizeUsed parameter is a pointer to std::size_t. If it is not a null pointer (the default), µSer will write the number of actually read/written serialization words to its target. This is useful when dynamic data structures are used.

We have already seen how to serialize to arrays:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::uint32_t x = 0x54534554;
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
}

Serializing into a vector works similarly:

#include <cstdint>
#include <iostream>
#include <vector>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
// Define a vector to receive the raw binary data.
std::vector<std::uint8_t> raw (4);
// Define the data to be serialized.
const std::uint32_t x = 0x54534554;
try {
// Perform the actual serialization. Implicitly uses raw.size() to obtain the raw buffer size.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw.data ()), 4) << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Note that the vector was initialized with 4 elements which are overwritten by serialize. If a number smaller than 4 were passed to std::vector's constructor, an exception would be thrown.

When "uSer::fixedSize<N>" is passed as the "size" parameter and the data structure contains no dynamic data, no runtime buffer range checks are performed, and no error handling is necessary:

#include <cstdint>
#include <iostream>
#include <vector>
#include <uSer.hh>
int main () {
// Define a vector to receive the raw binary data.
std::vector<std::uint8_t> raw (4);
// Define the data to be serialized.
const std::uint32_t x = 0x54534554;
// Perform the actual serialization.
uSer::serialize (raw, x, uSer::fixedSize<4>);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw.data ()), 4) << std::endl;
}
Note
You have to ensure that the parameter to uSer::fixedSize matches the actually available data. If the buffer is actually smaller, undetected buffer overflows and security holes might occur!

We can pass a std::size_t or a uSer::DynSize as the size parameter to only (de)serialize a part of a container with a runtime-known size:

#include <cstdint>
#include <iostream>
#include <vector>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
// Define a vector to with the raw binary data.
const std::vector<std::uint8_t> raw { 0xBE, 0xBA, 0xFE, 0xC0, 0xEF, 0xBE, 0xAD, 0xDE };
try {
// Define a variable to receive the deserialized data
std::uint32_t x;
// Perform the actual deserialization while using only the first half of the buffer.
uSer::deserialize<uSer::RawInfo<uint16_t, 8>> (raw, x, raw.size () / 2);
// uSer::deserialize (raw, x, uSer::DynSize { raw.size () / 2 }); // Same functionality made explicit
// Write the integer to standard output.
std::cout << std::hex << x << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Since the size is not known in advance, error handling is required. When there is no explicit limit to the raw buffer size, we can pass "uSer::infSize" as the size argument. This can e.g. be used in combination with std::back_inserter to automatically grow the container as needed. Since µSer doesn't check the buffer size in this case, error handling is not needed unless other things require error handling (dynamic data structures). Unfortunately, the C++ Standard Library offers no way to determine the serialization word type from a given std::back_inserter_iterator type. Therefore, we have to explicitly pass the serialization word to the seralize function via the uSer::RawInfo attribute:

#include <cstdint>
#include <iostream>
#include <vector>
#include <iterator>
#include <uSer.hh>
int main () {
// Define a vector to receive the raw binary data.
std::vector<std::uint8_t> raw;
// Define the data to be serialized.
const std::uint32_t x = 0x54534554;
// Perform the actual serialization; automatically append elements
// to the raw vector without an explicit size limit.
uSer::serialize<uSer::RawInfo<std::uint8_t>> (std::back_inserter (raw), x, uSer::infSize);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<const char*> (raw.data ()), static_cast<std::streamsize> (raw.size ())) << std::endl;
}

This is also the first example where an iterator instead of a reference to a container is passed to deserialize.

Error Handling

The return type of serialize and deserialize depends on the parameters and is used to signal error conditions. If USER_EXCEPTIONS is defined before including the uSer.hh header, exceptions are used to signal errors, and the return type will always be void. If it was not defined, and an error is possible (e.g. when using dynamic data structures or dynamic buffer sizes), the return type will be uSer_ErrorCode to signal success (uSer_EOK) or failure. If no errors can happen, the return type will be void. The uSer_ErrorCode enum is defined as "[[nodiscard]]" - if you ignore the returned value, the compiler will emit a warning. If you use the return value even if no errors are possible, you will get a compiler error. This prevents both forgetting to include error handling and superfluous error handling code.

We have already seen how to use exceptions:

#include <cstdint>
#include <iostream>
#include <vector>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::vector<std::uint16_t> x { 0x4554, 0x5453 };
try {
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

When using return codes, you can user uSer_getErrorMessage to retrieve a string describing the error:

#include <cstdint>
#include <iostream>
#include <vector>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::vector<std::uint16_t> x { 0x4554, 0x5453 };
// Perform the actual serialization.
// Check for success
if (ec == uSer_EOK) {
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
} else {
// Print error message
std::cerr << "uSer Error: " << uSer_getErrorMessage(ec) << std::endl;
}
}

If you were to ignore the returned value in this example, the compiler will issue a warning.

Remarks
Exceptions are generally the better way to handle errors, as they allow to free resources automatically ("RAII") and pass errors through multiple levels of independently developed software components. They can also actually have a better performance than error codes, because program execution will directly jump to the catch-handlers in case of an error, while return codes have to be checked repeatedly to decide whether execution should continue or cancel. However, on small embedded systems, the runtime library for handling exceptions might be simply too large for the limited program memory, which is why you can disable exception handling in µSer. If you are programming for a PC or an embedded system running a "full" operating system like Linux/Android, using exceptions is probably the best way.

Defining and Annotating structs

In C++, struct and class are actually almost the same thing: Both keywords declare classes (technically, there are no structs in C++) with the only difference that members and base classes of classes declared by "struct" are public by default, and private for those declared by "class". However, it is customary to use "struct" for simple classes that only contain "flat" data without encapsulation mechanisms such as getter/setter functions, similar to how they are used in C. Therefore, we will call these classes "structs". Typically, most serializable data in applications is declared in such structs, e.g. to contain data for network packets or file headers. Therefore, structs play a central role in µSer.

Unlike with containers, tuples and arrays, there is no way to automatically get a list of the members of a struct. This makes it necessary to explicitly make the list of members known to µSer.

Defining and annotating struct members separately

The easiest way to do this is as follows:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct
struct A {
// USER_STRUCT (A, uSer::AttrNone)
// Begin declaration
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
std::uint32_t d;
// Make members known to µSer
USER_ENUM_MEM (a, b, c, d)
};
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

First, the struct is defined as usual. Then a call to the USER_STRUCT macro is placed at the beginning. It requires one parameter, which is the name of the struct. There may be further parameters that define attributes valid for the whole struct. In this example, we don't provide any attributes. A strict compiler might emit a warning or error message because empty variadic macro parameters are not permitted by the C++ standard; in that case, we can pass uSer::AttrNone as a dummy meaning "no attributes".

Annotating a whole struct

Declaring attributes for a whole struct looks this way:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct
struct A {
// Begin declaration
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
std::uint32_t d;
// Make members known to µSer
USER_ENUM_MEM (a, b, c, d)
};
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

Now, all members of the struct are serialized as big endian.

Annotating a single struct member

A single member can be annotated by using USER_MEM_ANNOT :

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct
struct A {
// Begin declaration
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
std::uint32_t d;
// Annotate a member
USER_MEM_ANNOT(d, uSer::ByteOrder::BE)
// Make members known to µSer
USER_ENUM_MEM (a, b, c, d)
};
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

Defining and annotating a single member in one step

The definition and annotation of individual members can be combined by using USER_MEM :

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct
struct A {
// Begin declaration
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
USER_MEM(std::uint32_t, d, uSer::ByteOrder::BE)
// Annotate a member
// Make members known to µSer
USER_ENUM_MEM (a, b, c, d)
};
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

This method does not support arrays since their syntax requires parts of the type to be placed after the member name.

Defining and annotating all struct members in one step

The most compact method to declare structs is provided by USER_DEF_MEM :

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct
struct A {
// Begin declaration
// Define and annotate multiple members
(std::uint16_t,a,uSer::AttrNone),
(std::uint16_t,b),
(std::uint8_t,c),
(std::uint32_t,d,uSer::ByteOrder::BE)
)
};
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

This variant defines annotations for all members in one step. Again, we may have to use uSer::AttrNone to prevent empty variadic argument list. This variant does not support arrays either, but avoids typing every variable name twice.

Non-Intrusive struct annotation

The previously presented methods require the modification of the actual struct definition. They all add some some member types and static member functions to the struct whose names start with "uSer_". If the struct definition cannot be changed (e.g. because it belongs to an external library) or adding those members is not desirable (even though they have no influence on the struct's runtime behaviour), structs can be defined in a non-intrusive fashion. The macros used to achieve this have to be used in the global namespace. If the struct is defined in some namespace, its name has to be fully qualified.

Attributes valid for the whole struct can optionally be defined by USER_EXT_ANNOT . If no attributes are needed, this macro can be omitted. Individual members can be annotated by USER_EXT_MEM_ANNOT . The only required macro is USER_EXT_ENUM_MEM, which defines the list of members.

#include <cstdint>
#include <iostream>
#include <uSer.hh>
namespace N {
// Normal definition of a struct
struct A {
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
std::uint32_t d;
};
}
// These macro calls need to be in the global scope.
// Optional: Define attributes for the whole struct.
// Optional: Annotate a struct member
USER_EXT_MEM_ANNOT(N::A, d, uSer::ByteOrder::BE)
// List all struct members
USER_EXT_ENUM_MEM(N::A, a, b, c, d)
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
N::A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

This can also be done in a more compact way by using USER_EXT_DEF_MEM which defines and annotates all members at once. Attributes for the whole macro can again optionally be defined by USER_EXT_ANNOT .

#include <cstdint>
#include <iostream>
#include <uSer.hh>
namespace N {
// Normal definition of a struct
struct A {
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
std::uint32_t d;
};
}
// These macro calls need to be in the global scope.
// Optional: Define attributes for the whole struct.
// List all struct members
(a,uSer::AttrNone),
(b),
(c),
(d,uSer::ByteOrder::BE))
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
N::A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

It is actually possible to annotate any serializable type with USER_EXT_ANNOT . Annotating general types such as "int" or "std::vector<short>" is however discouraged, since the annotation will be valid for any serialization process and might affect the behaviour of other parts of the program.

Byte Order

An important aspect for serializing integers is the byte order. Integers that have more bits than a byte need to be split into multiple bytes for storage. Most platforms enforce a particular byte order by storing integers in memory in a certain way. If the data is copied from memory and written to a file directly, that byte order is applied to the stored data as well. If that file is copied to a computer with a different byte order, loaded into memory directly, and processed by arithmetic operations, unexpected results may occur. The most popular byte order is little endian, where the least significant byte is stored first and the most significant one last. This byte order is e.g. used by x86 and most ARM platforms. The reverse case is big endian, where the most significant byte is stored first. This is used by PowerPC and in various internet protocols. PDP endian is a hybrid variant, where 32-bit-Integers are split into two 16-bit-Integers, where the most significant one is stored first. The two 16-Bit-Integers are internally stored as little endian. This order stems from the PDP architecture.

For example, the number 305419896, or hexadecimal 0x12345678, is stored in the following orders (all numbers hexadecimal):

Byte Order examples
Byte OrderAddress 0123
Little Endian78563412
Big Endian12345678
PDP Endian34127856

µSer offers functionality to convert the data from the local platform's byte order into a specific defined order and back. The application code defines a fixed order desired for raw binary data, i.e. the network protocol or file format. µSer automatically converts the C++ data in the local order to/from that order. Since this is done via bit-operations, the order of the local platform need not be explicitly known and could actually be an entirely different one.

We have already seen how to use attributes to request a certain byte order:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct
struct A {
// Begin declaration
// Define arbitrary members
std::uint16_t a, b;
std::uint8_t c;
std::uint32_t d;
// Make members known to µSer
USER_ENUM_MEM (a, b, c, d)
};
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [9];
// Define the data to be serialized.
A x { 0x1234, 0x4321, 0xAF, 0xDEADBEEF };
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

If no byte order is defined, µSer assumes little endian. If we serialize this struct on one platform and then deserialize it on another, µSer guarantees that the data in the integers is correct, regardless of the byte orders of the two platforms.

When the Width attribute is used to define an integer size that is not a multiple of a byte, the "incomplete" byte is assumed to contain the remaining most significant bits. In big endian order, this incomplete byte then comes first. Consider this example:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [2];
// Define the data to be serialized.
std::uint16_t x = 0x765;
// Serialize as little endian
uSer::serialize<uSer::ByteOrder::LE, uSer::Width<11>> (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
// Serialize as big endian
uSer::serialize<uSer::ByteOrder::BE, uSer::Width<11>> (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

The output is:

1 0x65, 0x7,
2 0x2f, 0x3,

Sign Formats

Like with byte orders, different platforms have different ways to store negative values. µSer supports three of those: Two's complement, One's complement and Signed-Magnitude.

The format two's complement takes the top half of the corresponding unsigned integer's range, and translates it into the negative numbers. For example, for a 16-Bit-Integer, the numbers 0-32767 stay as they are. The binary representation of the unsigned number 32768 means -32768 in two's complement, 32769 means -32767, and 65535 means -1. This is the most popular format used on most platforms, and the other two are rare.

The signed-magnitude format stores an absolute value and a sign bit that defines whether the value is positive or negative. This format corresponds to the usual decimal notation, where the a sign bit of "1" means "-" and "0" means "+". Inverting the sign bit means negating the value. This format has two representations for zero, i.e. +0 and -0.

The one's complement format is similar to signed-magnitude but the absolute value is bitwise negated for negative values. This format has two representations for zero as well. Inverting all bits means negating the number.

Just like the byte order, the application defines the desired sign format, and µSer converts the local format from/to the desired one. This works independently of the host's format. The ranges of the different formats are not equal: For example, 16-bit two's complement numbers have the range -32768 - 32767. 16-bit-Integers in the other two formats have the range -32767 - 32767. µSer uses static assertions to make sure the integer in the raw data fits into the local format. For example, if the local platform uses signed-magnitude, and you try to deserialize a 16-bit-integer in two's complement into a int16_t, you will get a compiler error, since -32768 can not be represented on the host platform.

This example serializes a signed integer in all three sign formats:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [2];
// Define the data to be serialized.
const std::int16_t x = -291;
// Serialize as 2's complement
uSer::serialize<uSer::SignFormat::TwosComplement> (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
// Serialize as 1's complement
uSer::serialize<uSer::SignFormat::OnesComplement> (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
// Serialize as signed-magnitude
uSer::serialize<uSer::SignFormat::SignedMagnitude> (raw, x);
// Write the binary data to standard output.
for (std::uint8_t r : raw)
std::cout << std::hex << std::setw(2) << "0x" << int { r } << ", ";
std::cout << std::endl;
}

The output is:

1 0xdd, 0xfe,
2 0xdc, 0xfe,
3 0x23, 0x81,

Integer Sizes

The different platforms support different integer sizes, but a protocol might require integers of a size for which no type exists. µSer allows to set a specific fixed size for an integer by using the Width attribute. The next integer will be serialized right after the previous one; integers need not start at a byte order. Essentially, this simulates bitfields in the binary data stream, without the need for actual C++ bitfields which behave in a non-portable way.

A simple example is to convert RGB565 color values into a single 16bit-Integer:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct for storing colors in RGB565 format.
struct Color {
// Define color components.
std::uint8_t r, g, b;
// Explicitly set the sizes of the integers
USER_MEM_ANNOT(r, uSer::Width<5>)
USER_MEM_ANNOT(g, uSer::Width<6>)
USER_MEM_ANNOT(b, uSer::Width<5>)
// Make members known to µSer
USER_ENUM_MEM (r, g, b)
};
int main () {
// Store the binary data in a 16bit-Integer
std::uint16_t raw [1];
// Define a color to be serialized.
Color c1 { 24, 55, 12 };
// Convert RGB565-Color into 16bit-Value
uSer::serialize (raw, c1);
// Output the raw value
std::cout << std::hex << std::setw(4) << "0x" << raw[0] << std::endl;
}

Padding

Some formats require unused bits between the individual values. These are ignored during reading, and typically written to zero. These unused bits are called padding bits. While it is possible to define "dummy" integers (possibly using uSer::Width to achieve a specific amount of bits) that receive the padding bits, this wastes memory. Therefore, µSer provides the Padding::Fixed attribute which can only be applied to integers, and specifies a fixed amount of padding bits that come after that integer.

Assuming we want to skip the "green" component from the previous color example, we can add 6 padding bits after the "red" one:

#include <cstdint>
#include <iostream>
#include <uSer.hh>
// Define a serializable struct for storing colors in RGB565 format.
struct Color {
// Define color components.
std::uint8_t r, b;
// Explicitly set the sizes of the integers
USER_MEM_ANNOT(r, uSer::Width<5>, uSer::Padding::Fixed<6>)
USER_MEM_ANNOT(b, uSer::Width<5>)
// Make members known to µSer
};
int main () {
// Store the binary data in a 16bit-Integer
std::uint16_t raw [1];
// Define a color to be serialized.
Color c1 { 24, 12 };
// Convert RGB565-Color into 16bit-Value
uSer::serialize (raw, c1);
// Output the raw value
std::cout << std::hex << std::setw(4) << "0x" << raw[0] << std::endl;
}

The remaining two values retain their position in the raw data, and the "gap" is filled with zero bits.

Raw Iterators

µSer accesses the raw data stream via iterators. In C++, an iterator is a small class that "refers" to an element in a container or an input/output stream. Unlike a simple index, an iterator instance knows which container it refers to. A pointer is the simplest kind of iterator, to be used with C-Arrays. The first parameter to the serialize and deserialize functions is an iterator to the first raw data element (or a container/C-Array, of which the iterator to the first element will be queried using std::begin). µSer supports all standard library iterators that refer to unsigned integers, but it is possible to implement your own iterators to e.g. directly read/write the raw serialization words from/to some communication interface, network or file.

We can use std::istream_iterator and std::ostream_iterator to read/write raw data directly from/to files:

#include <cstdint>
#include <iostream>
#include <ios>
#include <fstream>
#include <iterator>
#include <sstream>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
try {
// Open input file
std::ifstream input ("rawdata.be", std::ios::binary);
if (!input)
throw std::runtime_error ("File could not be opened");
// Determine file size
input.seekg (0, std::ios::end);
std::size_t fSize = static_cast<std::size_t> (input.tellg ());
input.seekg (0, std::ios::beg);
std::uint32_t x;
// Read a 32-Bit-Integer as big endian
uSer::deserialize<uSer::ByteOrder::BE> (std::istream_iterator<std::uint8_t> (input), x, fSize);
// Open output file
std::ofstream output ("rawdata.le", std::ios::binary);
output.exceptions (std::ofstream::failbit);
// Write the integer as little endian
uSer::serialize<uSer::RawInfo<std::uint8_t>, uSer::ByteOrder::LE> (std::ostream_iterator<std::uint8_t> (output), x, uSer::infSize);
} catch (const std::exception& ex) {
// Handle errors
std::cerr << "Error: " << ex.what () << std::endl;
}
}

Note that RawInfo is needed for the output operator, since µSer can't automatically determine the serialization word type from std::ostream_iterator. The iterators are instantiated with "std::uint8_t" since µSer needs unsigned integers in the raw stream. The file size is queried ahead of deserialization, as the input iterator doesn't signal end-of-file to µSer.

µSer's requirements for iterators are actually weaker than the standard library's, making it easy to write your own ones. With "T" being the iterator type, "iter" an instance of "T", and "x" an instance of the serialization word, the basic requirements imposed by µSer are:

  • T needs to be move-constructible (i.e. have a constructor of the form "T::T (T&&)")
  • T needs to be move-assignable (i.e. have an assignment operator of the form "T& operator = (T&&)")
  • T does not have to (but may) be default-constructible, copy-constructible or copy-assignable.
  • T needs the following member type definitions:
    • "T::iterator_category" must be one of the standard iterator tag types; see the seven cases below for details.
    • "T::value_type" may be an alias for the serialization word type. If it is void, you must use RawInfo to make that type known to µSer.
    • "T::difference_type" must be an integral type large enough to store the difference between two positions
    • "T::pointer" and "T::reference" must be a pointer/reference type to the serialization word or "void"; not used by µSer but required for std::iterator_traits<T> to work
  • For deserialize, "x = (*iter)" should return the current serialization word from the raw stream
  • For serialize, "(*iter) = x;" should write the next serialization word to the raw stream
  • "++iter" should advance the iterator to the next element

If the raw binary stream contains bytes that consist entirely of padding bits, µSer is able to skip over those efficiently. There are seven cases for different kinds of iterators depending on whether padding bytes exist, each with their own set of guarantees and requirements:

Input Iterators with no padding bytes

In this case, "T::iterator_category" is not used. µSer will call "*iter" and "++iter" in strictly alternating order, e.g.:

1 x = *iter;
2 ++iter;
3 x = *iter;
4 ++iter;
5 x = *iter;
6 ++iter;

Output Iterators with no padding bytes

In this case, "T::iterator_category" is not used. µSer will call "(*iter)=x" and "++iter" in strictly alternating order, e.g.:

1 (*iter) = x;
2 ++iter;
3 (*iter) = x;
4 ++iter;
5 (*iter) = x;
6 ++iter;

This makes it easy to implement both input/output iterators, since the actual read/write can be done via the "operator *" while not doing anything in "operator ++". An example for both is:

#include <cstdint>
#include <iostream>
#include <ios>
#include <fstream>
#include <iterator>
#include <sstream>
#define USER_EXCEPTIONS
#include <uSer.hh>
class InIter {
public:
InIter (std::istream& os) : m_stream (&os) {}
InIter (InIter&& src) = default;
InIter& operator = (InIter&&) = default;
// Type Definitions for std::iterator_traits
using value_type = std::uint8_t;
using iterator_category = std::input_iterator_tag;
using pointer = std::uint8_t*;
using reference = std::uint8_t&;
using difference_type = std::ptrdiff_t;
// Operator * can be used to perform the actual reading
std::uint8_t operator * () {
std::uint8_t x;
*m_stream >> x;
return x;
}
// The increment operator can be a NOP
InIter& operator ++ () { return *this; }
private:
std::istream* m_stream;
};
class OutIter {
public:
OutIter (std::ostream& os) : m_stream (&os) {}
OutIter (OutIter&& src) = default;
OutIter& operator = (OutIter&&) = default;
// Type Definitions for std::iterator_traits
using value_type = std::uint8_t;
using iterator_category = std::output_iterator_tag;
using pointer = std::uint8_t*;
using reference = std::uint8_t&;
using difference_type = std::ptrdiff_t;
// To allow "(*iter)=x", simply return *this.
OutIter& operator * () {
return *this;
}
// Do the actual writing
void operator = (std::uint8_t x) {
*m_stream << x;
}
// The increment operator can be a NOP
OutIter& operator ++ () {
return *this;
}
private:
std::ostream* m_stream;
};
int main () {
try {
// Open input file
std::ifstream input ("rawdata.be", std::ios::binary);
if (!input)
throw std::runtime_error ("File could not be opened");
// Determine file size
input.seekg (0, std::ios::end);
std::size_t fSize = static_cast<std::size_t> (input.tellg ());
input.seekg (0, std::ios::beg);
std::uint32_t x;
// Read a 32-Bit-Integer as big endian
uSer::deserialize<uSer::ByteOrder::BE> (InIter { input }, x, fSize);
// Open output file
std::ofstream output ("rawdata.le", std::ios::binary);
output.exceptions (std::ofstream::failbit);
// Write the integer as little endian
uSer::serialize<uSer::ByteOrder::LE> (OutIter (output), x, uSer::infSize);
} catch (const std::exception& ex) {
// Handle errors
std::cerr << "Error: " << ex.what () << std::endl;
}
}

Input Iterators with padding bytes and without operator +=

In this case, "T::iterator_category" is not used. Two calls to "*iter" will never occur in direct succession, but multiple calls to "++iter" might directly follow each other to skip over padding bytes:

1 x = *iter;
2 ++iter;
3 x = *iter;
4 ++iter;
5 ++iter;
6 x = *iter;
7 ++iter;

Input Iterators with padding bytes and with operator +=

In this case, "T::iterator_category" is not used. With "n" being an instance of std::iterator_traits<T>::difference_type, "iter += n" should skip over n bytes. Two calls to "*iter" will never occur in direct succession, "++iter" will always follow "*iter", "iter += n" can optionally follow a "++iter", and "*iter" will follow either "++iter" or "iter += n", e.g.:

1 x = *iter;
2 ++iter;
3 x = *iter;
4 ++iter;
5 iter += 2;
6 x = *iter;
7 ++iter;

Output Iterator with padding bytes and T::iterator_category = std::output_iterator_tag

In this case, µSer will call "(*iter)=x" and "++iter" in strictly alternating order and padding bytes will be written as zero, e.g.:

1 *iter = x;
2 ++iter;
3 *iter = 0;
4 ++iter;
5 *iter = x;
6 ++iter;

Output Iterator with padding bytes, without operator += and T::iterator_category = std::forward_iterator_tag

This section applies for iterators where std::iterator_traits<T>::iterator_category is exactly or convertible to std::forward_iterator_tag. In this case, two calls to "*iter" will never occur in direct succession, but multiple calls to "++iter" might directly follow each other to skip over padding bytes:

1 *iter = x;
2 ++iter;
3 *iter = x;
4 ++iter;
5 ++iter;
6 *iter = x;
7 ++iter;

Output Iterator with padding bytes, with operator += and T::iterator_category = std::forward_iterator_tag

This section applies for iterators where std::iterator_traits<T>::iterator_category is exactly or convertible to std::forward_iterator_tag. With "n" being an instance of std::iterator_traits<T>::difference_type, "iter += n" should skip over n bytes. Two calls to "(*iter)=x" will never occur in direct succession, "++iter" will always follow "(*iter)=x", "iter += n" can optionally follow a "++iter", and "(*iter)=x" will follow either "++iter" or "iter += n", e.g.:

1 *iter = x;
2 ++iter;
3 *iter = x;
4 ++iter;
5 iter += 2;
6 *iter = x;
7 ++iter;

Dynamic data structures

Data structures are considered dynamic if their size is not known at compile time, but is determined at runtime based on data available only then. For deserialization, the size of a dynamic data structure might depend on other data objects that have just been deserialized. The serialized size of any dynamic data structure must be equal to or a multiple of the size of a serialization word. This restriction ensures that data that comes after the dynamic data always starts at the same bit in one serialization word, which greatly improves performance. For example, if you want to serialize a std::vector<std::uint16_t>, the serialization word can only be std::uint8_t or std::uint16_t, but not std::uint32_t. Since dynamic data structures always have the potential to overflow the raw buffer, error handling is required unless InfSize is specified as the buffer size.

µSer supports three different kinds of dynamic data:

Containers with dynamic size

When serializing a container that is not std::array or a C-Array, µSer will assume it to be dynamic, and query its size by using its ".size()" member function. µSer will serialize or deserialize exactly as many elements. This means that the prior to deserialization, the container needs to be set to its desired size, as µSer will never resize containers.

We have already seen how to serialize std::vector:

#include <cstdint>
#include <iostream>
#include <vector>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [4];
// Define the data to be serialized.
const std::vector<std::uint16_t> x { 0x4554, 0x5453 };
try {
// Perform the actual serialization.
uSer::serialize (raw, x);
// Write the binary data to standard output.
std::cout.write (reinterpret_cast<char*> (raw), 4) << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Deserialization then works like this:

#include <cstdint>
#include <iostream>
#include <vector>
#define USER_EXCEPTIONS
#include <uSer.hh>
int main () {
// Define an array with the raw binary data.
const std::uint8_t raw [4] { 0x54, 0x45, 0x53, 0x54 };
// Define a vector to receive the deserialized data and allocate it to the desired size.
std::vector<std::uint16_t> x (2);
try {
// Perform the actual deserialization.
// Write the data to standard output.
std::cout << std::hex << "0x" << x [0] << ", 0x" << x [1] << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Note how the vector is initialized to have 2 elements. Passing a greater integer will cause an exception as the buffer would overflow.

Dynamic size of struct members via Dyn::Size

By annotating a container member of a struct with the Dyn::Size attribute, you can make its size depend on various kinds of runtime data. The sole argument to Dyn::Size is a reference which tells µSer what size the container has. The reference can be:

  1. A non-static member function of the surrounding struct
  2. A free function (=function pointer)
  3. A static member function of any class (=function pointer)
  4. A pointer to a global object that has operator () overloaded
  5. A member variable pointer to a member of the surrounding struct, which must be convertible to std::size_t. In the first four cases, the return type must be convertible to std::size_t. For cases 2-4, a reference to the struct is passed as an argument.

This example shows four variants:

#include <cstdint>
#include <iostream>
#define USER_EXCEPTIONS
#include <uSer.hh>
struct A;
std::size_t g_getSize (const struct A&);
// Functional for determining the dynamic size (class with operator () overloaded)
struct Functional {
std::size_t operator () (const struct A&);
} functional;
// Define a serializable struct
struct A {
// Member function for determining the dynamic size
std::size_t m_getSize () const {
return N2;
}
// Begin declaration
// Size of array1
std::uint16_t N1;
// Define an array
std::uint8_t array1 [20];
// Define the array size dynamically using a member variable
USER_MEM_ANNOT (array1, uSer::Dyn::Size<&A::N1>)
std::uint16_t N2;
// Define an array
std::uint8_t array2 [20];
// Define the array size dynamically using a member function
USER_MEM_ANNOT (array2, uSer::Dyn::Size<&A::m_getSize>)
std::uint16_t N3;
// Define an array
std::uint8_t array3 [20];
// Define the array size dynamically using a free function
USER_MEM_ANNOT (array3, uSer::Dyn::Size<&g_getSize>)
std::uint16_t N4;
// Define an array
std::uint8_t array4 [20];
// Define the array size dynamically using a functional (class with operator () overloaded)
USER_MEM_ANNOT (array4, uSer::Dyn::Size<&functional>)
// Make members known to µSer
USER_ENUM_MEM (N1, array1, N2, array2, N3, array3, N4, array4)
};
// Free function for determining the dynamic size
std::size_t g_getSize (const struct A& a) {
return a.N3;
}
std::size_t Functional::operator () (const struct A& a) {
return a.N4;
}
int main () {
// Define an array with binary data
const std::uint8_t raw [26] = { 0x03, 0x00, 0x01, 0x02, 0x03,
0x04, 0x00, 0x01, 0x02, 0x03, 0x04,
0x05, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05,
0x06, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06 };
// Define a struct to receive the deserialized data
A x;
try {
// Perform the actual deserialization.
// Write the read data to standard output.
std::copy (x.array1, x.array1 + x.N1, std::ostream_iterator<int> (std::cout, ", ")); std::cout << std::endl;
std::copy (x.array2, x.array2 + x.N2, std::ostream_iterator<int> (std::cout, ", ")); std::cout << std::endl;
std::copy (x.array3, x.array3 + x.N3, std::ostream_iterator<int> (std::cout, ", ")); std::cout << std::endl;
std::copy (x.array4, x.array4 + x.N4, std::ostream_iterator<int> (std::cout, ", ")); std::cout << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Optional struct members via Dyn::Optional

By annotating a container member of a struct with the Dyn::Optional attribute, you can make its presence depend on various kinds of runtime data. The sole argument to Dyn::Optional is a reference which tells µSer whether the object exists. For the reference, the same rules as for Size apply, but the return type has to be (convertible) to bool.

This example shows four variants:

#include <cstdint>
#include <iostream>
#define USER_EXCEPTIONS
#include <uSer.hh>
struct A;
bool g_getSize (const struct A&);
// Functional for determining the existence (class with operator () overloaded)
struct Functional {
bool operator () (const struct A&);
} functional;
// Define a serializable struct
struct A {
// Member function for determining whether an object is present
bool m_getSize () const {
return N2;
}
// Begin declaration
// 1 if v1 exists, 0 else
std::uint8_t N1;
// Define an optional object
std::uint8_t v1;
// Define the presence of an object to be optional using a member variable
USER_MEM_ANNOT (v1, uSer::Dyn::Optional<&A::N1>)
// 1 if v2 exists, 0 else
std::uint8_t N2;
// Define an optional object
std::uint8_t v2;
// Define the presence of an object to be optional using a member function
USER_MEM_ANNOT (v2, uSer::Dyn::Optional<&A::m_getSize>)
// 1 if v3 exists, 0 else
std::uint8_t N3;
// Define an optional object
std::uint8_t v3;
// Define the presence of an object to be optional using a free function
USER_MEM_ANNOT (v3, uSer::Dyn::Optional<&g_getSize>)
// 1 if v4 exists, 0 else
std::uint8_t N4;
// Define an optional object
std::uint8_t v4;
// Define the presence of an object to be optional using a functional (class with operator () overloaded)
USER_MEM_ANNOT (v4, uSer::Dyn::Optional<&functional>)
// Make members known to µSer
USER_ENUM_MEM (N1, v1, N2, v2, N3, v3, N4, v4)
};
// Free function for determining the presence
bool g_getSize (const struct A& a) {
return a.N3 != 0;
}
bool Functional::operator () (const struct A& a) {
return a.N4 != 0;
}
int main () {
// Define an array with binary data
const std::uint8_t raw [6] = { 0x01, 0x01,
0x00,
0x01, 0x2A,
0x00 };
// Define a struct to receive the deserialized data
A x;
try {
// Perform the actual deserialization.
// Write the read data to standard output.
if (x.N1) std::cout << "v1: " << std::hex << int (x.v1) << std::endl;
if (x.N2) std::cout << "v2: " << std::hex << int (x.v2) << std::endl;
if (x.N3) std::cout << "v3: " << std::hex << int (x.v3) << std::endl;
if (x.N4) std::cout << "v4: " << std::hex << int (x.v4) << std::endl;
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

The variables N1-N4 are declared as std::uint8_t to avoid odd data in the example.

Hooks

Sometimes it is necessary to do some application-specific calculations and checks during (de)serialization. µSer accommodates this by allowing the application to specify hook functions which will be called before or after an object is (de)serialized. This can be achieved by the four attributes Hook::SerPre, Hook::SerPost, Hook::DeSerPre, Hook::DeSerPost. These attributes take a reference to a function which will be called before/after the annotated object is (de)serialized. The argument may reference:

  1. A non-static constant (for serialization) or non-constant (for deserialization) member function of the surrounding struct
  2. A free function (=function pointer)
  3. A static member function of any class (=function pointer)
  4. A pointer to a global object that has operator () overloaded
  5. If the annotated object is a struct, and the annotation is not given as a member annotation of a surrounding struct, the reference may be a non-static constant (for serialization) or non-constant (for deserialization) member function of that struct.

The return type can be:

  1. void, in which case the function may indicate errors via exceptions, but not fail otherwise
  2. uSer_ErrorCode, in which case the function may indicate errors by the returned value or via exceptions (if enabled). If the function indicates an error via a return value and exceptions are enabled, µSer will throw an exception that forwards the error code.

A constant (for serialization) or non-constant (for deserialization) reference to the annotated object is passed as the first argument. For cases 2-4, a constant (for serialization) or non-constant (for deserialization) reference to the struct is passed as a second argument, if the attribute was applied to a struct member.

An example for the different reference types and hook types is:

#include <cstdint>
#include <iostream>
#define USER_EXCEPTIONS
#include <uSer.hh>
struct A;
void serPost (const uint8_t&, const A&);
// Functional to be called before deserialization
struct DeSerPre {
void operator () (uint8_t&, A&);
} deSerPre;
// Define a serializable struct
struct A {
// Member function to be called before serialization
void serPre (const uint8_t& a) const {
std::cout << "SerPre\n";
// Simulate an error condition
if (a != 42)
}
// Member function to be called after deserialization
uSer_ErrorCode deSerPost () {
std::cout << "DeSerPost\n";
// Simulate an error condition
return a == 42 ? uSer_EOK : uSer_EHOOK;
}
// Begin declaration. Define hook to be called after deserialization of the whole struct.
// Define some members
std::uint8_t a, b, c;
// Define hooks to be called before/after (de)serialization of the individual members
USER_MEM_ANNOT(a, uSer::Hook::SerPre<&A::serPre>)
USER_MEM_ANNOT(b, uSer::Hook::SerPost<&serPost>)
USER_MEM_ANNOT(c, uSer::Hook::DeSerPre<&deSerPre>)
// Make members known to µSer
USER_ENUM_MEM (a, b, c)
};
// Hook function to be called after serialization
void serPost (const uint8_t&, const A&) {
std::cout << "SerPost\n";
}
void DeSerPre::operator () (uint8_t&, A&) {
std::cout << "DeSerPre\n";
}
int main () {
// Define an array to receive the raw binary data.
std::uint8_t raw [3];
// Define the data to be serialized.
A x { 42, 2, 3 };
try {
// Serialize & deserialize
uSer::serialize (raw, x);
} catch (const uSer::Exception& ex) {
// In case of error, print it
std::cerr << "uSer Error: " << ex.what () << std::endl;
}
}

Note that during the pre-deserialization hook, the object might not contain any meaningful data.

Calculating buffer sizes

µSer allows you to determine the size of the raw buffer based on the C++ data structure during compilation. This can be useful to allocate buffers of appropriate size.

  • rawStaticBits<RawIter, T, Attr...> returns the number of fixed bits belonging to the non-dynamic part of T in the raw data stream referred to by the raw iterator type RawIter when serializing with the attributes Attr.
  • RawMaxDynamic<RawIter, T, Attr...> is an alias to either MaxSize, Unlimited or NoDyn indicating the upper limit of dynamic serialization words in its member value, that there is no upper limit, or that there is no dynamic data, respectively.
  • minBufSize<RawIter, T, Attr...> is an integer denoting the minimum number of serialization words needed to (de)serialize the type T.
  • maxBufSize<RawIter, T, Attr...> is an integer denoting the maximum number of serialization words needed to (de)serialize the type T. Using this alias may cause a compiler error if the dynamic size is unlimited. Note that the maximum size might actually never occur depending on the data structure.

Using µSer on resource-constrained systems

µSer is designed to work on resource-constrained systems, such as small embedded systems. To this end, µSer never does any dynamic memory allocation. There are also some macros that can be defined before including the uSer.hh header (or via the compiler's command line) to configure µSer:

  • USER_EXCEPTIONS Do not define this macro in order to disable exception support, which typically consumes a large amount of program memory
  • USER_NO_PRINT Define this macro to disable support for printing serializable data via the uSer::print function and prevent inclusion of the <ostream> header.
  • USER_NO_SMARTPTR Define this macro to disable support for smart pointers and prevent inclusion of the <memory> header.

When using GCC, compile with "-fdata-sections -ffunction-sections -flto", and link with "-Wl,--gc-sections -flto". These options can reduce the amount of program memory needed. µSer also relies on optimization to be turned on (e.g. -O2 for GCC and Clang) to generate efficient code. µSer employs deeply nested call stacks to statically build algorithms specifically adapted to the user-defined data structures. With optimization enabled, the compiler collapses those into short and efficient algorithms. If some code does not work with optimizations enabled, this is most probably the result of relying on some undefined behaviour, i.e. programming errors. Using µSer instead of e.g. pointer casts to serialize data already avoids some of those problems (specifically padding, data alignment, aliasing rules).

C-Compatibility

µSer can be used in C-based projects if a C++17-compatible compiler is available. Fist, define the desired data structures in a common header for both C and C++ (here: packet.h):

#include <uSer.hh>
// Define data structures compatible with both C and C++.
// Define a simple struct containing integers
struct PacketA {
// Begin struct definition
uint32_t a;
// Encode a in Big-Endian in the raw data
USER_MEM_ANNOT(a, uSer::ByteOrder::BE)
uint16_t b;
int8_t c;
// Encode c in signed-magnitude format in the raw data
USER_MEM_ANNOT(c, uSer::SignFormat::SignedMagnitude)
// List members
USER_ENUM_MEM (a, b, c)
};
// Define a struct containing a dynamic data structure. Use the "typedef struct"-Trick to make usage more convenient.
typedef struct PacketB_ {
// Begin struct definition
uint8_t N;
struct PacketA packets [8];
// Let N denote the size of the arra packets.
USER_MEM_ANNOT (packets, uSer::Dyn::Size<&PacketB_::N>)
USER_ENUM_MEM (N, packets)
} PacketB;
// Declare functions for (de)serialization of the data structure. These will be implemented as C++.
USER_EXTERN_C uSer_ErrorCode serializePacketB (uint8_t* raw, const PacketB* pk, size_t bufferSize);
USER_EXTERN_C uSer_ErrorCode deserializePacketB (const uint8_t* raw, PacketB* pk, size_t bufferSize);

Define the structs by using the µSer-provided macros as explained before. Also declare serialize/deserialize functions as needed, i.e. for the structs we want to explicitly serialize from C code. In this example, we only want to (de)serialize PacketB from C, and have the contained PacketA instances (de)serialized automatically. Therefore, we don't need serialization functions for PacketA. Adjust the signature as needed (with or without error codes in the return type, the desired raw buffer type, with or without a buffer size parameter). In order to call a C++ function from C, it has to be annotated with 'extern "C"', but only when the C++ compiler sees it. The USER_EXTERN_C macro can be used for this, which evaluates to 'extern "C"' when compiling as C++, and to nothing when compiling as C.

Then, create a C++ source file for implementing the serialization functions:

#include "packet.h"
USER_EXTERN_C uSer_ErrorCode serializePacketB (uint8_t* raw, const PacketB* pk, size_t bufferSize) {
// Just call the µSer serialization function
return uSer::serialize (raw, *pk, bufferSize);
}
USER_EXTERN_C uSer_ErrorCode deserializePacketB (const uint8_t* raw, PacketB* pk, size_t bufferSize) {
// Just call the µSer deserialization function
return uSer::deserialize (raw, *pk, bufferSize);
}

We have to include the "packet.h" file, and just call serialize/deserialize in the function body. Remember to switch on the compiler's C++17 support (e.g. -std=c++17 for GCC and Clang). We can then use these functions from C code, e.g.:

#include <stdio.h>
#include <inttypes.h>
#include "packet.h"
int main () {
// Define a packet instance and fill it with some data
PacketB pk;
pk.N = 2;
pk.packets [0].a = 0xDEADBEEF;
pk.packets [0].b = 0xAA55;
pk.packets [0].c = -42;
pk.packets [1].a = 0xC0FEBABE;
pk.packets [1].b = 0x1234;
pk.packets [1].c = 35;
// Define an array to receive the raw data
uint8_t raw [15];
// Serialize the packet into the raw data
uSer_ErrorCode ec = serializePacketB (raw, &pk, sizeof (raw) /* uint8_t always has size 1 */);
// Check for errors
if (ec != uSer_EOK) {
fprintf (stderr, "Serialization error: %s\n", uSer_getErrorMessage (ec));
return 1;
} else {
// Print the raw data
puts ("Serialization result:");
for (size_t i = 0; i < sizeof(raw); ++i) {
printf ("%02x, ", raw [i]);
}
puts ("");
}
// Deserialize the raw data back into the struct
ec = deserializePacketB (raw, &pk, sizeof (raw));
// Check for errors
if (ec != uSer_EOK) {
fprintf (stderr, "Deserialization error: %s\n", uSer_getErrorMessage (ec));
return 1;
} else {
// Print the deserialized data structure
puts ("Deserialization result:");
printf ("pk.N=%" PRIu8 "\n", pk.N);
for (size_t i = 0; i < pk.N; ++i) {
printf ("pk.packets[%zd].a=%" PRIx32 ", .b=%" PRIx16 ", .c=%" PRId8 "\n", i, pk.packets [i].a, pk.packets [i].b, pk.packets [i].c);
}
}
}