Odysseia - What's next?

Odysseia - What's next?
Photo by Tomasz Frankowski / Unsplash

In the last posts we went over what has changed so far in comparison to the stock ARK Core, but what's next for Odysseia? Development will continue with a variety of changes in the pipeline that will improve performance but there's no timeline as to when those will be completed because of research, testing and development.

  • Implement CBOR for P2P Communication
  • Replace joi with ajv or fastest-validator
  • Replace JSON configuration files with single TOML file
  • Improve serialisation and deserialisation performance
  • Standardise port conventions

These are just some of the changes coming but these are the most noteworthy as of now. Most changes for now will be about standardising tooling and internals, then moving on to performance before the focus eventually shifts to security.

CBOR for P2P Communication

MessagePack is an efficient binary serialisation format. It lets you exchange data among multiple languages like JSON. But it's faster and smaller. The same applies to CBOR but it has an official IETF RFC, doesn't require an explicit schema and outperforms MessagePack, especially as payloads get bigger.

Protocol Buffers by Google are another alternative to MessagePack but they have the same requirement of explicitly specifying the structure of your data. Generally this is what you would want to do to reduce the amount of data that is sent as much as possible but Odysseia is built with plugins as first-class citizens. This means that as many parts of Odysseia need to be open for extension, including the P2P communication layer.

For now Odysseia will make use of cbor-x - a high performance implementation of cbor for JavaScript. This will keep the P2P communication layer open for extension by plugins while still offering great compression compared to raw JSON. This is a good compromise between performance and extensibility but if traffic would ever become a real issue Protocol Buffers would be a viable choice.

joi vs. Ajv vs. fastest-validator

ARK Core uses joi for internal validations and ajv for anything that is performance critical, like validating blocks and transactions before performing any further processing. Both of them are some of the most popular validation libraries for the JavaScript ecosystem and each comes with their own pros and cons depending on your use-case. fastest-validator is another contender that we'll look at as a potential standard validator because it has less verbose schemas and usage than AJV while also achieving better performance in a lot of cases.

While joi offers the best developer experience out of the box it has to make a lot of sacrifices in terms of performance for it which can become an issue if your application runs validation frequently or on large payloads.

The developer experience of Ajv is much more verbose because there's less syntactic sugar. You'll need to create verbose (at least compared to joi) schemas and manually compile them before using them. The manual compilation shouldn't be an issue or annoyance because you can do it once when booting your application and store them in some global object, container or abuse the node.js require cache to keep it cached for you.

fastest-validator is a direct competitor to both joi and Ajv because it provides top-notch performance while being in the middle of joi and Ajv in terms of verbosity. Writing schemas for it feels more like joi but without all of the syntactic sugar, which comes at a performance cost. Performance from initial testing seems to be at least slightly above Ajv and in a lot of cases far better.

JSON vs. TOML

JSON was designed as a data interchange format but the JavaScript Ecosystem adopted it as a general purpose format for API responses, storing any kind of data on disk and for configuration. Articles like Why JSON isn’t a Good Configuration Language do a good job at explaining why JSON isn't great for configuration and why specifications like YAML or TOML are better suited for configuration files, especially if they are being modified by humans.

TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics and is designed for humans. The first time you use it you'll realise that it has a lot in common with INI files but it has a proper specifications that define how it has to be parsed. It also has support for comments which makes it very easy to explain complex configuration properties right inside an example file rather than deferring a user to external documentation for further details.

(De)Serialisation Performance

(De)Serialisation in ARK Core (according to AIP11) is fairly efficient but is still wasting space in the header of a transaction, which is shared across all types. It also is all or nothing. If you want the value of a specific property you will need to deserialise the whole transaction instead of being able to skip N bytes and get the value of a specific property in the header.

Resolving this issue is fairly simple because it doesn't require any changes to the specifications, just the implementation. Methods like transaction.deserialise("senderPublicKey") will be introduced to quickly access the value of a specific property without having to waste valuable time on deserialising a full transaction.

Beyond those the biggest space inefficiencies and inconsistencies with how AIP11 serialisation has been implemented are that the vendor field, its length and the amount of a transaction are all stored at the top level. This means that it will have to fill a specific range of bytes with a placeholder if you are sending a transaction that doesn't use a vendor field or amount because the deserialisation process otherwise breaks.

Trying to read N bytes but encountering an unexpected amount of bytes will cause an exception to be thrown, or worse, continue reading but only return malformed data after that. This can result in corrupting the integrity of a node because it could end up producing wrong data from transactions it has received and then attempt to store it or broadcast it to other nodes on the network.

Type-specific properties like the vendor field (now called memo) and amount will also be moved into the asset. This will allow for smaller serialised transactions because placeholder values no longer have to be filled in.

Port Conventions

ARK Core since its beginning has been using ports of other common applications like Express servers. This can be annoying if you are already running some other applications on your server with the same ports.

In an attempt to standardise ports Odysseia will move to the following port conventions which are applicable for livenet and testnet.

  • General P2P will listen on port 26500
  • Internal P2P will listen on port 26501
  • Consensus P2P will listen on port 26502
  • API will listen on port 26503

These ports are within the unassigned 26490-26999 range. This should guarantee that no other applications are using these ports, unless you are using some uncommon applications.

Functionality, Performance & Security

Odysseia is still undergoing some significant changes as we speak. After all these breaking changes have been completed they will be thoroughly tested for their functionality.

This is the first step to confirm the changes are working as expected. This will be followed by performance testing to ensure there have been no major regressions. This is especially important because of the change in storage that requires more manual working by iterating over large sets of data.

Last but not least the focus will shift to security. This will be the most important phase to ensure major changes like the new consensus or P2P communication aren't released with major flaws.