Skip to main content

Design & Architecture

Fundamentally, the problem of any query solution is a data transformation task. Raw data, available via RPC event logs, needs to be decoded and transformed into a more useful format.

PostgreSQL Plugin

One of the most powerful parts of the design is that the engine functions as a plugin to the open-source PostgreSQL database. This allows us to leverage full SQL support without any hurdles or limitations. Since there is already extensive support for SQL, this makes creating APIs and bots much easier.

Low-latency queries

Once the event log data is fetched from on-chain, no additional network access is required and all queries run locally (with the exception of data not available in events).

This allows for <10 millisecond query latency compared to alternative solutions. theGraph advertises latency in the 150 to 300 millisecond range. On L2s like Arbitrum with block times measured in fractions of a second, such latency would be unacceptable.

Stateless

The Query Engine follows a stateless design compared to other solutions. Its only job is to fetch and decode events and store the raw event data. With our solution, each query runs after syncing is already done, so a single failing query only affects that query alone. Syncing only fails if the RPC node fails.

This significantly decreases the risk becoming out of sync due to errors in a query. On theGraph or GoldSky (which builds on top of the same design), errors in the “subgraph” script will cause the entire indexer to stop syncing and fail. Workarounds for this are experimental.

Derived data in our solution is generated by writing SQL views on top of the raw event data. If there is a bug in the query, the source event data is not affected and the query can be simply reperformed. There is no need to resync from scratch compared to other solutions. You can think of this roughly like using layers in PhotoShop instead of editing the photo directly.

Cross-chain queries “out-of-the-box”

The Query Engine is built to be network agnostic (like our dex middleware layer) so that event data from any chain can be retrieved and queried. There are no limitations on querying from any combination of networks.

Many query solutions are designed such that a node only syncs on one network. So each network requires an additional deployment. Query data can easily become inconsistent if one node falls behind the other in terms of block timestamps. The developer must also manually combine the data across networks outside of the query solution – which largely defeats the purpose of a query solution in the first place.

Extract-Load-Transform (ELT) architecture

The Query Engine follows an extract-load-transform (ELT) design compared to a “traditional” extract-transform-load (ETL) data transformation process. In short, ELT loads the raw data into the database and then just uses the database to perform transformations, while ETL loads the raw data and transforms it before loading it into the database.

By doing using an ELT approach, this has a number of advantages:

  • The original raw event data is always available. Investigating or debugging queries is more straightforward since you can see the raw events and how the query is generating the derived data.
  • There is no need for two separate systems to be in charge of the import process. The data is already in the database and can be transformed the same way as application data.

Human-readable contract ABI

To decode the raw binary event log data, a query solution needs to know the structure of the event that is emitted from the Solidity contract: the data types, field names, orders. This structure is the smart contract’s ABI – application binary interface. There are a few ways a contract’s ABI can be represented and managed.

Usually, a contract's ABI takes the form of a large JSON file generated when the contracts are compiled. Each contract generates a separate JSON file, which must be managed. Any external contracts still need a JSON file, which can sometimes be downloaded from Etherscan but often must be compiled by a smart contract engineer. These compilation artifacts need to be shared and kept in sync with the query solution with future redeploys during development.

We find that the easiest and most lightweight approach is to use the original Solidity definitions. These can be easily copy-and-pasted from any contract and sidestep a number of issues. Often, you only need a certain subset of events within a smart contract and not the entire JSON ABI.