This is a read note of Mastering Ethereum Ch11: Oracles. Oracles are systems that can provide external data sources to Ethereum smart contracts. Ideally oracles are systems that are trustless, meaning that they do not need to be trusted because they operate on decentralized principles.

1 Why Oracles Are Needed

In order to maintain consensus, EVM execution must be totally deterministic and based only on the shared context of the Ethereum state and signed transactions. This has two particularly important consequences: the first is that there can be no intrinsic source of randomness for the EVM and smart contracts to work with; the second is that extrinsic data can only be introduced as the data payload of a transaction.

Oracles, ideally, provide a trustless (or at least near-trustless) way of getting extrinsic (i.e., “real-world” or off-chain) information, such as the results of football games, the price of gold, or truly random numbers, onto the Ethereum platform for smart contracts to use. They can also be used to relay data securely to DApp frontends directly. Oracles can therefore be thought of as a mechanism for bridging the gap between the off-chain world and smart contracts. However, this can also introduce external risks to Ethereum’s security model because anyone can hack the oracle.

Note that some oracles provide data that is particular to a specific private data source, such as academic certificates or government IDs. As such, these data sources count as “oracles” because they also provide a data bridge for smart contracts. The data they provide generally takes the form of attestations, such as passports or records of achievement. Attestations will become a big part of the success of blockchain platforms in the future, particularly in relation to the related issues of verifying identity or reputation, so it is important to explore how they can be served by blockchain platforms.

Oracles can also be used to perform arbitrary computation, a function that can be especially useful given Ethereum’s inherent block gas limit and comparatively expensive computation costs.

2 Oracle Design Patterns

The key functions of an Oracle are:

  • Collect data from an off-chain source.
  • Transfer the data on-chain with a signed message.
  • Make the data available by putting it in a smart contract’s storage.

Once the data is available in a smart contract’s storage, it can be accessed by other smart contracts via message calls that invoke a “retrieve” function of the oracle’s smart contract; it can also be accessed by Ethereum nodes or network-enabled clients directly by “looking into” the oracle’s storage.

The three main ways to set up an oracle can be categorized as request–response, publish-subscribe, and immediate-read.

  • immediate-read oracles are those that provide data that is only needed for an immediate decision
  • publish–subscribe oracles provide a broadcast service for data that is expected to change (perhaps both regularly and frequently) is either polled by a smart contract on-chain, or watched by an off-chain daemon for updates.
  • request–response oracle might be implemented as a system of on-chain smart contracts and off-chain infrastructure used to monitor requests and retrieve and return data.

3 Data Authentication

There is a distinct possibility that data may be tampered with in transit, so it is critical that off-chain methods are able to attest to the returned data’s integrity. Two common approaches to data authentication are authenticity proofs and trusted execution environments (TEEs).

Authenticity proofs are cryptographic guarantees that data has not been tampered with. Based on a variety of attestation techniques (e.g., digitally signed proofs), they effectively shift the trust from the data carrier to the attestor (i.e., the provider of the attestation). By verifying the authenticity proof on-chain, smart contracts are able to verify the integrity of the data before operating upon it.

TEE methods utilize hardware-based secure enclaves to ensure data integrity.

4 Decentralized Oracles

ChainLink has proposed a decentralized oracle network consisting of three key smart contracts (a reputation contract, an order-matching contract, and an aggregation contract) and an off-chain registry of data providers. The reputation contract is used to keep track of data providers' performance. Scores in the reputation contract are used to populate the off-chain registry. The order-matching contract selects bids from oracles using the reputation contract. It then finalizes a service-level agreement, which includes query parameters and the number of oracles required. This means that the purchaser needn’t transact with the individual oracles directly. The aggregation contract collects responses (submitted using a commit–reveal scheme) from multiple oracles, calculates the final collective result of the query, and finally feeds the results back into the reputation contract.