This is a read note of Mastering Bitcoin Chapter 06: Transactions. Bitcoin is designed to ensure that transactions can be created, propagated on the network, validated, and finally added to the global ledger of transactions (the blockchain).

1 Introduction

The fundamental building block of a bitcoin transaction is a transaction output. Transaction outputs are indivisible chunks of bitcoin currency, recorded on the blockchain, and recognized as valid by the entire network. Bitcoin full nodes track all available and spendable outputs, known as unspent transaction outputs, or UTXO. The collection of all UTXO is known as the UTXO set and currently numbers in the millions of UTXO. The UTXO set grows as new UTXO is created and shrinks when UTXO is consumed. Every transaction represents a change (state transition) in the UTXO set.

When we say that a user’s wallet has “received” bitcoin, what we mean is that the wallet has detected on the blockchain an UTXO that can be spent with one of the keys controlled by that wallet. Thus, a user’s bitcoin “balance” is the sum of all UTXO that user’s wallet can spend and which may be scattered among hundreds of transactions and hundreds of blocks. The concept of a balance is created by the wallet application. The wallet calculates the user’s balance by scanning the blockchain and aggregating the value of any UTXO the wallet can spend with the keys it controls.

Outputs are discrete and indivisible units of value, denominated in integer satoshis (1 bitcoin = 10 ** 8 satoshis). An unspent output can only be consumed in its entirety by a transaction.

The exception to the output and input chain is a special type of transaction called the coinbase transaction, which is the first transaction in each block. This transaction is placed there by the “winning” miner and creates brand-new bitcoin payable to that miner as a reward for mining. This special coinbase transaction does not consume UTXO; instead, it has a special type of input called the “coinbase.” This is how bitcoin’s money supply is created during the mining process.

When a UTXO is used as an input in a transaction, it becomes a spent transaction output (STXO).

2 Transaction Outputs

Every bitcoin transaction creates outputs, which are recorded on the bitcoin ledger. Almost all of these outputs, with one exception (op_return) create spendable chunks of bitcoin called UTXO, which are tracked by every full-node Bitcoin client in the UTXO set. New transactions consume (spend) one or more of these outputs from the UTXO set.

Transaction outputs consist of two parts:

  • An amount of bitcoin, denominated in satoshis, the smallest bitcoin unit.
  • A cryptographic puzzle that determines the conditions required to spend the output.

The cryptographic puzzle is also known as a locking script, a witness script, or a scriptPubKey. It is coded in transaction script language. It sets the spending conditions.

When transaction outputs are transmitted over the network or exchanged between applications, they are serialized with the following fields:

  • Amount: it is 8 bytes in little-endian representing Bitcoin value in satohis. A bitcoin has 100 million, i.e. 10 ** 8, satoshis.
  • Locking-Script Size: it is 1-9 bytes giving the locking script size in bytes.
  • Locking-Script: it is the locking script that has a size given by the previous field.

3 Transaction Inputs

The transaction inputs are an array (list) called vin. Each vin represents a transaction. Transaction inputs identify (by reference) which UTXO will be consumed and provide proof of ownership through an unlocking script.

An input serialization has five fields:

  • A transaction ID in reversed byte order, 32 bytes, referencing the UTXO being spent.
  • An output index (vout), 4 bytes, identifying the specific UTXO in that transaction.
  • Unlocking script size, 1-9 bytes, giving the unlocking script length in byte.
  • The unlocking script, variable bytes, a script that fulfills the conditions of the UTXO locking script. The unlocking script is constructed by a wallet by first retrieving the referenced UTXO, examining its locking script, and then using it to build the necessary unlocking script to satisfy it. Most often, the unlocking script is a digital signature and public key proving ownership of the bitcoin. However, not all unlocking scripts contain signatures. The third part is a sequence number.
  • A sequence number, 4 bytes, used for locktime or disabled (0xFFFFFFFF).

The most common unlocking script is ScriptSig, a signature script. It has foure fields:

  • Signature size: 1-9 bytes, signature length in bytes.
  • Signature: a signature that inlucde Signature Hash Type (SIGHASH) with the specified size.
  • Public key size: 1-9 bytes, public key length in bytes.
  • Public key: the unhashed public key with the specified size.

4 Transaction Fees

Most transactions include transaction fees, which compensate the bitcoin miners for securing the network. Fees also serve as a security mechanism themselves, by making it economically infeasible for attackers to flood the network with transactions.

Transaction fees are calculated based on the size of the transaction in kilobytes, not the value of the transaction in bitcoin. Transaction fees affect the processing priority, meaning that a transaction with sufficient fees is likely to be included in the next block mined, whereas a transaction with insufficient or no fees might be delayed, processed on a best-effort basis after a few blocks, or not processed at all.

Over time, the way transaction fees are calculated and the effect they have on transaction prioritization has evolved. At first, transaction fees were fixed and constant across the network. Gradually, the fee structure relaxed and may be influenced by market forces, based on network capacity and transaction volume. Since at least the beginning of 2016, capacity limits in bitcoin have created competition between transactions, resulting in higher fees and effectively making free transactions a thing of the past. Zero fee or very low fee transactions rarely get mined and sometimes will not even be propagated across the network.

In Bitcoin Core, fee relay policies are set by the minrelaytxfee option. The current default minrelaytxfee is 0.00001 bitcoin. Any bitcoin service that creates transactions, including wallets, exchanges, retail applications, etc., must implement dynamic fees. Dynamic fees can be implemented through a third-party fee estimation service or with a built-in fee estimation algorithm. One popular service is bitcoinfees. It has an HTTP REST API.

The data structure of transactions does not have a field for fees. Instead, fees are implied as the difference between the sum of inputs and the sum of outputs. Any excess amount that remains after all outputs have been deducted from all inputs is the fee that is collected by the miners.

5 Transaction Script Language

The bitcoin transaction script language, called Script, is a Forth-like reverse-polish notation stack-based execution language. Both the locking script placed on an UTXO and the unlocking script are written in this scripting language. When a transaction is validated, the unlocking script in each input is executed alongside the corresponding locking script to see if it satisfies the spending condition.

Script is a very simple, not a gnearl-purpose language that was designed to be limited in scope and executable on a range of hardware. It has the following features:

  • It is deliberately limited in one important way—​there are no loops or complex flow control capabilities other than conditional flow control. This ensures that the language is not Turing Complete, meaning that scripts have limited complexity and predictable execution times.
  • It is stateless. There is no state prior to execution of the script, or state saved after execution of the script. Therefore, all the information needed to execute a script is contained within the script. A script will predictably execute the same way on any system.
  • Bitcoin’s transaction validation engine relies on two types of scripts to validate transactions: a locking script and an unlocking script.
    • Locking scripts can be written to express a vast variety of complex conditions. Historically, the locking script was called a scriptPubKey, because it usually contained a public key or Bitcoin address (public key hash). It is also called witness script or cryptographic puzzle.
    • The unlocking script was called scriptSig, because it usually contained a digital signature. Unlocking script is also called witness.

In the original Bitcoin client, the unlocking and locking scripts were concatenated and executed in sequence. For security reason, since 2010, the scripts are executed separately with the stack transferred between the two executions. First, the unlocking script is executed, using the stack execution engine. If the unlocking script is executed without errors (e.g., it has no “dangling” pointers left over), the main stack is copied and the locking script is executed.

The scripting language executes the script by processing each item from left to right. Numbers (data constants) are pushed onto the stack. Operators push or pop one or more parameters from the stack, act on them, and might push a result onto the stack.

If any result other than “TRUE” remains after execution of the combined script, the input is invalid because it has failed to satisfy the spending conditions placed on the UTXO. Only a valid transaction that correctly satisfies the conditions of the output results in the output being considered as “spent” and removed from the set of unspent transaction outputs (UTXO set).

6 Pay-to-Public-Key-Hash (P2PKH)

Today, most transactions processed through the Bitcoin network have the form “Payment to Bob’s Bitcoin address” and are based on a script called a Pay-to-Public-Key-Hash script (“P2PKH”).

The unlocking script (scriptSig) is <Signature><Public Key> and teh locking script (scriptPubKey) is DUP HASH160 <Publick Key Hash> OP_EQUALVERIFY OP_CHECKSIG.

It executes as the following:

  • The unlocking script pushes two items into stack from bottom to top: <Signature> and <Public Key>.
  • The DUP operator duplicates the top item in the stack. The stack becomes <Signature> <Public Key> <Public Key>.
  • The HASH160 hashes the top element. The stack is <Signature> <Public Key> <Public Key Hash>
  • The <Publick Key Hash> is pushed into stack: <Signature> <Public Key> <Public Key Hash> <Public Key Hash>.
  • The OP_EQUALVERIFY removes top two elements if they match: <Signature> <Public Key>.
  • The OP_CHECKSIG checks the signature matches the public key and pushes result to the top of stack if true.

7 Digital Signature (ECDSA)

The digital signature algorithm used in bitcoin is the Elliptic Curve Digital Signature Algorithm, or ECDSA. ECDSA is used by the script functions OP_CHECKSIG, OP_CHECKSIGVERIFY, OP_CHECKMULTISIG, and OP_CHECKMULTISIGVERIFY. Any time you see those in a locking script, the unlocking script must contain an ECDSA signature.

A digital signature serves three purposes in bitcoin.

  1. the signature proves that the owner of the private key, who is by implication the owner of the funds, has authorized the spending of those funds.
  2. Secondly, the proof of authorization is undeniable (nonrepudiation).
  3. Thirdly, the signature proves that the transaction (or specific parts of the transaction) have not and cannot be modified by anyone after it has been signed.

In Bitcoin, the digital signature funciton is applied on the hash of transaction data. The result is a pair of values: (R, S). The result is serialized into a byte-stream using an international standard encoding scheme called the Distinguished Encoding Rules (DER). For example, the signature 3045022100884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb02204b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e381301 has nine fields:

  • 0x30: the start of a DER sequence
  • 0x45: the length of the sequence (69 bytes)
  • 0x02: an integer value follows
  • 0x21: the length of the interger (33 bytes)
  • R: 00884d142d86652a3f47ba4746ec719bbfbd040a570b1deccbb6498c75c4ae24cb
  • 0x02: another integer value follows
  • 0x20: the length of the interger (32 bytes)
  • S: 4b9f039ff08df09cbe9f6addac960298cad530a863ea8f53982c09db8f6e3813
  • 0x01: a suffix indicating the type of hash used SIGHASH_ALL.

The signature verification algorithm takes the message (a hash of the transaction or parts of it), the signer’s public key and the signature (R, S), and returns TRUE if the signature is valid for this message and public key.

8 Signature Hash Types (SIGHASH)

In the simplest form, the signature applies to the entire transaction, thereby committing all the inputs, outputs, and other transaction fields. However, a signature can commit to only a subset of the data in a transaction, which is useful for a number of scenarios.

Bitcoin signatures have a way of indicating which part of a transaction’s data is included in the hash signed by the private key using a SIGHASH flag. The SIGHASH flag is a single byte that is appended to the signature. Every signature has a SIGHASH flag and the flag can be different from input to input. A transaction with three signed inputs may have three signatures with different SIGHASH flags, each signature signing (committing) different parts of the transaction. Note also that bitcoin transactions may contain inputs from different “owners,” who may sign only one input in a partially constructed (and invalid) transaction, collaborating with others to gather all the necessary signatures to make a valid transaction. Many of the SIGHASH flag types only make sense if you think of multiple participants collaborating outside the Bitcoin network and updating a partially signed transaction.

There are three SIGHASH flags:

  • ALL: 0x01, applies to all inputs and outputs.
  • NONE: 0x02, applies to all inputs, none of the outputs. This construction can be used to create a “bearer check” or “blank check” of a specific amount. It commits to the input, but allows the output locking script to be changed. Anyone can write their own Bitcoin address into the output locking script and redeem the transaction. However, the output value itself is locked by the signature.
  • SINGLE: 0x03, applies to all inputs but the only one output with the same index number as the signed input.

In addition, there is a modifier flag SIGHASH_ANYONECANPAY, which can be combined with each of the preceding flags. When ANYONECANPAY is set, only one input is signed, leaving the rest (and their sequence numbers) open for modification. The ANYONECANPAY has the value 0x80 and is applied by bitwise OR, resulting in the combined flags

  • ALL|ANYONECANPAY: 0x81, one input and all outputs. This construction can be used to make a “crowdfunding”-style transaction. Someone attempting to raise funds can construct a transaction with a single output. The single output pays the “goal” amount to the fundraiser. Such a transaction is obviously not valid, as it has no inputs. However, others can now amend it by adding an input of their own, as a donation. They sign their own input with ALL|ANYONECANPAY. Unless enough inputs are gathered to reach the value of the output, the transaction is invalid. Each donation is a “pledge,” which cannot be collected by the fundraiser until the entire goal amount is raised.
  • NONE|ANYONECANPAY: 0x82, one input and none of the outputs. This construction can be used to build a “dust collector.” Users who have tiny UTXO in their wallets can’t spend these because the cost in fees exceeds the value of the dust. With this type of signature, the dust UTXO can be donated for anyone to aggregate and spend whenever they want.
  • SINGLE|ANYONECANPAY: 0x83, one input and one output.

All SIGHASH types sign the transaction nLocktime field. In addition, the SIGHASH type itself is appended to the transaction before it is signed, so that it can’t be modified once signed.

9 ECDSA Math

The signature algorithm first generates an ephemeral (temporary) private public key pair. This temporary key pair is used in the calculation of the R and S values, after a transformation involving the signing private key and the transaction hash.

The temporary key pair is based on a random number k, which is used as the temporary private key. From k, we generate the corresponding temporary public key P. The R value of the digital signature is then the x coordinate of the ephemeral public key P.

The the algorithm calculates the value S from message hash, the R, the ephemeral prviate key, and the signing private key.

Verification is the inverse of the signature generation function, using the R, S values and the public key to calculate a value P, which is a point on the elliptic curve.

The value of the random number (private key) k is not important, as long as it is random. If the same value k is used to produce two signatures on different messages (transactions), then the signing private key can be calculated by anyone. Reuse of the same value for k in a signature algorithm leads to exposure of the private key! To avoid this vulnerability, the industry best practice RFC 6979 is to not generate k with a random-number generator seeded with entropy, but instead to use a deterministic-random process seeded with the transaction data itself.