This is a read note of Programming Bitcoin Ch06: Script. Script is the smart contract language used to implement the transaction’s locking and unlocking mechanism.

1 Mechanics of Script

Script is a stack-based language and is intentionally limited in the sensse that it avoids certain features such as loops for safety and relability reasons. Therefore it is not Turing complete.

Transactions assign bitcoins to a locking script. The locking script is what’s specified in the ScriptPubKey field. The unlocking of the lockbox is done in the ScriptSig field that proves ownership of the locked box, which authorizes spending of the funds.

Script is a programming language, and like most programming languages, it processes one command at a time. There are two possible types of commands: elements and operations. They are stored in a processing stack.

  • Elements are data. Technically, processing an element pushes that element onto the stack. Elements are byte strings of length 1 to 520. A typical element might be a DER signature or a SEC pubkey.
  • Operations do something to the data. They consume zero or more elements from the processing stack and push zero or more elements back to the stack.

After all the commands are evaluated, the top element of the stack must be nonzero for the script to resolve as valid. Having no elements in the stack or the top element being 0 would resolve as invalid. Resolving as invalid means that the transaction that includes the unlocking script is not accepted on the network.

2 Parsing the Script Fields

Both the ScriptPubKey and ScriptSig are parsed the same way. If the byte is between 0x01 and 0x4b (whose value we call n), we read the next n bytes as an element.

For an elment with a length greater than 0x4b (75 in decimal), There are three specific opcodes for handling such elements: OP_PUSHDATA1 (code value 0x4c), OP_PUSHDATA2 (code value 0x4d), and OP_PUSHDATA4 (code value 0x4e). OP_PUSHDATA1 means that the next byte contains how many bytes we need to read for the element. OP_PUSHDATA2 means that the next 2 bytes contain how many bytes we need to read for the element. OP_PUSHDATA4 means that the next 4 bytes contain how many bytes we need to read for the element.

Any element longer than 520 bytes is invalid.

To evaluate a script, we need to combine a ScriptSig field in a transaction input and the ScriptPubKey field in the output of the transaction specified as previous transaction. The ScriptSig is evaluated separately from the ScriptPubKey so as to not allow operations from the ScriptSig to affect the ScriptPubKey commands.

3 Standard Scripts

There are many types of standard scripts in Bitcoin, including the following:

  • p2pk: pay-to-pubkey
  • p2pkh: pay-to-pubkey-hash
  • p2sh: pay-script-hash
  • p2wpkh: pay-to-witness-pubkey-hash
  • p2wsh: pay-to-witness-script-hash

3.1 p2pk

Pay-to-pubkey (p2pk) was used largely during the early days of Bitcoin. Most coins thought to belong to Satoshi are in p2pk UTXOs—that is, transaction outputs whose ScriptPubKeys have the p2pk form.

For p2pk, ScriptPubKey has three parts: a length of public key, public key, and OP_CHECKSIG opcode.

For p2pk, the ScriptSig required to unlock the corresponding ScriptPubKey has two parts: length of signature and the signature.

After combination, the stack is OP_CHECKSIG, pubkey, signature. OP_CHECKSIG consumes two stack commands (pubkey and signature) and determines if they are valid for this transaction. OP_CHECKSIG will push a 1 to the stack if the signature is valid, and a 0 if not.

Problems of p2pk are: the public keys are uncompressed and are long; the UTXO set becomes bigger because public keys have to be kept round and indexed in the UTXO set; storing public keys is not safe in the future.

3.2 p2pkh

Pay-to-pubkey-hash (p2pkh) is an alternative script format that has two key advantages over p2pk:

  1. The addresses are shorter.
  2. It’s additionally protected by sha256 and ripemd160.

The p2pkh ScriptPubKey has six parts: OP_DUP, OP_HASH160, length of hash, hash value, OP_EQUALVERIFY, and OP_CHECKSIG. It has the 20-byte hash160 of the public key and not the public key itself.

The p2pkh ScriptSig has four parts: lenght of signature, signature, length of pubkey, and pubkey.