This is a read note of the book titled “IPFS 原理与实践.” IPFS is kind of severless.

1 Introduction

IPFS (InterPlanetary File System) is a new distributed hypermedia transport protocol based on content-addressing. IPFS is a distributed file system whose goal is to connect all computing devices into a single global filesystem. It uses BitTorrent protocol to exchange Git data objects. IPFS integrates BitTorrent, DHT, Git and SFS to create a P2P hypermedia protocol. IPFS is both a protocol and a P2P network.

Filecoin (FIL) is an incentive layer running on top of IPFS. It is distributed storage network based on blockchain. Filecoin has two markets: data search and data storage.

In Distributed Hash Table (DHT), each client stores and searches a portion of data. In Kademlia network, there is a big distributed hash table that allows access value by key. Git supports versioning. Self-certifying Filesystem (SFS) stores all files in a single folder and creates path based on original path and public key hash.

IPFS stores data but puts permanent addresses in blockchain transactions. IPFS works with different chains. Filecoin uses Proof of Replication (RoRep) that requires data storage.

IPFS has the following modules:

  • Filecoin: value data for incentive and exchange.
  • IPFS: application data for application.
  • IPLD: data structures for name, object and file.
  • libP2P: data tranportation for routing, network and exchange. LibP2P is commonly used by blockchain network layer.
  • Multiformats: encryption and meta data, for identification.

Application domains:

  • Permanent archive
  • Low cost, safe P2P CDN
  • Data storage for blockchain
  • Decentralized free speech

2 IPFS Foundation

2.1 DHT

DHT has a global distributed hash table, a collection of <key, value> where a key is a hash value of a file created a specific hash algoirithm from file name or file content. The value is the node address of the file storage. The global hash table is distributed to all nodes. Each node has a part of the table. Kademlia, Coral and S/Kademlia are three popular DHT implmentations.

2.2 BitTorren: Block Exchange Protocol

First, seed initializes the sharing with a .torrent meta data file. A tracker saves the meta data file and seed info. BT breaks a file into many 256KB blocks.

2.3 Git Version Control

Git stores snapshot of all files. If a file doesn’t change, the snapshot stores a reference to its latest version.

2.4 Self-Certifying File Systems (SFS)

SFS puts public key into filename. It implments a global file system. Any internet server can be a SFS server.

SFS filepath has three parts: server address, HostID (a hash value of machine name and public key), server file path.

2.5 Merkle DAG and Merkle Tree

Merkle DAG is the data structure of IPFS objects. Merkle DAG is a generalized Merkle Tree. Merkle DAG has three goals:

  • content addressing: use multiple hash to identify a data block
  • data integrity
  • deduplication

3 IPFS Protocol Stack

  • Identity: create a NodeId from public key hash
  • Network: multiple transportation protocols using overlay network
  • Routing: distributed hash table
  • Exchange: BitSwap support data exchange with built-in incentive system
  • Object: Merkle DAG
  • File
  • Naming

4 Modules

5 Filecoin

Filecoin is the first block project that is based on physical economy: the sharing of storage and bandwidth. The two markets are storage market and retrieval market.

Filecoin will optimize Internet traffic, distributed storage and distributed applications.

Filecoin has 2 billion coins. Miner stakes coins and gets mining rewards. The half period is 6.5 year.

Mining rewards: ico from storage and bandwidth, storage fee, transaction validation fee, retrieval fee.

There are three types of participants: retrieval miner (RM), storage (SM) miner and user. They work together as the following:

  • Storage Market (on-chain)
    • Ask: SM submits storage space for sell.
    • Bid: user places an order and FIL.
    • Match: the blockchain matches the ask and bid. User sends data and RM provides Proof of Replication.
    • Pay and verifiy: order is committed to ledger. SM keeps provides Proof of Spacetime that the data is available for retrieval.
  • Retrieval Market (off-chain)
    • RMs and users broadcast bid and ask.
    • RM send data to user if there is a match. User pays RM.
    • Transaction is done and recorded in Ledger