This is a study note of PostgreSQL 修炼之道.
1 Installation and start service
In Mac computer, for consitency, use
brew to install PostgreSQL and pgAdmin. The installation also runs the
initdb for the postgres database, no need to run it again.
After installation, there are one database
postgres and two template databases
template. Template database can be customized to custome cloned databases.
PSQL can use the following environment variables:
psql -E will show actual SQL statements for psql commands.
\?: list commands
\h [name]: help syntax of a command
\q: quit psql
\l: list all databases.
\c name: connect to a dabased.
\d: list of relations.
\d[S+] name: describe table, view, sequnce, or index.
Sshow system objects.
+show additional details
\timing on: display command execution time.
\encoding: display or set coding set.
\x: set display borders and expanded display.
\i filename: execute external file
\set ECHO_HIDDEN on | off: switch on/off showing SQL statements for a command.
Psql automatically commit. To change it, use
\set AUTOCOMMIT off or
begin; commands; rollback;/commit;.
3 Data Types
3.1 Literal Tranform
Simple types don’t need tranform, from example,
'heoll world'. For other types,
- Use type name and single-quoted string. Ex:
CAST('5' as int),
CAST('2014-07-01' as date).
3.2 Boolean Type
- True values:
- False values:
NOT. Any value operates with
IS to compare.
3.3 Numeric Types
decimal (variable length, accuate, good for money),
serial (4 bytes),
bigserial (8 bytes).
Floating numbers include three special values:
serial is auto-incremental type with 4-byte values.
bigserial has 8 bytes.
money type is a 8 bytes type whose display depends on
|/: square root
||/: cube root
@: absolute value
~: and, or, xor, not for binary
>>: binary left and right shift
varchar(n): variable length, maximum 1GB.
char(n): fixed length filled with spaces, the first is 1 char.
text: variable length without length limit.
char doesn’t perform better in pg.
bit_length: binary length
length: number of chars
convert: convert to an encoding
lower\upper: convert ot lowercase/uppercase
overlay: replace substring
positon: find the index of a substring
substring: extract substring
trim([leading|trailing|both]): trim a string
4 Logic Structure
One pg system can have multiple db. A connection connects to a single db.
When create a db, specify the following args:
OWNER: db onwer, default is the current login user.
TEMPLATE: template, default is
ENCODING: if it is different from
template1, then use
template0has no encoding. The common one is
UTF8, as defined in
TABLESPACE: table space name is used to define locations in the file system.
CONNECTION LIMIT: the max number of concurrent connections. The default is
A database can have multiple schema. A schema works as a namespace to organize db objects such as tables or functions. A db connection can access all schema defined in the db. There is a default schema
public when a database is created. It is a good idea to not use the
default schema and explicitly specify the db object using
schem_name.object_name. The SQL standard uses the username as schema name. In application, we can use a subsystem name as schema name and create a user with the same name as the owner.
If schema is not specified, pg use schema names defined in
search_path to search a db object. The default value is
"$user", public, it include the current username and
Schema privileges include
PUBLIC represents all users.
TOAST stands for
The Oversized-Attribute Storage Technique. It is used to store large variable length value. A value can be compressed and stored in external TOAST table.
Temporary tables can be sessional or transactional.
Table constrains include
A table can inherit another table’s definition to extend it. Query parent will show child’s data. Query child doesn’t show parent’s data. A table can inherit multiple tables.
Use inheritance and
RULE to partition a table: a logic table (parent)can be partitioned into several physical tables (children).
4.4 Trigger and Event Trigger
Trigger can be
FOR STATEMENT or
FOR ROW, and can be
AFTER. Statement triggers alwasy return
NULL. Row triggers will use the return value as new value in
BEFORE trigger, no operation for
NULL. The return value of
AFTER row tigger is ignored. Tiggers are executed by the order of its name.
pg defines the following variables:
NEW: the new value of type
OLD: the old value of type
TG_XXX: the trigger name of type
XXXis other trigger data such as
Event trigger is used to support DDL events such as
The system view
pg_event_trigger has all defined event triggers.
pg support the following indexes:
- B-tree: best for equal or range search
- HASH: best for simple equal search
- GiST: suport index strategy
- SP-GiST: space based GiST
- GIN: best for multi-value search, ex: array.
4.6 User and Role
pg uses the role concept to control db access. A role is a set of privileges. A user is a role with an additional
LOGIN privilege in pg. Roles are visible glabally in multiple databases of the same database system. There is a super user for each db system.
Some privileges are only created during user creation or
- db creation
privilege controlled by
- SCHEMA creation
- Temp Table
- db connect
- creation of db objects such as table, vew and function etc
- select/update/insert/delete on tables or columns
- operation on sequence
GRANT is used to grant a user some roles or grant privileges to a role.
pg uses MVCC that alllows parallel write and read. DDL operation is in a transaction. In psql, the
AUTOCOMMIT is on. pg supports
SAVEPOINT to enable partial transaction.
pg has two real transaction isolation levels: read commited and serialization. In read commited level, two selects in one transaction may read different data.
pg has two types of locks: table lock and row low.
5 pg Architecture
There are two types of pg drivers: libpq-based C interface and pure language drivers such as JDBC that doesn’t use
libpq. Inter-computer connections use TCP/IP while intra-computer conntections use domain socket.
When pg starts, it runs
Postmager (the same
postgres command ) process that forks session service processes
Posgtres processes and several hlep processes such as log, journal, archive and vacuum processes.
Each connect is served by a process.
pg_ctl is a utility to initialize database system, start/stop/re-start postgresql, check status, dynamic config.
postgresql.conf in the
$PGDATA folder. All configurations are in
pg_settings view. Detail connection configurations are in
7 Advanced Topics
Every table has system columns:
oid: row object id, it is generated globally, for system tables
tableoid: table oid (parent or child table)
cmax: control visibility in MVCC
ctid: row location in a table, maybe change after vacuum
pg uses one or more files to store data. Data is stoed in pages that has a default size of 8KB, max is 32KB. One page can store multiple tuples (rows).
7.2 pg Features
- Rule system: used to rewrite SQL statements
- Pattern matching:
- Listen and notfiy: communcation channel.
- Index features: function index, partial index, GiST, and more.
- Advisory lock
- SQL/MED: connect different dbs. There are four db objects: foreign data wrapper (FDW), server, user mapping, foreign table.