This is a study note of PostgreSQL 修炼之道.
1 Installation and start service
In Mac computer, for consitency, use brew
to install PostgreSQL and pgAdmin. The installation also runs the initdb
for the postgres database, no need to run it again.
|
|
After installation, there are one database postgres
and two template databases template0
and template
. Template database can be customized to custome cloned databases.
2 PSQL
PSQL can use the following environment variables: PGDATABASE
, PGHOST
, PGPORT
, and PGUSER
.
psql -E
will show actual SQL statements for psql commands.
Common commands:
\?
: list commands\h [name]
: help syntax of a command\q
: quit psql\l
: list all databases.\c name
: connect to a dabased.\d
: list of relations.\d[S+] name
: describe table, view, sequnce, or index.S
show system objects.+
show additional details\timing on
: display command execution time.\encoding
: display or set coding set.pset
,\x
: set display borders and expanded display.\i filename
: execute external file\set ECHO_HIDDEN on | off
: switch on/off showing SQL statements for a command.
Psql automatically commit. To change it, use \set AUTOCOMMIT off
or begin; commands; rollback;/commit;
.
3 Data Types
3.1 Literal Tranform
Simple types don’t need tranform, from example, 1
, 3.14
, 'heoll world'
. For other types,
- Use type name and single-quoted string. Ex:
bit '1101'
,int '1'
. - Use
CAST
. Ex:CAST('5' as int)
,CAST('2014-07-01' as date)
. - Use
::
. Ex:'5'::int
,'2014-07-01'::data
.
3.2 Boolean Type
Values:
- True values:
TRUE
,'true'
,'t'
,'y'
,'tRue'
,'yes'
,'1'
- False values:
FALSE
,'false'
,'f'
,'n'
,'falSE'
,'no'
,'0'
Operatos: AND
, OR
, NOT
. Any value operates with NULL
is NULL
. Use IS
to compare. NULL
in IS
is UNKNOWN
.
3.3 Numeric Types
smallint
, int
/integer
, bigint
, numeric
/decimal
(variable length, accuate, good for money), real
, double precision
, serial
(4 bytes), bigserial
(8 bytes).
Floating numbers include three special values: 'Infinity'
, '-Infinity'
and 'NaN'
.
serial
is auto-incremental type with 4-byte values. bigserial
has 8 bytes.
money
type is a 8 bytes type whose display depends on lc_monetary
.
Opeartors:
^
: power|/
: square root||/
: cube root|
: factorial@
: absolute value&
,|
,#
,~
: and, or, xor, not for binary<<
,>>
: binary left and right shift
3.4 Chars
character varying(n)
/varchar(n)
: variable length, maximum 1GB.character
/character(n)
/char(n)
: fixed length filled with spaces, the first is 1 char.text
: variable length without length limit.
Use text
or varchar
because char
doesn’t perform better in pg.
Functions:
||
: concatenationbit_length
: binary lengthchar_length
/length
: number of charsconvert
: convert to an encodinglower\upper
: convert ot lowercase/uppercaseoverlay
: replace substringpositon
: find the index of a substringsubstring
: extract substringtrim([leading|trailing|both])
: trim a string
4 Logic Structure
One pg system can have multiple db. A connection connects to a single db.
4.1 Database
When create a db, specify the following args:
OWNER
: db onwer, default is the current login user.TEMPLATE
: template, default istemplate1
.ENCODING
: if it is different fromtemplate1
, then usetemplate0
becaausetemplate0
has no encoding. The common one isUTF8
, as defined intemplate1
.TABLESPACE
: table space name is used to define locations in the file system.CONNECTION LIMIT
: the max number of concurrent connections. The default is-1
4.2 Schema
A database can have multiple schema. A schema works as a namespace to organize db objects such as tables or functions. A db connection can access all schema defined in the db. There is a default schema public
when a database is created. It is a good idea to not use the default
schema and explicitly specify the db object using schem_name.object_name
. The SQL standard uses the username as schema name. In application, we can use a subsystem name as schema name and create a user with the same name as the owner.
If schema is not specified, pg use schema names defined in search_path
to search a db object. The default value is "$user", public
, it include the current username and public
.
Schema privileges include USAGE
, CREATE
. PUBLIC
represents all users.
4.3 Table
TOAST
stands for The Oversized-Attribute Storage Technique
. It is used to store large variable length value. A value can be compressed and stored in external TOAST table.
Temporary tables can be sessional or transactional.
Table constrains include CHECK
, not null
, UNIQUE
, and REFERENCES
.
A table can inherit another table’s definition to extend it. Query parent will show child’s data. Query child doesn’t show parent’s data. A table can inherit multiple tables.
Use inheritance and Trigger
/RULE
to partition a table: a logic table (parent)can be partitioned into several physical tables (children).
4.4 Trigger and Event Trigger
Trigger can be FOR STATEMENT
or FOR ROW
, and can be BEFORE
or AFTER
. Statement triggers alwasy return NULL
. Row triggers will use the return value as new value in BEFORE
trigger, no operation for NULL
. The return value of AFTER
row tigger is ignored. Tiggers are executed by the order of its name.
pg defines the following variables:
NEW
: the new value of typeRECORD
forINSERT
/UPDATE
.OLD
: the old value of typeRECORD
forUPDATE
/DELETE
.TG_NAME
/TG_XXX
: the trigger name of typenam
,XXX
is other trigger data such asWHEN
,OP
,TABLE_NAME
etc.
Event trigger is used to support DDL events such as ddl_command_start
, ddl_command_end
, and sql_drop
.
The system view pg_event_trigger
has all defined event triggers.
4.5 Indexes
pg support the following indexes:
- B-tree: best for equal or range search
- HASH: best for simple equal search
- GiST: suport index strategy
- SP-GiST: space based GiST
- GIN: best for multi-value search, ex: array.
4.6 User and Role
pg uses the role concept to control db access. A role is a set of privileges. A user is a role with an additional LOGIN
privilege in pg. Roles are visible glabally in multiple databases of the same database system. There is a super user for each db system.
Some privileges are only created during user creation or ALTER ROLE
:
- superuser
- db creation
- login
privilege controlled by GRANT
and REVOKE
:
- SCHEMA creation
- Temp Table
- db connect
- creation of db objects such as table, vew and function etc
- select/update/insert/delete on tables or columns
- operation on sequence
GRANT
is used to grant a user some roles or grant privileges to a role.
4.7 Transaction
pg uses MVCC that alllows parallel write and read. DDL operation is in a transaction. In psql, the AUTOCOMMIT
is on. pg supports SAVEPOINT
to enable partial transaction.
pg has two real transaction isolation levels: read commited and serialization. In read commited level, two selects in one transaction may read different data.
pg has two types of locks: table lock and row low.
5 pg Architecture
There are two types of pg drivers: libpq-based C interface and pure language drivers such as JDBC that doesn’t use libpq
. Inter-computer connections use TCP/IP while intra-computer conntections use domain socket.
When pg starts, it runs Postmager
(the same postgres
command ) process that forks session service processes Posgtres
processes and several hlep processes such as log, journal, archive and vacuum processes.
Each connect is served by a process.
6 Management
pg_ctl
is a utility to initialize database system, start/stop/re-start postgresql, check status, dynamic config.
pg use postgresql.conf
in the $PGDATA
folder. All configurations are in pg_settings
view. Detail connection configurations are in pg_hba.conf
file.
7 Advanced Topics
7.1 Internals
Every table has system columns:
oid
: row object id, it is generated globally, for system tablestableoid
: table oid (parent or child table)xmin
,xmax
,cmin
,cmax
: control visibility in MVCCctid
: row location in a table, maybe change after vacuum
pg uses one or more files to store data. Data is stoed in pages that has a default size of 8KB, max is 32KB. One page can store multiple tuples (rows).
7.2 pg Features
- Rule system: used to rewrite SQL statements
- Pattern matching:
LIKE
,SIMILAR TO
,POSIX
.substring
. - Listen and notfiy: communcation channel.
- Index features: function index, partial index, GiST, and more.
- Sequence
- Advisory lock
- SQL/MED: connect different dbs. There are four db objects: foreign data wrapper (FDW), server, user mapping, foreign table.