Dec 28, 2016

PostgreSQL Architecture

Based on PostgreSQL documentation and various articles, I comprehend the PostgreSQL Architecture as below:

PostgreSQL Architecture


PostgreSQL instance consists of set of Process and Memory. PostgreSQL uses a simple “process per-user” client/server model. The major processes are:
  1. The ‘postmaster’ which is:
    • supervisory daemon process,
    • ‘postmaster’ is attached to shmmem segment but refrains from accessing to it.
    • always running waiting for connection requests
  2. Utility processes ( bgwriter, walwriter, syslogger, archiver, statscollector and autovacuum launcher ) and
  3. User Backend process ( postgres process itself, Server Process )
When a client request for connection to the database, firstly request is hit to Postmaster daemon process. After performing authentication and authorization it forks one new backend server process (postgres). Henceforth, the frontend process and the backend server communicate directly without intervention by the postmaster. The postmaster is always running, waiting for connection requests, whereas frontend and backend processes come and go. The libpq library allows a single frontend to make multiple connections to backend processes.
However, each backend process is a single-threaded process that can only execute one query at a time; so the communication over any one frontend-to-backend connection is single-threaded.
Postmaster and postgres servers run with the user ID of the PostgreSQL “superuser”.
One postgres process exists for every open database session. Once authenticated with user connection, it directly connects (with who and for what purpose) with shared memory.

POSTGRESQL SHARED MEMORY

Shared Buffers:

Sets the amount of memory the database server uses for shared memory buffers. The default is typically 32MB. Larger settings for shared_buffers usually require a corresponding increase in checkpoint_segments, in order to spread out the process of writing large quantities of new or changed data over a longer period of time. Below 3 parameters should be discussed:
  • bgwriter_delay
  • bgwriter_lru_maxpages
  • bgwriter_lru_multiplier

WAL Buffers:

The amount of shared memory used for WAL data that has not yet been written to disk. The default setting of -1 selects a size equal to 1/32nd (about 3%) of shared_buffers, but not less than 64kB nor more than the size of one WAL segment, typically 16MB. This value can be set manually if the automatic choice is too large or too small, but any positive value less than 32kB will be treated as 32kB. This parameter can only be set at server start. The contents of the WAL buffers are written out to disk at every transaction commit, so extremely large values are unlikely to provide a significant benefit. However, setting this value to at least a few megabytes can improve write performance on a busy server where many clients are committing at once. The auto-tuning selected by the default setting of -1 should give reasonable results in most cases.

CLOG Buffers:

$PGDATA/pg_clog contains a log of transaction metadata. This log tells PostgreSQL which transactions completed and which did not. The clog is small and never has any reason to become bloated, so you should never have any reason to touch it.

POSTGRESQL PER BACKEND MEMORY

work_mem:

Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. Default is 1M. Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value.

temp_buffers:

Sets the maximum number of temporary buffers used by each database session. Default is 8M. The setting can be changed within individual sessions, but only before the first use of temporary tables within the session; subsequent attempts to change the value will have no effect on that session.

maintenance_work_mem:

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. Default is 16M. Since only one of these operations can be executed at a time by a database session, and an installation normally doesn’t have many of them running concurrently, it’s safe to set this value significantly larger than work_mem.

UTILITY PROCESSES:

Mandatory processes: We cannot Enable/Disable these processes.
  • BGWriter
  • WAL Writer
Optional Processes:
  • Stats-collector
  • Autovacuum launcher
  • Archiver
  • Syslogger
  • WAL Sender
  • WAL Receiver

BGWriter

The function of background writer is to issue writes of “dirty” (new or modified) shared buffers. It writes shared buffers so that server processes handling user queries seldom or never need to wait for a write to occur. However, the background writer does cause a net overall increase in I/O load, because while a repeatedly-dirtied page might otherwise be written only once per checkpoint interval, the background writer might write it several times as it is dirtied in the same interval. Below parameters can be used to tune the behavior for local needs.
bgwriter_delay: Specifies the delay between activity rounds for the background writer. Default is 200ms.
bgwriter_lru_maxpages: In each round, no more than this many buffers will be written by the background writer. Setting this to zero disables background writing (except for checkpoint activity). The default value is 100 buffers.
bgwriter_lru_multiplier: The number of dirty buffers written in each round is based on the number of new buffers that have been needed by server processes during recent rounds. The default is 2.0.
Smaller values of bgwriter_lru_maxpages and bgwriter_lru_multiplier reduce the extra I/O load caused by the background writer, but make it more likely that server processes will have to issue writes for themselves, delaying interactive queries.

WAL Writer

WAL writer process writes and fsync WAL at convenient Intervals. In order to guarantee transaction security; WAL buffers holds the changes made to the database in the transaction logs. WAL buffers are written out to the disk at every transaction commit, as WAL writer process is responsible to write on to the disk. WAL_WRITER_DELAY parameter for invoking the WAL Writer Process, however there are other parameters which also keeps the WAL Writer busy.

Stats Collector

Stats collector process will collect the information about the server activity. It count number of access to the tables and indexes in both disk-block and individual row items. It also tracks the total number of rows in each table, and information about VACUUM and ANALYZE actions for each table. Collection of statistics adds some overhead to query execution, whether to collect or not collect information. Some of the parameter in the postgresql.conf file will control the collection activity of the stats collector process. Click here to know more about the stats collector process and its related parameters.

Autovacuum Launcher

For automating the execution of VACUUM and ANALYZE command, Autovacuum Launcher is a daemon process consists of multiple processes called autovacuum workers. Autovacuum launcher is a charge of starting autovacuum worker processes for all databases. Launcher will distribute the work across time, attempting to start one worker on each database for every interval, set by the parameter autovacuum_naptime. One worker will be launched for each database, set by the parameter autovacuum_max_workers. Each worker process will check each table within its database and execute VACUUM or ANALYZE as needed. Following will breif about the AUTOVACUUM LAUNCHER PROCESS parameters.

Syslogger Process / Logger Process:

Syslogger

As per the figure, it is clearly understood that all – the utility processes + user backends + Postmaster Daemon are attached to syslogger process for logging the information about their activities. Every process information is logged under $PGDATA/pg_log with the file .log.
Debugging more on the process information will cause overhead on the server. Minimal tuning is always recommended. However, increasing the debug level when required. Click Here for further on logging parameters.

Archiver

Achiver process is optional process, default is OFF.
Setting up the database in archive mode means to capture the WAL data of each segment file once it is filled and save that data somewhere before the segment file is recycled for reuse.
  1. On Database Archivelog mode, once the WAL data is filled in the WAL Segment, that filled segment named file is created under $PGDATA/pg_xlog/archive_status by the WAL Writer naming the file as “.ready”. File naming will be “segment-filename.ready”.
  2. Archiver Process triggers on finding the files which are in “.ready” state created by the WAL Writer process. Archiver process picks the ‘segment-file_number’ of .ready file and copies the file from $PGDATA/pg_xlog location to its concerned Archive destination given in ‘archive_command’ parameter(postgresql.conf).
  3. On successful completion of copy from source to destination, archiver process renames the “segment-filename.ready” to “segment-filename.done”. This completes the archiving process.
It is understood that, if any files named “segement-filename.ready” found in $PGDATA/pg_xlog/archive_status. They are the pending files still to be copied to Archive destination.
Click Here for more information on parameters and Archiving,
Comments/suggestion would be greatly appreciated.