Installing the PostgreSQL software

Installing the PostgreSQL packages

The dev-db/postgresql-server package, like all other packages in a Gentoo Linux environment, will automatically install any required dependencies. One of these dependencies is a correctly installed and configured logging daemon. If you have not already installed one now is a good time to follow the System logging with syslog-ng guide before proceeding with the PostgreSQL installation.

If you intend to implement the Logging to a database section of the System logging with syslog-ng guide then you will, fairly obviously, need to return back to this guide and correctly install PostgreSQL before continuing with that section. You should also make sure that you read and understand the sections on log priorities for both the app-admin/syslog-ng and dev-db/postgresql-server packages so that log-message loops can be avoided.

Before we install any packages we should ensure that the correct use-flags will be used so that all required functionality is made available and unnecessary functionality is not included. The dev-db/postgresql-server package and its dependencies provide a variety of use-flags only some of which will be discussed further here. As usual feel free to add and remove use-flags at will although the minimum set which are required for using this guide in its entirety are shown below.

lisa ~ # emerge -pv postgresql-server:9.0

These are the packages that would be merged, in order:

Calculating dependencies... done!

[ebuild   N    ] app-admin/eselect-postgresql-0.3
[ebuild   N    ] dev-db/postgresql-base-9.0.3 USE="ldap nls pam readline ssl zlib -doc -kerberos -pg_legacytimestamp -threads"
[ebuild   N    ] dev-db/postgresql-server-9.0.3 USE="nls perl python -doc -pg_legacytimestamp -tcl -uuid -xml"

Once you are confident that the correct use-flags are set for the dev-db/postgresql-server package, and any dependencies it may require, you can proceed with the installation by issuing the emerge command shown below.

lisa ~ # emerge postgresql-server:9.0

As you can see from the above examples we have specified that we are only interested in versions of the dev-db/postgresql-server package which fill a specific slot, in this case for version 9.0. This is to ensure that our active version of PostgreSQL will never be removed when an emerge --depclean action is performed. It also ensures that when a future major version of the package is released we will not automatically install this version until we decide to upgrade. Minor versions, such as 9.0.3, will still be installed automatically as they are compatible with our current installed version.

Creating a basic PostgreSQL configuration

The PostgreSQL ebuild creates a default daemon configuration file at /etc/conf.d/postgresql-9.0 which defines some variables which may be of interest to us before we issue the command to run the configuration part of the installation process. These configuration variables are briefly described below along with some more sensible defaults where appropriate.

PGDATA: This variable contains the path to the configuration data used by this PostgreSQL cluster. By default this configuration data will be located at /etc/postgresql-9.0/ which is probably acceptable.
DATA_DIR: This variable contains the path to the database storage location. By default the databases will be located under /var/lib/postgresql/ which, on anything other than a hobbyist system at least, is less than ideal. We recommend creating a separate partition or logical volume to contain the databases to both protect them from other files filling the filing system, which can result in database corruption, and ensure that the disk space they will be occupying will not be fragmented by existing files. This approach also enables backups to be made more easily, especially when using LVM or EVMS, and can also help improve overall system security and stability. Unless this partition or volume will be mounted at /var/lib/postgresql this value will need to be changed.
PG_INITDB_OPTS: This variable passes options to the initial database creation script. The most common option here is to set the value "-E UTF8" so that initial databases, and those subsequently created with no encoding specified, will use the UTF-8 encoding for storing character data. Another common use is to modify the default superuser name, for example using "-U pgadmin" to change the superuser name to pgadmin instead of the default postgres.
PGUSER, PGGROUP: These variables contain the user and group setting which the PostgreSQL daemon process will be started with. The defaults are both postgres which makes sense for most installations and should not need to be changed.
PGOPTS: The PGOPTS variable, as its name suggests, is used to pass additional startup options to the PostgreSQL daemon process. The default value is an empty string indicating that no additional options are specified. Additional configuration options can also be specified in ${PGDATA}/postgresql.conf.
WAIT_FOR_START, WAIT_FOR_STOP: The WAIT_FOR_START and WAIT_FOR_STOP variables are used to specify whether the pg_ctl application should wait for the database server to start or stop before returning. The default values for both is -w which specifies that the pg_ctl application should wait for the daemon to start or stop. Changing the appropriate value to -W will reverse this behaviour.
START_TIMEOUT: The START_TIMEOUT variable indicates how long the pg_ctl application should wait for the server process to start before assuming that it failed and returning an error. The default value specifies a wait period of up to sixty seconds which should be more that sufficient.
NICE_QUIT, NICE_TIMEOUT: The NICE_QUIT variable is used to specify whether the server process should wait for any clients with open connections to close them after the server has received a SIGTERM signal. The default value here is thirty seconds which is probably a reasonable amount of time to allow for this action unless you have any particularly slow clients. The NICE_TIMEOUT variable can be used to override the length of the timeout period.
RUDE_QUIT, RUDE_TIMEOUT: The RUDE_QUIT variable is used to specify whether the server should forcibly disconnect clients, rolling back any open transactions in the process, after the NICE_QUIT timeout period has passed. If this value is set to NO the server process will continue to run until all clients have voluntarily disconnected. The default value is thirty seconds which is probably a reasonable amount of time to allow for this. The RUDE_TIMEOUT variable can be used to override the length of the timeout period if required.
FORCE_QUIT, FORCE_TIMEOUT: The FORCE_QUIT variable is used to specify whether the server should be forcibly terminated after the NICE_TIMEOUT and RUDE_TIMEOUT periods have expired. The default value is NO indicating that the server should be allowed to shutdown normally. If the server process is killed as a result of specifying YES here then a recovery-run will be executed when the server is next started.

A complete example configuration for the postgresql daemon is provided below for your convenience.

/etc/conf.d/postgresql-9.0

PGDATA="/etc/postgresql-9.0"
DATA_DIR="/mnt/databases/postgres/9.0"
PG_INITDB_OPTS="-E UTF8 -U pgadmin"
PGUSER="postgres"
PGGROUP="postgres"
WAIT_FOR_START="-w"
WAIT_FOR_STOP="-w"
START_TIMEOUT=60
NICE_QUIT="YES"
NICE_TIMEOUT=60
RUDE_QUIT="YES"
RUDE_TIMEOUT=30
FORCE_QUIT="NO"
FORCE_TIMEOUT=2

Once you are happy that the configuration is to your liking you can install a default configuration and set of internal databases for the PostgreSQL server by issuing the following command.

lisa ~ # emerge postgresql-server --config

Configuring basic security settings

While the ebuild creates a default configuration for you, including the postgres user and group, it does not create a user account for administrative purposes. So that we can perform local administrative duties using an account other than root or the postgres account which the postgresql daemon process will use we need to create such an account as shown below.

lisa ~ # useradd -d /home/pgadmin -N -g users -G postgres -m pgadmin
lisa ~ # passwd pgadmin

New UNIX password: *******

Retype new UNIX password: *******

passwd: password updated successfully

When the ebuild created the default system databases several other configuration files were created which are specific to the installation just performed. Among these was pg_hba.conf which contains configuration information for Postgres Host Based Authentication. By default all local users are trusted completely, which is probably not what we want. The example below shows the modifications required to restrict local access to the pgadmin user, who will be trusted completely, and restrict remote IPv4 access to users who authenticate using an MD5 hash of the appropriate Postgres password.

/etc/postgresql-9.0/pg_hba.conf

# TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD

# "local" is for Unix domain socket connections only
local   all         all                               trust
local   all         pgadmin                           trust
local   all         all                               md5

# IPv4 local connections:
host    all         all         127.0.0.1/32          trust
host    all         all         127.0.0.1/32          md5

# IPv6 local connections:
host    all         all         ::1/128               trust
host    all         all         ::1/128               md5

# IPv4 remote connections:
host    all         all         10.0.0.0/8            md5
host    all         all         192.168.0.0/16        md5

Configuring basic network settings

Now that some basic security options have been set all that remains for our basic configuration is to modify the listen_addresses setting in the postgresql.conf so that remote IP clients can connect to the server. Without this modification the server will only bind to the local loopback address and therefore only local clients would be able to connect.

/etc/postgresql-9.0/postgresql.conf

# - Connection Settings -

listen_addresses = 'localhost'
listen_addresses = '*'

Configuring syslog settings

The default configuration created during the PostgreSQL installation process instructs the postgres daemon to send any log output to stderr. This may be acceptable in a development environment but in a production environment we almost certainly want the logs sent to a file or other persistent storage. The following example shows the modifications required to instruct the postgres daemon to send any log output to the syslog using facility code LOCAL0 and postgres as the source identity.

/etc/postgresql-9.0/postgresql.conf

# - Where to Log -

log_destination = 'stderr'
log_destination = 'syslog'
syslog_facility = 'LOCAL0'
syslog_ident    = 'postgres'

# - When to Log -

#log_min_messages = warning
log_min_messages = notice

#log_error_verbosity = default
log_error_verbosity = verbose

As we shall be experimenting with various configurations we have also increased the log priority from warning to notice and changed the log verbosity to provide more information to assist in debugging any problems we may encounter.

If you intend to implement the Logging to a database section of the System logging with syslog-ng guide then you should make sure that you read and understand the sections on log priorities for both the syslog-ng and postgresql-server packages so that log-message loops can be avoided.

Starting the PostgreSQL daemon

Now that the default configuration file along with a set of internal databases has been created and the basic security, networking and logging options have been configured you can start the PostgreSQL daemon and add it to the default run-level by issuing the following commands.

lisa ~ # /etc/init.d/postgresql-9.0 start
lisa ~ # rc-update add postgresql-9.0 default