Question

Operations Automation in its day-to-day work relies on the integrated database. What actions could be done by system administrators to keep database in good shape? How often DB should be maintained

Answer

Despite that OA database uses modern PostgreSQL engine, it still requires some maintenance work to be performed in order to keep performance at the top level. Otherwise, poorly maintained database could lead to bad performance of queries which will proportionally slow down all interactions with OA interface and even may interfere with services provisioning.

There are two types of maintenance operations:

  1. Routine autovacuum - it is performed automatically by DB engine in order to reclaim bloated space in database files. Autovacuum is configured in PostgreSQL settings by odin-pg-tune utility either automatically (since OA 7.1) during OA upgrade/hotfix installation or by launching odin-pg-tune manually.

  2. Quarterly checks and maintenance - these are executed by DB administrator each 3-6 months in order to ensure that current auto-vacuum settings are sufficient for keeping database in good shape. Also, these checks will provide useful information in case of any issues with OA platform performance.

Universal solution for the database maintenance is Vacuuming. It is standard procedure during which obsolete records (tuples) are being removed and database space is reclaimed. During routine vacuum only data inside of existing table files is affected and services operation is not interrupted. In contrary, full vacuum rebuilds table files anew, so indexes are recreated too. That is the most effective way to fix performance issues on highly bloated database, but it requires stop of OA services.

Since OA 7.1 with PostgreSQL 9.6 autovacuum routine is enabled. Autovacuum performs periodic cleanup of DB tables basing on bloat level and resources required for vacuuming. While autovacuum tries to be 'invisible' in terms of DB performance and does not eliminate all obsolete tuples on sight, it allows to maintain reasonable level of dead records across the database. So if DB became bloated before autovacuum has been enabled, additional one-time manual maintenance is required.

In order to check whether DB is well maintained, gather maintenance metrics by executing the following commands. These queries do not require stop of service and could be executed anytime.

Preparation procedures

  1. Run odin-pg-info utility on the database host:

    # odin-pg-info --db-name='oss' --db-host='/tmp/' --db-user='postgres'
    
  2. In the output examine the data under All tables sorted by BLOAT_SZ header (some columns are skipped for brevity):

                                                                 ACTUAL    CLEAR
    TABLE                                           *BLOAT_SZ    TBL_SZ   TBL_SZ  BLOAT%
    ---------------------------------------------  ----------  --------  -------  ------
    aps_resource                                       145 MB    507 MB   363 MB  28.50%
    on_screen_notification_activity_log_params         144 MB   3280 MB  3136 MB   4.39%
    aps_resource_link                                  127 MB   2235 MB  2108 MB   5.69%
    

    Any table of 1 GB size or more with 30+% of BLOAT% should be added to maintenance list. Tables less than 1GB size should be maintained if bloat exceeds 50+%.

  3. Examine All indexes sorted by BLOAT% table (some columns are skipped for brevity):

                                                                                      ACTUAL                    
    TABLE                     INDEX                                     BLOAT_SZ      IDX_SZ     TBL_SZ  *BLOAT%
    ------------------------  --------------------------------------------------  ---------- ----------  -------
    aps_property_value        IX_aps_property_value_vstring              4951 MB     6465 MB    7202 MB     76.6
    aps_property_value        IX_aps_property_value_aps_type_property    2546 MB     3087 MB    7202 MB     82.5
    aps_resource_link         IX_aps_resource_link_target_node           1306 MB     1758 MB    2234 MB     74.3
    aps_resource_link         IX_aps_resource_list_edge                   873 MB     1427 MB    2234 MB     61.2
    

    If an index bloat is more than 30% and table size is exceeds 1GB, add this table to maintenance list.

Maintenance procedures

All procedures below should be performed during the maintenance window since some operations require stop of OA services and database performance will degrade during vacuuming.

  1. Stop OA services on the management node as described in KB.

    It is required to stop any services which use database since VACUUM FULL will lock processed tables blocking any queries.

  2. Connect of OA database and run VACUUM FULL VERBOSE ANALYZE for all tables from previously prepared maintenance list one by one:

    oss=> VACUUM FULL VERBOSE ANALYZE aps_property_value;
    oss=> VACUUM FULL VERBOSE ANALYZE aps_resource_link;
    

    Warning: VACUUM FULL copies all 'alive' records from each table to temporary one. So this step requires amount of free diskspace equal to the biggest table of the database.

    Duration of full vacuum scales with size of the table(s) and mostly depends on disks I/O performance. Our tests on virtual container based on fast HDDs, required 10-11s per 100MB of table. So vacuuming of the table of 2.7 GB size took 280 seconds. ANALYZE will update query planner statistics to adapt to how data is distributed after vacuum.

    Notice: VACUUM could be canceled at runtime with to harmful consequences.

  3. Start OA services as described in KB.

External references:

  1. vacuum command
  2. routine vacuuming

Internal content

Link on internal Article