Symptoms
The issue affects APS application-driven services. Usually, the symptoms are:
- An attempt to place a new order for service plan with APS services fails after 5 minutes;
- Switching to APS application tab in UX1 times out after 5 minutes.
The following could be observed in /var/log/pa/core.log
on OA MN during the issue reproduce:
Jan 17 12:08:26.786 : DBG [rest:5110524 1:16519:7fbe02fb5700 SAAS ]: [ RDBMS] Direct execute [0x7fbeac58e260_0]: SET SESSION statement_timeout = 300000
...
Jan 17 12:08:26.841 : DBG [rest:5110524 1:16519:7fbe02fb5700 SAAS ]: [RDBMS] Prepare [0x7fbea01366d0]: <very long select>
...
Jan 17 12:13:26.850 : DBG [rest:5110524 1:16519:7fbe02fb5700 SAAS ]: [ RDBMS] can't retry. Transaction is in progress
Jan 17 12:13:26.850 : DBG [rest:5110524 1:16519:7fbe02fb5700 SAAS ]: [ APSC] [APS::Controller::RQLQuery::execute] <=== EXIT (by exception) [300.009838]
Jan 17 12:13:26.851 : ERR [rest:5110524 1:16519:7fbe02fb5700 SAAS ]: [ APSC] RQL internal error: ERROR: canceling statement due to statement timeout
Jan 17 12:13:26.851 : DBG [rest:5110524 1:16519:7fbe02fb5700 SAAS ]: [ APSC] [APS::Controller::obtainResourceList] <=== EXIT (by exception) [300.027572]
Cause
OSS MN database backend processes query very slow. The most probably root cause - bloated database tables/indicies.
Resolution
Check for bloated tables using odin-pg-info
tool on OA DB host:
- Install the rpm as per this KB article;
Check the output of:
# odin-pg-info
- Note the tables in the sections where
BLOAT
keyword is mentioned;
Please perform DB maintenance as per the following guide:
OA database maintenance procedures
NOTE: This action could be done for separate tables by specifying table name after the command itself.