Symptoms

There are random "Server is temporarily closed for maintenance" errors in the Parallels Business Automation - Enterprise (PBA-E) Control panel (CP). The top path does not appear in the Provider Control Panel after the session is idle for a couple of minutes. Some screens are not loading correctly.

BM and www (and probably other containers) are being restarted every 5 minutes. This can be seen in the following records in the corresponding container log. For example, in www.log:

[root@pba ~]# grep "Container WWW_Container" /usr/local/bm/log/www.log
[14-04-12 11:21:18.521 WWW_Contain TH01784 NTE] Container WWW_Container <41:0:0> started.
[14-04-12 11:26:23.446 WWW_Contain TH01836 NTE] Container WWW_Container <44:0:0> started.
[14-04-12 11:31:28.466 WWW_Contain TH01871 NTE] Container WWW_Container <45:0:0> started.
[14-04-12 11:36:33.544 WWW_Contain TH01909 NTE] Container WWW_Container <47:0:0> started.
[14-04-12 11:41:39.259 WWW_Contain TH01952 NTE] Container WWW_Container <49:0:0> started.

[14-04-12 11:36:33.545 UIWorker RQ00041 ERR] error: connection closed by other party 
[14-04-12 11:41:31.546 UIWorker RQ00041 ERR] connection closed by other party

In BM.log:

[root@pba ~]# grep "Server BM_Container started" /usr/local/bm/log/BM.log
[14-04-12 11:21:18.592 BM          TH01779 NTE] Server BM_Container started.
[14-04-12 11:26:23.419 BM          TH01825 NTE] Server BM_Container started.
[14-04-12 11:31:28.486 BM          TH01866 NTE] Server BM_Container started.
[14-04-12 11:36:33.572 BM          TH01904 NTE] Server BM_Container started.
[14-04-12 11:41:39.305 BM          TH01947 NTE] Server BM_Container started.

Watchdog is installed and started on the PBA-E server. In wd.log, you can see regular service restarts:

[root@pba ~]# grep "Dumping/restarting service(s)" /usr/local/bm/log/wd.log
[Sat Apr 12 11:21:14 2014] [HTTP] [INFO] Dumping/restarting service(s)..
[Sat Apr 12 11:26:19 2014] [HTTP] [INFO] Dumping/restarting service(s)..
[Sat Apr 12 11:31:24 2014] [HTTP] [INFO] Dumping/restarting service(s)..
[Sat Apr 12 11:36:29 2014] [HTTP] [INFO] Dumping/restarting service(s)..
[Sat Apr 12 11:41:35 2014] [HTTP] [INFO] Dumping/restarting service(s)..

Cause

Containers are restarted by Watchdog because of a misconfiguration in the Watchdog configuration file.

Resolution

Before each service restart, an error is specified in wd.log, for example:

[Sat Apr 12 11:46:39 2014] [HTTP] [DEBUG] Checking server availability..
[Sat Apr 12 11:46:39 2014] [HTTP] [WARN] Server response status '500 Can't connect to 10.39.87.31:5221 (connect: Connection refused)' is not valid.
[Sat Apr 12 11:46:40 2014] [HTTP] [INFO] Dumping/restarting service(s)..

Check the /usr/local/bm/etc/ssm.conf.d/wd.conf file:

[environment]
MAIL_PREFIX     = @@CUSTOMER_NAME@
MAIL_FROM       = @@CUSTOMER_EMAIL@
MAIL_TO         = bugreports@stellart.net
MODULES         = http service process
WWW_URL         = http://10.39.87.31:5221
CP_PATH         = $(CpUrlPrefix)/nologin/act/AHRC/Language_AvailableLanguagesList/
DUMP_SERVERS    = BM www

# Process settings
VS_TRESHOLD     = 4000000
RS_TRESHOLD     = 1500000

[options]
bin             = $(_name).pl
executor        = /usr/bin/perl
cwd             = $(_root)
summary         = Stellart Watchdog
arguments       =

In the above example, the BM and www containers are restarted by Watchdog because of service unavailability: DUMP_SERVERS = BM www

The following URL is used to check the www container: WWW_URL = http://10.39.87.31:5221

This URL is unaccessible, which causes constant service restarts. Change the URL in the wd.conf file to default value:

WWW_URL = https://127.0.0.1

Restart Watchdog to apply the changes:

/etc/init.d/pba restart wd

For more details on Watchdog configuration, please refer to the following article - Watchdog Installation/Upgrade for PBA 5.4

Internal content