Symptoms
Various tasks related to VPSManager service controller (related to Virtuozzo Containers OA module), e.g. backups, migrations, user-triggered tasks, fail massively.
Among others, there's a Set user password in VPS task getting rescheduled every 5 minutes.
The log output of the task:
Feb 27 19:04:35.491 : DBG [task:57864628:690778 p:-default-threadpool;-w:-Idle:1088 pau]: c.p.p.s.h.e.HostMngmtBean CORBA::TRANSIENT|COMM_FAILURE while contacting host 1010, marking it as unmanageable
VPSManager service controller crashes on each task run, there are a lot of core dump files inside /usr/local/pem/var/cores
directory.
The last entry in the log files before the error:
Feb 27 19:04:34.533 : DBG [task:57864628:690778 1:32573:7fa22ebfd700 VPSManager ]: [ VEManager_impl::doPerformHostOperations] ===> ENTRY
Stack trace of the crash dumps:
(gdb) bt
#0 0x00007fe3a3678694 in vfprintf () from /lib64/libc.so.6
#1 0x00007fe3a36a4179 in vsnprintf () from /lib64/libc.so.6
#2 0x00007fe39ef0f920 in _log_ap(char const*, char const*, LogPriority, char const*, __va_list_tag*) () from /usr/local/pem/libexec/VPSManager.so.7.2.0.16
#3 0x00007fe39ec62a54 in pem_log () from /usr/local/pem/libexec/VPSManager.so.7.2.0.16
#4 0x00007fe39ec643c3 in Plesk::VEManager_impl::doPerformHostOperations(SDK::PerformHostOperations::Args const&) () from /usr/local/pem/libexec/VPSManager.so.7.2.0.16
#5 0x00007fe3a85a6c2e in SDK::PerformHostOperations::ServantImpl::doPerformHostOperations(Plesk::PerformHostOperations::Args const&) () from /usr/local/pem/lib/libplatform.so
#6 0x00007fe3a84b5fba in POA_Plesk::doPerformHostOperations_PerformHostOperations::execute() () from /usr/local/pem/lib/libplatform.so
#7 0x00007fe3a6c582c3 in TAO::ServerRequestInterceptor_Adapter_Impl::execute_command(TAO_ServerRequest&, TAO::Upcall_Command&) () from /lib64/libTAO_PI_Server.so.2.2.1
#8 0x00007fe3a69ebe33 in TAO::Upcall_Wrapper::upcall(TAO_ServerRequest&, TAO::Argument* const*, unsigned long, TAO::Upcall_Command&, TAO::Portable_Server::Servant_Upcall*, CORBA::TypeCode* const*, unsigned int) ()
from /lib64/libTAO_PortableServer.so.2.2.1
#9 0x00007fe3a84b5a49 in POA_Plesk::PerformHostOperations::doPerformHostOperations_skel(TAO_ServerRequest&, TAO::Portable_Server::Servant_Upcall*, TAO_ServantBase*) () from /usr/local/pem/lib/libplatform.so
OA runs with TRACE log level:
[root@osscore ~]# grep loglevel /usr/local/pem/etc/pleskd.props
loglevel=TRACE
Cause
The issue is recognized as POA-114870: VPSManager SC crashes with TRACE log level enabled.
Resolution
To avoid the crashes, decrease the log level to DEBUG:
[root@osscore ~]# grep loglevel /usr/local/pem/etc/pleskd.props
loglevel=DEBUG
Restart OA core services:
How to restart OA system services: UI, Management Node, Agents