Symptoms
Provider's Control Panel hangs and shows Service Unavailable after HTTP session timeout.
OA UI log (/var/log/poa-ui.log
or /var/log/pa/pui/pui.log
) contains Java OOM errors:
2016-11-29 16:26:53,886 0b49b3d5a2 8373) DEBUG ERR java.lang.OutOfMemoryError: Java heap space
2016-11-29 16:26:53,894 8354) DEBUG ERR java.lang.OutOfMemoryError: Java heap
UI log is filled with a large amount of fetch usage info for account_id
calls for the same account before the outage, example:
14:23:30.039 [default task-164] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:30.063 [default task-93] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:30.153 [default task-86] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:40.608 [default task-216] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:45.871 [default task-147] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
...
and these calls are failing:
14:25:45.579 [default task-147] INFO CORBA - #11663900: Plesk.ResourceManagement._SubscriptionResourceManagerStub.getSubsWithRTTreeList(user_id=10101 (su_user_id=110011)) <<< exit [119708 ms] by exception: IDL:Plesk/ExSystem:1.0
14:25:52.887 [default task-172] INFO CORBA - #11663974: Plesk.ResourceManagement._SubscriptionResourceManagerStub.getSubsWithRTTreeList(user_id=10101 (su_user_id=110011)) <<< exit [125113 ms] by exception: IDL:Plesk/ExSystem:1.0
14:25:53.047 [default task-93] INFO CORBA - #11663912: Plesk.ResourceManagement._SubscriptionResourceManagerStub.getSubsWithRTTreeList(user_id=10101 (su_user_id=110011)) <<< exit [127089 ms] by exception: IDL:Plesk/ExSystem:1.0
The account (1010101 in this example) owns more than a thousand of subscriptions.
pemui
service restart on the UI host helps to bring back the CP.
Cause
The behavior is acknowledged as a software issue POA-108794: simultaneous ResCache creation in several threads can cause java.lang.OutOfMemoryError: Java heap space.
Resolution
In order to minimize the risk of similar outages, apply the recommendations list in the following article to the UI service:
OA UI is down: java.lang.OutOfMemoryError: GC overhead limit exceeded
If the outage keeps reoccurring, contact PTA/TAM to obtain the permanent fix for the issue.