Symptoms

Provider's Control Panel hangs and shows Service Unavailable after HTTP session timeout.

OA UI log (/var/log/poa-ui.log or /var/log/pa/pui/pui.log) contains Java OOM errors:

2016-11-29 16:26:53,886 0b49b3d5a2 8373) DEBUG  ERR                  java.lang.OutOfMemoryError: Java heap space
2016-11-29 16:26:53,894            8354) DEBUG  ERR                  java.lang.OutOfMemoryError: Java heap 

UI log is filled with a large amount of fetch usage info for account_id calls for the same account before the outage, example:

14:23:30.039 [default task-164] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:30.063 [default task-93] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:30.153 [default task-86] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:40.608 [default task-216] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
14:23:45.871 [default task-147] DEBUG c.p.p2.cp.core.pcp.objects.ResCache - [ResCache] fetch usage info for account_id=1010101
...

and these calls are failing:

14:25:45.579 [default task-147] INFO  CORBA - #11663900: Plesk.ResourceManagement._SubscriptionResourceManagerStub.getSubsWithRTTreeList(user_id=10101 (su_user_id=110011)) <<< exit [119708 ms] by exception: IDL:Plesk/ExSystem:1.0
14:25:52.887 [default task-172] INFO  CORBA - #11663974: Plesk.ResourceManagement._SubscriptionResourceManagerStub.getSubsWithRTTreeList(user_id=10101 (su_user_id=110011)) <<< exit [125113 ms] by exception: IDL:Plesk/ExSystem:1.0
14:25:53.047 [default task-93] INFO  CORBA - #11663912: Plesk.ResourceManagement._SubscriptionResourceManagerStub.getSubsWithRTTreeList(user_id=10101 (su_user_id=110011)) <<< exit [127089 ms] by exception: IDL:Plesk/ExSystem:1.0

The account (1010101 in this example) owns more than a thousand of subscriptions.

pemui service restart on the UI host helps to bring back the CP.

Cause

The behavior is acknowledged as a software issue POA-108794: simultaneous ResCache creation in several threads can cause java.lang.OutOfMemoryError: Java heap space.

Resolution

In order to minimize the risk of similar outages, apply the recommendations list in the following article to the UI service:

OA UI is down: java.lang.OutOfMemoryError: GC overhead limit exceeded

If the outage keeps reoccurring, contact PTA/TAM to obtain the permanent fix for the issue.

Internal content