Symptoms

Scheduled backups for VEs that are using particular schedule are not performed after specific date. One might find multitude of exceptions of the following kind in vps.log:

ERROR ClusterImpl [QuartzScheduler_Worker-7] - Failed to save pipeline #4880 [(VE_BACKUP for: VeId[1308432.example]), created at node [im1], step 0 (<start>), mode: EXEC, state:SHARED_RUN, reqIs: null] java.lang.StackOverflowError: null
        at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:44) ~[kryo-3.0.0.jar:na]
        at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:44) ~[kryo-3.0.0.jar:na]

Scheduled backups for VEs for other schedules are performed correctly.

Cause

CCU-14645 damaged database caching layer during handoff of VEs. That resulted in exceptions (of the same kind) during VE_BACKUP procedure which lead to appearance of entries in the database, that are not cleaned-up automatically, but are considered as active backups.

Therefore, the task that scheduled backups does not place more backups in queue.

Resolution

CCU-14645 was fixed in OAP version 7.2. Update is required in order to resolve the issue permanently.

In order to apply the workaround, please contact Odin Technical Support.

Internal content

Link on internal Article