Add_environment task is in rescheduled state after Cloud Infrastructure 18.0 upgrade.
/var/log/pa/core.log on Management node contains only
202 Accepted response:
Sep 26 01:59:41.022 : DBG [task:171895968:50074 Thread-276-(ActiveMQ-client-global-threads):4835 pau]: c.p.p.s.j.e.JobTrackManagerBean Starting job 171895968 name 'Execute operation 'add_environment' with ID 11453bab-883f-4b4d-a6d4-f3fb0b632850 on resource ab4cfa6b-32c3-4eb9-9b7a-48c8678b5c08' queue APSAsyncOperations ... Sep 26 01:59:41.030 : DBG [task:171895968:50074 Thread-276-(ActiveMQ-client-global-threads):4835 pau]: c.p.p.s.c.e.EndpointExecutorBean Begin request: 'PUT https://192.168.1.99:9000/api/payg/ab4cfa6b-32c3-4eb9-9b7a-48c8678b5c08/add_environment' TLSv1.2 ... Sep 26 01:59:41.034 : DBG [task:171895968:50074 Thread-276-(ActiveMQ-client-global-threads):4835 pau]: c.p.p.s.c.e.EndpointExecutorBean End request: 'PUT https://192.168.1.99:9000/api/payg/ab4cfa6b-32c3-4eb9-9b7a-48c8678b5c08/add_environment' '202 Accepted'
Same could be catched in
/var/log/pa/vps.aps.log on CIA endpoint:
2019-09-26 01:59:41,032 aps_11453bab-883f-4b4d-a6d4-f3fb0b632850_49643 INFO ImplicitAPSIdInjectorWebFilter [https-jsse-nio-9000-exec-17] - HTTP Request PUT /api/payg/ab4cfa6b-32c3-4eb9-9b7a-48c8678b5c08/add_environment 2019-09-26 01:59:41,033 aps_11453bab-883f-4b4d-a6d4-f3fb0b632850_49643 INFO HttpLoggingFeature [https-jsse-nio-9000-exec-17] - Server has received a request on thread https-jsse-nio-9000-exec-17 * [APSC->PACI] > PUT https://192.168.1.99:9000/api/payg/ab4cfa6b-32c3-4eb9-9b7a-48c8678b5c08/add_environment ... 2019-09-26 01:59:41,034 aps_11453bab-883f-4b4d-a6d4-f3fb0b632850_49643 INFO HttpLoggingFeature [https-jsse-nio-9000-exec-17] - Server responded with a response on thread https-jsse-nio-9000-exec-17 * [APSC<-PACI] < 202 < APS-Retry-Timeout: 10 < APS-Info: Creating Server < Content-Type: application/xml
Nothing is received on Instance Manager side in
All required ports are opened according to CIA Endpoint Deployment Guide
The beginning of the request contains
java.net.ConnectException: Connection timed out exception:
2019-09-23 04:21:32,376 (aps_11453bab-883f-4b4d-a6d4-f3fb0b632850_6) ERROR AsyncClientImpl [AsyncClientResponseThread-1] - unable to process response: java.net.ConnectException: Connection timed out
IM tries to communicate over
4477 port with the
Endpoint using non-backnet IP address due to misconfiguration on Endpoint setup step during upgrade.
VE was successfully created on
IM side, but
Endpoint did not receive a callback from
Endpoint considers this operation as not finished.
CallbackHost communication IP address on the CIA endpoint. Specify correct backnet address that has
4477 port opened in firewall:
# cat /usr/local/share/PACI-aps/paci-config.xml | grep callbackHost <callbackHost>192.168.2.99</callbackHost></im>
PACI-aps endpoint service:
# systemctl restart PACI-aps
However, it could not help to fix existing failed
add_environment tasks if they were stuck in rescheduled state too long.
IM tries to raise a callback too many times and if amount of retries runs out of limits, there is no way to fix this.
Those tasks could be only cancelled.