tree 8390254c70043a6ba3636851c83092aaa3f947b1
parent 5a91f5fb742d4a6fb95def0ad632c7c36c139a5e
author pierventre <pier@opennetworking.org> 1643111199 +0100
committer Pier Luigi Ventre <pier@opennetworking.org> 1645743971 +0000

[SDFAB-954] Non-leader instance can mark UNKNOWN the pipeline.

The probe task is not performed in atomic way and between the
initial mastership check and the actual probe the execution
can be blocked many times and the mastership can change. Recheck
the mastership after pipeline probe returns.

We are seeing an issue when a network partition occurs: watchdog
is stuck for 60s before returning and will mark the device offline.
However, in the meanwhile the mastership has been passed to another
instance which is already connected and has already marked the device
online. An harmless side effect of this change is that when we return
from the pipeline config we might be no longer the master and this
will delay in the worst case the markonline of the device for 15s
(next reconcile interval)

Additionally, this patch simplifies the Manager by removing the
executor lock and by using only one worker per device. This change
prevents also the exhaustion of all workers than can easily happen
if there is a network partition that prevents the probe to return
immediately.

Change-Id: I3429cd0598c95589e50f35139f6087f83ceb60f2
