tree 4eef2553df0b330f2cc84319be0c305d5d99a8cb
parent 25c627864175b2eaa432682d92f53914fcf53d5e
author pierventre <pier@opennetworking.org> 1643111199 +0100
committer pierventre <pier@opennetworking.org> 1645743722 -0800

[SDFAB-954] Non-leader instance can mark UNKNOWN the pipeline.

The probe task is not performed in atomic way and between the
initial mastership check and the actual probe the execution
can be blocked many times and the mastership can change. Recheck
the mastership after pipeline probe returns.

We are seeing an issue when a network partition occurs: watchdog
is stuck for 60s before returning and will mark the device offline.
However, in the meanwhile the mastership has been passed to another
instance which is already connected and has already marked the device
online. An harmless side effect of this change is that when we return
from the pipeline config we might be no longer the master and this
will delay in the worst case the markonline of the device for 15s
(next reconcile interval)

Additionally, this patch simplifies the Manager by removing the
executor lock and by using only one worker per device. This change
prevents also the exhaustion of all workers than can easily happen
if there is a network partition that prevents the probe to return
immediately.

Change-Id: I3429cd0598c95589e50f35139f6087f83ceb60f2
