Supervisory Control under Local Mean Payoff Constraints
Abstract
The paper investigates quantitative supervisory control with a local mean payoff objective for weighted discrete event system. The supervisor is designed to ensure that the mean payoff of weights over a fixed number of transitions never drops below a given threshold, aka the system stability. The algorithms transfer the supervisory control problem to a biparty game between the supervisor and the environment.
Automaton Model
The weighted discrete event system can be modeled as an automaton. Here shows an example. The state with “!” is unsafe ($x_8$). The set of controllable events is $E_c=\{a,b,c,d,e,f\}$, and the set of uncontrollable events is $E_{uc}=\{u_1,u_2,u_3,u_4,u_5\}$. The localhost website gui is supported by the WebMachine from transitions_gui package.
1 | def visualizeMachine(states, transitions, initial, name, ordered_transitions=False, ignore_invalid_triggers=True, auto_transitions=False): |
Algorithm 1
The first algorithm is to transform the automaton model to a biparty game. It has two steps: insert transition states into the original automaton, and remove deadlocking or unsafe states. We can use DFS to find out all transitions, and generally the inserted transition states of one original state are the combination of all controllable events at that state plus the uncontrollable ones.
1 | def DoDFS(y, X, x0, f, Ec, Euc, w, Qy, Qz, G, fyz, fzy): |
Algorithm 2
The second algorithm is to find the supervisor’s winning region. It first calculates the window mean payoff function which tracks the best worst-case accumulative weights that the supervisor may achieve from a state within N event occurrences (window size). Then the goal equivalents to adopting a control strategy to obtain a nonnegative accumulative payoff while the environment aims to spoil that goal by enforcing negative payoffs. We can use divide-and-conquer to find out all stable windows where the local window mean payoffs are all nonnegative. The example has a window size of three.
1 | def StableWindow(Qy, Qz, fyz, fzy, w, N): |
Algorithm 3
The third algorithm is to reconstruct and unfold the result of the second algorithm. It has two steps. The first one is to merge all stable windows in the second algorithm together, and add original states back to the automaton so that each original state has only one transit-out arrow. The second one is to remove all the inserted environment states to simplify the automaton and finally give a stable workflow.
1 | def Unfold(y0u, Qyu, Qzu, fyzu, fzyu, Qy, Qz, fyz, fzy): |