Zubr model

In order to do its job Zubr needs to know how its decisions will affect the world, i.e. how the world state in the future depends on its state now and the optimizer's choices.

Imagine a world where we have only one variable - input variable alpha. Its value can be either FALSE or TRUE. Let us introduce an output variable beta which can control two actions:

  • DO_NOTHING - alpha remains as it is now
  • MOVE - alpha switches to not alpha

Let us further assume that the optimizer "likes" alpha=TRUE, and "dislikes" alpha=FALSE. For this simple world our Zubr program will look as follows:


       %option getmodelprobability own // method getModelProbability
       %option getpayoff own // method getPayoff
       %%
       values
       {
            value FALSE, TRUE, DO_NOTHING, MOVE;
       }
       variables
       {
            input variable alpha:{FALSE, TRUE};
            output variable beta:{DO_NOTHING, MOVE};
       }
       %%
       protected float getModelProbability(VisibleState vs1, State s1,  Action
       a, VisibleState vs2, State s2) {
            switch (a.getVariableValue("beta")) {
                 case DO_NOTHING:
                      if (vs1.getVariableValue("alpha") ==
                       vs2.getVariableValue("alpha"))
                           return 1.0f;
                      else
                           return 0.0f;
                      break;
                 case MOVE:
                      if (vs1.getVariableValue("alpha")
                      != vs2.getVariableValue("alpha"))
                           return 1.0f;
                      else
                           return 0.0f;
                      break;
            }
            return 0.0f;
       }
       protected float getPayoff(VisibleState vs) {
            switch (vs.getVariableValue("alpha"))
            {
                 case FALSE:
                      return 0.0f;
                 case TRUE:
                      return 100.0f; // TRUE is "better" than FALSE
            }
            return 0.0f;
       }
       // end of example

In the method getModelProbability you can query the parameter vs1 and vs2 (VisibleState) for input variable values, s1 and s2 (State) for hidden variable values and a (Action) for output variable values. The method getVariableValue defined in all of them returns Value (an enum containing all the Zubr values).