Custom Agents
In addition to FQI
, users can also define custom decision-making agents and use them
in PyCFRL. A custom decision-agent often represents a pre-specified decision rule (e.g. making
decisions randomly) or implements another reinforcement learning algorithm not provided by PyCFRL.
To ensure a custom agent is compatible with PyCFRL, it must inherit from the
Agent
class provided by the agents
module. That is,
The custom agent should be a subclass of
Agent
.The custom preprocessor should have an
act()
method whose function name, parameter names, parameter data types, parameter default values, and return type are exactly as that defined in theAgent
class, except that it might have some additional arguments. The input and output lists or arrays should also follow the same Trajectory Array format or have the same shape as those defined inPreprocessor
.
For example, though simple, the following RandomAgent
is a valid custom
agent that will be compatible with PyCFRL.
class RandomAgent(Agent):
def __init__(self, p: int | float = 0.5) -> None:
self.p = p
def act(
self,
z: list | np.ndarray,
xt: list | np.ndarray,
xtm1: list | np.ndarray | None = None,
atm1: list | np.ndarray | None = None,
uat: list | np.ndarray | None = None,
verbose: bool = False
) -> np.ndarray:
if verbose:
print("RandomAgent taking actions...")
N = np.array(z).shape[0]
u = np.random.uniform(0, 1, size=N)
actions = (u < p).astype(int)
return actions
On the other hand, the following agent will not be compatible with PyCFRL
because its act()
does not have uat
in its argument list.
uat
should be included in the argument list here to ensure
compatibility even though it is not used in the function.
class RandomAgent(Agent):
def __init__(self, p: int | float = 0.5) -> None:
self.p = p
def act(
self,
z: list | np.ndarray,
xt: list | np.ndarray,
xtm1: list | np.ndarray | None = None,
atm1: list | np.ndarray | None = None,
verbose: bool = False
) -> np.ndarray:
if verbose:
print("RandomAgent taking actions...")
N = np.array(z).shape[0]
u = np.random.uniform(0, 1, size=N)
actions = (u < p).astype(int)
return actions
If an agent is a valid custom agent, then it can be used wherever
an FQI
can be used. For example, we can use
evaluate_fairness_through_model()
to calculate its counterfactual
fairness metric.
# Suppose e is a SimulatedEnvironment object that has already been trained.
# Also suppose zs, states, actions is a trajectory from the environment
# on which e is trained.
agent = RandomAgent(p=0.2)
evaluate_fairness_through_model(env=e,
zs=zs,
states=states,
actions=actions,
policy=agent)