.. _example_workflows: Example Workflows =============================== This section demonstrates some major workflows that can be performed by PyCFRL. Preprocessing Only -------------------------- .. image:: ../supps/preprocessing_only_workflow.png In this workflow, PyCFRL takes in an offline trajectory and then preprocesses the offline trajectory using :code:`SyntheticPreprocessor`. The final output of the workflow is the preprocessed (debiased) offline trajectory. This workflow is appropriate when the user does not want to train policies using PyCFRL. Instead, the user can take the preprocessed trajectory to train a counterfactually fair policy using another reinforcement learning library or application that better fits their needs. **Code:** A detailed code demonstration of this workflow can be found `here `_. Preprocessing + Policy Learning -------------------------- .. image:: ../supps/preprocessing_policy_learning_workflow.png In this workflow, PyCFRL takes in an offline trajectory and then preprocesses the offline trajectory using :code:`SequentialPreprocessor`. After that, the preprocessed trajectory is passed into :code:`FQI` to train a counterfactually fair policy, which is the final output of the workflow. This workflow is appropriate if the user wants to train a policy using PyCFRL. The trained policy can be further evaluated on its value and counterfactual fairness, which is discussed in detail in the "Assessing Policies Using Real Data" workflow later in this section. **Code:** A detailed code demonstration of this workflow can be found `here `_. Assessing Preprocessors Using Synthetic Data -------------------------- .. image:: ../supps/synthetic_data_workflow.png In this workflow, PyCFRL first uses :code:`sample_trajectory()` to sample a trajectory from a :code:`SyntheticEnvironment` whose transition rules are pre-specified. It then preprocesses the sampled trajectory using some :ref:`custom preprocessor ` defined by the user. After that, the preprocessed trajectory is passed into :code:`FQI` to train a policy, which is then assessed using synthetic data via :code:`evaluate_reward_through_simulation()` and :code:`evaluate_fairness_through_simulation()`. The final output of the workflow is the policy trained on the preprocessed data as well as its estimated value and counterfactual fairness metric. This workflow is appropriate when the user wants to examine the impact of some trajectory preprocessing method on the value and counterfactual fairness of the trained policy. **Code:** A detailed code demonstration of this workflow can be found `here `_. Assessing Policies Using Real Data -------------------------- .. image:: ../supps/real_data_workflow.png In this workflow, PyCFRL takes in an offline trajectory and then preprocesses the offline trajectory using :code:`SequentialPreprocessor`. After that, the preprocessed trajectory is passed into :code:`FQI` to train a counterfactually fair policy, which is then assessed using :code:`evaluate_reward_through_fqe()` and :code:`evaluate_fairness_through_model()` based on a :code:`SimulatedEnvironment` that mimics the transition rules of the true environment underlying the training trajectory. The final output of the workflow is the policy trained on the preprocessed data as well as its estimated value and counterfactual fairness metric. This workflow is appropriate when the user is interested in knowing the value and counterfactual fairness achieved by the trained policy when interacting with the true underlying environment. **Code:** A detailed code demonstration of this workflow can be found `here `_. **Conceptual Explanation:** A step-by-step conceptual explanation of this workflow can be found `here `_.