.. _trajectory_arrays: Trajectory Arrays =============================== In PyCFRL, a trajectory refers to the set of collected observed tuples :math:`\{(z_i, s_{i0}, a_{i0}, r_{i0}, s_{i1}, \dots, s_{i,T-1}, a_{i,T-1}, r_{i,T-1}, s_{iT}): i=1,\dots,N\}` describing the sensitive attribute, state, action, and reward of each individual (or subject) at each time step. Each :math:`t=0,\dots,T` is called a time step, and the observed tuple :math:`(s_{it}, a_{it}, r_{it}, s_{i,t+1})` is called the transition for individual :math:`i` at time step :math:`t`. Let :math:`N` be the total number of individuals and :math:`T` be the total number of transitions of each individual. This section introduces Trajectory Arrays, which is how trajectories are represented in PyCFRL. Any trajectory satisfying the :ref:`data requirements ` can be represented by Trajectory Arrays. The trajectory inputs and outputs of PyCFRL functions and classes are all in the form of Trajectory Arrays. To convert trajectory data from a tabular format to Trajectory Arrays or from Trajectory Arrays to a tabular format, see :ref:`Tabular Trajectory Data `. The sensitive attributes, states, actions, and rewards in a trajectory are represented by Trajectory Arrays of different formats, which are introduced below. Sensitive Attributes Format -------------------------------- A Trajectory Array in the Sensitive Attribute Format is used to store the observed sensitive attributes of each individual in the trajectory. It is a 2D list or array with shape :code:`(N, zdim)` where :code:`zdim` is the number of components in the sensitive attribute vector. The (i, j)-th entry of the list or array represents the j-th component of the observed sensitive attribute of the i-th individual. Note that if the sensitive attribute is univariate, then a Trajectory Array in the Sensitive Attribute Format should have shape :code:`(N, 1)` rather than :code:`(N,)`. For example, consider a trajectory dataset with 3 individuals where the sensitive attribute is bivariate. Then the sensitive attributes of this trajectory can be represented in the Sensitive Attribute Format as +---------------+---------------+ | :math:`z_1^1` | :math:`z_1^2` | +---------------+---------------+ | :math:`z_2^1` | :math:`z_2^2` | +---------------+---------------+ | :math:`z_3^1` | :math:`z_3^2` | +---------------+---------------+ Single-time States Format -------------------------------- A Trajectory Array in the Single-time States Attribute Format is used to store the state of each individual in the trajectory at a single time step. It is a 2D list or array with shape :code:`(N, xdim)` where :code:`xdim` is the number of components in the state vector. The (i, j)-th entry of the list or array represents the j-th component of the state variable of the i-th individual at the given time step. Note that if the state vector is univariate, then a Trajectory Array in the Single-time States Format should have shape :code:`(N, 1)` rather than :code:`(N,)`. For example, consider a trajectory dataset with 3 individuals where the state variable is bivariate. Then the states of this trajectory at some time step :math:`t` can be represented in the Single-time States Format as +------------------+------------------+ | :math:`x_{1t}^1` | :math:`x_{1t}^2` | +------------------+------------------+ | :math:`x_{2t}^1` | :math:`x_{2t}^2` | +------------------+------------------+ | :math:`x_{3t}^1` | :math:`x_{3t}^2` | +------------------+------------------+ Full-trajectory States Format -------------------------------- A Trajectory Array in the Full-trajectory States Format is used to store the state of each individual in the trajectory at all time steps. It is a 3D list or array with shape :code:`(N, T+1, xdim)` where :code:`xdim` is the number of components in the state vector. The (i, j, k)-th entry of the list or array represents the k-th component of the state variable of the i-th individual at the j-th time step. Note that if the state vector is univariate, then a Trajectory Array in the Single-time States Format should have shape :code:`(N, T+1, 1)` rather than :code:`(N, T+1)`. For example, consider a trajectory dataset with 3 individuals and 3 transitions where the state variable is bivariate. Then the states of this trajectory at all time steps can be represented in the Full-trajectory States Format as +------------------------------+------------------------------+------------------------------+------------------------------+ | :math:`[x_{10}^1, x_{10}^2]` | :math:`[x_{11}^1, x_{11}^2]` | :math:`[x_{12}^1, x_{12}^2]` | :math:`[x_{13}^1, x_{13}^2]` | +------------------------------+------------------------------+------------------------------+------------------------------+ | :math:`[x_{20}^1, x_{20}^2]` | :math:`[x_{21}^1, x_{21}^2]` | :math:`[x_{22}^1, x_{22}^2]` | :math:`[x_{23}^1, x_{23}^2]` | +------------------------------+------------------------------+------------------------------+------------------------------+ | :math:`[x_{30}^1, x_{30}^2]` | :math:`[x_{31}^1, x_{31}^2]` | :math:`[x_{32}^1, x_{32}^2]` | :math:`[x_{33}^1, x_{33}^2]` | +------------------------------+------------------------------+------------------------------+------------------------------+ Single-time Actions Format -------------------------------- A Trajectory Array in the Single-time Actions Format is used to store the action of each individual in the trajectory at a single time step. It is a 1D list or array with shape :code:`(N,)`. The i-th entry of the list or array represents action of the i-th individual at the given time step. For example, consider a trajectory dataset with 3 individuals. Then the actions of this trajectory at some time step :math:`t` can be represented in the Single-time Actions Format as +----------------+----------------+----------------+ | :math:`a_{1t}` | :math:`a_{2t}` | :math:`a_{3t}` | +----------------+----------------+----------------+ Full-trajectory Actions Format -------------------------------- A Trajectory Array in the Full-trajectory Actions Format is used to store the action of each individual in the trajectory at all time steps. It is a 2D list or array with shape :code:`(N, T)`. The (i, j)-th entry of the list or array represents the action of the i-th individual at the j-th time step. For example, consider a trajectory dataset with 3 individuals and 3 transitions. Then the actions of this trajectory at all time steps can be represented in the Full-trajectory Actions Format as +----------------+----------------+----------------+ | :math:`a_{10}` | :math:`a_{11}` | :math:`a_{12}` | +----------------+----------------+----------------+ | :math:`a_{20}` | :math:`a_{21}` | :math:`a_{22}` | +----------------+----------------+----------------+ | :math:`a_{30}` | :math:`a_{31}` | :math:`a_{32}` | +----------------+----------------+----------------+ Single-time Rewards Format -------------------------------- A Trajectory Array in the Single-time Rewards Format is used to store the reward of each individual in the trajectory at a single time step. It is a 1D list or array with shape :code:`(N,)`. The i-th entry of the list or array represents reward of the i-th individual at the given time step. For example, consider a trajectory dataset with 3 individuals. Then the rewards of this trajectory at some time step :math:`t` can be represented in the Single-time Rewards Format as +----------------+----------------+----------------+ | :math:`r_{1t}` | :math:`r_{2t}` | :math:`r_{3t}` | +----------------+----------------+----------------+ Full-trajectory Rewards Format -------------------------------- A Trajectory Array in the Full-trajectory Rewards Format is used to store the reward of each individual in the trajectory at all time steps. It is a 2D list or array with shape :code:`(N, T)`. The (i, j)-th entry of the list or array represents the reward of the i-th individual at the j-th time step. For example, consider a trajectory dataset with 3 individuals and 3 transitions. Then the rewards of this trajectory at all time steps can be represented in the Full-trajectory Rewards Format as +----------------+----------------+----------------+ | :math:`r_{10}` | :math:`r_{11}` | :math:`r_{12}` | +----------------+----------------+----------------+ | :math:`r_{20}` | :math:`r_{21}` | :math:`r_{22}` | +----------------+----------------+----------------+ | :math:`r_{30}` | :math:`r_{31}` | :math:`r_{32}` | +----------------+----------------+----------------+