ddql_optimal_execution.trainer.Trainer¶

class ddql_optimal_execution.trainer.Trainer(agent: ddql_optimal_execution.agent._ddql.DDQL, env: ddql_optimal_execution.environnement._env.MarketEnvironnement, **kwargs)[source]¶

This class is used to train a DDQL agent in a given environment.

agent¶

The agent attribute is an instance of the DDQL class, which is a reinforcement learning algorithm

Type: DDQL

used for decision making in an environment.

env¶

The env attribute is an instance of the MarketEnvironnement class, which represents the

Type: MarketEnvironnement

environment in which the agent will interact and learn. It provides the agent with information about

the current state of the market and allows it to take actions based on that information.

exp_replay¶

The exp_replay attribute is an instance of the ExperienceReplay class, which is a memory buffer

Type: ExperienceReplay

that stores and retrieves experiences for reinforcement learning agents.

pretrain(max_steps: int = 1000, batch_size: int = 32)[source]¶: This function pretrains a DDQL agent by running random episodes, taking limit actions (sell all at the beginning or the end) and storing the experiences in an

experience replay buffer.

train(max_steps: int = 1000, batch_size: int = 32)[source]¶: This function trains a DDQL agent by running episodes, taking actions based on the current state of the environment, and storing the experiences in an

experience replay buffer.

__init__(agent: ddql_optimal_execution.agent._ddql.DDQL, env: ddql_optimal_execution.environnement._env.MarketEnvironnement, **kwargs)[source]¶

This function initializes an object with an agent, environment, and experience replay capacity.

Parameters

agent (DDQL) – The agent parameter is an instance of the DDQL class, which is a reinforcement learning algorithm
environment. (used for decision making in an) –
env (MarketEnvironnement) – The env parameter is an instance of the MarketEnvironnement class, which represents the
about (environment in which the agent will interact and learn. It provides the agent with information) –
information. (the current state of the market and allows it to take actions based on that) –

Methods

`__init__`(agent, env, **kwargs)	This function initializes an object with an agent, environment, and experience replay capacity.
`fill_exp_replay`([max_steps, verbose])	This function fills an experience replay buffer with experiences from random episodes.
`pretrain`([max_steps, batch_size])	This function pretrains a DDQL agent by running random episodes, taking limit actions (sell all at the beginning or the end) and storing the experiences in an experience replay buffer.
`test`([max_steps])
`train`([max_steps, batch_size])	This function trains an agent using the DDQL algorithm and an experience replay buffer.

__random_border_actions(p_bar: Optional[tqdm.tqdm] = None)¶: This function runs a random episode, taking limit actions (sell all at the beginning or the end) and storing the experiences in an experience replay buffer.

fill_exp_replay(max_steps: int = 1000, verbose: bool = True)[source]¶

This function fills an experience replay buffer with experiences from random episodes.

Parameters

max_steps (int) – The max_steps parameter is the maximum number of steps that the function will take before
replay (stopping. It is used to prevent the function from running indefinitely if the experience) –
full. (buffer is) –

pretrain(max_steps: int = 1000, batch_size: int = 32)[source]¶

This function pretrains a DDQL agent by running random episodes, taking limit actions (sell all at the beginning or the end) and storing the experiences in an experience replay buffer.

Parameters

max_steps (int, optional) – The maximum number of steps to pretrain the agent for.
batch_size (int, optional) – The number of experiences to sample from the experience replay buffer at each training step.

test(max_steps: int = 1000)[source]¶

train(max_steps: int = 1000, batch_size: int = 32)[source]¶

This function trains an agent using the DDQL algorithm and an experience replay buffer.

Parameters

max_steps (int, optional) – max_steps is an optional integer parameter that specifies the maximum number of steps to train the
value (agent for. If the number of steps taken during training exceeds this) –
process (the training) –
stop. (will) –
batch_size (int, optional) – batch_size is an optional integer parameter that specifies the number of experiences to sample
step. (from the experience replay buffer at each training) –