ddql_optimal_execution.trainer.Trainer

class ddql_optimal_execution.trainer.Trainer(agent: ddql_optimal_execution.agent._ddql.DDQL, env: ddql_optimal_execution.environnement._env.MarketEnvironnement, **kwargs)[source]

This class is used to train a DDQL agent in a given environment.

agent

The agent attribute is an instance of the DDQL class, which is a reinforcement learning algorithm

Type

DDQL

used for decision making in an environment.
env

The env attribute is an instance of the MarketEnvironnement class, which represents the

Type

MarketEnvironnement

environment in which the agent will interact and learn. It provides the agent with information about
the current state of the market and allows it to take actions based on that information.
exp_replay

The exp_replay attribute is an instance of the ExperienceReplay class, which is a memory buffer

Type

ExperienceReplay

that stores and retrieves experiences for reinforcement learning agents.
pretrain(max_steps: int = 1000, batch_size: int = 32)[source]

This function pretrains a DDQL agent by running random episodes, taking limit actions (sell all at the beginning or the end) and storing the experiences in an

experience replay buffer.
train(max_steps: int = 1000, batch_size: int = 32)[source]

This function trains a DDQL agent by running episodes, taking actions based on the current state of the environment, and storing the experiences in an

experience replay buffer.
__init__(agent: ddql_optimal_execution.agent._ddql.DDQL, env: ddql_optimal_execution.environnement._env.MarketEnvironnement, **kwargs)[source]

This function initializes an object with an agent, environment, and experience replay capacity.

Parameters
  • agent (DDQL) – The agent parameter is an instance of the DDQL class, which is a reinforcement learning algorithm

  • environment. (used for decision making in an) –

  • env (MarketEnvironnement) – The env parameter is an instance of the MarketEnvironnement class, which represents the

  • about (environment in which the agent will interact and learn. It provides the agent with information) –

  • information. (the current state of the market and allows it to take actions based on that) –

Methods

__init__(agent, env, **kwargs)

This function initializes an object with an agent, environment, and experience replay capacity.

fill_exp_replay([max_steps, verbose])

This function fills an experience replay buffer with experiences from random episodes.

pretrain([max_steps, batch_size])

This function pretrains a DDQL agent by running random episodes, taking limit actions (sell all at the beginning or the end) and storing the experiences in an experience replay buffer.

test([max_steps])

train([max_steps, batch_size])

This function trains an agent using the DDQL algorithm and an experience replay buffer.

__random_border_actions(p_bar: Optional[tqdm.tqdm] = None)

This function runs a random episode, taking limit actions (sell all at the beginning or the end) and storing the experiences in an experience replay buffer.

fill_exp_replay(max_steps: int = 1000, verbose: bool = True)[source]

This function fills an experience replay buffer with experiences from random episodes.

Parameters
  • max_steps (int) – The max_steps parameter is the maximum number of steps that the function will take before

  • replay (stopping. It is used to prevent the function from running indefinitely if the experience) –

  • full. (buffer is) –

pretrain(max_steps: int = 1000, batch_size: int = 32)[source]

This function pretrains a DDQL agent by running random episodes, taking limit actions (sell all at the beginning or the end) and storing the experiences in an experience replay buffer.

Parameters
  • max_steps (int, optional) – The maximum number of steps to pretrain the agent for.

  • batch_size (int, optional) – The number of experiences to sample from the experience replay buffer at each training step.

test(max_steps: int = 1000)[source]
train(max_steps: int = 1000, batch_size: int = 32)[source]

This function trains an agent using the DDQL algorithm and an experience replay buffer.

Parameters
  • max_steps (int, optional) – max_steps is an optional integer parameter that specifies the maximum number of steps to train the

  • value (agent for. If the number of steps taken during training exceeds this) –

  • process (the training) –

  • stop. (will) –

  • batch_size (int, optional) – batch_size is an optional integer parameter that specifies the number of experiences to sample

  • step. (from the experience replay buffer at each training) –