Skip to content

Environment Classes

Overview

FinRL provides several pre-built trading environments that are compatible with OpenAI Gym. These environments handle the trading logic, state representation, and reward calculation.

StockTradingEnv

The primary environment for stock and cryptocurrency trading.

Class Definition

class StockTradingEnv(gym.Env):
    def __init__(
        self,
        df: pd.DataFrame,
        stock_dim: int,
        hmax: int,
        initial_amount: int,
        num_stock_shares: list[int],
        buy_cost_pct: list[float],
        sell_cost_pct: list[float],
        reward_scaling: float,
        state_space: int,
        action_space: int,
        tech_indicator_list: list[str],
        turbulence_threshold=None,
        risk_indicator_col="turbulence",
        make_plots: bool = False,
        print_verbosity=10,
        day=0,
        initial=True,
        previous_state=[],
        model_name="",
        mode="",
        iteration=""
    )

Parameters

Parameter Type Description
df pd.DataFrame Price data with OHLCV + indicators
stock_dim int Number of assets to trade
hmax int Maximum shares per trade
initial_amount int Starting cash amount
num_stock_shares list[int] Initial holdings per asset
buy_cost_pct list[float] Buy transaction costs per asset
sell_cost_pct list[float] Sell transaction costs per asset
reward_scaling float Reward normalization factor
state_space int State vector dimensions
action_space int Action vector dimensions
tech_indicator_list list[str] Technical indicators to include
turbulence_threshold float Risk management threshold (optional)
risk_indicator_col str Column name for risk metric (default: "turbulence")
make_plots bool Enable automatic plotting (default: False)
print_verbosity int Logging frequency (default: 10)
day int Starting day index (default: 0)
initial bool Initial state flag (default: True)
previous_state list Previous state for continuity (default: [])
model_name str Model identifier (default: "")
mode str Operating mode (default: "")
iteration str Iteration identifier (default: "")

State Space

The state space consists of:

State = [cash, prices[], holdings[], technical_indicators[]]

Dimensions: 1 + 2*stock_dim + len(tech_indicators)*stock_dim

Example: For 3 stocks with 5 technical indicators: - Cash: 1 dimension - Prices: 3 dimensions
- Holdings: 3 dimensions - Technical indicators: 15 dimensions (5 indicators × 3 stocks) - Total: 22 dimensions

Action Space

Actions represent the number of shares to buy/sell for each asset: - Positive values: Buy shares - Negative values: Sell shares - Range: [-hmax, hmax] for each asset

Reward Function

The default reward is the change in portfolio value:

reward = (end_portfolio_value - begin_portfolio_value) * reward_scaling

Where portfolio value = cash + sum(holdings * current_prices)

Key Methods

reset()

Resets the environment to initial state.

state, info = env.reset()

step(actions)

Executes actions and returns next state.

state, reward, terminated, truncated, info = env.step(actions)

render()

Returns the current state for inspection.

current_state = env.render()

Risk Management

Turbulence Control

When market turbulence exceeds the threshold, the environment: 1. Liquidates all positions (sells everything) 2. Prevents new purchases 3. Only allows selling

# Enable turbulence control
env = StockTradingEnv(
    df=df,
    turbulence_threshold=140,  # Liquidate when turbulence > 140
    **other_params
)

Example Usage

from finrl.meta.env_stock_trading.env_stocktrading import StockTradingEnv
from stable_baselines3.common.vec_env import DummyVecEnv

# Required parameters (no optional kwargs with invalid parameters)
def create_env(data):
    return StockTradingEnv(
        df=data,
        stock_dim=len(tickers),
        hmax=100,
        initial_amount=1000000,
        num_stock_shares=[0] * len(tickers),  # Start with no holdings
        buy_cost_pct=[0.001] * len(tickers),
        sell_cost_pct=[0.001] * len(tickers),
        reward_scaling=1e-4,
        state_space=state_dimensions,
        action_space=len(tickers),
        tech_indicator_list=indicators,
        turbulence_threshold=140
    )

# Create and wrap environment
vec_env = DummyVecEnv([lambda: create_env(processed_data)])

CryptoEnv

Specialized environment for cryptocurrency trading with advanced features.

Class Definition

class CryptoEnv:
    def __init__(
        self,
        config,
        lookback=1,
        initial_capital=1e6,
        buy_cost_pct=1e-3,
        sell_cost_pct=1e-3,
        gamma=0.99
    )

Key Features

  • Lookback Window: Uses historical data for state representation
  • Action Normalization: Automatically scales actions based on price levels
  • Multi-Crypto Support: Handles multiple cryptocurrency pairs
  • Continuous Trading: Designed for 24/7 crypto markets

State Representation

state = [
    cash_normalized,
    holdings_normalized, 
    technical_indicators[t-lookback:t]
]

Example Configuration

# Prepare data arrays
price_array = crypto_data.pivot_table(
    index='date', 
    columns='symbol', 
    values='close'
).values

tech_array = get_technical_indicators(crypto_data)

# Environment config
config = {
    "price_array": price_array,
    "tech_array": tech_array
}

# Create crypto environment
crypto_env = CryptoEnv(
    config=config,
    lookbook=24,  # 24-hour lookback
    initial_capital=100000,
    buy_cost_pct=0.001,
    sell_cost_pct=0.001
)

Environment Variants

StockTradingEnvCashPenalty

Penalizes holding too much cash.

StockTradingEnvStopLoss

Includes automatic stop-loss functionality.

PortfolioOptimizationEnv

Optimizes portfolio weights rather than individual trades.

Best Practices

Environment Configuration

Reward Scaling

Use reward_scaling=1e-4 for typical stock prices to normalize rewards

Transaction Costs

Set realistic transaction costs: 0.1% (0.001) for stocks, 0.25% (0.0025) for crypto

Common Parameter Errors

  • TypeError with unexpected keyword arguments: Ensure you're not passing invalid parameters like day_trading, lookback_window, etc.
  • All required parameters must be provided: df, stock_dim, hmax, initial_amount, num_stock_shares, buy_cost_pct, sell_cost_pct, reward_scaling, state_space, action_space, tech_indicator_list
  • List parameters must match stock_dim: num_stock_shares, buy_cost_pct, sell_cost_pct should have length equal to stock_dim

State Space Calculation

Always verify state space dimensions match your data:

expected_dims = 1 + 2*stock_dim + len(indicators)*stock_dim

Data Requirements

Ensure your DataFrame has these required columns: - date: Trading date (YYYY-MM-DD format) - tic: Asset ticker/symbol - close: Closing price - Technical indicators as specified in tech_indicator_list

Memory Management

For large datasets, consider:

# Use fewer technical indicators
MINIMAL_INDICATORS = ["close_14_sma", "rsi_14"]

# Reduce data frequency
daily_data = resample_to_daily(minute_data)

# Use data chunking for training
def create_env_chunks(df, chunk_size=10000):
    for chunk in df.groupby(df.index // chunk_size):
        yield StockTradingEnv(chunk, **env_kwargs)