Open Source · MIT · Production Ready 开源 · MIT 协议 · 生产可用

OpenRLQuant

Reinforcement Learning for Quantitative US Equity Trading

From raw market data to autonomous live execution — in four phases.

⭐ Star on GitHubGitHub 点星 → Quick Start快速开始 ↓ View Demo查看演示

20%+

Annual Return

1.2+

Sharpe Ratio

<15%

Max Drawdown

Python Files

Architecture

Four-Phase System

Each phase builds on the previous. Start from Phase 1, go as far as you need.

PHASE 01 — run_phase1.py

Data & Features

数据管道 · 特征工程 · 回测环境

Multi-source market data ingestion with disk caching. 88 features per stock including technical indicators, macro regime signals, and FinBERT sentiment. Rolling z-score normalization with zero look-ahead bias.

Polygon.ioYahoo Finance FinnhubFinBERTGymnasium

PHASE 02 — run_phase2.py

RL Training

PPO训练 · Transformer策略 · 课程学习

PPO with MLP baseline or Transformer policy. Per-stock temporal encoder with CLS-token aggregation, cross-asset attention for correlation modeling. 4-stage curriculum from calm bull to full crisis.

PPOTransformer CurriculumOptunaMLflow

PHASE 03 — run_phase3.py

Live Execution

实盘接入 · 风控系统 · 监控看板

Alpaca paper and live trading via unified broker interface. 7-layer hard risk constraints independent of the agent. FastAPI inference server with 15 endpoints. Prometheus + Grafana monitoring stack.

Alpaca APIFastAPI PrometheusGrafanaDocker

PHASE 04 — run_phase4.py

Autonomous Ops

自动再训练 · 集成投票 · 调度告警

CUSUM + KS-test drift detection triggers Champion/Challenger retraining with Welch's t-test for promotion. UCB1 ensemble voting with disagreement-aware position scaling. 7 scheduled jobs, Slack + email alerts.

CUSUMUCB1 Ensemble APSchedulerSlackWelch t-test

Demo

Simulated Backtest Results

Illustrative results using synthetic data. Replace with your real backtest output.

Annual Returns by Year (RL vs SPY)

RL Strategy

SPY B&H

Performance Metrics vs Benchmark

Metric	RL Agent	SPY B&H
Annual Return	+23.4%	+11.8%
Sharpe Ratio	1.34	0.72
Max Drawdown	-12.3%	-33.9%
Calmar Ratio	1.90	0.35
Alpha vs SPY	+11.6%	—
Win Rate	57.2%	53.1%

* SIMULATED RESULTS USING SYNTHETIC DATA — NOT REAL TRADING PERFORMANCE

Quickstart

Get Running in Minutes

Free API keys available for all data sources. No paid subscriptions required for Phase 1–2.

STEP 01

Install

克隆 · 虚拟环境 · 依赖安装

Clone the repo, create a virtual environment, and install dependencies. Python 3.10+ required.

STEP 02

Configure

STEP 03

Run

Phase 1 验证 · Phase 2 训练 · Phase 4 部署

Start with Phase 1 validation, train a PPO agent in Phase 2, then launch the full autonomous system.

terminal

# 1. Clone & install
git clone https://github.com/Donvink/OpenRLQuant.git
cd OpenRLQuant
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# 2. Configure API keys
cp .env.example .env
# Fill in: POLYGON_API_KEY, FINNHUB_KEY, ALPACA_API_KEY

# 3. Phase 1 — validate data pipeline (~5 min)
python run_phase1.py --mode quick

# 4. Phase 2 — train PPO agent (~30 min CPU)
python run_phase2.py --mode mlp --use-synthetic --timesteps 500000

# 5. Phase 4 — launch full autonomous system
python run_phase4.py --mode paper --dry-run
#   → API:     http://localhost:8000/docs
#   → Grafana: http://localhost:3000
    

Tech Stack

Built With

Production-grade libraries, no custom wheel reinvention.

Data & Features

yfinance / Polygon.ioMarket data

FinnhubNews & filings

FinBERT / transformersNLP sentiment

pandas-taTechnical indicators

PyArrow / ParquetFeature store

Training & RL

PyTorch 2.0+Deep learning

Stable-Baselines3PPO implementation

GymnasiumRL environment

OptunaHyperparameter search

MLflowExperiment tracking

Execution & Ops

Alpaca APIBroker integration

FastAPI + uvicornInference server

Prometheus + GrafanaMonitoring

APSchedulerJob scheduling

Docker ComposeDeployment

Author

Built by One Person

Contributions welcome — see CONTRIBUTING.md

Leo Zhong

Quant Developer · Open Source

Built OpenRLQuant to bridge the gap between academic RL research and production quantitative trading. The system covers the full lifecycle from data ingestion to live autonomous execution, with real-world constraints like transaction costs, slippage, and multi-layer risk management that most research repositories ignore.

GitHub Twitter / X Email

⚠ OpenRLQuant is for educational and research purposes only. Backtested performance does not guarantee future results. US equity trading involves substantial risk of capital loss. Always paper trade for at least 3 months before using real funds. OpenRLQuant 仅供学习和研究使用。回测结果不代表未来实盘收益。美股交易存在实质性亏损风险。在使用真实资金前，请至少进行3个月的纸面交易验证。

OpenRLQuant

Four-Phase System

四阶段系统架构

Simulated Backtest Results

模拟回测结果展示

Get Running in Minutes

快速上手

Built With

技术栈

Built by One Person

独立开发者