- 存储层重构: HDF5 → DuckDB(UPSERT模式、线程安全存储) - Sync类迁移: DataSync从sync.py迁移到api_daily.py(职责分离) - 模型模块重构: src/models → src/pipeline(更清晰的命名) - 新增因子模块: factors/momentum (MA、收益率排名)、factors/financial - 新增API接口: api_namechange、api_bak_basic - 新增训练入口: training模块(main.py、pipeline配置) - 工具函数统一: get_today_date等移至utils.py - 文档更新: AGENTS.md添加架构变更历史
47 lines
1.1 KiB
Python
47 lines
1.1 KiB
Python
"""ProStock 训练流程模块
|
||
|
||
本模块提供完整的模型训练流程:
|
||
1. 数据处理:Fillna(0) -> Dropna
|
||
2. 模型训练:LightGBM分类模型
|
||
3. 预测选股:每日top5股票池
|
||
|
||
使用示例:
|
||
from src.training import run_training
|
||
|
||
# 运行完整训练流程
|
||
result = run_training(
|
||
train_start="20180101",
|
||
train_end="20230101",
|
||
test_start="20230101",
|
||
test_end="20240101",
|
||
top_n=5,
|
||
output_path="output/top_stocks.tsv"
|
||
)
|
||
|
||
因子使用:
|
||
from src.factors import MovingAverageFactor, ReturnRankFactor
|
||
|
||
ma5 = MovingAverageFactor(period=5) # 5日移动平均
|
||
ma10 = MovingAverageFactor(period=10) # 10日移动平均
|
||
ret5 = ReturnRankFactor(period=5) # 5日收益率排名
|
||
"""
|
||
|
||
from src.training.pipeline import (
|
||
create_pipeline,
|
||
predict_top_stocks,
|
||
prepare_data,
|
||
run_training,
|
||
save_top_stocks,
|
||
train_model,
|
||
)
|
||
|
||
__all__ = [
|
||
# 管道函数
|
||
"prepare_data",
|
||
"create_pipeline",
|
||
"train_model",
|
||
"predict_top_stocks",
|
||
"save_top_stocks",
|
||
"run_training",
|
||
]
|