refactor: 存储层迁移DuckDB + 模块重构
- 存储层重构: HDF5 → DuckDB(UPSERT模式、线程安全存储) - Sync类迁移: DataSync从sync.py迁移到api_daily.py(职责分离) - 模型模块重构: src/models → src/pipeline(更清晰的命名) - 新增因子模块: factors/momentum (MA、收益率排名)、factors/financial - 新增API接口: api_namechange、api_bak_basic - 新增训练入口: training模块(main.py、pipeline配置) - 工具函数统一: get_today_date等移至utils.py - 文档更新: AGENTS.md添加架构变更历史
This commit is contained in:
46
src/training/__init__.py
Normal file
46
src/training/__init__.py
Normal file
@@ -0,0 +1,46 @@
|
||||
"""ProStock 训练流程模块
|
||||
|
||||
本模块提供完整的模型训练流程:
|
||||
1. 数据处理:Fillna(0) -> Dropna
|
||||
2. 模型训练:LightGBM分类模型
|
||||
3. 预测选股:每日top5股票池
|
||||
|
||||
使用示例:
|
||||
from src.training import run_training
|
||||
|
||||
# 运行完整训练流程
|
||||
result = run_training(
|
||||
train_start="20180101",
|
||||
train_end="20230101",
|
||||
test_start="20230101",
|
||||
test_end="20240101",
|
||||
top_n=5,
|
||||
output_path="output/top_stocks.tsv"
|
||||
)
|
||||
|
||||
因子使用:
|
||||
from src.factors import MovingAverageFactor, ReturnRankFactor
|
||||
|
||||
ma5 = MovingAverageFactor(period=5) # 5日移动平均
|
||||
ma10 = MovingAverageFactor(period=10) # 10日移动平均
|
||||
ret5 = ReturnRankFactor(period=5) # 5日收益率排名
|
||||
"""
|
||||
|
||||
from src.training.pipeline import (
|
||||
create_pipeline,
|
||||
predict_top_stocks,
|
||||
prepare_data,
|
||||
run_training,
|
||||
save_top_stocks,
|
||||
train_model,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
# 管道函数
|
||||
"prepare_data",
|
||||
"create_pipeline",
|
||||
"train_model",
|
||||
"predict_top_stocks",
|
||||
"save_top_stocks",
|
||||
"run_training",
|
||||
]
|
||||
Reference in New Issue
Block a user