refactor: 代码审查修复 - 日期过滤、性能优化、数据泄露防护
- 修复 data_loader.py 财务数据日期过滤,支持按范围加载 - 优化 MADClipper 使用窗口函数替代 join,提升性能 - 修复训练日期边界问题,添加1天间隔避免数据泄露 - 新增 .gitignore 规则忽略训练输出目录
This commit is contained in:
@@ -12,13 +12,16 @@ from src.training.pipeline import run_training
|
||||
|
||||
if __name__ == "__main__":
|
||||
# 运行完整训练流程
|
||||
# 训练集:20180101 - 20230101
|
||||
# 测试集:20230101 - 20240101
|
||||
# 训练集:20190101 - 20231231
|
||||
# 验证集:20240102 - 20240531 (与训练集间隔1天,避免数据泄露)
|
||||
# 测试集:20240602 - 20241231 (与验证集间隔1天,避免数据泄露)
|
||||
result = run_training(
|
||||
train_start="20190101",
|
||||
train_end="20250101",
|
||||
test_start="20250101",
|
||||
test_end="20260101",
|
||||
train_end="20231231",
|
||||
val_start="20240102",
|
||||
val_end="20240531",
|
||||
test_start="20240602",
|
||||
test_end="20241231",
|
||||
top_n=5,
|
||||
output_path="output/top_stocks.tsv",
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user