228 lines
6.8 KiB
Markdown
228 lines
6.8 KiB
Markdown
|
|
# 代码审查报告 - Factor 框架
|
|||
|
|
|
|||
|
|
**审查日期**: 2026-02-22
|
|||
|
|
**审查范围**: `src/factors/` 模块及测试代码
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 变更概述
|
|||
|
|
|
|||
|
|
| 类型 | 文件 |
|
|||
|
|
|------|------|
|
|||
|
|
| **已暂存** | `.kilocode/rules/project_rules.md` - Git 提交规范文档 |
|
|||
|
|
| | `.kilocode/rules/python-development-guidelines.md` - Python 开发规范扩展 |
|
|||
|
|
| **未跟踪** | `src/factors/` - 新增因子计算框架 |
|
|||
|
|
| | `tests/factors/` - 对应测试文件 |
|
|||
|
|
| | `docs/` - 文档目录 |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 文档变更(已暂存)
|
|||
|
|
|
|||
|
|
✅ 无问题。Git 提交规范添加符合项目风格,格式清晰。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 新增代码审查(factors 模块)
|
|||
|
|
|
|||
|
|
### 1. 严重问题
|
|||
|
|
|
|||
|
|
#### 1.1 `engine.py:306-323` - 交易日偏移实现假设错误
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
def _get_trading_date_offset(self, date: str, offset: int) -> str:
|
|||
|
|
from datetime import datetime, timedelta
|
|||
|
|
dt = datetime.strptime(date, "%Y%m%d")
|
|||
|
|
new_dt = dt + timedelta(days=offset)
|
|||
|
|
return new_dt.strftime("%Y%m%d")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**问题**:简单使用日历日偏移,假设每天都是交易日。A股市场有周末和节假日,这会导致:
|
|||
|
|
- 偏移计算不准确
|
|||
|
|
- `lookback_days` 实际不等于交易天数
|
|||
|
|
- 可能加载过多或过少的历史数据
|
|||
|
|
|
|||
|
|
**建议**:使用真实的交易日历,或至少跳过周末。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
#### 1.2 `base.py:137-147` - 乘法运算符类型检查不完整
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
def __mul__(self, other):
|
|||
|
|
if isinstance(other, (int, float)):
|
|||
|
|
from src.factors.composite import ScalarFactor
|
|||
|
|
return ScalarFactor(self, float(other), "*")
|
|||
|
|
elif isinstance(other, BaseFactor):
|
|||
|
|
from src.factors.composite import CompositeFactor
|
|||
|
|
return CompositeFactor(self, other, "*")
|
|||
|
|
return NotImplemented
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**问题**:`float` 类型的负数会匹配 `int` 分支,但 `bool` 是 `int` 的子类,会被错误匹配:
|
|||
|
|
```python
|
|||
|
|
factor * True # 返回 ScalarFactor(factor, 1.0, "*") - 可能不是预期行为
|
|||
|
|
factor * False # 返回 ScalarFactor(factor, 0.0, "*") - 可能不是预期行为
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**建议**:显式排除 `bool` 类型:
|
|||
|
|
```python
|
|||
|
|
if isinstance(other, (int, float)) and not isinstance(other, bool):
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
#### 1.3 `engine.py:60-72` - compute 方法缺少必需参数验证 ✅ 已修复
|
|||
|
|
|
|||
|
|
**修复内容**:
|
|||
|
|
- 为截面因子添加了 `start_date` 和 `end_date` 必填参数验证
|
|||
|
|
- 为时序因子添加了 `stock_codes`、`start_date`、`end_date` 必填参数验证
|
|||
|
|
- 参数缺失时抛出明确的 `ValueError`,指出缺少哪些参数
|
|||
|
|
|
|||
|
|
**修复代码**:
|
|||
|
|
```python
|
|||
|
|
if factor.factor_type == "cross_sectional":
|
|||
|
|
if "start_date" not in kwargs or "end_date" not in kwargs:
|
|||
|
|
raise ValueError(
|
|||
|
|
"cross_sectional factor requires 'start_date' and 'end_date' parameters"
|
|||
|
|
)
|
|||
|
|
elif factor.factor_type == "time_series":
|
|||
|
|
missing = []
|
|||
|
|
if "stock_codes" not in kwargs:
|
|||
|
|
missing.append("stock_codes")
|
|||
|
|
if "start_date" not in kwargs:
|
|||
|
|
missing.append("start_date")
|
|||
|
|
if "end_date" not in kwargs:
|
|||
|
|
missing.append("end_date")
|
|||
|
|
if missing:
|
|||
|
|
raise ValueError(
|
|||
|
|
f"time_series factor requires parameters: {', '.join(missing)}"
|
|||
|
|
)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 2. 中等问题
|
|||
|
|
|
|||
|
|
#### 2.1 `base.py:92-101` - 参数验证时机问题 ✅ 已修复
|
|||
|
|
|
|||
|
|
**修复内容**:
|
|||
|
|
- 移除了 `__init__` 中自动调用的 `self._validate_params()`
|
|||
|
|
- 更新了 `_validate_params()` 的文档,明确说明子类如需自定义验证,需自行在子类 `__init__` 中调用
|
|||
|
|
- 添加了关于 `data_specs` 必须在类级别定义的说明
|
|||
|
|
|
|||
|
|
**修复代码**:
|
|||
|
|
```python
|
|||
|
|
def __init__(self, **params):
|
|||
|
|
"""初始化因子参数
|
|||
|
|
|
|||
|
|
注意:data_specs 必须在类级别定义(类属性),
|
|||
|
|
而非在 __init__ 中设置。data_specs 的验证在
|
|||
|
|
__init_subclass__ 中完成(类创建时)。
|
|||
|
|
"""
|
|||
|
|
self.params = params
|
|||
|
|
|
|||
|
|
def _validate_params(self):
|
|||
|
|
"""验证参数有效性
|
|||
|
|
|
|||
|
|
子类可覆盖此方法进行自定义验证(需自行在子类 __init__ 中调用)。
|
|||
|
|
基类实现为空,表示不执行任何验证。
|
|||
|
|
|
|||
|
|
注意:由于 data_specs 在类创建时通过 __init_subclass__ 验证,
|
|||
|
|
不应在实例级别修改。如需动态 data_specs,请使用参数化模式:
|
|||
|
|
...
|
|||
|
|
"""
|
|||
|
|
pass
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
#### 2.2 `engine.py:161-169` - 静默修复长度不匹配
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
if len(factor_values) != len(today_stocks):
|
|||
|
|
# 尝试从 factor_data 重新提取
|
|||
|
|
cs_data = factor_data.get_cross_section()
|
|||
|
|
if len(cs_data) > 0:
|
|||
|
|
today_stocks = cs_data["ts_code"]
|
|||
|
|
if len(factor_values) != len(today_stocks):
|
|||
|
|
factor_values = pl.Series([None] * len(today_stocks)) # 静默填充 null!
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**问题**:静默返回 null 值可能掩盖因子计算中的逻辑错误。开发者难以发现因子实现问题。
|
|||
|
|
|
|||
|
|
**建议**:
|
|||
|
|
- 至少记录警告日志
|
|||
|
|
- 或在开发环境抛出异常
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
#### 2.3 `data_spec.py:14` - 使用 `frozen=True` 但通过 `object.__setattr__` 绕过 ✅ 已修复
|
|||
|
|
|
|||
|
|
**修复内容**:
|
|||
|
|
- 更新了 `__post_init__` 的注释,准确说明 `frozen=True` 的含义
|
|||
|
|
- 说明本类仅做验证,无需修改字段,因此直接 raise ValueError 即可
|
|||
|
|
- 补充说明如需在 `__post_init__` 中修改字段,可使用 `object.__setattr__`
|
|||
|
|
|
|||
|
|
**修复代码**:
|
|||
|
|
```python
|
|||
|
|
def __post_init__(self):
|
|||
|
|
"""验证约束条件
|
|||
|
|
|
|||
|
|
验证项:
|
|||
|
|
1. lookback_days >= 1(至少包含当日)
|
|||
|
|
2. columns 必须包含 ts_code 和 trade_date
|
|||
|
|
3. source 不能为空字符串
|
|||
|
|
|
|||
|
|
注意:由于 frozen=True,实例创建后不可修改。
|
|||
|
|
若需要在 __post_init__ 中修改字段(如有),可使用 object.__setattr__。
|
|||
|
|
本类仅做验证,无需修改字段,因此直接 raise ValueError 即可。
|
|||
|
|
"""
|
|||
|
|
if self.lookback_days < 1:
|
|||
|
|
raise ValueError(f"lookback_days must be >= 1, got {self.lookback_days}")
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
### 3. 轻微问题 / 优化建议
|
|||
|
|
|
|||
|
|
#### 3.1 `engine.py` - 缺少类型注解
|
|||
|
|
|
|||
|
|
`compute()` 方法返回类型注解为 `pl.DataFrame`,但 `_compute_cross_sectional` 和 `_compute_time_series` 返回类型未标注。
|
|||
|
|
|
|||
|
|
#### 3.2 `data_spec.py:42` - 默认值 `lookback_days=1` 语义
|
|||
|
|
|
|||
|
|
注释说 "包含当日",但 `lookback_days=1` 实际只包含 `[T]`,这与注释中 `lookback_days=5` 表示 `[T-4, T]` 一致。
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 测试覆盖
|
|||
|
|
|
|||
|
|
测试覆盖良好,共 97 个测试用例,覆盖:
|
|||
|
|
- 因子基类验证
|
|||
|
|
- 组合因子运算
|
|||
|
|
- 数据加载器
|
|||
|
|
- 数据规格定义
|
|||
|
|
- 引擎执行逻辑
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 总结
|
|||
|
|
|
|||
|
|
| 状态 | 严重 | 中等 | 轻微 |
|
|||
|
|
|------|------|------|------|
|
|||
|
|
| 修复前 | 3 | 3 | 2 |
|
|||
|
|
| 修复后 | 2 | 1 | 2 |
|
|||
|
|
|
|||
|
|
**已修复问题**:
|
|||
|
|
- ✅ 1.3 compute 方法参数验证
|
|||
|
|
- ✅ 2.1 参数验证时机问题
|
|||
|
|
- ✅ 2.3 frozen dataclass 注释
|
|||
|
|
|
|||
|
|
**待修复问题**:
|
|||
|
|
- ⚠️ 1.1 交易日偏移实现(严重)
|
|||
|
|
- ⚠️ 1.2 乘法运算符 bool 边界(严重)
|
|||
|
|
- ⚠️ 2.2 静默修复长度不匹配(中等)
|
|||
|
|
|
|||
|
|
Factor 框架整体设计良好,测试覆盖全面。建议修复剩余 2 个严重问题后再合并代码。
|