ProStock/docs/plans/2026-04-08-step3-llm-prompt-local-dsl.md

# Step 3: LLM Prompt 改造（直接生成本地 DSL）实施计划

> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

**Goal:** 将 FactorMiner 的 LLM Prompt 和输出解析器从 CamelCase + `$` 前缀 DSL 改造为直接生成本地 snake_case DSL，移除运行时翻译层。

**Architecture:** Prompt 直接使用本地 `FactorEngine` 支持的 snake_case 函数名和字段名；`OutputParser` 仅做字符串提取和轻量清洗，不再调用 FactorMiner 的 `ExpressionTree` 解析；`factor_generator.py` 配合返回原始 DSL 字符串。

**Tech Stack:** Python, ProStock `src.factors` 本地 DSL (`FactorEngine`)

---

## Task 1: 重写 `src/factorminer/agent/prompt_builder.py`

**Files:**
- Modify: `src/factorminer/agent/prompt_builder.py`
- Test: `tests/test_factorminer_prompt.py`

**Step 1: 重写字段列表函数 `_format_feature_list()`**

将 `$` 前缀字段替换为本地字段，并添加计算说明：

```python
def _format_feature_list() -> str:
    descriptions = {
        "open": "开盘价",
        "high": "最高价",
        "low": "最低价",
        "close": "收盘价",
        "vol": "成交量（股数）",
        "amount": "成交额（金额）",
        "vwap": "可用 amount / vol 计算",
        "returns": "可用 close / ts_delay(close, 1) - 1 计算",
    }
    lines = []
    for feat, desc in descriptions.items():
        lines.append(f"  {feat}: {desc}")
    return "\n".join(lines)
```

**Step 2: 定义本地 DSL 算子表映射**

在 `prompt_builder.py` 中新增 `LOCAL_OPERATOR_TABLE` 常量，列出 prompt 中需要展示的本地可用算子（按类别分组），不再依赖 `OPERATOR_REGISTRY` 遍历：

```python
LOCAL_OPERATOR_TABLE = {
    "ARITHMETIC": [
        ("+", "二元", "x + y"),
        ("-", "二元/一元", "x - y 或 -x"),
        ("*", "二元", "x * y"),
        ("/", "二元", "x / y"),
        ("**", "二元", "x ** y (幂运算)"),
        (">", "二元", "x > y (条件判断，返回 0/1)"),
        ("<", "二元", "x < y (条件判断，返回 0/1)"),
        ("abs(x)", "一元", "绝对值"),
        ("sign(x)", "一元", "符号函数"),
        ("max_(x, y)", "二元", "逐元素最大值"),
        ("min_(x, y)", "二元", "逐元素最小值"),
        ("clip(x, lower, upper)", "一元带参", "截断"),
        ("log(x)", "一元", "自然对数"),
        ("sqrt(x)", "一元", "平方根"),
        ("exp(x)", "一元", "指数函数"),
    ],
    "TIMESERIES": [
        ("ts_mean(x, window)", "一元+窗口", "滚动均值"),
        ("ts_std(x, window)", "一元+窗口", "滚动标准差"),
        ("ts_var(x, window)", "一元+窗口", "滚动方差"),
        ("ts_max(x, window)", "一元+窗口", "滚动最大值"),
        ("ts_min(x, window)", "一元+窗口", "滚动最小值"),
        ("ts_sum(x, window)", "一元+窗口", "滚动求和"),
        ("ts_delay(x, periods)", "一元+周期", "滞后 N 期"),
        ("ts_delta(x, periods)", "一元+周期", "差分 N 期"),
        ("ts_corr(x, y, window)", "二元+窗口", "滚动相关系数"),
        ("ts_cov(x, y, window)", "二元+窗口", "滚动协方差"),
        ("ts_pct_change(x, periods)", "一元+周期", "N 期百分比变化"),
        ("ts_ema(x, window)", "一元+窗口", "指数移动平均"),
        ("ts_wma(x, window)", "一元+窗口", "加权移动平均"),
        ("ts_skew(x, window)", "一元+窗口", "滚动偏度"),
        ("ts_kurt(x, window)", "一元+窗口", "滚动峰度"),
        ("ts_rank(x, window)", "一元+窗口", "滚动分位排名"),
    ],
    "CROSS_SECTIONAL": [
        ("cs_rank(x)", "一元", "截面排名（分位数）"),
        ("cs_zscore(x)", "一元", "截面 Z-Score 标准化"),
        ("cs_demean(x)", "一元", "截面去均值"),
        ("cs_neutralize(x, group)", "一元", "行业/市值中性化"),
        ("cs_winsorize(x, lower, upper)", "一元", "截面缩尾处理"),
    ],
    "CONDITIONAL": [
        ("if_(condition, true_val, false_val)", "三元", "条件选择"),
        ("where(condition, true_val, false_val)", "三元", "if_ 的别名"),
    ],
}
```

然后重写 `_format_operator_table()`：

```python
def _format_operator_table() -> str:
    lines = []
    for cat_name, ops in LOCAL_OPERATOR_TABLE.items():
        lines.append(f"\n### {cat_name} operators")
        for op_sig, arity, desc in ops:
            lines.append(f"- {op_sig}: {desc} ({arity})")
    return "\n".join(lines)
```

**Step 3: 重写 `SYSTEM_PROMPT`**

替换语法规则段落和示例：

```python
SYSTEM_PROMPT = f"""You are a quantitative researcher mining formulaic alpha factors for stock selection.

Your goal is to generate novel, predictive factor expressions using the local ProStock DSL. Each factor is a composition of operators applied to raw market features.

## RAW FEATURES (leaf nodes)
{_format_feature_list()}

## OPERATOR LIBRARY
{_format_operator_table()}

## EXPRESSION SYNTAX RULES
1. Expressions use Python-style infix operators: +, -, *, /, **, >, <
2. Function calls use snake_case names with comma-separated arguments: ts_mean(close, 20)
3. Window sizes and periods are numeric arguments placed last in function calls.
4. Valid window sizes are integers, typically in range [2, 250].
5. Cross-sectional operators (cs_rank, cs_zscore, cs_demean) operate across all stocks at each time step -- they are crucial for making factors comparable.
6. Do NOT use $ prefix for features. Use `close`, `vol`, `amount`, etc. directly.
7. `vwap` is not a raw feature; use `amount / vol` if you need it.
8. `returns` is not a raw feature; use `close / ts_delay(close, 1) - 1` if you need returns.

## EXAMPLES OF WELL-FORMED FACTORS
- -cs_rank(ts_delta(close, 5))
  Short-term reversal: rank of 5-day price change, negated.
- cs_zscore((vol - ts_mean(vol, 20)) / ts_std(vol, 20))
  Volume surprise: standardized deviation from 20-day mean volume.
- cs_rank((close - amount / vol) / (amount / vol))
  Intraday deviation from VWAP, cross-sectionally ranked.
- -ts_corr(vol, close, 10)
  Negative price-volume correlation over 10 days.
- if_(close / ts_delay(close, 1) - 1 > 0, ts_std(close / ts_delay(close, 1) - 1, 10), -ts_std(close / ts_delay(close, 1) - 1, 10))
  Conditional volatility: positive for up-moves, negative for down-moves.
- cs_rank((close - ts_min(low, 20)) / (ts_max(high, 20) - ts_min(low, 20)))
  Position within 20-day price range, ranked.

## KEY PRINCIPLES FOR HIGH-QUALITY FACTORS
- Always wrap the outermost expression with a cross-sectional operator (cs_rank, cs_zscore) for comparability.
- Combine DIFFERENT operator types for novelty (e.g., time-series + cross-sectional + arithmetic).
- Use diverse window sizes; avoid always defaulting to 10.
- Explore uncommon feature combinations (amount, amount/vol are underused).
- Factors with depth 3-7 tend to be best: deep enough to capture non-trivial patterns but not so deep they overfit.
- Prefer economically meaningful combinations over random nesting.
- IMPORTANT: Avoid operators that are NOT listed above (e.g., Decay, TsLinRegSlope, HMA, DEMA, Resid). If you use them, the factor will be rejected.
"""
```

**Step 4: 更新所有输出格式示例**

在 `build_user_prompt`（约第333行）中，将示例公式替换为本地 DSL：

```
1. momentum_reversal: -cs_rank(ts_delta(close, 5))
2. volume_surprise: cs_zscore((vol - ts_mean(vol, 20)) / ts_std(vol, 20))
```

在 `build_specialist_prompt`（约第529行）中同步替换：

```
Example: 1. momentum_reversal: -cs_rank(ts_delta(close, 5))
```

**Step 5: 运行 prompt_builder 相关测试（若已有）**

```bash
uv run pytest tests/test_factorminer_prompt.py -v -k prompt
```

---

## Task 2: 修改 `src/factorminer/agent/output_parser.py`

**Files:**
- Modify: `src/factorminer/agent/output_parser.py`
- Test: `tests/test_factorminer_prompt.py`

**Step 1: 移除 FactorMiner 解析器依赖**

删除以下导入：

```python
from src.factorminer.core.expression_tree import ExpressionTree
from src.factorminer.core.parser import parse, try_parse
from src.factorminer.core.types import OperatorType, OPERATOR_REGISTRY
```

**Step 2: 修改 `CandidateFactor`**

```python
@dataclass
class CandidateFactor:
    """A candidate factor parsed from LLM output.

    Attributes
    ----------
    name : str
        Descriptive snake_case name.
    formula : str
        DSL formula string.
    category : str
        Inferred category based on outermost operators.
    parse_error : str
        Error message if formula failed basic validation.
    """

    name: str
    formula: str
    category: str = "unknown"
    parse_error: str = ""

    @property
    def is_valid(self) -> bool:
        return not self.parse_error and bool(self.formula.strip())
```

**Step 3: 修改 `_infer_category()`**

将所有 CamelCase 算子名替换为 snake_case：

```python
def _infer_category(formula: str) -> str:
    """Infer a rough category from the outermost operators in the formula."""
    if any(op in formula for op in ("cs_rank", "cs_zscore", "cs_demean", "cs_neutralize", "cs_winsorize")):
        if any(op in formula for op in ("ts_corr", "ts_cov")):
            return "cross_sectional_regression"
        if any(op in formula for op in ("ts_delta", "ts_delay", "ts_pct_change")):
            return "cross_sectional_momentum"
        if any(op in formula for op in ("ts_std", "ts_var", "ts_skew", "ts_kurt")):
            return "cross_sectional_volatility"
        if any(op in formula for op in ("ts_mean", "ts_sum", "ts_ema", "ts_wma")):
            return "cross_sectional_smoothing"
        return "cross_sectional"
    if any(op in formula for op in ("ts_corr", "ts_cov")):
        return "regression"
    if any(op in formula for op in ("ts_delta", "ts_delay", "ts_pct_change")):
        return "momentum"
    if any(op in formula for op in ("ts_std", "ts_var", "ts_skew", "ts_kurt")):
        return "volatility"
    if any(op in formula for op in ("if_", "where", ">", "<")):
        return "conditional"
    return "general"
```

**Step 4: 修改 `_FORMULA_ONLY_PATTERN`**

本地 DSL 公式可能以 `cs_`, `ts_` 开头，也可能以 `-` 开头（如 `-cs_rank(...)`），或字段名/数字开头：

```python
_FORMULA_ONLY_PATTERN = re.compile(
    r"^\s*([a-zA-Z_][a-zA-Z0-9_]*\s*\(.*\)|-.*|\d.*)\s*$"
)
```

**Step 5: 修改 `_clean_formula()`**

移除 `$` 清洗逻辑（当前已不需要替换 `$` 前缀），保留注释、标点和反引号清理：

```python
def _clean_formula(formula: str) -> str:
    """Clean up a formula string before parsing."""
    formula = formula.strip()
    # Remove trailing comments
    if " #" in formula:
        formula = formula[: formula.index(" #")]
    if " //" in formula:
        formula = formula[: formula.index(" //")]
    # Remove trailing punctuation
    formula = formula.rstrip(";,.")
    # Remove surrounding backticks
    formula = formula.strip("`")
    return formula.strip()
```

**Step 6: 重写 `_try_build_candidate()`**

不再调用 `try_parse(formula)` 或 `ExpressionTree`，仅做基础校验：

```python
def _try_build_candidate(name: str, formula: str) -> CandidateFactor:
    """Attempt to validate a formula and build a CandidateFactor."""
    # Basic validation: parenthesis balance
    if formula.count("(") != formula.count(")"):
        return CandidateFactor(
            name=name,
            formula=formula,
            category="unknown",
            parse_error="括号不匹配",
        )
    category = _infer_category(formula)
    return CandidateFactor(name=name, formula=formula, category=category)
```

**Step 7: 修改 `_generate_name_from_formula()`**

正则提取的逻辑调整为适配 snake_case 函数名（第一个括号前的部分）：

```python
def _generate_name_from_formula(formula: str, index: int) -> str:
    """Generate a descriptive name from a formula."""
    # Extract the outermost operator (snake_case)
    m = re.match(r"([a-zA-Z_][a-zA-Z0-9_]*)\s*\(", formula)
    if m:
        outer_op = m.group(1).lower()
        return f"{outer_op}_factor_{index + 1}"
    # Handle unary minus
    m = re.match(r"-([a-zA-Z_][a-zA-Z0-9_]*)\s*\(", formula)
    if m:
        outer_op = m.group(1).lower()
        return f"neg_{outer_op}_factor_{index + 1}"
    return f"factor_{index + 1}"
```

---

## Task 3: 适配 `src/factorminer/agent/factor_generator.py`

**Files:**
- Modify: `src/factorminer/agent/factor_generator.py`

**Step 1: 更新 retry prompt 的 DSL 规则描述**

在 `_retry_failed_parses` 方法中（约第199行），将 repair_prompt 中的描述改为本地 DSL 规则：

```python
repair_prompt = (
    "The following factor formulas failed to parse. "
    "Fix each one so it uses ONLY valid local DSL operators and features "
    "from the library. Return them in the same numbered format:\n"
    "<number>. <name>: <corrected_formula>\n\n"
    "Broken formulas:\n"
    + "\n".join(f"  {i+1}. {f}" for i, f in enumerate(failed))
    + "\n\nFix all syntax errors, unknown operators, and invalid "
    "feature names. Use snake_case functions (e.g., ts_mean, cs_rank), "
    "infix operators (+, -, *, /, >, <), and raw features without $ prefix. "
    "Every formula must be valid in the local DSL."
)
```

**Step 2: 确认 `generate_batch` 无需修改**

因为 `CandidateFactor.is_valid` 已改为基于字符串校验，`generate_batch` 中的过滤逻辑自然兼容。

---

## Task 4: 编写测试 `tests/test_factorminer_prompt.py`

**Files:**
- Create: `tests/test_factorminer_prompt.py`

**Step 1: 测试 system prompt 使用本地 DSL**

```python
import pytest
from src.factorminer.agent.prompt_builder import SYSTEM_PROMPT

def test_system_prompt_uses_local_dsl():
    assert "$close" not in SYSTEM_PROMPT
    assert "CsRank(" not in SYSTEM_PROMPT
    assert "cs_rank(" in SYSTEM_PROMPT
    assert "close / ts_delay(close, 1) - 1" in SYSTEM_PROMPT
    assert "ts_mean(close, 20)" in SYSTEM_PROMPT
```

**Step 2: 测试 OutputParser 正确提取本地 DSL**

```python
from src.factorminer.agent.output_parser import parse_llm_output, CandidateFactor

def test_parse_local_dsl_numbered_list():
    raw = (
        "1. momentum: -cs_rank(ts_delta(close, 5))\n"
        "2. volume: cs_zscore((vol - ts_mean(vol, 20)) / ts_std(vol, 20))\n"
        "3. vwap_dev: cs_rank((close - amount / vol) / (amount / vol))\n"
    )
    candidates, failed = parse_llm_output(raw)
    assert len(candidates) == 3
    assert candidates[0].name == "momentum"
    assert candidates[0].formula == "-cs_rank(ts_delta(close, 5))"
    assert candidates[0].is_valid
    assert candidates[1].name == "volume"
    assert candidates[1].formula == "cs_zscore((vol - ts_mean(vol, 20)) / ts_std(vol, 20))"
    assert candidates[2].name == "vwap_dev"
    assert not failed
```

**Step 3: 测试 formula-only 行**

```python
def test_parse_local_dsl_formula_only():
    raw = "cs_rank(close / ts_delay(close, 5) - 1)"
    candidates, failed = parse_llm_output(raw)
    assert len(candidates) == 1
    assert candidates[0].formula == "cs_rank(close / ts_delay(close, 5) - 1)"
    assert not failed
```

**Step 4: 测试括号不匹配标记为无效**

```python
def test_parse_invalid_parentheses():
    candidates, failed = parse_llm_output("1. bad: cs_rank(ts_delta(close, 5)")
    assert len(candidates) == 1
    assert not candidates[0].is_valid
    assert "括号" in candidates[0].parse_error
```

**Step 5: 测试分类推断**

```python
def test_infer_category_local_dsl():
    from src.factorminer.agent.output_parser import _infer_category
    assert _infer_category("cs_rank(ts_delta(close, 5))") == "cross_sectional_momentum"
    assert _infer_category("ts_corr(vol, close, 10)") == "regression"
    assert _infer_category("ts_std(close, 20)") == "volatility"
    assert _infer_category("if_(close > open, 1, -1)") == "conditional"
```

**Step 6: 运行测试**

```bash
uv run pytest tests/test_factorminer_prompt.py -v
```

预期：所有测试通过。

---

## 执行命令汇总

```bash
# 安装依赖（若尚未安装）
uv pip install -e .

# 运行新增测试
uv run pytest tests/test_factorminer_prompt.py -v

# 运行 factorminer 相关测试
uv run pytest tests/test_factorminer_* -v
```

---

## 提交建议

修改完成后建议拆分为两个 commits：

1. `refactor(factorminer): rewrite LLM prompts to output local snake_case DSL`
2. `test(factorminer): add prompt and output parser tests for local DSL`