- 添加因子表达式文档,收录180+个因子及数学表达式 - 添加因子实现分析报告,明确ts_*与cs_*算子分类 - 实现装饰器系统:@time_series/@cross_section/@element_wise - 优化API和翻译器以支持新架构
18 KiB
18 KiB
因子名称与表达式文档
数据来源
- 分析文件:
main/train/Classify2_load_model.ipynb - 因子模块:
main/factor/factor.py,main/factor/money_factor.py,main/factor/utils.py
一、财务因子 (Financial Factors)
1. add_financial_factor 系列
因子名称: undist_profit_ps, ocfps, roa, roe
表达式:
- 使用
merge_asof将财务指标数据按股票代码和公告日期匹配到每个交易日 - 匹配逻辑: 向后查找(找 ≤ trade_date 的最近财务数据)
- 公式:
factor_value直接作为因子值
2. calculate_cashflow_to_ev_factor
因子名称: cashflow_to_ev_factor
表达式:
Enterprise Value = total_mv * 10000 + total_liab - money_cap
cashflow_to_ev_factor = n_cashflow_act / Enterprise Value
3. caculate_book_to_price_ratio
因子名称: book_to_price_ratio
表达式:
book_to_price_ratio = bps / close
二、ARBR 因子 (ARBR Factors)
4. calculate_arbr
因子名称: AR, BR, AR_BR
表达式:
# 中间计算
h_minus_o = high - open
o_minus_l = open - low
prev_close = close.shift(1)
h_minus_pc_pos = max(0, high - prev_close)
pc_minus_l_pos = max(0, prev_close - low)
# AR 和 BR 计算
AR = sum(h_minus_o, N) / sum(o_minus_l, N) * 100
BR = sum(h_minus_pc_pos, N) / sum(pc_minus_l_pos, N) * 100
AR_BR = AR - BR
三、技术指标因子 (Technical Indicator Factors)
5. turnover_rate_n
因子名称: turnover_rate_mean_5
表达式:
turnover_rate_mean_5 = mean(turnover_rate, window=5)
6. variance_n
因子名称: variance_20
表达式:
variance_20 = var(pct_chg, window=20)
7. bbi_ratio_factor
因子名称: bbi_ratio_factor
表达式:
SMA3 = mean(close, 3)
SMA6 = mean(close, 6)
SMA12 = mean(close, 12)
SMA24 = mean(close, 24)
BBI = (SMA3 + SMA6 + SMA12 + SMA24) / 4
bbi_ratio_factor = BBI / close
四、偏离度因子 (Deviation Factors)
8. daily_deviation
因子名称: daily_deviation
表达式:
# 计算日级别动量基准
daily_positive_benchmark = mean(pct_chg[pct_chg > 0]) # 每日上涨股票的平均涨跌幅
daily_negative_benchmark = mean(pct_chg[pct_chg < 0]) # 每日下跌股票的平均涨跌幅
# 偏离度计算
if pct_chg > 0 and daily_positive_benchmark > 0:
daily_deviation = pct_chg - daily_positive_benchmark
elif pct_chg < 0 and daily_negative_benchmark < 0:
daily_deviation = pct_chg - daily_negative_benchmark
else:
daily_deviation = 0
9. daily_industry_deviation
因子名称: daily_industry_deviation
表达式:
# 计算日级别行业动量基准
daily_industry_positive_benchmark = mean(pct_chg[pct_chg > 0]) # 按 trade_date + cat_l2_code 分组
daily_industry_negative_benchmark = mean(pct_chg[pct_chg < 0]) # 按 trade_date + cat_l2_code 分组
# 行业偏离度计算
if pct_chg > 0 and daily_industry_positive_benchmark > 0:
daily_industry_deviation = pct_chg - daily_industry_positive_benchmark
elif pct_chg < 0 and daily_industry_negative_benchmark < 0:
daily_industry_deviation = pct_chg - daily_industry_negative_benchmark
else:
daily_industry_deviation = 0
五、滚动因子和简单因子 (Rolling & Simple Factors)
10. get_rolling_factor 生成的因子
资金流因子
| 因子名称 | 表达式 |
|---|---|
lg_elg_net_buy_vol |
(buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) |
flow_lg_elg_intensity |
lg_elg_net_buy_vol / (vol + epsilon) |
sm_net_buy_vol |
buy_sm_vol - sell_sm_vol |
flow_divergence_diff |
sm_net_buy_vol - lg_elg_net_buy_vol |
flow_divergence_ratio |
sm_net_buy_vol / (lg_elg_net_buy_vol + sign(lg_elg_net_buy_vol) * epsilon + epsilon) |
total_buy_vol |
buy_sm_vol + buy_lg_vol + buy_elg_vol |
lg_elg_buy_prop |
(buy_lg_vol + buy_elg_vol) / (total_buy_vol + epsilon) |
flow_struct_buy_change |
diff(lg_elg_buy_prop, 1) |
lg_elg_net_buy_vol_change |
diff(lg_elg_net_buy_vol, 1) |
flow_lg_elg_accel |
diff(lg_elg_net_buy_vol_change, 1) |
筹码分布因子
| 因子名称 | 表达式 |
|---|---|
chip_concentration_range |
(cost_95pct - cost_5pct) / (close + epsilon) |
chip_skewness |
(weight_avg - cost_50pct) / (cost_50pct + epsilon) |
floating_chip_proxy |
winner_rate * max(0, (close - cost_15pct) / (close + epsilon)) |
cost_support_15pct_change |
pct_change(cost_15pct, 1) * 100 |
cat_winner_price_zone |
categorical: 1=高风险区, 2=低潜力区, 3=中上获利区, 4=中下亏损区 |
flow_chip_consistency |
lg_elg_net_buy_vol * price_near_low_support |
profit_taking_vs_absorb |
lg_elg_net_buy_vol * (winner_rate > 0.7) |
波动率因子
| 因子名称 | 表达式 |
|---|---|
upside_vol |
std(pos_returns, window=20) |
downside_vol |
std(neg_returns, window=20) |
vol_ratio |
upside_vol / downside_vol |
return_skew |
skew(pct_chg, window=5) |
return_kurtosis |
kurt(pct_chg, window=5) |
成交量因子
| 因子名称 | 表达式 |
|---|---|
volume_change_rate |
mean(vol, 2) / mean(vol, 10) - 1 |
cat_volume_breakout |
vol > max(vol, 5) |
turnover_deviation |
(turnover_rate - mean(turnover_rate, 3)) / std(turnover_rate, 3) |
cat_turnover_spike |
turnover_rate > mean(turnover_rate, 3) + 2 * std(turnover_rate, 3) |
avg_volume_ratio |
mean(volume_ratio, 3) |
cat_volume_ratio_breakout |
volume_ratio > max(volume_ratio, 5) |
vol_spike |
mean(vol, 20) |
vol_std_5 |
std(pct_change(vol), 5) |
技术指标
| 因子名称 | 表达式 |
|---|---|
atr_14 |
ATR(high, low, close, 14) (TA-Lib) |
atr_6 |
ATR(high, low, close, 6) (TA-Lib) |
obv |
OBV(close, vol) (TA-Lib) |
maobv_6 |
SMA(obv, 6) (TA-Lib) |
rsi_3 |
RSI(close, 3) (TA-Lib) |
收益率因子
| 因子名称 | 表达式 |
|---|---|
return_5 |
close / close.shift(5) - 1 |
return_20 |
close / close.shift(20) - 1 |
std_return_5 |
std(pct_change(close), 5) |
std_return_90 |
std(pct_change(close), 90) |
std_return_90_2 |
std(pct_change(close.shift(10)), 90) |
EMA 因子
| 因子名称 | 表达式 |
|---|---|
act_factor1 |
atan((EMA(close,5)/EMA(close,5).shift(1)-1)*100) * 57.3 / 50 |
act_factor2 |
atan((EMA(close,13)/EMA(close,13).shift(1)-1)*100) * 57.3 / 40 |
act_factor3 |
atan((EMA(close,20)/EMA(close,20).shift(1)-1)*100) * 57.3 / 21 |
act_factor4 |
atan((EMA(close,60)/EMA(close,60).shift(1)-1)*100) * 57.3 / 10 |
rank_act_factor1 |
rank(act_factor1, pct=True) |
rank_act_factor2 |
rank(act_factor2, pct=True) |
rank_act_factor3 |
rank(act_factor3, pct=True) |
log_circ_mv |
log(circ_mv) |
Alpha 因子
| 因子名称 | 表达式 |
|---|---|
cov |
cov(high, vol, window=5) |
delta_cov |
diff(cov, 5) |
_stddev_close |
std(close, 20) |
_rank_stddev |
rank(_stddev_close, pct=True) |
alpha_22_improved |
-1 * delta_cov * _rank_stddev |
alpha_003 |
(close - open) / (high - low) (if high != low else 0) |
alpha_007 |
rank(rolling_corr(close, vol, 5), pct=True) |
alpha_013 |
rank(sum(close, 5) - sum(close, 20), pct=True) |
筹码因子
| 因子名称 | 表达式 |
|---|---|
vol_break |
1 if (close > cost_85pct) & (volume_ratio > 2) else 0 |
weight_roc5 |
pct_change(weight_avg, 5) |
price_cost_divergence |
corr(pct_change(close), pct_change(weight_avg), 10) |
smallcap_concentration |
(1 / log_circ_mv) * (cost_85pct - cost_15pct) |
cost_stability |
std(weight_avg, 20) / mean(weight_avg, 20) |
high_cost_break_days |
sum(close > cost_95pct, 5) |
liquidity_risk |
(cost_95pct - cost_5pct) / mean(vol, 10) |
turnover_std |
std(turnover_rate, 20) |
mv_volatility |
turnover_std / log_circ_mv |
volume_growth |
pct_change(vol, 20) |
mv_growth |
volume_growth / log_circ_mv |
momentum_factor |
volume_change_rate + 0.5 * turnover_deviation |
resonance_factor |
vol_ratio * pct_chg |
log_close |
log(close) |
cat_vol_spike |
vol > 2 * vol_spike |
up |
(high - max(close, open)) / close |
down |
(min(close, open) - low) / close |
obv_maobv_6 |
obv - maobv_6 |
std_return_5_over_std_return_90 |
std_return_5 / std_return_90 |
std_return_90_minus_std_return_90_2 |
std_return_90 - std_return_90_2 |
cat_af2 |
act_factor2 > act_factor1 |
cat_af3 |
act_factor3 > act_factor2 |
cat_af4 |
act_factor4 > act_factor3 |
act_factor5 |
act_factor1 + act_factor2 + act_factor3 + act_factor4 |
act_factor6 |
(act_factor1 - act_factor2) / sqrt(act_factor1^2 + act_factor2^2) |
active_buy_volume_large |
buy_lg_vol / net_mf_vol |
active_buy_volume_big |
buy_elg_vol / net_mf_vol |
active_buy_volume_small |
buy_sm_vol / net_mf_vol |
buy_lg_vol_minus_sell_lg_vol |
(buy_lg_vol - sell_lg_vol) / net_mf_vol |
buy_elg_vol_minus_sell_elg_vol |
(buy_elg_vol - sell_elg_vol) / net_mf_vol |
ctrl_strength |
(cost_85pct - cost_15pct) / (his_high - his_low) |
low_cost_dev |
(close - cost_5pct) / (cost_50pct - cost_5pct) |
asymmetry |
(cost_95pct - cost_50pct) / (cost_50pct - cost_5pct) |
lock_factor |
turnover_rate * (1 - (cost_95pct - cost_5pct) / (his_high - his_low)) |
cat_vol_break |
(close > cost_85pct) & (volume_ratio > 2) |
cost_atr_adj |
(cost_95pct - cost_5pct) / atr_14 |
cat_golden_resonance |
(close > weight_avg) & (volume_ratio > 1.5) & (winner_rate > 0.7) |
mv_turnover_ratio |
turnover_rate / log_circ_mv |
mv_adjusted_volume |
vol / log_circ_mv |
mv_weighted_turnover |
turnover_rate / log_circ_mv |
nonlinear_mv_volume |
vol / log_circ_mv |
mv_volume_ratio |
volume_ratio / log_circ_mv |
mv_momentum |
turnover_rate * volume_ratio / log_circ_mv |
六、资金流因子 (Money Flow Factors)
11. lg_flow_mom_corr_20_60
表达式:
net_lg_flow_val = (buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) * close
rolling_net_lg_flow = sum(net_lg_flow_val, 20)
price_mom = pct_change(close, 20)
lg_flow_mom_corr_20_60 = corr(rolling_net_lg_flow, price_mom, 60)
12. lg_flow_accel
表达式:
net_lg_flow_vol = buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol
lg_flow_accel = diff(diff(net_lg_flow_vol, 1), 1)
13. profit_pressure
表达式:
profit_margin_85 = close / cost_85pct - 1
profit_margin_95 = close / cost_95pct - 1
profit_pressure = winner_rate * 0.5 * (profit_margin_85 + profit_margin_95)
14. underwater_resistance
表达式:
underwater_ratio = 1.0 - winner_rate
dist_to_cost_15 = max(0, cost_15pct - close) / (close + epsilon)
underwater_resistance = underwater_ratio * dist_to_cost_15
15. cost_conc_std_20
表达式:
cost_range_norm = (cost_85pct - cost_15pct) / (weight_avg + epsilon)
cost_conc_std_20 = std(cost_range_norm, 20)
16. profit_decay_20
表达式:
ret_20 = close / close.shift(20) - 1
winner_rate_change_20 = diff(winner_rate, 20)
profit_decay_20 = ret_20 / winner_rate_change_20
17. vol_amp_loss_20
表达式:
vol_20 = std(pct_chg, 20)
loss_degree = max(0, weight_avg - close) / (close + epsilon)
vol_amp_loss_20 = vol_20 * loss_degree
18. vol_drop_profit_cnt_5
表达式:
is_profitable = close > weight_avg * (1 + 0.1)
is_dropping = pct_chg < -0.03
rolling_mean_vol = mean(vol, 20)
rolling_std_vol = std(vol, 20)
is_high_vol = vol > (rolling_mean_vol + 2 * rolling_std_vol)
event = is_profitable & is_dropping & is_high_vol
vol_drop_profit_cnt_5 = sum(event, 5)
19. lg_flow_vol_interact_20
表达式:
vol_20 = std(pct_chg, 20)
net_lg_flow_val = (buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) * close
total_val = vol * close
abs_net_lg_flow_ratio = abs(net_lg_flow_val) / (total_val + epsilon)
abs_net_lg_flow_ratio_20 = mean(abs_net_lg_flow_ratio, 20)
lg_flow_vol_interact_20 = vol_20 * abs_net_lg_flow_ratio_20
20. cost_break_confirm_cnt_5
表达式:
prev_cost_85 = cost_85pct.shift(1)
prev_cost_15 = cost_15pct.shift(1)
break_up = close > prev_cost_85
break_down = close < prev_cost_15
net_lg_flow_vol = buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol
confirm_up = break_up & (net_lg_flow_vol > 0)
confirm_down = break_down & (net_lg_flow_vol < 0)
net_confirm = confirm_up - confirm_down
cost_break_confirm_cnt_5 = sum(net_confirm, 5)
21. atr_norm_channel_pos_14
表达式:
tr = max(high - low, abs(high - prev_close), abs(low - prev_close))
atr_14 = mean(tr, 14)
roll_low_14 = min(low, 14)
atr_norm_channel_pos_14 = (close - roll_low_14) / atr_14
22. turnover_diff_skew_20
表达式:
turnover_diff = diff(turnover_rate, 1)
turnover_diff_skew_20 = skew(turnover_diff, 20)
23. lg_sm_flow_diverge_20
表达式:
lg_flow_ratio = (buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) / vol
sm_flow_ratio = (buy_sm_vol - sell_sm_vol) / vol
lg_flow_ratio_20 = mean(lg_flow_ratio, 20)
sm_flow_ratio_20 = mean(sm_flow_ratio, 20)
lg_sm_flow_diverge_20 = lg_flow_ratio_20 - sm_flow_ratio_20
24. pullback_strong_20_20
表达式:
high_20 = max(high, 20)
pullback_depth = (high_20 - close) / high_20
recent_gain_20 = close / close.shift(20) - 1
pullback_strong_20_20 = pullback_depth / recent_gain_20
25. vol_wgt_hist_pos_20
表达式:
hist_pos = (close - his_low) / (his_high - his_low)
rolling_mean_vol_20 = mean(vol, 20)
vol_rel_strength = vol / rolling_mean_vol_20
vol_wgt_hist_pos_20 = hist_pos * vol_rel_strength
26. vol_adj_roc_20
表达式:
roc_20 = close / close.shift(20) - 1
vol_20 = std(pct_chg, 20)
vol_adj_roc_20 = roc_20 / vol_20
七、截面排序因子 (Cross-Sectional Rank Factors)
27. cs_rank_net_lg_flow_val
表达式:
net_lg_flow_val = (buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) * close
cs_rank_net_lg_flow_val = rank(net_lg_flow_val, pct=True)
28. cs_rank_flow_divergence
表达式:
lg_ratio = (buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) / vol
sm_ratio = (buy_sm_vol - sell_sm_vol) / vol
divergence = lg_ratio - sm_ratio
cs_rank_flow_divergence = rank(divergence, pct=True)
29. cs_rank_ind_adj_lg_flow
表达式:
net_lg_flow_vol = (buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol) * close
industry_avg_flow = mean(net_lg_flow_vol) by trade_date, cat_l2_code
deviation = net_lg_flow_vol - industry_avg_flow
cs_rank_ind_adj_lg_flow = rank(deviation, pct=True)
30. cs_rank_elg_buy_ratio
表达式:
elg_buy_ratio = buy_elg_vol / vol
cs_rank_elg_buy_ratio = rank(elg_buy_ratio, pct=True)
31. cs_rank_rel_profit_margin
表达式:
profit_margin = (close - weight_avg) / close
cs_rank_rel_profit_margin = rank(profit_margin, pct=True)
32. cs_rank_cost_breadth
表达式:
cost_breadth = (cost_85pct - cost_15pct) / weight_avg
cs_rank_cost_breadth = rank(cost_breadth, pct=True)
33. cs_rank_dist_to_upper_cost
表达式:
dist_to_95 = close / cost_95pct
cs_rank_dist_to_upper_cost = rank(dist_to_95, pct=True)
34. cs_rank_winner_rate
表达式:
cs_rank_winner_rate = rank(winner_rate, pct=True)
35. cs_rank_intraday_range
表达式:
norm_range = (high - low) / close
cs_rank_intraday_range = rank(norm_range, pct=True)
36. cs_rank_close_pos_in_range
表达式:
close_pos = (close - low) / (high - low)
cs_rank_close_pos_in_range = rank(close_pos, pct=True)
37. cs_rank_opening_gap
表达式:
gap = open / pre_close - 1
cs_rank_opening_gap = rank(gap, pct=True)
38. cs_rank_pos_in_hist_range
表达式:
hist_pos = (close - his_low) / (his_high - his_low)
cs_rank_pos_in_hist_range = rank(hist_pos, pct=True)
39. cs_rank_vol_x_profit_margin
表达式:
daily_vol = abs(pct_chg)
profit_margin = (close - weight_avg) / close
interaction = daily_vol * profit_margin
cs_rank_vol_x_profit_margin = rank(interaction, pct=True)
40. cs_rank_lg_flow_price_concordance
表达式:
net_lg_flow_vol = buy_lg_vol + buy_elg_vol - sell_lg_vol - sell_elg_vol
concordance = net_lg_flow_vol * pct_chg
cs_rank_lg_flow_price_concordance = rank(concordance, pct=True)
41. cs_rank_turnover_per_winner
表达式:
turnover_per_winner = turnover_rate / winner_rate
cs_rank_turnover_per_winner = rank(turnover_per_winner, pct=True)
42. cs_rank_ind_cap_neutral_pe
表达式: Placeholder - 需要 statsmodels 实现
43. cs_rank_volume_ratio
表达式:
cs_rank_volume_ratio = rank(volume_ratio, pct=True)
44. cs_rank_elg_buy_sell_sm_ratio
表达式:
ratio = buy_elg_vol / sell_sm_vol
cs_rank_elg_buy_sell_sm_ratio = rank(ratio, pct=True)
45. cs_rank_cost_dist_vol_ratio
表达式:
dist = abs(close - weight_avg) / (close + epsilon)
interaction = dist * volume_ratio
cs_rank_cost_dist_vol_ratio = rank(interaction, pct=True)
46. cs_rank_size
表达式:
log_circ_mv = log1p(circ_mv)
cs_rank_size = rank(log_circ_mv, pct=True)
八、行业因子 (Industry Factors)
47. get_act_factor (from main.utils.factor)
生成的因子: act_factor1, act_factor2, act_factor3, act_factor4
表达式:
obv = OBV(close, vol)
return_5 = close / close.shift(5) - 1
return_20 = close / close.shift(20) - 1
return_5_percentile = rank(return_5, pct=True)
return_20_percentile = rank(return_20, pct=True)
列重命名: 行业因子列名前缀为 industry_
附录:符号说明
| 符号 | 含义 |
|---|---|
epsilon |
极小值 (1e-10),防止除零 |
mean(x, N) |
N周期滚动平均值 |
std(x, N) |
N周期滚动标准差 |
var(x, N) |
N周期滚动方差 |
sum(x, N) |
N周期滚动求和 |
max(x, N) |
N周期滚动最大值 |
min(x, N) |
N周期滚动最小值 |
diff(x, N) |
N周期差分 |
pct_change(x, N) |
N周期百分比变化 |
shift(x, N) |
N周期位移 |
rank(x, pct=True) |
截面排序 (百分比) |
corr(x, y, N) |
N周期滚动相关系数 |
cov(x, y, N) |
N周期滚动协方差 |
skew(x, N) |
N周期滚动偏度 |
kurt(x, N) |
N周期滚动峰度 |
ATR |
Average True Range (TA-Lib) |
OBV |
On-Balance Volume (TA-Lib) |
RSI |
Relative Strength Index (TA-Lib) |
SMA |
Simple Moving Average (TA-Lib) |
EMA |
Exponential Moving Average (TA-Lib) |
atan |
反正切函数 |
文档生成时间: 2026-03-06 共收录 180+ 个因子