褚宏光 95d13b2cce Enhance converging triangle analysis with detailed mode and outlier removal algorithm

- Added `--show-details` parameter to `pipeline_converging_triangle.py` for generating detailed charts that display all pivot points and fitting lines.
- Implemented an iterative outlier removal algorithm in `fit_pivot_line` to improve the accuracy of pivot point fitting by eliminating weak points.
- Updated `USAGE.md` to include new command examples for the detailed mode.
- Revised multiple documentation files to reflect recent changes and improvements in the pivot detection and visualization processes.

2026-01-26 18:43:18 +08:00

3.0 KiB

Raw Blame History

拟合线迭代离群点移除优化

问题描述

当前的分段选择算法会选中一些"弱"枢轴点用于拟合：

上沿线：某个高点虽然是时间段内最高，但明显低于其他高点（如图中第二个点 5.8 元）
这些点会拉低/拉高拟合线，导致与主观判断不符

解决方案：迭代离群点移除

核心逻辑

1. 初始拟合：用所有枢轴点做线性回归
2. 计算残差：每个点到拟合线的偏差
3. 识别离群点：
   - 上沿线：价格明显低于拟合线的点 = 弱高点
   - 下沿线：价格明显高于拟合线的点 = 弱低点
4. 移除最差的离群点
5. 重新拟合
6. 重复直到收敛

算法流程图

flowchart TD
    A[输入枢轴点] --> B[初始线性回归]
    B --> C[计算残差]
    C --> D{存在离群点?}
    D -->|是| E[移除最大离群点]
    E --> F{剩余点 >= 3?}
    F -->|是| G{迭代次数 < 3?}
    G -->|是| B
    G -->|否| H[返回当前拟合]
    F -->|否| H
    D -->|否| H

代码修改

文件: src/converging_triangle.py

重写 fit_pivot_line 函数（第 230-350 行）:

def fit_pivot_line(
    pivot_indices: np.ndarray,
    pivot_values: np.ndarray,
    mode: str = "upper",
    min_points: int = 2,
    outlier_threshold: float = 1.5,  # 新增：离群点阈值（标准差倍数）
    max_iterations: int = 3,          # 新增：最大迭代次数
) -> Tuple[float, float, np.ndarray]:
    """
    迭代离群点移除的枢轴点拟合算法
    
    策略：
    1. 先用所有点做初始拟合
    2. 识别并移除偏离拟合线的"弱"点
    3. 迭代直到收敛
    
    对于上沿线：移除价格明显低于拟合线的点
    对于下沿线：移除价格明显高于拟合线的点
    """

关键参数

| 参数 | 默认值 | 说明 |

|------|--------|------|

| outlier_threshold | 1.5 | 残差超过 1.5 倍标准差视为离群点 |

| max_iterations | 3 | 最多迭代 3 次，避免过度过滤 |

| min_points | 3 | 至少保留 3 个点用于拟合 |

离群点判定逻辑

上沿线（upper）:

# 残差 = 拟合值 - 实际值
# 正残差表示点在拟合线下方（弱高点）
residuals = fitted_values - actual_values
outliers = residuals > threshold  # 弱高点

下沿线（lower）:

# 残差 = 实际值 - 拟合值
# 正残差表示点在拟合线上方（弱低点）
residuals = actual_values - fitted_values
outliers = residuals > threshold  # 弱低点

预期效果

以图中 SZ300278 为例：

第二个点（5.8元）明显低于拟合线
在第一次迭代后会被识别为离群点并移除
最终拟合线只使用剩余 3 个更有代表性的高点

测试计划

使用 SZ300278 验证修复效果
对比修改前后的图表
确保不会过度过滤正常的枢轴点

文档更新

更新 docs/枢轴点分段选择算法详解.md，添加迭代离群点移除的说明。

3.0 KiB Raw Blame History Unescape Escape