/Users/yuchi/PycharmProjects/PsyMl_ISI/.venv/bin/python /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/batch_brute_v1.py
================================================================================
暴力單特徵測試 (batch_brute_v1) [optimized]
================================================================================
DB: /Users/yuchi/PycharmProjects/PsyMl_ISI/data/psy_ml_isi.db
資料表: isi_raw_data_recalc_5s10
附加群組: ['EEG'] (prefixes=['EEG_'])
EEG 類型篩選: ['BRAIN_AVG', 'REGION_AVG', 'FAA']
PAC JOIN: 不加入
指定樣本: 無
跳過樣本: ['S112008', 'S112043', 'S112074', 'S112104', 'S112120', 'S112194', 'S112203', 'S112268']
排名指標: AUC
CV seed: 42
Worker 數: 9 / 10 cores
輸出目錄: /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs
================================================================================
[1/4] 載入資料...
[SKIP] ID exclusions: 8 個樣本被跳過 (['S112008', 'S112043', 'S112074', 'S112104', 'S112120', 'S112194', 'S112203', 'S112268'])
[EEG 類型篩選] 1002 → 326 個 EEG 欄位(保留類型:['BRAIN_AVG', 'REGION_AVG', 'FAA'])
樣本數: 88
目標分布: {0: 60, 1: 28}
附加特徵數: 326
平方項: ON
並行任務數: 2608 (326 features × 8 classifiers)
Baseline 快取: ON(相同 complete-case mask 共用 baseline)
Unique masks: 2 / 326 features(省去 2592 次 baseline 重複計算)
[2/4] 準備資料暫存...
暫存路徑: /var/folders/w3/0b378zts7xn2dmlg9rw77lc00000gn/T/brute_v1_3pk03s77
[3/4] 並行跑 2608 個任務 (9 workers)...
特徵×分類器: 100%|██████████| 2608/2608 [04:24<00:00, 9.85task/s]
細粒度結果 → /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs/batch_brute_v1_results.csv
[4/4] 計算排名 (mean_delta, 指標=AUC)...
排名結果 → /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs/batch_brute_v1_ranking.csv
────────────────────────────────────────────────────────────────────────────────
Matched Baseline 各模型 AUC(取各 feature matched subset 的平均)
────────────────────────────────────────────────────────────────────────────────
DecisionTree F1=0.5294 AUC=0.6511
KNN F1=0.6316 AUC=0.8383
LogisticRegression F1=0.7568 AUC=0.9053
MLP F1=0.6111 AUC=0.8617
NaiveBayes F1=0.7429 AUC=0.9128
RandomForest F1=0.5556 AUC=0.8376
SVM F1=0.7619 AUC=0.8632
XGBoost F1=0.7179 AUC=0.8617
────────────────────────────────────────────────────────────────────────────────
Top 30 附加特徵 (mean_delta_AUC 排序)
────────────────────────────────────────────────────────────────────────────────
feature_combination mean_delta positive_ratio delta_DecisionTree delta_KNN delta_LogisticRegression delta_MLP delta_NaiveBayes delta_RandomForest delta_SVM delta_XGBoost
rank
1 +EEG_FAA_ABS_ALPHA1_O2O1+EEG_FAA_ABS_ALPHA1_O2O1² 0.0617 1.00 0.1195 0.0353 0.0466 0.0797 0.0286 0.0579 0.0692 0.0571
2 +EEG_FAA_ABS_ALPHA1_O2O1 0.0521 0.88 0.0910 0.0579 -0.0180 0.0737 0.0301 0.0556 0.0662 0.0602
3 +EEG_PWR_REL_ALPHA2_OCCIPITAL_AVG 0.0443 0.75 0.1579 0.0707 -0.0120 0.0481 -0.0030 0.0398 0.0436 0.0090
4 +EEG_PWR_ABS_ALPHA_TEMPORAL_AVG+EEG_PWR_ABS_ALPHA_TEMPORAL_AVG² 0.0439 0.75 0.1459 0.0812 0.0045 0.0316 -0.0015 0.0519 0.0436 -0.0060
5 +EEG_PWR_ABS_ALPHA_TEMPORAL_AVG 0.0366 1.00 0.1053 0.0459 0.0090 0.0286 0.0090 0.0436 0.0346 0.0165
6 +EEG_PWR_REL_BETA_BRAIN_AVG+EEG_PWR_REL_BETA_BRAIN_AVG² 0.0353 0.75 0.0910 0.0857 -0.0316 0.0090 -0.0180 0.0549 0.0722 0.0195
7 +EEG_FAA_REL_BETA1_O2O1 0.0320 0.88 0.1053 0.0586 -0.0211 0.0000 0.0165 0.0496 0.0241 0.0226
8 +EEG_FAA_REL_ALPHA2_FP2FP1+EEG_FAA_REL_ALPHA2_FP2FP1² 0.0318 0.75 0.1338 0.0429 -0.0120 0.0045 -0.0346 0.0609 0.0376 0.0211
9 +EEG_PWR_ABS_ALPHA_OCCIPITAL_AVG+EEG_PWR_ABS_ALPHA_OCCIPITAL_AVG² 0.0315 1.00 0.1030 0.0451 0.0180 0.0286 0.0045 0.0256 0.0180 0.0090
10 +EEG_FAA_REL_ALPHA1_P4P3+EEG_FAA_REL_ALPHA1_P4P3² 0.0313 0.75 0.0549 0.0669 -0.0060 0.0662 -0.0203 0.0406 0.0226 0.0256
11 +EEG_PWR_REL_ALPHA2_OCCIPITAL_AVG+EEG_PWR_REL_ALPHA2_OCCIPITAL_AVG² 0.0306 0.62 0.0910 0.0842 0.0000 0.0556 -0.0286 0.0308 0.0316 -0.0195
12 +EEG_FAA_ABS_BETA3_FP2FP1 0.0303 0.62 0.1195 0.0571 -0.0195 0.0436 -0.0090 0.0406 0.0218 -0.0120
13 +EEG_FAA_REL_ALPHA2_FP2FP1 0.0300 0.75 0.1481 0.0241 -0.0256 0.0316 -0.0165 0.0451 0.0271 0.0060
14 +EEG_FAA_REL_GAMMA1_F8F7 0.0291 0.75 0.0812 0.0571 -0.0060 0.0556 -0.0331 0.0406 0.0195 0.0180
15 +EEG_FAA_ABS_HBETA_FP2FP1 0.0289 0.75 0.0932 0.0436 -0.0180 0.0436 -0.0030 0.0496 0.0211 0.0015
16 +EEG_FAA_ABS_ALPHA_O2O1+EEG_FAA_ABS_ALPHA_O2O1² 0.0275 0.75 0.0526 -0.0008 0.0195 0.0301 0.0000 0.0511 0.0376 0.0301
17 +EEG_PWR_REL_BETA_BRAIN_AVG 0.0270 0.75 0.0286 0.0850 -0.0060 0.0466 -0.0045 0.0331 0.0165 0.0165
18 +EEG_FAA_ABS_ALPHA_O2O1 0.0262 0.75 0.0526 -0.0053 -0.0105 0.0361 0.0180 0.0376 0.0496 0.0316
19 +EEG_FAA_REL_GAMMA1_F4F3 0.0259 0.88 0.0789 0.0286 0.0120 0.0075 -0.0271 0.0549 0.0346 0.0180
20 +EEG_FAA_ABS_THETA_P4P3+EEG_FAA_ABS_THETA_P4P3² 0.0257 0.75 0.1338 -0.0030 0.0015 -0.0105 0.0008 0.0391 0.0165 0.0271
21 +EEG_FAA_ABS_BETA3_FP2FP1+EEG_FAA_ABS_BETA3_FP2FP1² 0.0254 0.75 0.1195 0.0398 -0.0030 0.0602 -0.0782 0.0316 0.0195 0.0135
22 +EEG_FAA_ABS_GAMMA1_P4P3 0.0250 0.62 0.1316 0.0331 0.0045 -0.0045 -0.0105 0.0398 0.0195 -0.0135
23 +EEG_PWR_REL_BETA_PARIETAL_AVG 0.0246 0.62 0.1195 -0.0075 -0.0030 0.0090 -0.0045 0.0534 0.0105 0.0195
24 +EEG_FAA_REL_GAMMA1_F8F7+EEG_FAA_REL_GAMMA1_F8F7² 0.0239 0.75 0.1053 0.0639 0.0000 0.0301 -0.0541 0.0218 0.0150 0.0090
25 +EEG_FAA_ABS_GAMMA_FP2FP1 0.0235 0.75 0.1075 0.0338 -0.0165 0.0165 -0.0120 0.0346 0.0090 0.0150
26 +EEG_PWR_REL_HBETA_BRAIN_AVG+EEG_PWR_REL_HBETA_BRAIN_AVG² 0.0232 0.75 0.0910 0.0925 -0.0211 0.0015 -0.0301 0.0173 0.0286 0.0060
27 +EEG_PWR_ABS_ALPHA_OCCIPITAL_AVG 0.0231 0.88 0.0647 0.0338 0.0090 0.0000 0.0256 0.0248 0.0120 0.0150
28 +EEG_PWR_REL_ALPHA_BRAIN_AVG+EEG_PWR_REL_ALPHA_BRAIN_AVG² 0.0230 0.75 0.1316 0.0015 -0.0361 0.0195 -0.0316 0.0586 0.0256 0.0150
29 +EEG_FAA_REL_ALPHA_FP2FP1 0.0227 0.62 0.1338 0.0248 -0.0211 0.0000 -0.0150 0.0278 0.0376 -0.0060
30 +EEG_FAA_REL_GAMMA1_F4F3+EEG_FAA_REL_GAMMA1_F4F3² 0.0225 0.88 0.0789 0.0338 0.0090 0.0030 -0.0767 0.0654 0.0436 0.0226
────────────────────────────────────────────────────────────────────────────────
Top 10 by positive_ratio(對最多模型有幫助)
────────────────────────────────────────────────────────────────────────────────
feature_combination mean_delta positive_ratio delta_DecisionTree delta_KNN delta_LogisticRegression delta_MLP delta_NaiveBayes delta_RandomForest delta_SVM delta_XGBoost
rank
1 +EEG_FAA_ABS_ALPHA1_O2O1+EEG_FAA_ABS_ALPHA1_O2O1² 0.0617 1.00 0.1195 0.0353 0.0466 0.0797 0.0286 0.0579 0.0692 0.0571
5 +EEG_PWR_ABS_ALPHA_TEMPORAL_AVG 0.0366 1.00 0.1053 0.0459 0.0090 0.0286 0.0090 0.0436 0.0346 0.0165
9 +EEG_PWR_ABS_ALPHA_OCCIPITAL_AVG+EEG_PWR_ABS_ALPHA_OCCIPITAL_AVG² 0.0315 1.00 0.1030 0.0451 0.0180 0.0286 0.0045 0.0256 0.0180 0.0090
42 +EEG_FAA_REL_DELTA_F4F3 0.0207 1.00 0.0669 0.0090 0.0180 0.0135 0.0045 0.0263 0.0135 0.0135
49 +EEG_FAA_REL_ALPHA2_P4P3 0.0198 1.00 0.0143 0.0241 0.0045 0.0511 0.0015 0.0301 0.0120 0.0211
2 +EEG_FAA_ABS_ALPHA1_O2O1 0.0521 0.88 0.0910 0.0579 -0.0180 0.0737 0.0301 0.0556 0.0662 0.0602
7 +EEG_FAA_REL_BETA1_O2O1 0.0320 0.88 0.1053 0.0586 -0.0211 0.0000 0.0165 0.0496 0.0241 0.0226
19 +EEG_FAA_REL_GAMMA1_F4F3 0.0259 0.88 0.0789 0.0286 0.0120 0.0075 -0.0271 0.0549 0.0346 0.0180
27 +EEG_PWR_ABS_ALPHA_OCCIPITAL_AVG 0.0231 0.88 0.0647 0.0338 0.0090 0.0000 0.0256 0.0248 0.0120 0.0150
30 +EEG_FAA_REL_GAMMA1_F4F3+EEG_FAA_REL_GAMMA1_F4F3² 0.0225 0.88 0.0789 0.0338 0.0090 0.0030 -0.0767 0.0654 0.0436 0.0226
可讀報告 → /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs/batch_brute_v1_report.txt
================================================================================
完成。總耗時:265.8s (4.4 min)
================================================================================
EEG_PWR_REL_BETA_BRAIN_AVG
BAI_T1+BDI_T1
/Users/yuchi/PycharmProjects/PsyMl_ISI/.venv/bin/python /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/ml_benchmark_modular.py
[Filter] ID exclusions: 8 rows removed.
============================================================
[BASE] 來源=isi_raw_data_recalc_5s 目標=3TP 列數=54 特徵數=2
[BASE] 使用欄位:['BDI_T1', 'BAI_T1']
[CV] Stratified 10-fold, seed=42 | class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted
[前處理] HRV Log轉換=False | 異常值處理=iqr
[Leakage check] Class balance
count percent%
3TP
0 35 64.8
1 19 35.2
[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。
=== [BASE] ML Benchmark (Stratified 10-fold CV) ===
model AUC AUC_overall F1_pos(=1) Prec_pos Rec_pos F1_neg(=0) Prec_neg Rec_neg MCC Accuracy Pred1_mean Pred0_mean
NaiveBayes 0.913 0.913 0.743 0.812 0.684 0.877 0.842 0.914 0.626 0.833 1.600 3.800
LogisticRegression 0.905 0.905 0.757 0.778 0.737 0.873 0.861 0.886 0.631 0.833 1.800 3.600
XGBoost 0.868 0.868 0.718 0.700 0.737 0.841 0.853 0.829 0.559 0.796 2.000 3.400
SVM 0.863 0.863 0.762 0.696 0.842 0.848 0.903 0.800 0.620 0.815 2.300 3.100
MLP 0.860 0.860 0.571 0.625 0.526 0.795 0.763 0.829 0.371 0.722 1.600 3.800
KNN 0.838 0.838 0.632 0.632 0.632 0.800 0.800 0.800 0.432 0.741 1.900 3.500
RandomForest 0.835 0.835 0.571 0.625 0.526 0.795 0.763 0.829 0.371 0.722 1.600 3.800
DecisionTree 0.651 0.651 0.529 0.600 0.474 0.784 0.744 0.829 0.322 0.704 1.500 3.900
--- [BASE] Aggregated Confusion Matrix ---
model TN_sum FP_sum FN_sum TP_sum FP_FN_IDS
NaiveBayes 32 3 6 13 [S112271, S112019, S112240 | S112002, S112169, S112222, S112036, S112183, S112257]
LogisticRegression 31 4 5 14 [S112271, S112201, S112019, S112240 | S112169, S112222, S112036, S112183, S112257]
XGBoost 29 6 5 14 [S112012, S112271, S112159, S112042, S112019, S112240 | S112029, S112222, S112036, S112183, S112039]
SVM 28 7 3 16 [S112271, S112159, S112119, S112184, S112201, S112019, S112240 | S112169, S112036, S112183]
MLP 29 6 9 10 [S112012, S112271, S112159, S112184, S112019, S112240 | S112029, S112169, S112003, S112209, S112036, S112183, S112023, S112086, S112039]
KNN 28 7 7 12 [S112012, S112271, S112184, S112042, S112201, S112019, S112240 | S112002, S112169, S112222, S112036, S112183, S112257, S112039]
RandomForest 29 6 9 10 [S112012, S112271, S112159, S112042, S112019, S112240 | S112002, S112029, S112169, S112222, S112036, S112183, S112023, S112086, S112039]
DecisionTree 29 6 10 9 [S112012, S112271, S112159, S112042, S112019, S112240 | S112029, S112055, S112169, S112003, S112222, S112036, S112183, S112023, S112086, S112039]
[Filter] ID exclusions: 8 rows removed.
============================================================
[BASE + ADDED] 來源=isi_raw_data_recalc_5s 目標=3TP 列數=54 特徵數=3
[BASE + ADDED] 使用欄位:['BDI_T1', 'BAI_T1', 'EEG_PWR_REL_BETA_BRAIN_AVG']
[CV] Stratified 10-fold, seed=42 | class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted
[前處理] HRV Log轉換=False | 異常值處理=iqr
[Leakage check] Class balance
count percent%
3TP
0 35 64.800
1 19 35.200
[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。
=== [BASE + ADDED] ML Benchmark (Stratified 10-fold CV) ===
model AUC AUC_overall F1_pos(=1) Prec_pos Rec_pos F1_neg(=0) Prec_neg Rec_neg MCC Accuracy Pred1_mean Pred0_mean
KNN 0.923 0.923 0.778 0.824 0.737 0.889 0.865 0.914 0.670 0.852 1.700 3.700
MLP 0.911 0.911 0.769 0.750 0.789 0.870 0.882 0.857 0.639 0.833 2.000 3.400
NaiveBayes 0.908 0.908 0.743 0.812 0.684 0.877 0.842 0.914 0.626 0.833 1.600 3.800
LogisticRegression 0.899 0.899 0.757 0.778 0.737 0.873 0.861 0.886 0.631 0.833 1.800 3.600
SVM 0.880 0.880 0.718 0.700 0.737 0.841 0.853 0.829 0.559 0.796 2.000 3.400
XGBoost 0.874 0.874 0.684 0.684 0.684 0.829 0.829 0.829 0.513 0.778 1.900 3.500
RandomForest 0.871 0.871 0.647 0.733 0.579 0.838 0.795 0.886 0.495 0.778 1.500 3.900
DecisionTree 0.623 0.623 0.500 0.529 0.474 0.750 0.730 0.771 0.252 0.667 1.700 3.700
--- [BASE + ADDED] Aggregated Confusion Matrix ---
model TN_sum FP_sum FN_sum TP_sum FP_FN_IDS
KNN 32 3 5 14 [S112271, S112019, S112240 | S112222, S112183, S112023, S112257, S112039]
MLP 30 5 4 15 [S112271, S112119, S112070, S112019, S112240 | S112003, S112222, S112023, S112257]
NaiveBayes 32 3 6 13 [S112271, S112019, S112240 | S112002, S112169, S112222, S112036, S112183, S112257]
LogisticRegression 31 4 5 14 [S112271, S112201, S112019, S112240 | S112169, S112222, S112036, S112183, S112257]
SVM 29 6 5 14 [S112271, S112159, S112119, S112201, S112019, S112240 | S112222, S112036, S112183, S112023, S112257]
XGBoost 29 6 6 13 [S112012, S112271, S112042, S112201, S112019, S112240 | S112169, S112222, S112036, S112183, S112257, S112039]
RandomForest 31 4 8 11 [S112012, S112271, S112019, S112240 | S112169, S112003, S112222, S112036, S112183, S112023, S112257, S112039]
DecisionTree 27 8 10 9 [S112012, S112271, S112070, S112266, S112042, S112201, S112019, S112240 | S112055, S112169, S112003, S112222, S112036, S112183, S112023, S112257, S112075, S112039]
[Filter] ID exclusions: 8 rows removed.
============================================================
[BASE + ADDED + ADDED²] 來源=isi_raw_data_recalc_5s 目標=3TP 列數=54 特徵數=3
[BASE + ADDED + ADDED²] 使用欄位:['BDI_T1', 'BAI_T1', 'EEG_PWR_REL_BETA_BRAIN_AVG']
[BASE + ADDED + ADDED²] 多項式特徵(Pipeline 內產生):['EEG_PWR_REL_BETA_BRAIN_AVG²']
[CV] Stratified 10-fold, seed=42 | class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted
[前處理] HRV Log轉換=False | 異常值處理=iqr
[Leakage check] Class balance
count percent%
3TP
0 35 64.800
1 19 35.200
[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。
=== [BASE + ADDED + ADDED²] ML Benchmark (Stratified 10-fold CV) ===
model AUC AUC_overall F1_pos(=1) Prec_pos Rec_pos F1_neg(=0) Prec_neg Rec_neg MCC Accuracy Pred1_mean Pred0_mean
SVM 0.935 0.935 0.821 0.800 0.842 0.899 0.912 0.886 0.720 0.870 2.000 3.400
KNN 0.924 0.924 0.800 0.875 0.737 0.904 0.868 0.943 0.711 0.870 1.600 3.800
NaiveBayes 0.895 0.895 0.686 0.750 0.632 0.849 0.816 0.886 0.541 0.796 1.600 3.800
RandomForest 0.889 0.889 0.727 0.857 0.632 0.880 0.825 0.943 0.626 0.833 1.400 4.000
XGBoost 0.881 0.881 0.757 0.778 0.737 0.873 0.861 0.886 0.631 0.833 1.800 3.600
LogisticRegression 0.874 0.874 0.722 0.765 0.684 0.861 0.838 0.886 0.586 0.815 1.700 3.700
MLP 0.868 0.868 0.769 0.750 0.789 0.870 0.882 0.857 0.639 0.833 2.000 3.400
DecisionTree 0.754 0.754 0.683 0.636 0.737 0.806 0.844 0.771 0.494 0.759 2.200 3.200
--- [BASE + ADDED + ADDED²] Aggregated Confusion Matrix ---
model TN_sum FP_sum FN_sum TP_sum FP_FN_IDS
SVM 31 4 3 16 [S112271, S112159, S112119, S112019 | S112222, S112183, S112023]
KNN 33 2 5 14 [S112271, S112019 | S112222, S112183, S112023, S112257, S112039]
NaiveBayes 31 4 7 12 [S112012, S112271, S112019, S112240 | S112002, S112169, S112222, S112036, S112183, S112257, S112039]
RandomForest 33 2 7 12 [S112271, S112240 | S112169, S112003, S112222, S112036, S112183, S112023, S112039]
XGBoost 31 4 5 14 [S112271, S112042, S112019, S112240 | S112169, S112003, S112222, S112183, S112039]
LogisticRegression 31 4 6 13 [S112271, S112201, S112019, S112240 | S112169, S112222, S112036, S112183, S112023, S112257]
MLP 30 5 4 15 [S112271, S112119, S112070, S112019, S112240 | S112003, S112222, S112023, S112257]
DecisionTree 27 8 5 14 [S112012, S112271, S112070, S112266, S112042, S112201, S112019, S112240 | S112003, S112222, S112036, S112183, S112075]
============================================================
=== Feature Comparison ===
BASE: ['BAI_T1', 'BDI_T1']
ADDED: ['EEG_PWR_REL_BETA_BRAIN_AVG']
ADDED sq: ['EEG_PWR_REL_BETA_BRAIN_AVG²']
============================================================
model AUC_base AUC_added delta(+ADDED) AUC_sq delta(+ADDED²) delta(sq-linear)
NaiveBayes 0.9128 0.9083 -0.0045 0.8947 -0.0180 -0.0135
LogisticRegression 0.9053 0.8992 -0.0060 0.8737 -0.0316 -0.0256
XGBoost 0.8677 0.8737 0.0060 0.8812 0.0135 0.0075
SVM 0.8632 0.8797 0.0165 0.9353 0.0722 0.0556
MLP 0.8602 0.9113 0.0511 0.8677 0.0075 -0.0436
KNN 0.8383 0.9233 0.0850 0.9241 0.0857 0.0008
RandomForest 0.8353 0.8714 0.0361 0.8895 0.0541 0.0180
DecisionTree 0.6511 0.6226 -0.0286 0.7541 0.1030 0.1316
8-model mean AUC delta:
+ADDED (linear): +0.0195
+ADDED² (quad): +0.0358
sq vs linear: +0.0164
============================================================
进程已结束,退出代码为 0
BDI_T1
/Users/yuchi/PycharmProjects/PsyMl_ISI/.venv/bin/python /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/ml_benchmark_modular.py
[Filter] ID exclusions: 8 rows removed.
============================================================
[BASE] 來源=isi_raw_data_recalc_5s 目標=3TP 列數=54 特徵數=1
[BASE] 使用欄位:['BDI_T1']
[CV] Stratified 10-fold, seed=42 | class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted
[前處理] HRV Log轉換=False | 異常值處理=iqr
[Leakage check] Class balance
count percent%
3TP
0 35 64.8
1 19 35.2
[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。
=== [BASE] ML Benchmark (Stratified 10-fold CV) ===
model AUC AUC_overall F1_pos(=1) Prec_pos Rec_pos F1_neg(=0) Prec_neg Rec_neg MCC Accuracy Pred1_mean Pred0_mean
LogisticRegression 0.874 0.874 0.684 0.684 0.684 0.829 0.829 0.829 0.513 0.778 1.900 3.500
NaiveBayes 0.862 0.862 0.688 0.846 0.579 0.868 0.805 0.943 0.583 0.815 1.300 4.100
XGBoost 0.853 0.853 0.750 0.714 0.789 0.853 0.879 0.829 0.605 0.815 2.100 3.300
KNN 0.845 0.845 0.703 0.722 0.684 0.845 0.833 0.857 0.548 0.796 1.800 3.600
MLP 0.836 0.836 0.684 0.684 0.684 0.829 0.829 0.829 0.513 0.778 1.900 3.500
RandomForest 0.824 0.824 0.737 0.737 0.737 0.857 0.857 0.857 0.594 0.815 1.900 3.500
SVM 0.789 0.789 0.727 0.640 0.842 0.812 0.897 0.743 0.560 0.778 2.500 2.900
DecisionTree 0.786 0.786 0.737 0.737 0.737 0.857 0.857 0.857 0.594 0.815 1.900 3.500
--- [BASE] Aggregated Confusion Matrix ---
model TN_sum FP_sum FN_sum TP_sum FP_FN_IDS
LogisticRegression 29 6 6 13 [S112271, S112159, S112119, S112201, S112019, S112240 | S112002, S112169, S112222, S112036, S112183, S112257]
NaiveBayes 33 2 8 11 [S112271, S112240 | S112002, S112029, S112169, S112222, S112036, S112183, S112257, S112214]
XGBoost 29 6 4 15 [S112012, S112271, S112159, S112042, S112105, S112240 | S112029, S112003, S112222, S112036]
KNN 30 5 6 13 [S112012, S112271, S112042, S112105, S112240 | S112002, S112029, S112003, S112222, S112036, S112183]
MLP 29 6 6 13 [S112012, S112271, S112159, S112042, S112105, S112240 | S112002, S112029, S112003, S112222, S112036, S112183]
RandomForest 30 5 5 14 [S112012, S112271, S112159, S112070, S112240 | S112029, S112003, S112222, S112036, S112086]
SVM 26 9 3 16 [S112012, S112271, S112159, S112119, S112042, S112201, S112105, S112019, S112240 | S112002, S112222, S112183]
DecisionTree 30 5 5 14 [S112012, S112271, S112159, S112070, S112240 | S112029, S112003, S112222, S112036, S112086]
[Filter] ID exclusions: 8 rows removed.
============================================================
[BASE + ADDED] 來源=isi_raw_data_recalc_5s 目標=3TP 列數=54 特徵數=2
[BASE + ADDED] 使用欄位:['BDI_T1', 'EEG_PWR_REL_BETA_BRAIN_AVG']
[CV] Stratified 10-fold, seed=42 | class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted
[前處理] HRV Log轉換=False | 異常值處理=iqr
[Leakage check] Class balance
count percent%
3TP
0 35 64.800
1 19 35.200
[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。
=== [BASE + ADDED] ML Benchmark (Stratified 10-fold CV) ===
model AUC AUC_overall F1_pos(=1) Prec_pos Rec_pos F1_neg(=0) Prec_neg Rec_neg MCC Accuracy Pred1_mean Pred0_mean
MLP 0.907 0.907 0.789 0.789 0.789 0.886 0.886 0.886 0.675 0.852 1.900 3.500
KNN 0.901 0.901 0.778 0.824 0.737 0.889 0.865 0.914 0.670 0.852 1.700 3.700
SVM 0.880 0.880 0.732 0.682 0.789 0.836 0.875 0.800 0.573 0.796 2.200 3.200
NaiveBayes 0.866 0.866 0.647 0.733 0.579 0.838 0.795 0.886 0.495 0.778 1.500 3.900
LogisticRegression 0.863 0.863 0.700 0.667 0.737 0.824 0.848 0.800 0.526 0.778 2.100 3.300
RandomForest 0.863 0.863 0.757 0.778 0.737 0.873 0.861 0.886 0.631 0.833 1.800 3.600
XGBoost 0.833 0.833 0.829 0.773 0.895 0.896 0.938 0.857 0.731 0.870 2.200 3.200
DecisionTree 0.744 0.744 0.667 0.706 0.632 0.833 0.811 0.857 0.503 0.778 1.700 3.700
--- [BASE + ADDED] Aggregated Confusion Matrix ---
model TN_sum FP_sum FN_sum TP_sum FP_FN_IDS
MLP 31 4 4 15 [S112271, S112070, S112266, S112240 | S112222, S112023, S112257, S112075]
KNN 32 3 5 14 [S112271, S112070, S112240 | S112002, S112222, S112183, S112023, S112257]
SVM 28 7 4 15 [S112271, S112159, S112119, S112070, S112201, S112019, S112240 | S112002, S112222, S112183, S112023]
NaiveBayes 31 4 8 11 [S112012, S112271, S112201, S112240 | S112002, S112029, S112169, S112222, S112036, S112183, S112257, S112214]
LogisticRegression 28 7 5 14 [S112012, S112271, S112159, S112119, S112201, S112019, S112240 | S112002, S112222, S112036, S112183, S112257]
RandomForest 31 4 5 14 [S112012, S112271, S112070, S112240 | S112003, S112222, S112023, S112257, S112039]
XGBoost 30 5 2 17 [S112012, S112271, S112042, S112201, S112240 | S112003, S112222]
DecisionTree 30 5 7 12 [S112271, S112070, S112176, S112266, S112042 | S112003, S112222, S112036, S112183, S112023, S112257, S112075]
[Filter] ID exclusions: 8 rows removed.
============================================================
[BASE + ADDED + ADDED²] 來源=isi_raw_data_recalc_5s 目標=3TP 列數=54 特徵數=2
[BASE + ADDED + ADDED²] 使用欄位:['BDI_T1', 'EEG_PWR_REL_BETA_BRAIN_AVG']
[BASE + ADDED + ADDED²] 多項式特徵(Pipeline 內產生):['EEG_PWR_REL_BETA_BRAIN_AVG²']
[CV] Stratified 10-fold, seed=42 | class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted
[前處理] HRV Log轉換=False | 異常值處理=iqr
[Leakage check] Class balance
count percent%
3TP
0 35 64.800
1 19 35.200
[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。
=== [BASE + ADDED + ADDED²] ML Benchmark (Stratified 10-fold CV) ===
model AUC AUC_overall F1_pos(=1) Prec_pos Rec_pos F1_neg(=0) Prec_neg Rec_neg MCC Accuracy Pred1_mean Pred0_mean
SVM 0.911 0.911 0.789 0.789 0.789 0.886 0.886 0.886 0.675 0.852 1.900 3.500
MLP 0.904 0.904 0.842 0.842 0.842 0.914 0.914 0.914 0.756 0.889 1.900 3.500
KNN 0.888 0.888 0.688 0.846 0.579 0.868 0.805 0.943 0.583 0.815 1.300 4.100
XGBoost 0.883 0.883 0.821 0.800 0.842 0.899 0.912 0.886 0.720 0.870 2.000 3.400
RandomForest 0.873 0.873 0.706 0.800 0.632 0.865 0.821 0.914 0.582 0.815 1.500 3.900
LogisticRegression 0.859 0.859 0.769 0.750 0.789 0.870 0.882 0.857 0.639 0.833 2.000 3.400
NaiveBayes 0.854 0.854 0.606 0.714 0.526 0.827 0.775 0.886 0.449 0.759 1.400 4.000
DecisionTree 0.730 0.730 0.649 0.667 0.632 0.817 0.806 0.829 0.466 0.759 1.800 3.600
--- [BASE + ADDED + ADDED²] Aggregated Confusion Matrix ---
model TN_sum FP_sum FN_sum TP_sum FP_FN_IDS
SVM 31 4 4 15 [S112271, S112159, S112119, S112019 | S112002, S112222, S112183, S112023]
MLP 32 3 3 16 [S112271, S112070, S112240 | S112222, S112023, S112075]
KNN 33 2 8 11 [S112271, S112070 | S112002, S112003, S112209, S112222, S112183, S112023, S112257, S112039]
XGBoost 31 4 3 16 [S112271, S112042, S112201, S112240 | S112169, S112003, S112222]
RandomForest 32 3 7 12 [S112271, S112070, S112240 | S112169, S112003, S112222, S112023, S112257, S112075, S112039]
LogisticRegression 30 5 4 15 [S112271, S112119, S112201, S112019, S112240 | S112002, S112222, S112183, S112023]
NaiveBayes 31 4 9 10 [S112012, S112271, S112201, S112240 | S112002, S112029, S112169, S112003, S112222, S112036, S112183, S112257, S112214]
DecisionTree 29 6 7 12 [S112012, S112271, S112070, S112176, S112266, S112042 | S112003, S112222, S112036, S112183, S112075, S112039, S112087]
============================================================
=== Feature Comparison ===
BASE: ['BDI_T1']
ADDED: ['EEG_PWR_REL_BETA_BRAIN_AVG']
ADDED sq: ['EEG_PWR_REL_BETA_BRAIN_AVG²']
============================================================
model AUC_base AUC_added delta(+ADDED) AUC_sq delta(+ADDED²) delta(sq-linear)
LogisticRegression 0.8737 0.8632 -0.0105 0.8586 -0.0150 -0.0045
NaiveBayes 0.8617 0.8662 0.0045 0.8541 -0.0075 -0.0120
XGBoost 0.8534 0.8331 -0.0203 0.8827 0.0293 0.0496
KNN 0.8451 0.9008 0.0556 0.8880 0.0429 -0.0128
MLP 0.8361 0.9068 0.0707 0.9038 0.0677 -0.0030
RandomForest 0.8241 0.8632 0.0391 0.8729 0.0489 0.0098
SVM 0.7895 0.8797 0.0902 0.9113 0.1218 0.0316
DecisionTree 0.7857 0.7444 -0.0414 0.7301 -0.0556 -0.0143
8-model mean AUC delta:
+ADDED (linear): +0.0235
+ADDED² (quad): +0.0290
sq vs linear: +0.0055
============================================================
进程已结束,退出代码为 0
PAC
/Users/yuchi/PycharmProjects/PsyMl_ISI/.venv/bin/python /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/batch_brute_v1.py
================================================================================
暴力單特徵測試 (batch_brute_v1) [optimized]
================================================================================
DB: /Users/yuchi/PycharmProjects/PsyMl_ISI/data/psy_ml_isi.db
資料表: isi_raw_data_recalc_5s10
附加群組: ['PAC'] (prefixes=['EEG_PAC_'])
PAC JOIN: ['MI', 'MVL']
跳過樣本: ['S112008', 'S112043', 'S112074', 'S112104', 'S112120', 'S112194', 'S112203', 'S112268']
排名指標: AUC
CV seed: 42
Worker 數: 9 / 10 cores
輸出目錄: /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs
================================================================================
[1/4] 載入資料...
[PAC JOIN] isi_raw_data_recalc_5s10_pac_mi: 58/96 個樣本匹配
[PAC JOIN] isi_raw_data_recalc_5s10_pac_mvl: 58/96 個樣本匹配
[SKIP] ID exclusions: 8 個樣本被跳過 (['S112008', 'S112043', 'S112074', 'S112104', 'S112120', 'S112194', 'S112203', 'S112268'])
樣本數: 88
目標分布: {0: 60, 1: 28}
附加特徵數: 2090
平方項: OFF
並行任務數: 16720 (2090 features × 8 classifiers)
Baseline 快取: ON(相同 complete-case mask 共用 baseline)
Unique masks: 1 / 2090 features(省去 16712 次 baseline 重複計算)
[2/4] 準備資料暫存...
暫存路徑: /var/folders/w3/0b378zts7xn2dmlg9rw77lc00000gn/T/brute_v1_mfihprux
[3/4] 並行跑 16720 個任務 (9 workers)...
特徵×分類器: 100%|██████████| 16720/16720 [24:24<00:00, 11.42task/s]
細粒度結果 → /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs/batch_brute_v1_results.csv
[4/4] 計算排名 (mean_delta, 指標=AUC)...
排名結果 → /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/PAC/outputs/batch_brute_v1_ranking.csv
────────────────────────────────────────────────────────────────────────────────
Matched Baseline 各模型 AUC(取各 feature matched subset 的平均)
────────────────────────────────────────────────────────────────────────────────
DecisionTree F1=0.5033 AUC=0.6708
KNN F1=0.6033 AUC=0.8833
LogisticRegression F1=0.7100 AUC=0.9625
MLP F1=0.6067 AUC=0.9250
NaiveBayes F1=0.7000 AUC=0.9625
RandomForest F1=0.5467 AUC=0.8750
SVM F1=0.7267 AUC=0.9125
XGBoost F1=0.4900 AUC=0.9000
────────────────────────────────────────────────────────────────────────────────
Top 30 附加特徵 (mean_delta_AUC 排序)
────────────────────────────────────────────────────────────────────────────────
feature_combination mean_delta positive_ratio delta_DecisionTree delta_KNN delta_LogisticRegression delta_MLP delta_NaiveBayes delta_RandomForest delta_SVM delta_XGBoost
rank
1 +EEG_PAC_DELTA_ALPHA1_MVL_T8 0.0664 1.00 0.1375 0.0437 0.0250 0.0750 0.0125 0.1125 0.0375 0.0875
2 +EEG_PAC_ALPHA_BETA1_MI_FP2 0.0586 0.88 0.1958 0.0437 -0.0208 0.0417 0.0125 0.0917 0.0708 0.0333
3 +EEG_PAC_DELTA_ALPHA_MVL_C4 0.0583 0.88 0.2167 0.0417 0.0000 0.0500 0.0125 0.1000 0.0125 0.0333
4 +EEG_PAC_DELTA_ALPHA2_MVL_O1 0.0544 0.75 0.1625 0.0479 -0.0250 0.0250 0.0000 0.0875 0.0625 0.0750
5 +EEG_PAC_DELTA_ALPHA_MVL_PZ 0.0529 0.75 0.1375 0.0979 0.0000 0.0042 0.0000 0.0583 0.0500 0.0750
6 +EEG_PAC_ALPHA2_GAMMA1_MI_C3 0.0503 0.88 0.1333 0.0604 0.0000 0.0500 0.0125 0.1000 0.0375 0.0083
7 +EEG_PAC_ALPHA_ALPHA_MVL_C4 0.0477 0.75 0.1208 0.0792 -0.0167 0.0375 0.0000 0.0813 0.0500 0.0292
8 +EEG_PAC_ALPHA_BETA1_MI_F7 0.0461 0.75 0.1583 0.0500 -0.0208 0.0125 -0.0208 0.0646 0.0583 0.0667
9 +EEG_PAC_ALPHA_GAMMA1_MI_CZ 0.0458 0.62 0.1750 0.0542 0.0000 0.0000 0.0125 0.0750 0.0000 0.0500
10 +EEG_PAC_DELTA_GAMMA_MI_CZ 0.0451 0.75 0.1458 0.0312 -0.0125 0.0500 0.0000 0.0667 0.0208 0.0583
11 +EEG_PAC_ALPHA_ALPHA1_MI_O1 0.0443 0.75 0.1500 0.0333 0.0000 0.0375 0.0000 0.0542 0.0333 0.0458
12 +EEG_PAC_ALPHA_BETA1_MI_FP1 0.0437 0.75 0.1167 0.0500 -0.0125 0.0167 0.0000 0.0833 0.0375 0.0583
13 +EEG_PAC_ALPHA_BETA_MVL_P8 0.0430 0.75 0.1250 0.0812 0.0000 0.0500 0.0125 0.0250 0.0500 0.0000
14 +EEG_PAC_DELTA_ALPHA2_MVL_P4 0.0406 0.75 0.1125 0.0500 -0.0375 0.0625 -0.0167 0.0458 0.0500 0.0583
15 +EEG_PAC_ALPHA1_BETA2_MI_FP1 0.0406 0.75 0.1083 0.0583 -0.0125 0.0167 0.0000 0.0875 0.0250 0.0417
16 +EEG_PAC_ALPHA2_GAMMA1_MI_T8 0.0393 0.75 0.1375 0.0250 -0.0292 0.0125 -0.0375 0.0938 0.0500 0.0625
17 +EEG_PAC_ALPHA1_ALPHA1_MVL_FP2 0.0391 0.50 0.1917 0.0458 0.0000 -0.0292 0.0000 0.0583 0.0500 -0.0042
18 +EEG_PAC_DELTA_BETA2_MVL_O1 0.0378 0.75 0.1458 0.0604 0.0000 0.0042 -0.0125 0.0417 0.0333 0.0292
19 +EEG_PAC_ALPHA1_HBETA_MI_FP2 0.0375 0.75 0.0792 0.0542 -0.0208 0.0375 -0.0333 0.0750 0.0375 0.0708
20 +EEG_PAC_DELTA_ALPHA2_MVL_PZ 0.0372 0.62 0.0708 0.0771 -0.0125 0.0000 0.0000 0.0583 0.0333 0.0708
21 +EEG_PAC_ALPHA_BETA_MI_P3 0.0367 0.75 0.1125 0.0437 0.0000 0.0250 -0.0375 0.0875 0.0375 0.0250
22 +EEG_PAC_DELTA_BETA2_MVL_FP1 0.0367 0.62 0.1375 0.0542 0.0000 0.0208 0.0000 0.0563 0.0000 0.0250
23 +EEG_PAC_DELTA_ALPHA_MVL_T8 0.0359 1.00 0.0500 0.0542 0.0125 0.0208 0.0125 0.0750 0.0125 0.0500
24 +EEG_PAC_THETA_BETA2_MI_C3 0.0354 0.50 0.1375 0.1042 -0.0125 -0.0167 0.0000 0.0542 0.0750 -0.0583
25 +EEG_PAC_THETA_ALPHA1_MI_T7 0.0349 0.62 0.1375 0.0604 0.0000 0.0250 0.0000 0.0563 0.0250 -0.0250
26 +EEG_PAC_ALPHA_BETA2_MI_FP1 0.0346 0.75 0.0750 0.0604 -0.0250 0.0083 0.0000 0.0500 0.0500 0.0583
27 +EEG_PAC_THETA_ALPHA2_MVL_F7 0.0341 0.75 0.0500 0.0646 -0.0208 0.0375 -0.0167 0.0750 0.0333 0.0500
28 +EEG_PAC_ALPHA1_GAMMA1_MI_CZ 0.0341 0.62 0.1625 -0.0042 -0.0292 0.0375 -0.0292 0.0437 0.0208 0.0708
29 +EEG_PAC_ALPHA_BETA1_MI_F8 0.0341 0.88 0.0875 0.0167 0.0125 0.0125 -0.0042 0.0938 0.0083 0.0458
30 +EEG_PAC_DELTA_BETA2_MVL_FP2 0.0341 0.75 0.1208 0.0396 0.0000 0.0250 -0.0167 0.0500 0.0333 0.0208
────────────────────────────────────────────────────────────────────────────────
Top 10 by positive_ratio(對最多模型有幫助)
────────────────────────────────────────────────────────────────────────────────
feature_combination mean_delta positive_ratio delta_DecisionTree delta_KNN delta_LogisticRegression delta_MLP delta_NaiveBayes delta_RandomForest delta_SVM delta_XGBoost
rank
1 +EEG_PAC_DELTA_ALPHA1_MVL_T8 0.0664 1.00 0.1375 0.0437 0.0250 0.0750 0.0125 0.1125 0.0375 0.0875
23 +EEG_PAC_DELTA_ALPHA_MVL_T8 0.0359 1.00 0.0500 0.0542 0.0125 0.0208 0.0125 0.0750 0.0125 0.0500
2 +EEG_PAC_ALPHA_BETA1_MI_FP2 0.0586 0.88 0.1958 0.0437 -0.0208 0.0417 0.0125 0.0917 0.0708 0.0333
3 +EEG_PAC_DELTA_ALPHA_MVL_C4 0.0583 0.88 0.2167 0.0417 0.0000 0.0500 0.0125 0.1000 0.0125 0.0333
6 +EEG_PAC_ALPHA2_GAMMA1_MI_C3 0.0503 0.88 0.1333 0.0604 0.0000 0.0500 0.0125 0.1000 0.0375 0.0083
29 +EEG_PAC_ALPHA_BETA1_MI_F8 0.0341 0.88 0.0875 0.0167 0.0125 0.0125 -0.0042 0.0938 0.0083 0.0458
32 +EEG_PAC_ALPHA_GAMMA1_MVL_CZ 0.0336 0.88 0.0917 0.0437 0.0000 0.0125 0.0125 0.0500 0.0125 0.0458
35 +EEG_PAC_ALPHA_GAMMA_MVL_CZ 0.0323 0.88 0.0833 0.0417 0.0000 0.0125 0.0125 0.0708 0.0125 0.0250
42 +EEG_PAC_ALPHA1_BETA2_MVL_P8 0.0297 0.88 0.1125 0.0125 0.0125 0.0458 0.0125 0.0375 0.0208 -0.0167
50 +EEG_PAC_ALPHA2_GAMMA1_MVL_CZ 0.0279 0.88 0.0875 0.0354 0.0125 -0.0417 0.0125 0.0583 0.0375 0.0208
================================================================================
完成。總耗時:1468.1s (24.5 min)
================================================================================
进程已结束,退出代码为 0