回歸原始數據計算

/Users/yuchi/PycharmProjects/PsyMl_ISI/.venv/bin/python /Users/yuchi/PycharmProjects/PsyMl_ISI/ML/tools/lasso_ranking.py 
[模式] 排除模式

[資料] 來源=isi_raw_data  目標=3TP  樣本=80  特徵=37
  已排除群組：['ACS', 'CPT', 'EEG', 'IGT', 'ISI', 'PSQI', 'WM']
  總缺值比例：17.20%

[CV結果]（分數 = neg_log_loss，越大越好 → log_loss 越小）
  lambda_min：C = 0.330177  |  lambda = 3.02868
  lambda_1SE：C = 0.242212  |  lambda = 4.12862

[選入變項（1SE）] 以 |係數| 排序（前 30）
                   coef  abs_coef
BDI_T1         1.475406  1.475406
BAI_T1         0.266486  0.266486
EF_MOTIVATION  0.091748  0.091748
HRV_VLF        0.017864  0.017864

[對照] lambda_min 非零變項數：5，lambda_1SE 非零變項數：4

[選入變項（lambda_min）] 以 |係數| 排序（前 30）
                   coef  abs_coef
BDI_T1         1.595091  1.595091
BAI_T1         0.382922  0.382922
HRV_VLF        0.131116  0.131116
EF_MOTIVATION  0.097169  0.097169
HRV_RESP_RATE -0.015412  0.015412

[Top 10（路徑峰值）] 不綁定單一 C
               BDI_T1
      WCST_PERS_ERR_T
WCST_PCT_CONCEPTUAL_T
         HRV_RMSSD_MS
               BAI_T1
              HRV_VLF
               ERQ_ES
 EF_SOCIAL_INHIBITION
        HRV_RESP_RATE
               HRV_LF

[資料] 來源=isi_raw_data  目標=3TP  列數=51  特徵數=2
[特徵] 使用欄位（前 15）：['BDI_T1', 'BAI_T1']
[CV] Stratified 5-fold, seed=42  |  class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted

[Leakage check] Class balance
     count  percent%
3TP                 
0       35      68.6
1       16      31.4

[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。

=== Basic ML Benchmark (Stratified 5-fold CV) ===
             model   AUC  F1_pos(=1)  Prec_pos  Rec_pos  F1_neg(=0)  Prec_neg  Rec_neg   MCC  Accuracy  Pred1_mean  Pred0_mean
LogisticRegression 0.878       0.688     0.688    0.688       0.857     0.857    0.857 0.545     0.804       3.200       7.000
        NaiveBayes 0.871       0.759     0.846    0.688       0.904     0.868    0.943 0.671     0.863       2.600       7.600
               MLP 0.844       0.684     0.591    0.812       0.812     0.897    0.743 0.520     0.765       4.400       5.800
               KNN 0.789       0.690     0.769    0.625       0.877     0.842    0.914 0.574     0.824       2.600       7.600
               SVM 0.776       0.686     0.632    0.750       0.836     0.875    0.800 0.528     0.784       3.800       6.400
      RandomForest 0.774       0.606     0.588    0.625       0.812     0.824    0.800 0.418     0.745       3.400       6.800
           XGBoost 0.762       0.647     0.611    0.688       0.824     0.848    0.800 0.473     0.765       3.600       6.600
      DecisionTree 0.721       0.606     0.588    0.625       0.812     0.824    0.800 0.418     0.745       3.400       6.800

--- Aggregated Confusion Matrix Sums (across all folds' test parts) ---
             model  TN_sum  FP_sum  FN_sum  TP_sum                                                                                                              FP_FN_IDS
LogisticRegression      30       5       5      11                            [S112194, S112019, S112047, S112159, S112119 | S112002, S112008, S112036, S112169, S112183]
        NaiveBayes      33       2       5      11                                                       [S112194, S112019 | S112002, S112008, S112036, S112169, S112183]
               MLP      26       9       3      13          [S112194, S112019, S112047, S112159, S112012, S112042, S112105, S112119, S112104 | S112008, S112115, S112183]
               KNN      32       3       6      10                                     [S112194, S112019, S112042 | S112002, S112008, S112036, S112169, S112183, S112029]
               SVM      28       7       4      12                   [S112194, S112019, S112047, S112159, S112012, S112119, S112104 | S112008, S112036, S112169, S112183]
      RandomForest      28       7       6      10 [S112194, S112019, S112047, S112159, S112012, S112042, S112176 | S112003, S112008, S112036, S112169, S112183, S112029]
           XGBoost      28       7       5      11          [S112194, S112019, S112047, S112159, S112012, S112042, S112105 | S112008, S112036, S112169, S112183, S112029]
      DecisionTree      28       7       6      10 [S112019, S112047, S112159, S112012, S112042, S112105, S112176 | S112003, S112008, S112036, S112169, S112183, S112029]

[資料] 來源=isi_raw_data  目標=3TP  列數=51  特徵數=5
[特徵] 使用欄位（前 15）：['BDI_T1', 'BAI_T1', 'HRV_VLF', 'HRV_LF', 'HRV_RESP_RATE']
[CV] Stratified 5-fold, seed=42  |  class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted

[Leakage check] Class balance
     count  percent%
3TP                 
0       35      68.6
1       16      31.4

[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。

=== Basic ML Benchmark (Stratified 5-fold CV) ===
             model   AUC  F1_pos(=1)  Prec_pos  Rec_pos  F1_neg(=0)  Prec_neg  Rec_neg   MCC  Accuracy  Pred1_mean  Pred0_mean
        NaiveBayes 0.857       0.800     0.857    0.750       0.917     0.892    0.943 0.720     0.882       2.800       7.400
LogisticRegression 0.827       0.710     0.733    0.688       0.873     0.861    0.886 0.584     0.824       3.000       7.200
               KNN 0.816       0.692     0.900    0.562       0.895     0.829    0.971 0.624     0.843       2.000       8.200
               SVM 0.814       0.688     0.688    0.688       0.857     0.857    0.857 0.545     0.804       3.200       7.000
           XGBoost 0.791       0.588     0.556    0.625       0.794     0.818    0.771 0.385     0.725       3.600       6.600
      RandomForest 0.750       0.552     0.615    0.500       0.822     0.789    0.857 0.380     0.745       2.600       7.600
      DecisionTree 0.638       0.514     0.474    0.562       0.746     0.781    0.714 0.266     0.667       3.800       6.400
               MLP 0.609       0.438     0.438    0.438       0.743     0.743    0.743 0.180     0.647       3.200       7.000

--- Aggregated Confusion Matrix Sums (across all folds' test parts) ---
             model  TN_sum  FP_sum  FN_sum  TP_sum                                                                                                                                                           FP_FN_IDS
        NaiveBayes      33       2       4      12                                                                                                             [S112194, S112019 | S112008, S112036, S112169, S112183]
LogisticRegression      31       4       5      11                                                                                  [S112194, S112019, S112159, S112119 | S112002, S112008, S112036, S112169, S112183]
               KNN      34       1       7       9                                                                                           [S112173 | S112002, S112003, S112008, S112039, S112036, S112169, S112183]
               SVM      30       5       5      11                                                                         [S112079, S112173, S112194, S112019, S112047 | S112002, S112008, S112036, S112169, S112183]
           XGBoost      27       8       6      10                                     [S112079, S112194, S112019, S112047, S112159, S112012, S112042, S112105 | S112002, S112003, S112008, S112036, S112169, S112183]
      RandomForest      30       5       8       8                                              [S112194, S112019, S112047, S112159, S112042 | S112002, S112003, S112008, S112039, S112036, S112169, S112183, S112029]
      DecisionTree      25      10       7       9          [S112079, S112173, S112194, S112019, S112047, S112159, S112012, S112042, S112105, S112119 | S112003, S112008, S112055, S112036, S112169, S112183, S112029]
               MLP      26       9       9       7 [S112194, S112019, S112012, S112105, S112119, S112158, S112177, S112160, S112180 | S112002, S112008, S112086, S112055, S112087, S112036, S112183, S112023, S112029]

[資料] 來源=isi_raw_data  目標=3TP  列數=51  特徵數=6
[特徵] 使用欄位（前 15）：['BDI_T1', 'BAI_T1', 'EF_MOTIVATION', 'HRV_VLF', 'HRV_LF', 'HRV_RESP_RATE']
[CV] Stratified 5-fold, seed=42  |  class_weight=balanced
[聚合] K-fold = weighted | LOSO = weighted

[Leakage check] Class balance
     count  percent%
3TP                 
0       35      68.6
1       16      31.4

[Leakage check] 未發現與目標 |r| ≥ 0.95 的欄位。

=== Basic ML Benchmark (Stratified 5-fold CV) ===
             model   AUC  F1_pos(=1)  Prec_pos  Rec_pos  F1_neg(=0)  Prec_neg  Rec_neg   MCC  Accuracy  Pred1_mean  Pred0_mean
        NaiveBayes 0.859       0.800     0.857    0.750       0.917     0.892    0.943 0.720     0.882       2.800       7.400
           XGBoost 0.843       0.629     0.579    0.688       0.806     0.844    0.771 0.440     0.745       3.800       6.400
LogisticRegression 0.839       0.688     0.688    0.688       0.857     0.857    0.857 0.545     0.804       3.200       7.000
               KNN 0.821       0.667     1.000    0.500       0.897     0.814    1.000 0.638     0.843       1.600       8.600
               MLP 0.821       0.578     0.448    0.812       0.667     0.864    0.543 0.333     0.627       5.800       4.400
               SVM 0.816       0.667     0.714    0.625       0.861     0.838    0.886 0.531     0.804       2.800       7.400
      RandomForest 0.783       0.690     0.769    0.625       0.877     0.842    0.914 0.574     0.824       2.600       7.600
      DecisionTree 0.713       0.606     0.588    0.625       0.812     0.824    0.800 0.418     0.745       3.400       6.800

--- Aggregated Confusion Matrix Sums (across all folds' test parts) ---
             model  TN_sum  FP_sum  FN_sum  TP_sum                                                                                                                                                                    FP_FN_IDS
        NaiveBayes      33       2       4      12                                                                                              [S112194, S112019 | S112008, S112036, S112169, S112183]
           XGBoost      27       8       5      11                               [S112079, S112194, S112019, S112047, S112159, S112012, S112042, S112105 | S112002, S112008, S112036, S112169, S112183]
LogisticRegression      30       5       5      11                                                          [S112194, S112019, S112159, S112119, S112043 | S112002, S112008, S112036, S112169, S112183]
               KNN      35       0       8       8                                   [- | S112002, S112003, S112008, S112039, S112036, S112169, S112183, S112029]
               MLP      19      16       3      13 [S112194, S112019, S112047, S112159, S112012, S112042, S112105, S112119, S112158, S112177, S112077, S112160, S112164, S112180, S112043, S112104 | S112002, S112008, S112036]
               SVM      31       4       6      10                                                       [S112079, S112173, S112019, S112047 | S112002, S112008, S112039, S112036, S112169, S112183]
      RandomForest      32       3       6      10                                                         [S112019, S112047, S112159 | S112002, S112008, S112039, S112036, S112169, S112183
      DecisionTree      28       7       6      10                                          [S112079, S112173, S112194, S112047, S112012, S112042, S112105 | S112003, S112008, S112115, S112036, S112169, S112183]

研究室筆記

Explorer

回歸原始數據計算