Fit logistic regression model#
based on different imputation methods
baseline: reference
model: any other selected imputation method
Parameters#
Default and set parameters for the notebook.
folder_data: str = '' # specify data directory if needed
fn_clinical_data = "data/ALD_study/processed/ald_metadata_cli.csv"
folder_experiment = "runs/appl_ald_data/plasma/proteinGroups"
model_key = 'VAE'
target = 'kleiner'
sample_id_col = 'Sample ID'
cutoff_target: int = 2 # => for binarization target >= cutoff_target
file_format = "csv"
out_folder = 'diff_analysis'
fn_qc_samples = '' # 'data/ALD_study/processed/qc_plasma_proteinGroups.pkl'
baseline = 'RSN' # default is RSN, as this was used in the original ALD Niu. et. al 2022
template_pred = 'pred_real_na_{}.csv' # fixed, do not change
# Parameters
cutoff_target = 0.5
folder_experiment = "runs/alzheimer_study"
target = "AD"
baseline = "PI"
model_key = "QRILC"
out_folder = "diff_analysis"
fn_clinical_data = "runs/alzheimer_study/data/clinical_data.csv"
root - INFO Removed from global namespace: folder_data
root - INFO Removed from global namespace: fn_clinical_data
root - INFO Removed from global namespace: folder_experiment
root - INFO Removed from global namespace: model_key
root - INFO Removed from global namespace: target
root - INFO Removed from global namespace: sample_id_col
root - INFO Removed from global namespace: cutoff_target
root - INFO Removed from global namespace: file_format
root - INFO Removed from global namespace: out_folder
root - INFO Removed from global namespace: fn_qc_samples
root - INFO Removed from global namespace: baseline
root - INFO Removed from global namespace: template_pred
root - INFO Already set attribute: folder_experiment has value runs/alzheimer_study
root - INFO Already set attribute: out_folder has value diff_analysis
{'baseline': 'PI',
'cutoff_target': 0.5,
'data': PosixPath('runs/alzheimer_study/data'),
'file_format': 'csv',
'fn_clinical_data': 'runs/alzheimer_study/data/clinical_data.csv',
'fn_qc_samples': '',
'folder_data': '',
'folder_experiment': PosixPath('runs/alzheimer_study'),
'model_key': 'QRILC',
'out_figures': PosixPath('runs/alzheimer_study/figures'),
'out_folder': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC'),
'out_metrics': PosixPath('runs/alzheimer_study'),
'out_models': PosixPath('runs/alzheimer_study'),
'out_preds': PosixPath('runs/alzheimer_study/preds'),
'sample_id_col': 'Sample ID',
'target': 'AD',
'template_pred': 'pred_real_na_{}.csv'}
Load data#
Load target#
target = pd.read_csv(args.fn_clinical_data,
index_col=0,
usecols=[args.sample_id_col, args.target])
target = target.dropna()
target
| AD | |
|---|---|
| Sample ID | |
| Sample_000 | 0 |
| Sample_001 | 1 |
| Sample_002 | 1 |
| Sample_003 | 1 |
| Sample_004 | 1 |
| ... | ... |
| Sample_205 | 1 |
| Sample_206 | 0 |
| Sample_207 | 0 |
| Sample_208 | 0 |
| Sample_209 | 0 |
210 rows Γ 1 columns
MS proteomics or specified omics data#
Aggregated from data splits of the imputation workflow run before.
pimmslearn.io.datasplits - INFO Loaded 'train_X' from file: runs/alzheimer_study/data/train_X.csv
pimmslearn.io.datasplits - INFO Loaded 'val_y' from file: runs/alzheimer_study/data/val_y.csv
pimmslearn.io.datasplits - INFO Loaded 'test_y' from file: runs/alzheimer_study/data/test_y.csv
Sample ID protein groups
Sample_176 Q9BT88 14.411
Sample_075 E9PHN6;E9PHN7;F6XZQ7;P28161;P28161-2 12.112
Sample_112 Q96KN2 22.343
Sample_181 H0YAC1;P03952 18.099
Sample_048 Q9NZ53 17.320
Name: intensity, dtype: float64
Get overlap between independent features and target
Select by ALD criteria#
Use parameters as specified in ALD study.
root - INFO Initally: N samples: 210, M feat: 1421
root - INFO Dropped features quantified in less than 126 samples.
root - INFO After feat selection: N samples: 210, M feat: 1213
root - INFO Min No. of Protein-Groups in single sample: 754
root - INFO Finally: N samples: 210, M feat: 1213
| protein groups | A0A024QZX5;A0A087X1N8;P35237 | A0A024R0T9;K7ER74;P02655 | A0A024R3W6;A0A024R412;O60462;O60462-2;O60462-3;O60462-4;O60462-5;Q7LBX6;X5D2Q8 | A0A024R644;A0A0A0MRU5;A0A1B0GWI2;O75503 | A0A075B6H9 | A0A075B6I0 | A0A075B6I1 | A0A075B6I6 | A0A075B6I9 | A0A075B6J9 | ... | Q9Y653;Q9Y653-2;Q9Y653-3 | Q9Y696 | Q9Y6C2 | Q9Y6N6 | Q9Y6N7;Q9Y6N7-2;Q9Y6N7-4 | Q9Y6R7 | Q9Y6X5 | Q9Y6Y8;Q9Y6Y8-2 | Q9Y6Y9 | S4R3U6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample ID | |||||||||||||||||||||
| Sample_000 | 15.912 | 16.852 | 15.570 | 16.481 | 20.246 | 16.764 | 17.584 | 16.988 | 20.054 | NaN | ... | 16.012 | 15.178 | NaN | 15.050 | 16.842 | 19.863 | NaN | 19.563 | 12.837 | 12.805 |
| Sample_001 | 15.936 | 16.874 | 15.519 | 16.387 | 19.941 | 18.786 | 17.144 | NaN | 19.067 | 16.188 | ... | 15.528 | 15.576 | NaN | 14.833 | 16.597 | 20.299 | 15.556 | 19.386 | 13.970 | 12.442 |
| Sample_002 | 16.111 | 14.523 | 15.935 | 16.416 | 19.251 | 16.832 | 15.671 | 17.012 | 18.569 | NaN | ... | 15.229 | 14.728 | 13.757 | 15.118 | 17.440 | 19.598 | 15.735 | 20.447 | 12.636 | 12.505 |
| Sample_003 | 16.107 | 17.032 | 15.802 | 16.979 | 19.628 | 17.852 | 18.877 | 14.182 | 18.985 | 13.438 | ... | 15.495 | 14.590 | 14.682 | 15.140 | 17.356 | 19.429 | NaN | 20.216 | 12.627 | 12.445 |
| Sample_004 | 15.603 | 15.331 | 15.375 | 16.679 | 20.450 | 18.682 | 17.081 | 14.140 | 19.686 | 14.495 | ... | 14.757 | 15.094 | 14.048 | 15.256 | 17.075 | 19.582 | 15.328 | 19.867 | 13.145 | 12.235 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Sample_205 | 15.682 | 16.886 | 14.910 | 16.482 | 17.705 | 17.039 | NaN | 16.413 | 19.102 | 16.064 | ... | 15.235 | 15.684 | 14.236 | 15.415 | 17.551 | 17.922 | 16.340 | 19.928 | 12.929 | 11.802 |
| Sample_206 | 15.798 | 17.554 | 15.600 | 15.938 | 18.154 | 18.152 | 16.503 | 16.860 | 18.538 | 15.288 | ... | 15.422 | 16.106 | NaN | 15.345 | 17.084 | 18.708 | 14.249 | 19.433 | NaN | NaN |
| Sample_207 | 15.739 | 16.877 | 15.469 | 16.898 | 18.636 | 17.950 | 16.321 | 16.401 | 18.849 | 17.580 | ... | 15.808 | 16.098 | 14.403 | 15.715 | 16.586 | 18.725 | 16.138 | 19.599 | 13.637 | 11.174 |
| Sample_208 | 15.477 | 16.779 | 14.995 | 16.132 | 14.908 | 17.530 | NaN | 16.119 | 18.368 | 15.202 | ... | 15.157 | 16.712 | NaN | 14.640 | 16.533 | 19.411 | 15.807 | 19.545 | 13.216 | NaN |
| Sample_209 | 15.727 | 17.261 | 15.175 | 16.235 | 17.893 | 17.744 | 16.371 | 15.780 | 18.806 | 16.532 | ... | 15.237 | 15.652 | 15.211 | 14.205 | 16.749 | 19.275 | 15.732 | 19.577 | 11.042 | 11.791 |
210 rows Γ 1213 columns
Number of complete cases which can be used:
Samples available both in proteomics data and for target: 210
Load imputations from specified model#
missing values pred. by QRILC: runs/alzheimer_study/preds/pred_real_na_QRILC.csv
Sample ID protein groups
Sample_047 P36269;P36269-2;P36269-3 8.475
Sample_107 P08582 11.640
Sample_163 P05026;P05026-2 13.860
Name: intensity, dtype: float64
Load imputations from baseline model#
Sample ID protein groups
Sample_000 A0A075B6J9 11.915
A0A075B6Q5 13.301
A0A075B6R2 11.133
A0A075B6S5 12.923
A0A087WSY4 14.332
...
Sample_209 Q9P1W8;Q9P1W8-2;Q9P1W8-4 11.945
Q9UI40;Q9UI40-2 12.911
Q9UIW2 13.315
Q9UMX0;Q9UMX0-2;Q9UMX0-4 11.697
Q9UP79 13.540
Name: intensity, Length: 46401, dtype: float64
Modeling setup#
General approach:
use one train, test split of the data
select best 10 features from training data
X_train,y_trainbefore binarization of targetdichotomize (binarize) data into to groups (zero and 1)
evaluate model on the test data
X_test,y_test
Repeat general approach for
all original ald data: all features justed in original ALD study
all model data: all features available my using the self supervised deep learning model
newly available feat only: the subset of features available from the self supervised deep learning model which were newly retained using the new approach
All data:
| protein groups | A0A024QZX5;A0A087X1N8;P35237 | A0A024R0T9;K7ER74;P02655 | A0A024R3W6;A0A024R412;O60462;O60462-2;O60462-3;O60462-4;O60462-5;Q7LBX6;X5D2Q8 | A0A024R644;A0A0A0MRU5;A0A1B0GWI2;O75503 | A0A075B6H7 | A0A075B6H9 | A0A075B6I0 | A0A075B6I1 | A0A075B6I6 | A0A075B6I9 | ... | Q9Y653;Q9Y653-2;Q9Y653-3 | Q9Y696 | Q9Y6C2 | Q9Y6N6 | Q9Y6N7;Q9Y6N7-2;Q9Y6N7-4 | Q9Y6R7 | Q9Y6X5 | Q9Y6Y8;Q9Y6Y8-2 | Q9Y6Y9 | S4R3U6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample ID | |||||||||||||||||||||
| Sample_000 | 15.912 | 16.852 | 15.570 | 16.481 | 17.301 | 20.246 | 16.764 | 17.584 | 16.988 | 20.054 | ... | 16.012 | 15.178 | 12.574 | 15.050 | 16.842 | 19.863 | 13.101 | 19.563 | 12.837 | 12.805 |
| Sample_001 | 15.936 | 16.874 | 15.519 | 16.387 | 13.796 | 19.941 | 18.786 | 17.144 | 12.813 | 19.067 | ... | 15.528 | 15.576 | 11.717 | 14.833 | 16.597 | 20.299 | 15.556 | 19.386 | 13.970 | 12.442 |
| Sample_002 | 16.111 | 14.523 | 15.935 | 16.416 | 18.175 | 19.251 | 16.832 | 15.671 | 17.012 | 18.569 | ... | 15.229 | 14.728 | 13.757 | 15.118 | 17.440 | 19.598 | 15.735 | 20.447 | 12.636 | 12.505 |
| Sample_003 | 16.107 | 17.032 | 15.802 | 16.979 | 15.963 | 19.628 | 17.852 | 18.877 | 14.182 | 18.985 | ... | 15.495 | 14.590 | 14.682 | 15.140 | 17.356 | 19.429 | 13.733 | 20.216 | 12.627 | 12.445 |
| Sample_004 | 15.603 | 15.331 | 15.375 | 16.679 | 15.473 | 20.450 | 18.682 | 17.081 | 14.140 | 19.686 | ... | 14.757 | 15.094 | 14.048 | 15.256 | 17.075 | 19.582 | 15.328 | 19.867 | 13.145 | 12.235 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Sample_205 | 15.682 | 16.886 | 14.910 | 16.482 | 11.560 | 17.705 | 17.039 | 14.926 | 16.413 | 19.102 | ... | 15.235 | 15.684 | 14.236 | 15.415 | 17.551 | 17.922 | 16.340 | 19.928 | 12.929 | 11.802 |
| Sample_206 | 15.798 | 17.554 | 15.600 | 15.938 | 14.714 | 18.154 | 18.152 | 16.503 | 16.860 | 18.538 | ... | 15.422 | 16.106 | 11.369 | 15.345 | 17.084 | 18.708 | 14.249 | 19.433 | 10.788 | 8.575 |
| Sample_207 | 15.739 | 16.877 | 15.469 | 16.898 | 13.033 | 18.636 | 17.950 | 16.321 | 16.401 | 18.849 | ... | 15.808 | 16.098 | 14.403 | 15.715 | 16.586 | 18.725 | 16.138 | 19.599 | 13.637 | 11.174 |
| Sample_208 | 15.477 | 16.779 | 14.995 | 16.132 | 13.392 | 14.908 | 17.530 | 13.190 | 16.119 | 18.368 | ... | 15.157 | 16.712 | 12.941 | 14.640 | 16.533 | 19.411 | 15.807 | 19.545 | 13.216 | 6.479 |
| Sample_209 | 15.727 | 17.261 | 15.175 | 16.235 | 13.467 | 17.893 | 17.744 | 16.371 | 15.780 | 18.806 | ... | 15.237 | 15.652 | 15.211 | 14.205 | 16.749 | 19.275 | 15.732 | 19.577 | 11.042 | 11.791 |
210 rows Γ 1421 columns
Subset of data by ALD criteria#
| protein groups | A0A024QZX5;A0A087X1N8;P35237 | A0A024R0T9;K7ER74;P02655 | A0A024R3W6;A0A024R412;O60462;O60462-2;O60462-3;O60462-4;O60462-5;Q7LBX6;X5D2Q8 | A0A024R644;A0A0A0MRU5;A0A1B0GWI2;O75503 | A0A075B6H9 | A0A075B6I0 | A0A075B6I1 | A0A075B6I6 | A0A075B6I9 | A0A075B6K4 | ... | O14793 | O95479;R4GMU1 | P01282;P01282-2 | P10619;P10619-2;X6R5C5;X6R8A1 | P21810 | Q14956;Q14956-2 | Q6ZMP0;Q6ZMP0-2 | Q9HBW1 | Q9NY15 | P17050 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample ID | |||||||||||||||||||||
| Sample_000 | 15.912 | 16.852 | 15.570 | 16.481 | 20.246 | 16.764 | 17.584 | 16.988 | 20.054 | 16.148 | ... | 12.796 | 12.072 | 12.224 | 13.002 | 13.754 | 13.997 | 12.142 | 13.610 | 11.158 | 14.105 |
| Sample_001 | 15.936 | 16.874 | 15.519 | 16.387 | 19.941 | 18.786 | 17.144 | 13.300 | 19.067 | 16.127 | ... | 12.937 | 13.228 | 13.141 | 12.546 | 12.103 | 12.394 | 13.008 | 13.295 | 13.054 | 12.728 |
| Sample_002 | 16.111 | 14.523 | 15.935 | 16.416 | 19.251 | 16.832 | 15.671 | 17.012 | 18.569 | 15.387 | ... | 12.996 | 11.668 | 13.147 | 12.657 | 12.306 | 12.865 | 11.113 | 12.154 | 13.995 | 14.586 |
| Sample_003 | 16.107 | 17.032 | 15.802 | 16.979 | 19.628 | 17.852 | 18.877 | 14.182 | 18.985 | 16.565 | ... | 13.793 | 12.744 | 13.246 | 12.475 | 12.893 | 12.075 | 11.584 | 13.531 | 12.526 | 14.343 |
| Sample_004 | 15.603 | 15.331 | 15.375 | 16.679 | 20.450 | 18.682 | 17.081 | 14.140 | 19.686 | 16.418 | ... | 13.331 | 12.979 | 13.325 | 13.773 | 12.813 | 14.709 | 13.847 | 13.119 | 12.045 | 13.734 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Sample_205 | 15.682 | 16.886 | 14.910 | 16.482 | 17.705 | 17.039 | 12.955 | 16.413 | 19.102 | 15.350 | ... | 14.269 | 14.064 | 16.826 | 18.182 | 15.225 | 15.044 | 14.192 | 16.605 | 14.995 | 14.257 |
| Sample_206 | 15.798 | 17.554 | 15.600 | 15.938 | 18.154 | 18.152 | 16.503 | 16.860 | 18.538 | 16.582 | ... | 14.273 | 17.700 | 16.802 | 20.202 | 15.280 | 15.086 | 13.978 | 18.086 | 15.557 | 14.171 |
| Sample_207 | 15.739 | 16.877 | 15.469 | 16.898 | 18.636 | 17.950 | 16.321 | 16.401 | 18.849 | 15.768 | ... | 14.473 | 16.882 | 16.917 | 20.105 | 15.690 | 15.135 | 13.138 | 17.066 | 15.706 | 15.690 |
| Sample_208 | 15.477 | 16.779 | 14.995 | 16.132 | 14.908 | 17.530 | 13.800 | 16.119 | 18.368 | 17.560 | ... | 15.234 | 17.175 | 16.521 | 18.859 | 15.305 | 15.161 | 13.006 | 17.917 | 15.396 | 14.371 |
| Sample_209 | 15.727 | 17.261 | 15.175 | 16.235 | 17.893 | 17.744 | 16.371 | 15.780 | 18.806 | 16.338 | ... | 14.556 | 16.656 | 16.954 | 18.493 | 15.823 | 14.626 | 13.385 | 17.767 | 15.687 | 13.573 |
210 rows Γ 1213 columns
Features which would not have been included using ALD criteria:
Index(['A0A075B6H7', 'A0A075B6Q5', 'A0A075B7B8', 'A0A087WSY4',
'A0A087WTT8;A0A0A0MQX5;O94779;O94779-2', 'A0A087WXB8;Q9Y274',
'A0A087WXE9;E9PQ70;Q6UXH9;Q6UXH9-2;Q6UXH9-3',
'A0A087X1Z2;C9JTV4;H0Y4Y4;Q8WYH2;Q96C19;Q9BUP0;Q9BUP0-2',
'A0A0A0MQS9;A0A0A0MTC7;Q16363;Q16363-2', 'A0A0A0MSN4;P12821;P12821-2',
...
'Q9NZ94;Q9NZ94-2;Q9NZ94-3', 'Q9NZU1', 'Q9P1W8;Q9P1W8-2;Q9P1W8-4',
'Q9UHI8', 'Q9UI40;Q9UI40-2',
'Q9UIB8;Q9UIB8-2;Q9UIB8-3;Q9UIB8-4;Q9UIB8-5;Q9UIB8-6',
'Q9UKZ4;Q9UKZ4-2', 'Q9UMX0;Q9UMX0-2;Q9UMX0-4', 'Q9Y281;Q9Y281-3',
'Q9Y490'],
dtype='object', name='protein groups', length=208)
Binarize targets, but also keep groups for stratification
| AD | 0 | 1 |
|---|---|---|
| AD | ||
| False | 122 | 0 |
| True | 0 | 88 |
Determine best number of parameters by cross validation procedure#
using subset of data by ALD criteria:
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 162.89it/s]
0%| | 0/2 [00:00<?, ?it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 5.10it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 5.06it/s]
0%| | 0/3 [00:00<?, ?it/s]
67%|βββββββ | 2/3 [00:00<00:00, 5.77it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 4.88it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 5.03it/s]
0%| | 0/4 [00:00<?, ?it/s]
50%|βββββ | 2/4 [00:00<00:00, 7.45it/s]
75%|ββββββββ | 3/4 [00:00<00:00, 5.08it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 4.38it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 4.78it/s]
0%| | 0/5 [00:00<?, ?it/s]
40%|ββββ | 2/5 [00:00<00:00, 6.52it/s]
60%|ββββββ | 3/5 [00:00<00:00, 4.96it/s]
80%|ββββββββ | 4/5 [00:00<00:00, 3.90it/s]
100%|ββββββββββ| 5/5 [00:01<00:00, 3.88it/s]
100%|ββββββββββ| 5/5 [00:01<00:00, 4.20it/s]
0%| | 0/6 [00:00<?, ?it/s]
33%|ββββ | 2/6 [00:00<00:00, 5.30it/s]
50%|βββββ | 3/6 [00:00<00:00, 3.08it/s]
67%|βββββββ | 4/6 [00:01<00:00, 3.01it/s]
83%|βββββββββ | 5/6 [00:01<00:00, 3.28it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 3.18it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 3.29it/s]
0%| | 0/7 [00:00<?, ?it/s]
29%|βββ | 2/7 [00:00<00:00, 7.74it/s]
43%|βββββ | 3/7 [00:00<00:00, 5.28it/s]
57%|ββββββ | 4/7 [00:00<00:00, 4.42it/s]
71%|ββββββββ | 5/7 [00:01<00:00, 4.13it/s]
86%|βββββββββ | 6/7 [00:01<00:00, 4.17it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 3.83it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 4.28it/s]
0%| | 0/8 [00:00<?, ?it/s]
25%|βββ | 2/8 [00:00<00:00, 8.09it/s]
38%|ββββ | 3/8 [00:00<00:00, 5.50it/s]
50%|βββββ | 4/8 [00:00<00:00, 4.95it/s]
62%|βββββββ | 5/8 [00:00<00:00, 4.77it/s]
75%|ββββββββ | 6/8 [00:01<00:00, 4.83it/s]
88%|βββββββββ | 7/8 [00:01<00:00, 3.99it/s]
100%|ββββββββββ| 8/8 [00:01<00:00, 4.29it/s]
100%|ββββββββββ| 8/8 [00:01<00:00, 4.68it/s]
0%| | 0/9 [00:00<?, ?it/s]
22%|βββ | 2/9 [00:00<00:00, 9.37it/s]
33%|ββββ | 3/9 [00:00<00:00, 6.64it/s]
44%|βββββ | 4/9 [00:00<00:00, 5.77it/s]
56%|ββββββ | 5/9 [00:00<00:00, 4.60it/s]
67%|βββββββ | 6/9 [00:01<00:00, 4.72it/s]
78%|ββββββββ | 7/9 [00:01<00:00, 4.68it/s]
89%|βββββββββ | 8/9 [00:01<00:00, 4.63it/s]
100%|ββββββββββ| 9/9 [00:01<00:00, 4.35it/s]
100%|ββββββββββ| 9/9 [00:01<00:00, 4.90it/s]
0%| | 0/10 [00:00<?, ?it/s]
20%|ββ | 2/10 [00:00<00:00, 8.57it/s]
30%|βββ | 3/10 [00:00<00:01, 5.71it/s]
40%|ββββ | 4/10 [00:00<00:01, 5.34it/s]
50%|βββββ | 5/10 [00:00<00:01, 5.00it/s]
60%|ββββββ | 6/10 [00:01<00:00, 4.79it/s]
70%|βββββββ | 7/10 [00:01<00:00, 4.68it/s]
80%|ββββββββ | 8/10 [00:01<00:00, 4.55it/s]
90%|βββββββββ | 9/10 [00:01<00:00, 4.08it/s]
100%|ββββββββββ| 10/10 [00:02<00:00, 4.23it/s]
100%|ββββββββββ| 10/10 [00:02<00:00, 4.70it/s]
0%| | 0/11 [00:00<?, ?it/s]
18%|ββ | 2/11 [00:00<00:01, 8.50it/s]
27%|βββ | 3/11 [00:00<00:01, 5.91it/s]
36%|ββββ | 4/11 [00:00<00:01, 4.71it/s]
45%|βββββ | 5/11 [00:00<00:01, 4.85it/s]
55%|ββββββ | 6/11 [00:01<00:01, 4.70it/s]
64%|βββββββ | 7/11 [00:01<00:00, 4.73it/s]
73%|ββββββββ | 8/11 [00:01<00:00, 4.77it/s]
82%|βββββββββ | 9/11 [00:01<00:00, 4.60it/s]
91%|βββββββββ | 10/11 [00:02<00:00, 4.39it/s]
100%|ββββββββββ| 11/11 [00:02<00:00, 3.25it/s]
100%|ββββββββββ| 11/11 [00:02<00:00, 4.27it/s]
0%| | 0/12 [00:00<?, ?it/s]
17%|ββ | 2/12 [00:00<00:01, 9.07it/s]
25%|βββ | 3/12 [00:00<00:01, 6.10it/s]
33%|ββββ | 4/12 [00:00<00:01, 5.52it/s]
42%|βββββ | 5/12 [00:00<00:01, 5.02it/s]
50%|βββββ | 6/12 [00:01<00:01, 4.68it/s]
58%|ββββββ | 7/12 [00:01<00:01, 4.57it/s]
67%|βββββββ | 8/12 [00:01<00:00, 4.61it/s]
75%|ββββββββ | 9/12 [00:01<00:00, 4.51it/s]
83%|βββββββββ | 10/12 [00:02<00:00, 4.52it/s]
92%|ββββββββββ| 11/12 [00:02<00:00, 4.52it/s]
100%|ββββββββββ| 12/12 [00:02<00:00, 3.51it/s]
100%|ββββββββββ| 12/12 [00:02<00:00, 4.44it/s]
0%| | 0/13 [00:00<?, ?it/s]
15%|ββ | 2/13 [00:00<00:02, 4.64it/s]
23%|βββ | 3/13 [00:00<00:03, 2.96it/s]
31%|βββ | 4/13 [00:01<00:03, 2.84it/s]
38%|ββββ | 5/13 [00:01<00:02, 3.09it/s]
46%|βββββ | 6/13 [00:01<00:02, 3.35it/s]
54%|ββββββ | 7/13 [00:02<00:01, 3.60it/s]
62%|βββββββ | 8/13 [00:02<00:01, 3.87it/s]
69%|βββββββ | 9/13 [00:02<00:01, 3.99it/s]
77%|ββββββββ | 10/13 [00:02<00:00, 4.10it/s]
85%|βββββββββ | 11/13 [00:02<00:00, 4.05it/s]
92%|ββββββββββ| 12/13 [00:03<00:00, 4.21it/s]
100%|ββββββββββ| 13/13 [00:03<00:00, 4.04it/s]
100%|ββββββββββ| 13/13 [00:03<00:00, 3.73it/s]
0%| | 0/14 [00:00<?, ?it/s]
14%|ββ | 2/14 [00:00<00:01, 6.36it/s]
21%|βββ | 3/14 [00:00<00:02, 5.17it/s]
29%|βββ | 4/14 [00:00<00:02, 4.73it/s]
36%|ββββ | 5/14 [00:01<00:02, 4.48it/s]
43%|βββββ | 6/14 [00:01<00:01, 4.40it/s]
50%|βββββ | 7/14 [00:01<00:01, 4.34it/s]
57%|ββββββ | 8/14 [00:01<00:01, 4.22it/s]
64%|βββββββ | 9/14 [00:01<00:01, 4.31it/s]
71%|ββββββββ | 10/14 [00:02<00:01, 3.85it/s]
79%|ββββββββ | 11/14 [00:02<00:00, 3.44it/s]
86%|βββββββββ | 12/14 [00:03<00:00, 3.17it/s]
93%|ββββββββββ| 13/14 [00:03<00:00, 3.06it/s]
100%|ββββββββββ| 14/14 [00:03<00:00, 3.02it/s]
100%|ββββββββββ| 14/14 [00:03<00:00, 3.74it/s]
0%| | 0/15 [00:00<?, ?it/s]
13%|ββ | 2/15 [00:00<00:02, 5.52it/s]
20%|ββ | 3/15 [00:00<00:03, 3.88it/s]
27%|βββ | 4/15 [00:01<00:03, 3.60it/s]
33%|ββββ | 5/15 [00:01<00:02, 3.44it/s]
40%|ββββ | 6/15 [00:01<00:02, 3.46it/s]
47%|βββββ | 7/15 [00:01<00:02, 3.44it/s]
53%|ββββββ | 8/15 [00:02<00:02, 3.49it/s]
60%|ββββββ | 9/15 [00:02<00:01, 3.62it/s]
67%|βββββββ | 10/15 [00:02<00:01, 3.63it/s]
73%|ββββββββ | 11/15 [00:03<00:01, 3.61it/s]
80%|ββββββββ | 12/15 [00:03<00:00, 3.69it/s]
87%|βββββββββ | 13/15 [00:03<00:00, 3.92it/s]
93%|ββββββββββ| 14/15 [00:03<00:00, 3.82it/s]
100%|ββββββββββ| 15/15 [00:04<00:00, 3.74it/s]
100%|ββββββββββ| 15/15 [00:04<00:00, 3.70it/s]
| fit_time | score_time | test_precision | test_recall | test_f1 | test_balanced_accuracy | test_roc_auc | test_average_precision | n_observations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| n_features | ||||||||||||||||||
| 1 | 0.005 | 0.002 | 0.046 | 0.018 | 0.766 | 0.340 | 0.120 | 0.083 | 0.201 | 0.127 | 0.552 | 0.041 | 0.854 | 0.063 | 0.824 | 0.087 | 210.000 | 0.000 |
| 2 | 0.005 | 0.002 | 0.056 | 0.023 | 0.609 | 0.128 | 0.465 | 0.118 | 0.517 | 0.100 | 0.619 | 0.067 | 0.699 | 0.083 | 0.648 | 0.098 | 210.000 | 0.000 |
| 3 | 0.004 | 0.001 | 0.041 | 0.013 | 0.803 | 0.078 | 0.748 | 0.094 | 0.770 | 0.069 | 0.806 | 0.054 | 0.919 | 0.040 | 0.906 | 0.042 | 210.000 | 0.000 |
| 4 | 0.007 | 0.004 | 0.074 | 0.028 | 0.799 | 0.074 | 0.758 | 0.093 | 0.774 | 0.066 | 0.808 | 0.053 | 0.918 | 0.040 | 0.905 | 0.042 | 210.000 | 0.000 |
| 5 | 0.007 | 0.003 | 0.077 | 0.027 | 0.807 | 0.083 | 0.816 | 0.102 | 0.807 | 0.071 | 0.835 | 0.059 | 0.923 | 0.040 | 0.911 | 0.043 | 210.000 | 0.000 |
| 6 | 0.004 | 0.002 | 0.044 | 0.016 | 0.813 | 0.080 | 0.813 | 0.104 | 0.808 | 0.070 | 0.836 | 0.059 | 0.922 | 0.042 | 0.911 | 0.044 | 210.000 | 0.000 |
| 7 | 0.003 | 0.001 | 0.037 | 0.015 | 0.817 | 0.080 | 0.815 | 0.102 | 0.812 | 0.070 | 0.839 | 0.059 | 0.922 | 0.042 | 0.911 | 0.043 | 210.000 | 0.000 |
| 8 | 0.003 | 0.000 | 0.037 | 0.000 | 0.815 | 0.083 | 0.825 | 0.094 | 0.815 | 0.066 | 0.842 | 0.056 | 0.920 | 0.041 | 0.911 | 0.041 | 210.000 | 0.000 |
| 9 | 0.003 | 0.000 | 0.033 | 0.002 | 0.816 | 0.082 | 0.826 | 0.093 | 0.816 | 0.061 | 0.842 | 0.052 | 0.919 | 0.041 | 0.909 | 0.042 | 210.000 | 0.000 |
| 10 | 0.003 | 0.000 | 0.034 | 0.002 | 0.813 | 0.072 | 0.828 | 0.096 | 0.817 | 0.065 | 0.843 | 0.056 | 0.925 | 0.042 | 0.915 | 0.042 | 210.000 | 0.000 |
| 11 | 0.006 | 0.003 | 0.060 | 0.017 | 0.815 | 0.075 | 0.824 | 0.100 | 0.816 | 0.069 | 0.843 | 0.059 | 0.924 | 0.044 | 0.914 | 0.044 | 210.000 | 0.000 |
| 12 | 0.004 | 0.002 | 0.045 | 0.017 | 0.817 | 0.066 | 0.836 | 0.091 | 0.823 | 0.061 | 0.849 | 0.052 | 0.924 | 0.044 | 0.911 | 0.047 | 210.000 | 0.000 |
| 13 | 0.005 | 0.002 | 0.049 | 0.017 | 0.831 | 0.072 | 0.816 | 0.085 | 0.820 | 0.058 | 0.846 | 0.048 | 0.925 | 0.040 | 0.912 | 0.042 | 210.000 | 0.000 |
| 14 | 0.004 | 0.001 | 0.042 | 0.013 | 0.832 | 0.070 | 0.820 | 0.084 | 0.823 | 0.059 | 0.848 | 0.050 | 0.923 | 0.041 | 0.910 | 0.044 | 210.000 | 0.000 |
| 15 | 0.005 | 0.002 | 0.050 | 0.018 | 0.831 | 0.073 | 0.815 | 0.081 | 0.820 | 0.058 | 0.846 | 0.049 | 0.920 | 0.041 | 0.906 | 0.045 | 210.000 | 0.000 |
Using all data:
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 514.70it/s]
0%| | 0/2 [00:00<?, ?it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 4.92it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 4.88it/s]
0%| | 0/3 [00:00<?, ?it/s]
67%|βββββββ | 2/3 [00:00<00:00, 4.22it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 2.90it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 3.07it/s]
0%| | 0/4 [00:00<?, ?it/s]
50%|βββββ | 2/4 [00:00<00:00, 7.58it/s]
75%|ββββββββ | 3/4 [00:00<00:00, 5.50it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 4.33it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 4.82it/s]
0%| | 0/5 [00:00<?, ?it/s]
40%|ββββ | 2/5 [00:00<00:00, 8.72it/s]
60%|ββββββ | 3/5 [00:00<00:00, 6.17it/s]
80%|ββββββββ | 4/5 [00:00<00:00, 4.89it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 4.55it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 5.09it/s]
0%| | 0/6 [00:00<?, ?it/s]
33%|ββββ | 2/6 [00:00<00:00, 7.92it/s]
50%|βββββ | 3/6 [00:00<00:00, 4.65it/s]
67%|βββββββ | 4/6 [00:00<00:00, 4.07it/s]
83%|βββββββββ | 5/6 [00:01<00:00, 4.16it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 3.96it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 4.29it/s]
0%| | 0/7 [00:00<?, ?it/s]
29%|βββ | 2/7 [00:00<00:00, 6.85it/s]
43%|βββββ | 3/7 [00:00<00:00, 4.83it/s]
57%|ββββββ | 4/7 [00:00<00:00, 4.39it/s]
71%|ββββββββ | 5/7 [00:01<00:00, 3.74it/s]
86%|βββββββββ | 6/7 [00:01<00:00, 3.53it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 3.37it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 3.83it/s]
0%| | 0/8 [00:00<?, ?it/s]
25%|βββ | 2/8 [00:00<00:00, 6.36it/s]
38%|ββββ | 3/8 [00:00<00:01, 4.66it/s]
50%|βββββ | 4/8 [00:00<00:01, 3.84it/s]
62%|βββββββ | 5/8 [00:01<00:00, 3.62it/s]
75%|ββββββββ | 6/8 [00:01<00:00, 3.51it/s]
88%|βββββββββ | 7/8 [00:01<00:00, 3.54it/s]
100%|ββββββββββ| 8/8 [00:02<00:00, 3.62it/s]
100%|ββββββββββ| 8/8 [00:02<00:00, 3.81it/s]
0%| | 0/9 [00:00<?, ?it/s]
22%|βββ | 2/9 [00:00<00:01, 4.77it/s]
33%|ββββ | 3/9 [00:00<00:01, 3.89it/s]
44%|βββββ | 4/9 [00:01<00:01, 3.69it/s]
56%|ββββββ | 5/9 [00:01<00:01, 3.63it/s]
67%|βββββββ | 6/9 [00:01<00:00, 3.42it/s]
78%|ββββββββ | 7/9 [00:01<00:00, 3.43it/s]
89%|βββββββββ | 8/9 [00:02<00:00, 3.27it/s]
100%|ββββββββββ| 9/9 [00:02<00:00, 3.40it/s]
100%|ββββββββββ| 9/9 [00:02<00:00, 3.54it/s]
0%| | 0/10 [00:00<?, ?it/s]
20%|ββ | 2/10 [00:00<00:01, 4.99it/s]
30%|βββ | 3/10 [00:00<00:02, 3.31it/s]
40%|ββββ | 4/10 [00:01<00:02, 2.95it/s]
50%|βββββ | 5/10 [00:01<00:01, 2.88it/s]
60%|ββββββ | 6/10 [00:01<00:01, 3.09it/s]
70%|βββββββ | 7/10 [00:02<00:00, 3.10it/s]
80%|ββββββββ | 8/10 [00:02<00:00, 3.03it/s]
90%|βββββββββ | 9/10 [00:02<00:00, 3.10it/s]
100%|ββββββββββ| 10/10 [00:03<00:00, 3.24it/s]
100%|ββββββββββ| 10/10 [00:03<00:00, 3.19it/s]
0%| | 0/11 [00:00<?, ?it/s]
18%|ββ | 2/11 [00:00<00:01, 7.46it/s]
27%|βββ | 3/11 [00:00<00:01, 5.45it/s]
36%|ββββ | 4/11 [00:00<00:01, 4.68it/s]
45%|βββββ | 5/11 [00:01<00:01, 4.31it/s]
55%|ββββββ | 6/11 [00:01<00:01, 3.22it/s]
64%|βββββββ | 7/11 [00:01<00:01, 2.90it/s]
73%|ββββββββ | 8/11 [00:02<00:01, 2.89it/s]
82%|βββββββββ | 9/11 [00:02<00:00, 2.73it/s]
91%|βββββββββ | 10/11 [00:03<00:00, 2.71it/s]
100%|ββββββββββ| 11/11 [00:03<00:00, 2.96it/s]
100%|ββββββββββ| 11/11 [00:03<00:00, 3.29it/s]
0%| | 0/12 [00:00<?, ?it/s]
17%|ββ | 2/12 [00:00<00:01, 7.82it/s]
25%|βββ | 3/12 [00:00<00:01, 5.44it/s]
33%|ββββ | 4/12 [00:00<00:01, 4.74it/s]
42%|βββββ | 5/12 [00:01<00:01, 4.36it/s]
50%|βββββ | 6/12 [00:01<00:01, 3.96it/s]
58%|ββββββ | 7/12 [00:01<00:01, 3.87it/s]
67%|βββββββ | 8/12 [00:01<00:01, 3.79it/s]
75%|ββββββββ | 9/12 [00:02<00:00, 3.80it/s]
83%|βββββββββ | 10/12 [00:02<00:00, 3.82it/s]
92%|ββββββββββ| 11/12 [00:02<00:00, 3.56it/s]
100%|ββββββββββ| 12/12 [00:03<00:00, 3.13it/s]
100%|ββββββββββ| 12/12 [00:03<00:00, 3.82it/s]
0%| | 0/13 [00:00<?, ?it/s]
15%|ββ | 2/13 [00:00<00:02, 5.01it/s]
23%|βββ | 3/13 [00:00<00:03, 3.30it/s]
31%|βββ | 4/13 [00:01<00:03, 2.93it/s]
38%|ββββ | 5/13 [00:01<00:02, 2.75it/s]
46%|βββββ | 6/13 [00:02<00:02, 2.77it/s]
54%|ββββββ | 7/13 [00:02<00:02, 2.85it/s]
62%|βββββββ | 8/13 [00:02<00:01, 2.93it/s]
69%|βββββββ | 9/13 [00:02<00:01, 3.06it/s]
77%|ββββββββ | 10/13 [00:03<00:00, 3.10it/s]
85%|βββββββββ | 11/13 [00:03<00:00, 3.19it/s]
92%|ββββββββββ| 12/13 [00:03<00:00, 3.28it/s]
100%|ββββββββββ| 13/13 [00:04<00:00, 3.26it/s]
100%|ββββββββββ| 13/13 [00:04<00:00, 3.12it/s]
0%| | 0/14 [00:00<?, ?it/s]
14%|ββ | 2/14 [00:00<00:01, 6.84it/s]
21%|βββ | 3/14 [00:00<00:02, 5.26it/s]
29%|βββ | 4/14 [00:00<00:02, 4.63it/s]
36%|ββββ | 5/14 [00:01<00:02, 4.33it/s]
43%|βββββ | 6/14 [00:01<00:01, 4.02it/s]
50%|βββββ | 7/14 [00:01<00:01, 3.91it/s]
57%|ββββββ | 8/14 [00:01<00:01, 3.81it/s]
64%|βββββββ | 9/14 [00:02<00:01, 3.90it/s]
71%|ββββββββ | 10/14 [00:02<00:01, 3.56it/s]
79%|ββββββββ | 11/14 [00:02<00:00, 3.16it/s]
86%|βββββββββ | 12/14 [00:03<00:00, 2.93it/s]
93%|ββββββββββ| 13/14 [00:03<00:00, 2.77it/s]
100%|ββββββββββ| 14/14 [00:04<00:00, 2.77it/s]
100%|ββββββββββ| 14/14 [00:04<00:00, 3.47it/s]
0%| | 0/15 [00:00<?, ?it/s]
13%|ββ | 2/15 [00:00<00:01, 7.40it/s]
20%|ββ | 3/15 [00:00<00:02, 5.39it/s]
27%|βββ | 4/15 [00:00<00:02, 3.70it/s]
33%|ββββ | 5/15 [00:01<00:03, 3.23it/s]
40%|ββββ | 6/15 [00:01<00:02, 3.02it/s]
47%|βββββ | 7/15 [00:02<00:02, 2.84it/s]
53%|ββββββ | 8/15 [00:02<00:02, 2.95it/s]
60%|ββββββ | 9/15 [00:02<00:02, 2.94it/s]
67%|βββββββ | 10/15 [00:03<00:01, 2.96it/s]
73%|ββββββββ | 11/15 [00:03<00:01, 3.05it/s]
80%|ββββββββ | 12/15 [00:03<00:00, 3.13it/s]
87%|βββββββββ | 13/15 [00:03<00:00, 3.19it/s]
93%|ββββββββββ| 14/15 [00:04<00:00, 3.29it/s]
100%|ββββββββββ| 15/15 [00:04<00:00, 3.37it/s]
100%|ββββββββββ| 15/15 [00:04<00:00, 3.30it/s]
| fit_time | score_time | test_precision | test_recall | test_f1 | test_balanced_accuracy | test_roc_auc | test_average_precision | n_observations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| n_features | ||||||||||||||||||
| 1 | 0.004 | 0.002 | 0.046 | 0.018 | 0.776 | 0.339 | 0.120 | 0.084 | 0.201 | 0.127 | 0.553 | 0.041 | 0.848 | 0.065 | 0.823 | 0.088 | 210.000 | 0.000 |
| 2 | 0.005 | 0.003 | 0.055 | 0.021 | 0.702 | 0.092 | 0.606 | 0.106 | 0.644 | 0.080 | 0.707 | 0.060 | 0.782 | 0.069 | 0.760 | 0.086 | 210.000 | 0.000 |
| 3 | 0.005 | 0.002 | 0.048 | 0.021 | 0.719 | 0.086 | 0.674 | 0.109 | 0.689 | 0.075 | 0.738 | 0.057 | 0.801 | 0.069 | 0.768 | 0.088 | 210.000 | 0.000 |
| 4 | 0.005 | 0.002 | 0.050 | 0.019 | 0.713 | 0.083 | 0.673 | 0.112 | 0.685 | 0.077 | 0.735 | 0.057 | 0.793 | 0.067 | 0.760 | 0.085 | 210.000 | 0.000 |
| 5 | 0.004 | 0.002 | 0.043 | 0.015 | 0.746 | 0.086 | 0.732 | 0.103 | 0.734 | 0.070 | 0.773 | 0.057 | 0.861 | 0.062 | 0.849 | 0.062 | 210.000 | 0.000 |
| 6 | 0.004 | 0.001 | 0.040 | 0.011 | 0.751 | 0.090 | 0.743 | 0.101 | 0.741 | 0.071 | 0.779 | 0.058 | 0.863 | 0.061 | 0.855 | 0.058 | 210.000 | 0.000 |
| 7 | 0.003 | 0.002 | 0.035 | 0.011 | 0.766 | 0.078 | 0.717 | 0.111 | 0.734 | 0.067 | 0.776 | 0.052 | 0.867 | 0.055 | 0.861 | 0.052 | 210.000 | 0.000 |
| 8 | 0.004 | 0.001 | 0.037 | 0.007 | 0.766 | 0.078 | 0.730 | 0.112 | 0.741 | 0.068 | 0.781 | 0.054 | 0.868 | 0.055 | 0.853 | 0.060 | 210.000 | 0.000 |
| 9 | 0.004 | 0.001 | 0.040 | 0.008 | 0.801 | 0.074 | 0.759 | 0.102 | 0.775 | 0.068 | 0.809 | 0.055 | 0.882 | 0.049 | 0.862 | 0.061 | 210.000 | 0.000 |
| 10 | 0.005 | 0.002 | 0.048 | 0.018 | 0.800 | 0.078 | 0.761 | 0.099 | 0.776 | 0.068 | 0.810 | 0.057 | 0.881 | 0.049 | 0.860 | 0.062 | 210.000 | 0.000 |
| 11 | 0.004 | 0.001 | 0.038 | 0.006 | 0.803 | 0.081 | 0.780 | 0.097 | 0.786 | 0.064 | 0.818 | 0.054 | 0.884 | 0.054 | 0.867 | 0.060 | 210.000 | 0.000 |
| 12 | 0.004 | 0.001 | 0.044 | 0.013 | 0.815 | 0.077 | 0.807 | 0.098 | 0.806 | 0.063 | 0.834 | 0.053 | 0.901 | 0.049 | 0.883 | 0.052 | 210.000 | 0.000 |
| 13 | 0.005 | 0.002 | 0.049 | 0.018 | 0.823 | 0.082 | 0.824 | 0.091 | 0.818 | 0.057 | 0.844 | 0.049 | 0.908 | 0.047 | 0.894 | 0.049 | 210.000 | 0.000 |
| 14 | 0.004 | 0.002 | 0.041 | 0.010 | 0.833 | 0.082 | 0.812 | 0.092 | 0.817 | 0.057 | 0.843 | 0.048 | 0.911 | 0.046 | 0.899 | 0.047 | 210.000 | 0.000 |
| 15 | 0.004 | 0.002 | 0.034 | 0.014 | 0.832 | 0.079 | 0.822 | 0.096 | 0.822 | 0.062 | 0.848 | 0.053 | 0.911 | 0.048 | 0.902 | 0.046 | 210.000 | 0.000 |
Using only new features:
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 195.31it/s]
0%| | 0/2 [00:00<?, ?it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 25.55it/s]
0%| | 0/3 [00:00<?, ?it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 19.06it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 18.98it/s]
0%| | 0/4 [00:00<?, ?it/s]
75%|ββββββββ | 3/4 [00:00<00:00, 24.72it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 20.00it/s]
0%| | 0/5 [00:00<?, ?it/s]
60%|ββββββ | 3/5 [00:00<00:00, 17.20it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 12.51it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 13.12it/s]
0%| | 0/6 [00:00<?, ?it/s]
50%|βββββ | 3/6 [00:00<00:00, 15.49it/s]
83%|βββββββββ | 5/6 [00:00<00:00, 15.86it/s]
100%|ββββββββββ| 6/6 [00:00<00:00, 14.98it/s]
0%| | 0/7 [00:00<?, ?it/s]
43%|βββββ | 3/7 [00:00<00:00, 28.28it/s]
86%|βββββββββ | 6/7 [00:00<00:00, 15.08it/s]
100%|ββββββββββ| 7/7 [00:00<00:00, 15.10it/s]
0%| | 0/8 [00:00<?, ?it/s]
38%|ββββ | 3/8 [00:00<00:00, 20.14it/s]
75%|ββββββββ | 6/8 [00:00<00:00, 18.05it/s]
100%|ββββββββββ| 8/8 [00:00<00:00, 16.48it/s]
100%|ββββββββββ| 8/8 [00:00<00:00, 17.05it/s]
0%| | 0/9 [00:00<?, ?it/s]
44%|βββββ | 4/9 [00:00<00:00, 30.64it/s]
89%|βββββββββ | 8/9 [00:00<00:00, 26.08it/s]
100%|ββββββββββ| 9/9 [00:00<00:00, 26.33it/s]
0%| | 0/10 [00:00<?, ?it/s]
30%|βββ | 3/10 [00:00<00:00, 21.88it/s]
60%|ββββββ | 6/10 [00:00<00:00, 16.03it/s]
80%|ββββββββ | 8/10 [00:00<00:00, 16.96it/s]
100%|ββββββββββ| 10/10 [00:00<00:00, 14.92it/s]
100%|ββββββββββ| 10/10 [00:00<00:00, 15.81it/s]
0%| | 0/11 [00:00<?, ?it/s]
27%|βββ | 3/11 [00:00<00:00, 27.57it/s]
55%|ββββββ | 6/11 [00:00<00:00, 21.49it/s]
82%|βββββββββ | 9/11 [00:00<00:00, 17.70it/s]
100%|ββββββββββ| 11/11 [00:00<00:00, 17.85it/s]
100%|ββββββββββ| 11/11 [00:00<00:00, 18.78it/s]
0%| | 0/12 [00:00<?, ?it/s]
25%|βββ | 3/12 [00:00<00:00, 17.72it/s]
42%|βββββ | 5/12 [00:00<00:00, 17.05it/s]
58%|ββββββ | 7/12 [00:00<00:00, 14.95it/s]
75%|ββββββββ | 9/12 [00:00<00:00, 16.26it/s]
92%|ββββββββββ| 11/12 [00:00<00:00, 14.90it/s]
100%|ββββββββββ| 12/12 [00:00<00:00, 15.83it/s]
0%| | 0/13 [00:00<?, ?it/s]
23%|βββ | 3/13 [00:00<00:00, 19.79it/s]
38%|ββββ | 5/13 [00:00<00:00, 17.21it/s]
54%|ββββββ | 7/13 [00:00<00:00, 17.08it/s]
69%|βββββββ | 9/13 [00:00<00:00, 15.89it/s]
85%|βββββββββ | 11/13 [00:00<00:00, 15.43it/s]
100%|ββββββββββ| 13/13 [00:00<00:00, 15.62it/s]
100%|ββββββββββ| 13/13 [00:00<00:00, 16.12it/s]
0%| | 0/14 [00:00<?, ?it/s]
21%|βββ | 3/14 [00:00<00:00, 23.32it/s]
43%|βββββ | 6/14 [00:00<00:00, 16.16it/s]
57%|ββββββ | 8/14 [00:00<00:00, 16.00it/s]
71%|ββββββββ | 10/14 [00:00<00:00, 15.02it/s]
86%|βββββββββ | 12/14 [00:00<00:00, 15.38it/s]
100%|ββββββββββ| 14/14 [00:00<00:00, 13.86it/s]
100%|ββββββββββ| 14/14 [00:00<00:00, 15.04it/s]
0%| | 0/15 [00:00<?, ?it/s]
20%|ββ | 3/15 [00:00<00:00, 29.62it/s]
40%|ββββ | 6/15 [00:00<00:00, 20.96it/s]
60%|ββββββ | 9/15 [00:00<00:00, 19.24it/s]
80%|ββββββββ | 12/15 [00:00<00:00, 18.50it/s]
93%|ββββββββββ| 14/15 [00:00<00:00, 18.45it/s]
100%|ββββββββββ| 15/15 [00:00<00:00, 19.19it/s]
| fit_time | score_time | test_precision | test_recall | test_f1 | test_balanced_accuracy | test_roc_auc | test_average_precision | n_observations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| n_features | ||||||||||||||||||
| 1 | 0.004 | 0.002 | 0.046 | 0.019 | 0.205 | 0.390 | 0.016 | 0.031 | 0.030 | 0.057 | 0.503 | 0.019 | 0.735 | 0.063 | 0.652 | 0.078 | 210.000 | 0.000 |
| 2 | 0.003 | 0.001 | 0.036 | 0.007 | 0.598 | 0.123 | 0.259 | 0.090 | 0.349 | 0.083 | 0.560 | 0.035 | 0.648 | 0.077 | 0.584 | 0.081 | 210.000 | 0.000 |
| 3 | 0.004 | 0.002 | 0.041 | 0.014 | 0.608 | 0.124 | 0.471 | 0.109 | 0.526 | 0.105 | 0.623 | 0.078 | 0.693 | 0.090 | 0.610 | 0.098 | 210.000 | 0.000 |
| 4 | 0.003 | 0.000 | 0.036 | 0.002 | 0.611 | 0.130 | 0.479 | 0.114 | 0.530 | 0.105 | 0.624 | 0.078 | 0.688 | 0.090 | 0.610 | 0.097 | 210.000 | 0.000 |
| 5 | 0.004 | 0.002 | 0.048 | 0.016 | 0.621 | 0.116 | 0.496 | 0.105 | 0.547 | 0.098 | 0.635 | 0.073 | 0.687 | 0.091 | 0.603 | 0.097 | 210.000 | 0.000 |
| 6 | 0.003 | 0.001 | 0.035 | 0.007 | 0.617 | 0.108 | 0.508 | 0.103 | 0.552 | 0.091 | 0.637 | 0.069 | 0.679 | 0.085 | 0.598 | 0.092 | 210.000 | 0.000 |
| 7 | 0.005 | 0.002 | 0.050 | 0.018 | 0.592 | 0.116 | 0.474 | 0.112 | 0.518 | 0.094 | 0.613 | 0.070 | 0.672 | 0.081 | 0.613 | 0.093 | 210.000 | 0.000 |
| 8 | 0.003 | 0.001 | 0.026 | 0.011 | 0.578 | 0.116 | 0.464 | 0.111 | 0.507 | 0.096 | 0.605 | 0.070 | 0.664 | 0.082 | 0.601 | 0.093 | 210.000 | 0.000 |
| 9 | 0.004 | 0.002 | 0.043 | 0.021 | 0.564 | 0.117 | 0.455 | 0.120 | 0.498 | 0.107 | 0.598 | 0.077 | 0.666 | 0.082 | 0.605 | 0.090 | 210.000 | 0.000 |
| 10 | 0.004 | 0.001 | 0.037 | 0.009 | 0.548 | 0.112 | 0.447 | 0.115 | 0.486 | 0.102 | 0.588 | 0.074 | 0.654 | 0.083 | 0.603 | 0.090 | 210.000 | 0.000 |
| 11 | 0.005 | 0.002 | 0.054 | 0.018 | 0.567 | 0.111 | 0.484 | 0.119 | 0.515 | 0.101 | 0.605 | 0.074 | 0.653 | 0.084 | 0.601 | 0.091 | 210.000 | 0.000 |
| 12 | 0.004 | 0.001 | 0.041 | 0.011 | 0.561 | 0.117 | 0.478 | 0.115 | 0.510 | 0.104 | 0.600 | 0.080 | 0.651 | 0.087 | 0.597 | 0.092 | 210.000 | 0.000 |
| 13 | 0.005 | 0.002 | 0.052 | 0.017 | 0.548 | 0.116 | 0.468 | 0.120 | 0.499 | 0.105 | 0.591 | 0.081 | 0.643 | 0.089 | 0.588 | 0.092 | 210.000 | 0.000 |
| 14 | 0.004 | 0.001 | 0.043 | 0.013 | 0.556 | 0.114 | 0.472 | 0.129 | 0.505 | 0.113 | 0.599 | 0.079 | 0.665 | 0.083 | 0.595 | 0.093 | 210.000 | 0.000 |
| 15 | 0.005 | 0.002 | 0.047 | 0.015 | 0.556 | 0.105 | 0.471 | 0.116 | 0.505 | 0.102 | 0.598 | 0.074 | 0.660 | 0.082 | 0.593 | 0.092 | 210.000 | 0.000 |
Best number of features by subset of the data:#
| ald | all | new | |
|---|---|---|---|
| fit_time | 4 | 2 | 13 |
| score_time | 5 | 2 | 11 |
| test_precision | 14 | 14 | 5 |
| test_recall | 12 | 13 | 6 |
| test_f1 | 12 | 15 | 6 |
| test_balanced_accuracy | 12 | 15 | 6 |
| test_roc_auc | 10 | 14 | 1 |
| test_average_precision | 10 | 15 | 1 |
| n_observations | 1 | 1 | 1 |
Train, test split#
Show number of cases in train and test data
| train | test | |
|---|---|---|
| False | 98 | 24 |
| True | 70 | 18 |
Results#
run_modelreturns dataclasses with the further needed resultsadd mrmr selection of data (select best number of features to use instead of fixing it)
Save results for final model on entire data, new features and ALD study criteria selected data.
0%| | 0/14 [00:00<?, ?it/s]
14%|ββ | 2/14 [00:00<00:02, 5.70it/s]
21%|βββ | 3/14 [00:00<00:02, 4.64it/s]
29%|βββ | 4/14 [00:00<00:02, 4.18it/s]
36%|ββββ | 5/14 [00:01<00:02, 4.19it/s]
43%|βββββ | 6/14 [00:01<00:01, 4.20it/s]
50%|βββββ | 7/14 [00:01<00:01, 4.21it/s]
57%|ββββββ | 8/14 [00:01<00:01, 4.37it/s]
64%|βββββββ | 9/14 [00:02<00:01, 4.41it/s]
71%|ββββββββ | 10/14 [00:02<00:00, 4.45it/s]
79%|ββββββββ | 11/14 [00:02<00:00, 4.43it/s]
86%|βββββββββ | 12/14 [00:02<00:00, 4.42it/s]
93%|ββββββββββ| 13/14 [00:02<00:00, 4.34it/s]
100%|ββββββββββ| 14/14 [00:03<00:00, 4.17it/s]
100%|ββββββββββ| 14/14 [00:03<00:00, 4.34it/s]
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 1018.53it/s]
0%| | 0/10 [00:00<?, ?it/s]
20%|ββ | 2/10 [00:00<00:00, 9.04it/s]
30%|βββ | 3/10 [00:00<00:01, 6.40it/s]
40%|ββββ | 4/10 [00:00<00:01, 5.39it/s]
50%|βββββ | 5/10 [00:00<00:01, 4.57it/s]
60%|ββββββ | 6/10 [00:01<00:00, 4.20it/s]
70%|βββββββ | 7/10 [00:01<00:00, 4.31it/s]
80%|ββββββββ | 8/10 [00:01<00:00, 4.31it/s]
90%|βββββββββ | 9/10 [00:01<00:00, 4.37it/s]
100%|ββββββββββ| 10/10 [00:02<00:00, 4.40it/s]
100%|ββββββββββ| 10/10 [00:02<00:00, 4.69it/s]
ROC-AUC on test split#
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/auc_roc_curve.pdf
Data used to plot ROC:
| ALD study all | QRILC all | QRILC new | ||||
|---|---|---|---|---|---|---|
| fpr | tpr | fpr | tpr | fpr | tpr | |
| 0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| 1 | 0.000 | 0.056 | 0.000 | 0.056 | 0.042 | 0.000 |
| 2 | 0.000 | 0.389 | 0.000 | 0.500 | 0.042 | 0.111 |
| 3 | 0.042 | 0.389 | 0.042 | 0.500 | 0.125 | 0.111 |
| 4 | 0.042 | 0.444 | 0.042 | 0.611 | 0.125 | 0.222 |
| 5 | 0.083 | 0.444 | 0.167 | 0.611 | 0.167 | 0.222 |
| 6 | 0.083 | 0.778 | 0.167 | 0.667 | 0.167 | 0.667 |
| 7 | 0.167 | 0.778 | 0.208 | 0.667 | 0.250 | 0.667 |
| 8 | 0.167 | 0.833 | 0.208 | 0.722 | 0.250 | 0.722 |
| 9 | 0.542 | 0.833 | 0.250 | 0.722 | 0.417 | 0.722 |
| 10 | 0.542 | 0.889 | 0.250 | 0.778 | 0.417 | 0.778 |
| 11 | 0.583 | 0.889 | 0.417 | 0.778 | 0.500 | 0.778 |
| 12 | 0.583 | 1.000 | 0.417 | 0.833 | 0.500 | 0.889 |
| 13 | 1.000 | 1.000 | 0.458 | 0.833 | 0.583 | 0.889 |
| 14 | NaN | NaN | 0.458 | 0.889 | 0.583 | 0.944 |
| 15 | NaN | NaN | 0.542 | 0.889 | 0.833 | 0.944 |
| 16 | NaN | NaN | 0.542 | 0.944 | 0.833 | 1.000 |
| 17 | NaN | NaN | 0.792 | 0.944 | 1.000 | 1.000 |
| 18 | NaN | NaN | 0.792 | 1.000 | NaN | NaN |
| 19 | NaN | NaN | 1.000 | 1.000 | NaN | NaN |
Features selected for final models#
| ALD study all | QRILC all | QRILC new | |
|---|---|---|---|
| rank | |||
| 0 | P10636-2;P10636-6 | Q9Y2T3;Q9Y2T3-3 | A6PVN5;F6WIT2;Q15257;Q15257-2;Q15257-3 |
| 1 | Q8NCL4 | P60709;P63261 | None |
| 2 | J3KNE3;P68402 | A0A0C4DH07;Q8N2S1;Q8N2S1-2;Q8N2S1-3 | None |
| 3 | Q02818 | P10636-2;P10636-6 | None |
| 4 | P61981 | P04430 | None |
| 5 | P04075 | P61981 | None |
| 6 | P14174 | P04075 | None |
| 7 | Q9Y2T3;Q9Y2T3-3 | P14174 | None |
| 8 | P00338;P00338-3 | A6PVN5;F6WIT2;Q15257;Q15257-2;Q15257-3 | None |
| 9 | C9JF17;P05090 | P00338;P00338-3 | None |
| 10 | None | P63104 | None |
| 11 | None | C9JF17;P05090 | None |
| 12 | None | P05413;S4R371 | None |
| 13 | None | P14618 | None |
Precision-Recall plot on test data#
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/prec_recall_curve.pdf
Data used to plot PRC:
| ALD study all | QRILC all | QRILC new | ||||
|---|---|---|---|---|---|---|
| precision | tpr | precision | tpr | precision | tpr | |
| 0 | 0.429 | 1.000 | 0.429 | 1.000 | 0.429 | 1.000 |
| 1 | 0.439 | 1.000 | 0.439 | 1.000 | 0.439 | 1.000 |
| 2 | 0.450 | 1.000 | 0.450 | 1.000 | 0.450 | 1.000 |
| 3 | 0.462 | 1.000 | 0.462 | 1.000 | 0.462 | 1.000 |
| 4 | 0.474 | 1.000 | 0.474 | 1.000 | 0.474 | 1.000 |
| 5 | 0.486 | 1.000 | 0.486 | 1.000 | 0.459 | 0.944 |
| 6 | 0.500 | 1.000 | 0.472 | 0.944 | 0.472 | 0.944 |
| 7 | 0.514 | 1.000 | 0.486 | 0.944 | 0.486 | 0.944 |
| 8 | 0.529 | 1.000 | 0.500 | 0.944 | 0.500 | 0.944 |
| 9 | 0.545 | 1.000 | 0.515 | 0.944 | 0.515 | 0.944 |
| 10 | 0.562 | 1.000 | 0.531 | 0.944 | 0.531 | 0.944 |
| 11 | 0.548 | 0.944 | 0.548 | 0.944 | 0.548 | 0.944 |
| 12 | 0.533 | 0.889 | 0.567 | 0.944 | 0.533 | 0.889 |
| 13 | 0.552 | 0.889 | 0.552 | 0.889 | 0.552 | 0.889 |
| 14 | 0.536 | 0.833 | 0.571 | 0.889 | 0.571 | 0.889 |
| 15 | 0.556 | 0.833 | 0.593 | 0.889 | 0.556 | 0.833 |
| 16 | 0.577 | 0.833 | 0.577 | 0.833 | 0.538 | 0.778 |
| 17 | 0.600 | 0.833 | 0.600 | 0.833 | 0.560 | 0.778 |
| 18 | 0.625 | 0.833 | 0.583 | 0.778 | 0.583 | 0.778 |
| 19 | 0.652 | 0.833 | 0.609 | 0.778 | 0.565 | 0.722 |
| 20 | 0.682 | 0.833 | 0.636 | 0.778 | 0.591 | 0.722 |
| 21 | 0.714 | 0.833 | 0.667 | 0.778 | 0.619 | 0.722 |
| 22 | 0.750 | 0.833 | 0.700 | 0.778 | 0.650 | 0.722 |
| 23 | 0.789 | 0.833 | 0.684 | 0.722 | 0.684 | 0.722 |
| 24 | 0.778 | 0.778 | 0.722 | 0.722 | 0.667 | 0.667 |
| 25 | 0.824 | 0.778 | 0.706 | 0.667 | 0.706 | 0.667 |
| 26 | 0.875 | 0.778 | 0.750 | 0.667 | 0.750 | 0.667 |
| 27 | 0.867 | 0.722 | 0.733 | 0.611 | 0.733 | 0.611 |
| 28 | 0.857 | 0.667 | 0.786 | 0.611 | 0.714 | 0.556 |
| 29 | 0.846 | 0.611 | 0.846 | 0.611 | 0.692 | 0.500 |
| 30 | 0.833 | 0.556 | 0.917 | 0.611 | 0.667 | 0.444 |
| 31 | 0.818 | 0.500 | 0.909 | 0.556 | 0.636 | 0.389 |
| 32 | 0.800 | 0.444 | 0.900 | 0.500 | 0.600 | 0.333 |
| 33 | 0.889 | 0.444 | 1.000 | 0.500 | 0.556 | 0.278 |
| 34 | 0.875 | 0.389 | 1.000 | 0.444 | 0.500 | 0.222 |
| 35 | 1.000 | 0.389 | 1.000 | 0.389 | 0.571 | 0.222 |
| 36 | 1.000 | 0.333 | 1.000 | 0.333 | 0.500 | 0.167 |
| 37 | 1.000 | 0.278 | 1.000 | 0.278 | 0.400 | 0.111 |
| 38 | 1.000 | 0.222 | 1.000 | 0.222 | 0.500 | 0.111 |
| 39 | 1.000 | 0.167 | 1.000 | 0.167 | 0.667 | 0.111 |
| 40 | 1.000 | 0.111 | 1.000 | 0.111 | 0.500 | 0.056 |
| 41 | 1.000 | 0.056 | 1.000 | 0.056 | 0.000 | 0.000 |
| 42 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.000 |
Train data plots#
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/prec_recall_curve_train.pdf
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/auc_roc_curve_train.pdf
Output files:
{'results_QRILC all.pkl': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/results_QRILC all.pkl'),
'results_QRILC new.pkl': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/results_QRILC new.pkl'),
'results_ALD study all.pkl': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/results_ALD study all.pkl'),
'auc_roc_curve.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/auc_roc_curve.pdf'),
'mrmr_feat_by_model.xlsx': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/mrmr_feat_by_model.xlsx'),
'prec_recall_curve.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/prec_recall_curve.pdf'),
'prec_recall_curve_train.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/prec_recall_curve_train.pdf'),
'auc_roc_curve_train.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_QRILC/auc_roc_curve_train.pdf')}