Fit logistic regression model#
based on different imputation methods
baseline: reference
model: any other selected imputation method
Parameters#
Default and set parameters for the notebook.
folder_data: str = '' # specify data directory if needed
fn_clinical_data = "data/ALD_study/processed/ald_metadata_cli.csv"
folder_experiment = "runs/appl_ald_data/plasma/proteinGroups"
model_key = 'VAE'
target = 'kleiner'
sample_id_col = 'Sample ID'
cutoff_target: int = 2 # => for binarization target >= cutoff_target
file_format = "csv"
out_folder = 'diff_analysis'
fn_qc_samples = '' # 'data/ALD_study/processed/qc_plasma_proteinGroups.pkl'
baseline = 'RSN' # default is RSN, as this was used in the original ALD Niu. et. al 2022
template_pred = 'pred_real_na_{}.csv' # fixed, do not change
# Parameters
cutoff_target = 0.5
folder_experiment = "runs/alzheimer_study"
target = "AD"
baseline = "PI"
model_key = "TRKNN"
out_folder = "diff_analysis"
fn_clinical_data = "runs/alzheimer_study/data/clinical_data.csv"
root - INFO Removed from global namespace: folder_data
root - INFO Removed from global namespace: fn_clinical_data
root - INFO Removed from global namespace: folder_experiment
root - INFO Removed from global namespace: model_key
root - INFO Removed from global namespace: target
root - INFO Removed from global namespace: sample_id_col
root - INFO Removed from global namespace: cutoff_target
root - INFO Removed from global namespace: file_format
root - INFO Removed from global namespace: out_folder
root - INFO Removed from global namespace: fn_qc_samples
root - INFO Removed from global namespace: baseline
root - INFO Removed from global namespace: template_pred
root - INFO Already set attribute: folder_experiment has value runs/alzheimer_study
root - INFO Already set attribute: out_folder has value diff_analysis
{'baseline': 'PI',
'cutoff_target': 0.5,
'data': PosixPath('runs/alzheimer_study/data'),
'file_format': 'csv',
'fn_clinical_data': 'runs/alzheimer_study/data/clinical_data.csv',
'fn_qc_samples': '',
'folder_data': '',
'folder_experiment': PosixPath('runs/alzheimer_study'),
'model_key': 'TRKNN',
'out_figures': PosixPath('runs/alzheimer_study/figures'),
'out_folder': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN'),
'out_metrics': PosixPath('runs/alzheimer_study'),
'out_models': PosixPath('runs/alzheimer_study'),
'out_preds': PosixPath('runs/alzheimer_study/preds'),
'sample_id_col': 'Sample ID',
'target': 'AD',
'template_pred': 'pred_real_na_{}.csv'}
Load data#
Load target#
target = pd.read_csv(args.fn_clinical_data,
index_col=0,
usecols=[args.sample_id_col, args.target])
target = target.dropna()
target
| AD | |
|---|---|
| Sample ID | |
| Sample_000 | 0 |
| Sample_001 | 1 |
| Sample_002 | 1 |
| Sample_003 | 1 |
| Sample_004 | 1 |
| ... | ... |
| Sample_205 | 1 |
| Sample_206 | 0 |
| Sample_207 | 0 |
| Sample_208 | 0 |
| Sample_209 | 0 |
210 rows Γ 1 columns
MS proteomics or specified omics data#
Aggregated from data splits of the imputation workflow run before.
pimmslearn.io.datasplits - INFO Loaded 'train_X' from file: runs/alzheimer_study/data/train_X.csv
pimmslearn.io.datasplits - INFO Loaded 'val_y' from file: runs/alzheimer_study/data/val_y.csv
pimmslearn.io.datasplits - INFO Loaded 'test_y' from file: runs/alzheimer_study/data/test_y.csv
Sample ID protein groups
Sample_066 Q6NUJ2 16.621
Sample_056 P62993 15.059
Sample_145 Q6ZRP7 14.820
Sample_076 Q8N428 15.374
Sample_181 P35527 16.396
Name: intensity, dtype: float64
Get overlap between independent features and target
Select by ALD criteria#
Use parameters as specified in ALD study.
root - INFO Initally: N samples: 210, M feat: 1421
root - INFO Dropped features quantified in less than 126 samples.
root - INFO After feat selection: N samples: 210, M feat: 1213
root - INFO Min No. of Protein-Groups in single sample: 754
root - INFO Finally: N samples: 210, M feat: 1213
| protein groups | A0A024QZX5;A0A087X1N8;P35237 | A0A024R0T9;K7ER74;P02655 | A0A024R3W6;A0A024R412;O60462;O60462-2;O60462-3;O60462-4;O60462-5;Q7LBX6;X5D2Q8 | A0A024R644;A0A0A0MRU5;A0A1B0GWI2;O75503 | A0A075B6H9 | A0A075B6I0 | A0A075B6I1 | A0A075B6I6 | A0A075B6I9 | A0A075B6J9 | ... | Q9Y653;Q9Y653-2;Q9Y653-3 | Q9Y696 | Q9Y6C2 | Q9Y6N6 | Q9Y6N7;Q9Y6N7-2;Q9Y6N7-4 | Q9Y6R7 | Q9Y6X5 | Q9Y6Y8;Q9Y6Y8-2 | Q9Y6Y9 | S4R3U6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample ID | |||||||||||||||||||||
| Sample_000 | 15.912 | 16.852 | 15.570 | 16.481 | 20.246 | 16.764 | 17.584 | 16.988 | 20.054 | NaN | ... | 16.012 | 15.178 | NaN | 15.050 | 16.842 | 19.863 | NaN | 19.563 | 12.837 | 12.805 |
| Sample_001 | 15.936 | 16.874 | 15.519 | 16.387 | 19.941 | 18.786 | 17.144 | NaN | 19.067 | 16.188 | ... | 15.528 | 15.576 | NaN | 14.833 | 16.597 | 20.299 | 15.556 | 19.386 | 13.970 | 12.442 |
| Sample_002 | 16.111 | 14.523 | 15.935 | 16.416 | 19.251 | 16.832 | 15.671 | 17.012 | 18.569 | NaN | ... | 15.229 | 14.728 | 13.757 | 15.118 | 17.440 | 19.598 | 15.735 | 20.447 | 12.636 | 12.505 |
| Sample_003 | 16.107 | 17.032 | 15.802 | 16.979 | 19.628 | 17.852 | 18.877 | 14.182 | 18.985 | 13.438 | ... | 15.495 | 14.590 | 14.682 | 15.140 | 17.356 | 19.429 | NaN | 20.216 | 12.627 | 12.445 |
| Sample_004 | 15.603 | 15.331 | 15.375 | 16.679 | 20.450 | 18.682 | 17.081 | 14.140 | 19.686 | 14.495 | ... | 14.757 | 15.094 | 14.048 | 15.256 | 17.075 | 19.582 | 15.328 | 19.867 | 13.145 | 12.235 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Sample_205 | 15.682 | 16.886 | 14.910 | 16.482 | 17.705 | 17.039 | NaN | 16.413 | 19.102 | 16.064 | ... | 15.235 | 15.684 | 14.236 | 15.415 | 17.551 | 17.922 | 16.340 | 19.928 | 12.929 | 11.802 |
| Sample_206 | 15.798 | 17.554 | 15.600 | 15.938 | 18.154 | 18.152 | 16.503 | 16.860 | 18.538 | 15.288 | ... | 15.422 | 16.106 | NaN | 15.345 | 17.084 | 18.708 | 14.249 | 19.433 | NaN | NaN |
| Sample_207 | 15.739 | 16.877 | 15.469 | 16.898 | 18.636 | 17.950 | 16.321 | 16.401 | 18.849 | 17.580 | ... | 15.808 | 16.098 | 14.403 | 15.715 | 16.586 | 18.725 | 16.138 | 19.599 | 13.637 | 11.174 |
| Sample_208 | 15.477 | 16.779 | 14.995 | 16.132 | 14.908 | 17.530 | NaN | 16.119 | 18.368 | 15.202 | ... | 15.157 | 16.712 | NaN | 14.640 | 16.533 | 19.411 | 15.807 | 19.545 | 13.216 | NaN |
| Sample_209 | 15.727 | 17.261 | 15.175 | 16.235 | 17.893 | 17.744 | 16.371 | 15.780 | 18.806 | 16.532 | ... | 15.237 | 15.652 | 15.211 | 14.205 | 16.749 | 19.275 | 15.732 | 19.577 | 11.042 | 11.791 |
210 rows Γ 1213 columns
Number of complete cases which can be used:
Samples available both in proteomics data and for target: 210
Load imputations from specified model#
missing values pred. by TRKNN: runs/alzheimer_study/preds/pred_real_na_TRKNN.csv
Sample ID protein groups
Sample_204 P08581;P08581-2 13.271
Sample_040 P40121;P40121-2 14.895
Sample_060 B1AJZ9;B1AJZ9-4;H0YE38;Q5JYW6 13.983
Name: intensity, dtype: float64
Load imputations from baseline model#
Sample ID protein groups
Sample_000 A0A075B6J9 13.412
A0A075B6Q5 13.967
A0A075B6R2 12.053
A0A075B6S5 13.419
A0A087WSY4 14.256
...
Sample_209 Q9P1W8;Q9P1W8-2;Q9P1W8-4 12.540
Q9UI40;Q9UI40-2 11.810
Q9UIW2 12.704
Q9UMX0;Q9UMX0-2;Q9UMX0-4 11.626
Q9UP79 12.553
Name: intensity, Length: 46401, dtype: float64
Modeling setup#
General approach:
use one train, test split of the data
select best 10 features from training data
X_train,y_trainbefore binarization of targetdichotomize (binarize) data into to groups (zero and 1)
evaluate model on the test data
X_test,y_test
Repeat general approach for
all original ald data: all features justed in original ALD study
all model data: all features available my using the self supervised deep learning model
newly available feat only: the subset of features available from the self supervised deep learning model which were newly retained using the new approach
All data:
| protein groups | A0A024QZX5;A0A087X1N8;P35237 | A0A024R0T9;K7ER74;P02655 | A0A024R3W6;A0A024R412;O60462;O60462-2;O60462-3;O60462-4;O60462-5;Q7LBX6;X5D2Q8 | A0A024R644;A0A0A0MRU5;A0A1B0GWI2;O75503 | A0A075B6H7 | A0A075B6H9 | A0A075B6I0 | A0A075B6I1 | A0A075B6I6 | A0A075B6I9 | ... | Q9Y653;Q9Y653-2;Q9Y653-3 | Q9Y696 | Q9Y6C2 | Q9Y6N6 | Q9Y6N7;Q9Y6N7-2;Q9Y6N7-4 | Q9Y6R7 | Q9Y6X5 | Q9Y6Y8;Q9Y6Y8-2 | Q9Y6Y9 | S4R3U6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample ID | |||||||||||||||||||||
| Sample_000 | 15.912 | 16.852 | 15.570 | 16.481 | 17.301 | 20.246 | 16.764 | 17.584 | 16.988 | 20.054 | ... | 16.012 | 15.178 | 13.770 | 15.050 | 16.842 | 19.863 | 15.931 | 19.563 | 12.837 | 12.805 |
| Sample_001 | 15.936 | 16.874 | 15.519 | 16.387 | 13.796 | 19.941 | 18.786 | 17.144 | 16.954 | 19.067 | ... | 15.528 | 15.576 | 13.938 | 14.833 | 16.597 | 20.299 | 15.556 | 19.386 | 13.970 | 12.442 |
| Sample_002 | 16.111 | 14.523 | 15.935 | 16.416 | 18.175 | 19.251 | 16.832 | 15.671 | 17.012 | 18.569 | ... | 15.229 | 14.728 | 13.757 | 15.118 | 17.440 | 19.598 | 15.735 | 20.447 | 12.636 | 12.505 |
| Sample_003 | 16.107 | 17.032 | 15.802 | 16.979 | 15.963 | 19.628 | 17.852 | 18.877 | 14.182 | 18.985 | ... | 15.495 | 14.590 | 14.682 | 15.140 | 17.356 | 19.429 | 16.006 | 20.216 | 12.627 | 12.445 |
| Sample_004 | 15.603 | 15.331 | 15.375 | 16.679 | 15.473 | 20.450 | 18.682 | 17.081 | 14.140 | 19.686 | ... | 14.757 | 15.094 | 14.048 | 15.256 | 17.075 | 19.582 | 15.328 | 19.867 | 13.145 | 12.235 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Sample_205 | 15.682 | 16.886 | 14.910 | 16.482 | 16.035 | 17.705 | 17.039 | 15.261 | 16.413 | 19.102 | ... | 15.235 | 15.684 | 14.236 | 15.415 | 17.551 | 17.922 | 16.340 | 19.928 | 12.929 | 11.802 |
| Sample_206 | 15.798 | 17.554 | 15.600 | 15.938 | 15.820 | 18.154 | 18.152 | 16.503 | 16.860 | 18.538 | ... | 15.422 | 16.106 | 14.543 | 15.345 | 17.084 | 18.708 | 14.249 | 19.433 | 12.319 | 11.307 |
| Sample_207 | 15.739 | 16.877 | 15.469 | 16.898 | 15.779 | 18.636 | 17.950 | 16.321 | 16.401 | 18.849 | ... | 15.808 | 16.098 | 14.403 | 15.715 | 16.586 | 18.725 | 16.138 | 19.599 | 13.637 | 11.174 |
| Sample_208 | 15.477 | 16.779 | 14.995 | 16.132 | 15.361 | 14.908 | 17.530 | 15.748 | 16.119 | 18.368 | ... | 15.157 | 16.712 | 14.371 | 14.640 | 16.533 | 19.411 | 15.807 | 19.545 | 13.216 | 10.901 |
| Sample_209 | 15.727 | 17.261 | 15.175 | 16.235 | 15.840 | 17.893 | 17.744 | 16.371 | 15.780 | 18.806 | ... | 15.237 | 15.652 | 15.211 | 14.205 | 16.749 | 19.275 | 15.732 | 19.577 | 11.042 | 11.791 |
210 rows Γ 1421 columns
Subset of data by ALD criteria#
| protein groups | A0A024QZX5;A0A087X1N8;P35237 | A0A024R0T9;K7ER74;P02655 | A0A024R3W6;A0A024R412;O60462;O60462-2;O60462-3;O60462-4;O60462-5;Q7LBX6;X5D2Q8 | A0A024R644;A0A0A0MRU5;A0A1B0GWI2;O75503 | A0A075B6H9 | A0A075B6I0 | A0A075B6I1 | A0A075B6I6 | A0A075B6I9 | A0A075B6K4 | ... | O14793 | O95479;R4GMU1 | P01282;P01282-2 | P10619;P10619-2;X6R5C5;X6R8A1 | P21810 | Q14956;Q14956-2 | Q6ZMP0;Q6ZMP0-2 | Q9HBW1 | Q9NY15 | P17050 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Sample ID | |||||||||||||||||||||
| Sample_000 | 15.912 | 16.852 | 15.570 | 16.481 | 20.246 | 16.764 | 17.584 | 16.988 | 20.054 | 16.148 | ... | 12.449 | 13.051 | 13.220 | 12.867 | 13.424 | 13.300 | 13.211 | 14.187 | 13.768 | 11.609 |
| Sample_001 | 15.936 | 16.874 | 15.519 | 16.387 | 19.941 | 18.786 | 17.144 | 13.309 | 19.067 | 16.127 | ... | 12.691 | 12.670 | 13.043 | 12.063 | 12.694 | 12.414 | 12.431 | 11.873 | 13.834 | 12.740 |
| Sample_002 | 16.111 | 14.523 | 15.935 | 16.416 | 19.251 | 16.832 | 15.671 | 17.012 | 18.569 | 15.387 | ... | 14.611 | 11.361 | 14.329 | 12.444 | 13.242 | 13.526 | 12.610 | 12.567 | 12.622 | 12.551 |
| Sample_003 | 16.107 | 17.032 | 15.802 | 16.979 | 19.628 | 17.852 | 18.877 | 14.182 | 18.985 | 16.565 | ... | 12.778 | 12.245 | 14.173 | 12.847 | 12.345 | 13.331 | 13.855 | 13.287 | 11.289 | 13.354 |
| Sample_004 | 15.603 | 15.331 | 15.375 | 16.679 | 20.450 | 18.682 | 17.081 | 14.140 | 19.686 | 16.418 | ... | 13.803 | 14.433 | 12.759 | 13.537 | 13.614 | 12.396 | 14.186 | 14.139 | 11.627 | 11.973 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| Sample_205 | 15.682 | 16.886 | 14.910 | 16.482 | 17.705 | 17.039 | 13.688 | 16.413 | 19.102 | 15.350 | ... | 14.269 | 14.064 | 16.826 | 18.182 | 15.225 | 15.044 | 14.192 | 16.605 | 14.995 | 14.257 |
| Sample_206 | 15.798 | 17.554 | 15.600 | 15.938 | 18.154 | 18.152 | 16.503 | 16.860 | 18.538 | 16.582 | ... | 14.273 | 17.700 | 16.802 | 20.202 | 15.280 | 15.086 | 13.978 | 18.086 | 15.557 | 14.171 |
| Sample_207 | 15.739 | 16.877 | 15.469 | 16.898 | 18.636 | 17.950 | 16.321 | 16.401 | 18.849 | 15.768 | ... | 14.473 | 16.882 | 16.917 | 20.105 | 15.690 | 15.135 | 13.138 | 17.066 | 15.706 | 15.690 |
| Sample_208 | 15.477 | 16.779 | 14.995 | 16.132 | 14.908 | 17.530 | 13.150 | 16.119 | 18.368 | 17.560 | ... | 15.234 | 17.175 | 16.521 | 18.859 | 15.305 | 15.161 | 13.006 | 17.917 | 15.396 | 14.371 |
| Sample_209 | 15.727 | 17.261 | 15.175 | 16.235 | 17.893 | 17.744 | 16.371 | 15.780 | 18.806 | 16.338 | ... | 14.556 | 16.656 | 16.954 | 18.493 | 15.823 | 14.626 | 13.385 | 17.767 | 15.687 | 13.573 |
210 rows Γ 1213 columns
Features which would not have been included using ALD criteria:
Index(['A0A075B6H7', 'A0A075B6Q5', 'A0A075B7B8', 'A0A087WSY4',
'A0A087WTT8;A0A0A0MQX5;O94779;O94779-2', 'A0A087WXB8;Q9Y274',
'A0A087WXE9;E9PQ70;Q6UXH9;Q6UXH9-2;Q6UXH9-3',
'A0A087X1Z2;C9JTV4;H0Y4Y4;Q8WYH2;Q96C19;Q9BUP0;Q9BUP0-2',
'A0A0A0MQS9;A0A0A0MTC7;Q16363;Q16363-2', 'A0A0A0MSN4;P12821;P12821-2',
...
'Q9NZ94;Q9NZ94-2;Q9NZ94-3', 'Q9NZU1', 'Q9P1W8;Q9P1W8-2;Q9P1W8-4',
'Q9UHI8', 'Q9UI40;Q9UI40-2',
'Q9UIB8;Q9UIB8-2;Q9UIB8-3;Q9UIB8-4;Q9UIB8-5;Q9UIB8-6',
'Q9UKZ4;Q9UKZ4-2', 'Q9UMX0;Q9UMX0-2;Q9UMX0-4', 'Q9Y281;Q9Y281-3',
'Q9Y490'],
dtype='object', name='protein groups', length=208)
Binarize targets, but also keep groups for stratification
| AD | 0 | 1 |
|---|---|---|
| AD | ||
| False | 122 | 0 |
| True | 0 | 88 |
Determine best number of parameters by cross validation procedure#
using subset of data by ALD criteria:
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 81.34it/s]
0%| | 0/2 [00:00<?, ?it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 7.31it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 7.24it/s]
0%| | 0/3 [00:00<?, ?it/s]
67%|βββββββ | 2/3 [00:00<00:00, 6.68it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 4.72it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 4.99it/s]
0%| | 0/4 [00:00<?, ?it/s]
50%|βββββ | 2/4 [00:00<00:00, 9.04it/s]
75%|ββββββββ | 3/4 [00:00<00:00, 6.18it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 5.08it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 5.61it/s]
0%| | 0/5 [00:00<?, ?it/s]
40%|ββββ | 2/5 [00:00<00:00, 8.27it/s]
60%|ββββββ | 3/5 [00:00<00:00, 5.61it/s]
80%|ββββββββ | 4/5 [00:00<00:00, 4.67it/s]
100%|ββββββββββ| 5/5 [00:01<00:00, 4.00it/s]
100%|ββββββββββ| 5/5 [00:01<00:00, 4.57it/s]
0%| | 0/6 [00:00<?, ?it/s]
33%|ββββ | 2/6 [00:00<00:00, 5.32it/s]
50%|βββββ | 3/6 [00:00<00:00, 3.78it/s]
67%|βββββββ | 4/6 [00:01<00:00, 3.21it/s]
83%|βββββββββ | 5/6 [00:01<00:00, 2.79it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 2.85it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 3.12it/s]
0%| | 0/7 [00:00<?, ?it/s]
29%|βββ | 2/7 [00:00<00:00, 7.76it/s]
43%|βββββ | 3/7 [00:00<00:00, 5.39it/s]
57%|ββββββ | 4/7 [00:00<00:00, 5.03it/s]
71%|ββββββββ | 5/7 [00:00<00:00, 4.79it/s]
86%|βββββββββ | 6/7 [00:01<00:00, 4.66it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 4.60it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 4.92it/s]
0%| | 0/8 [00:00<?, ?it/s]
25%|βββ | 2/8 [00:00<00:00, 8.03it/s]
38%|ββββ | 3/8 [00:00<00:01, 4.81it/s]
50%|βββββ | 4/8 [00:00<00:00, 4.38it/s]
62%|βββββββ | 5/8 [00:01<00:00, 4.33it/s]
75%|ββββββββ | 6/8 [00:01<00:00, 4.27it/s]
88%|βββββββββ | 7/8 [00:01<00:00, 4.24it/s]
100%|ββββββββββ| 8/8 [00:01<00:00, 4.49it/s]
100%|ββββββββββ| 8/8 [00:01<00:00, 4.58it/s]
0%| | 0/9 [00:00<?, ?it/s]
22%|βββ | 2/9 [00:00<00:00, 7.57it/s]
33%|ββββ | 3/9 [00:00<00:01, 5.71it/s]
44%|βββββ | 4/9 [00:00<00:00, 5.03it/s]
56%|ββββββ | 5/9 [00:01<00:00, 4.04it/s]
67%|βββββββ | 6/9 [00:01<00:00, 4.12it/s]
78%|ββββββββ | 7/9 [00:01<00:00, 4.43it/s]
89%|βββββββββ | 8/9 [00:01<00:00, 4.14it/s]
100%|ββββββββββ| 9/9 [00:02<00:00, 4.17it/s]
100%|ββββββββββ| 9/9 [00:02<00:00, 4.47it/s]
0%| | 0/10 [00:00<?, ?it/s]
20%|ββ | 2/10 [00:00<00:01, 7.95it/s]
30%|βββ | 3/10 [00:00<00:01, 5.92it/s]
40%|ββββ | 4/10 [00:00<00:01, 5.03it/s]
50%|βββββ | 5/10 [00:00<00:01, 4.77it/s]
60%|ββββββ | 6/10 [00:01<00:00, 4.65it/s]
70%|βββββββ | 7/10 [00:01<00:00, 4.57it/s]
80%|ββββββββ | 8/10 [00:01<00:00, 4.45it/s]
90%|βββββββββ | 9/10 [00:01<00:00, 4.51it/s]
100%|ββββββββββ| 10/10 [00:02<00:00, 3.97it/s]
100%|ββββββββββ| 10/10 [00:02<00:00, 4.57it/s]
0%| | 0/11 [00:00<?, ?it/s]
18%|ββ | 2/11 [00:00<00:00, 9.00it/s]
27%|βββ | 3/11 [00:00<00:01, 6.27it/s]
36%|ββββ | 4/11 [00:00<00:01, 5.34it/s]
45%|βββββ | 5/11 [00:00<00:01, 5.26it/s]
55%|ββββββ | 6/11 [00:01<00:00, 5.04it/s]
64%|βββββββ | 7/11 [00:01<00:00, 4.85it/s]
73%|ββββββββ | 8/11 [00:01<00:00, 4.75it/s]
82%|βββββββββ | 9/11 [00:01<00:00, 4.16it/s]
91%|βββββββββ | 10/11 [00:02<00:00, 4.17it/s]
100%|ββββββββββ| 11/11 [00:02<00:00, 4.18it/s]
100%|ββββββββββ| 11/11 [00:02<00:00, 4.74it/s]
0%| | 0/12 [00:00<?, ?it/s]
17%|ββ | 2/12 [00:00<00:01, 7.23it/s]
25%|βββ | 3/12 [00:00<00:01, 5.65it/s]
33%|ββββ | 4/12 [00:00<00:01, 4.94it/s]
42%|βββββ | 5/12 [00:00<00:01, 4.75it/s]
50%|βββββ | 6/12 [00:01<00:01, 4.67it/s]
58%|ββββββ | 7/12 [00:01<00:01, 4.66it/s]
67%|βββββββ | 8/12 [00:01<00:00, 4.49it/s]
75%|ββββββββ | 9/12 [00:01<00:00, 4.53it/s]
83%|βββββββββ | 10/12 [00:02<00:00, 3.54it/s]
92%|ββββββββββ| 11/12 [00:02<00:00, 3.81it/s]
100%|ββββββββββ| 12/12 [00:02<00:00, 3.98it/s]
100%|ββββββββββ| 12/12 [00:02<00:00, 4.38it/s]
0%| | 0/13 [00:00<?, ?it/s]
15%|ββ | 2/13 [00:00<00:01, 7.40it/s]
23%|βββ | 3/13 [00:00<00:02, 4.94it/s]
31%|βββ | 4/13 [00:00<00:02, 3.80it/s]
38%|ββββ | 5/13 [00:01<00:02, 3.73it/s]
46%|βββββ | 6/13 [00:01<00:01, 3.51it/s]
54%|ββββββ | 7/13 [00:01<00:01, 3.58it/s]
62%|βββββββ | 8/13 [00:02<00:01, 3.49it/s]
69%|βββββββ | 9/13 [00:02<00:01, 3.54it/s]
77%|ββββββββ | 10/13 [00:02<00:00, 3.48it/s]
85%|βββββββββ | 11/13 [00:02<00:00, 3.55it/s]
92%|ββββββββββ| 12/13 [00:03<00:00, 3.66it/s]
100%|ββββββββββ| 13/13 [00:03<00:00, 3.47it/s]
100%|ββββββββββ| 13/13 [00:03<00:00, 3.70it/s]
0%| | 0/14 [00:00<?, ?it/s]
14%|ββ | 2/14 [00:00<00:01, 7.84it/s]
21%|βββ | 3/14 [00:00<00:01, 6.07it/s]
29%|βββ | 4/14 [00:00<00:01, 5.43it/s]
36%|ββββ | 5/14 [00:00<00:01, 5.19it/s]
43%|βββββ | 6/14 [00:01<00:02, 3.98it/s]
50%|βββββ | 7/14 [00:01<00:01, 3.96it/s]
57%|ββββββ | 8/14 [00:01<00:01, 3.99it/s]
64%|βββββββ | 9/14 [00:02<00:01, 4.06it/s]
71%|ββββββββ | 10/14 [00:02<00:00, 4.05it/s]
79%|ββββββββ | 11/14 [00:02<00:00, 4.15it/s]
86%|βββββββββ | 12/14 [00:02<00:00, 4.24it/s]
93%|ββββββββββ| 13/14 [00:02<00:00, 4.31it/s]
100%|ββββββββββ| 14/14 [00:03<00:00, 4.42it/s]
100%|ββββββββββ| 14/14 [00:03<00:00, 4.45it/s]
0%| | 0/15 [00:00<?, ?it/s]
13%|ββ | 2/15 [00:00<00:01, 9.98it/s]
20%|ββ | 3/15 [00:00<00:01, 6.56it/s]
27%|βββ | 4/15 [00:00<00:01, 5.87it/s]
33%|ββββ | 5/15 [00:00<00:01, 5.40it/s]
40%|ββββ | 6/15 [00:01<00:01, 5.24it/s]
47%|βββββ | 7/15 [00:01<00:01, 5.13it/s]
53%|ββββββ | 8/15 [00:01<00:01, 4.98it/s]
60%|ββββββ | 9/15 [00:01<00:01, 4.19it/s]
67%|βββββββ | 10/15 [00:02<00:01, 4.10it/s]
73%|ββββββββ | 11/15 [00:02<00:00, 4.19it/s]
80%|ββββββββ | 12/15 [00:02<00:00, 4.34it/s]
87%|βββββββββ | 13/15 [00:02<00:00, 4.36it/s]
93%|ββββββββββ| 14/15 [00:02<00:00, 4.40it/s]
100%|ββββββββββ| 15/15 [00:03<00:00, 4.35it/s]
100%|ββββββββββ| 15/15 [00:03<00:00, 4.74it/s]
| fit_time | score_time | test_precision | test_recall | test_f1 | test_balanced_accuracy | test_roc_auc | test_average_precision | n_observations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| n_features | ||||||||||||||||||
| 1 | 0.005 | 0.003 | 0.058 | 0.027 | 0.899 | 0.158 | 0.169 | 0.089 | 0.274 | 0.124 | 0.576 | 0.043 | 0.856 | 0.060 | 0.823 | 0.086 | 210.000 | 0.000 |
| 2 | 0.005 | 0.002 | 0.049 | 0.019 | 0.629 | 0.134 | 0.431 | 0.141 | 0.497 | 0.115 | 0.618 | 0.071 | 0.693 | 0.083 | 0.633 | 0.095 | 210.000 | 0.000 |
| 3 | 0.004 | 0.001 | 0.043 | 0.012 | 0.663 | 0.098 | 0.611 | 0.122 | 0.631 | 0.094 | 0.691 | 0.072 | 0.789 | 0.070 | 0.736 | 0.099 | 210.000 | 0.000 |
| 4 | 0.005 | 0.002 | 0.059 | 0.020 | 0.662 | 0.097 | 0.620 | 0.126 | 0.635 | 0.096 | 0.694 | 0.073 | 0.780 | 0.073 | 0.725 | 0.101 | 210.000 | 0.000 |
| 5 | 0.004 | 0.001 | 0.041 | 0.010 | 0.754 | 0.102 | 0.716 | 0.099 | 0.728 | 0.077 | 0.768 | 0.065 | 0.857 | 0.057 | 0.836 | 0.063 | 210.000 | 0.000 |
| 6 | 0.004 | 0.001 | 0.045 | 0.014 | 0.810 | 0.078 | 0.829 | 0.095 | 0.815 | 0.062 | 0.842 | 0.054 | 0.908 | 0.046 | 0.889 | 0.053 | 210.000 | 0.000 |
| 7 | 0.004 | 0.001 | 0.042 | 0.012 | 0.810 | 0.079 | 0.827 | 0.100 | 0.814 | 0.066 | 0.841 | 0.057 | 0.906 | 0.048 | 0.887 | 0.055 | 210.000 | 0.000 |
| 8 | 0.004 | 0.001 | 0.041 | 0.010 | 0.825 | 0.087 | 0.825 | 0.098 | 0.820 | 0.065 | 0.846 | 0.055 | 0.909 | 0.047 | 0.882 | 0.063 | 210.000 | 0.000 |
| 9 | 0.004 | 0.001 | 0.036 | 0.005 | 0.817 | 0.083 | 0.807 | 0.104 | 0.806 | 0.066 | 0.835 | 0.055 | 0.908 | 0.048 | 0.885 | 0.059 | 210.000 | 0.000 |
| 10 | 0.004 | 0.001 | 0.039 | 0.010 | 0.843 | 0.088 | 0.824 | 0.105 | 0.828 | 0.072 | 0.853 | 0.061 | 0.919 | 0.049 | 0.906 | 0.056 | 210.000 | 0.000 |
| 11 | 0.004 | 0.002 | 0.038 | 0.007 | 0.835 | 0.087 | 0.817 | 0.108 | 0.821 | 0.075 | 0.848 | 0.064 | 0.920 | 0.049 | 0.907 | 0.056 | 210.000 | 0.000 |
| 12 | 0.005 | 0.002 | 0.048 | 0.018 | 0.829 | 0.085 | 0.830 | 0.098 | 0.825 | 0.073 | 0.851 | 0.063 | 0.920 | 0.050 | 0.909 | 0.055 | 210.000 | 0.000 |
| 13 | 0.006 | 0.003 | 0.057 | 0.025 | 0.831 | 0.089 | 0.828 | 0.099 | 0.825 | 0.073 | 0.850 | 0.063 | 0.919 | 0.050 | 0.908 | 0.055 | 210.000 | 0.000 |
| 14 | 0.004 | 0.000 | 0.040 | 0.007 | 0.821 | 0.086 | 0.825 | 0.092 | 0.819 | 0.066 | 0.845 | 0.057 | 0.918 | 0.049 | 0.908 | 0.053 | 210.000 | 0.000 |
| 15 | 0.004 | 0.000 | 0.038 | 0.002 | 0.828 | 0.089 | 0.825 | 0.092 | 0.822 | 0.069 | 0.848 | 0.059 | 0.919 | 0.049 | 0.911 | 0.051 | 210.000 | 0.000 |
Using all data:
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 412.10it/s]
0%| | 0/2 [00:00<?, ?it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 6.71it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 6.66it/s]
0%| | 0/3 [00:00<?, ?it/s]
67%|βββββββ | 2/3 [00:00<00:00, 6.10it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 4.62it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 4.84it/s]
0%| | 0/4 [00:00<?, ?it/s]
50%|βββββ | 2/4 [00:00<00:00, 7.99it/s]
75%|ββββββββ | 3/4 [00:00<00:00, 5.42it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 4.99it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 5.35it/s]
0%| | 0/5 [00:00<?, ?it/s]
40%|ββββ | 2/5 [00:00<00:00, 11.13it/s]
80%|ββββββββ | 4/5 [00:00<00:00, 6.17it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 5.66it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 6.12it/s]
0%| | 0/6 [00:00<?, ?it/s]
33%|ββββ | 2/6 [00:00<00:00, 7.51it/s]
50%|βββββ | 3/6 [00:00<00:00, 5.56it/s]
67%|βββββββ | 4/6 [00:00<00:00, 4.65it/s]
83%|βββββββββ | 5/6 [00:01<00:00, 4.35it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 4.26it/s]
100%|ββββββββββ| 6/6 [00:01<00:00, 4.65it/s]
0%| | 0/7 [00:00<?, ?it/s]
29%|βββ | 2/7 [00:00<00:00, 7.96it/s]
43%|βββββ | 3/7 [00:00<00:00, 5.79it/s]
57%|ββββββ | 4/7 [00:00<00:00, 5.21it/s]
71%|ββββββββ | 5/7 [00:00<00:00, 4.66it/s]
86%|βββββββββ | 6/7 [00:01<00:00, 4.43it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 4.42it/s]
100%|ββββββββββ| 7/7 [00:01<00:00, 4.84it/s]
0%| | 0/8 [00:00<?, ?it/s]
25%|βββ | 2/8 [00:00<00:00, 9.94it/s]
38%|ββββ | 3/8 [00:00<00:00, 7.05it/s]
50%|βββββ | 4/8 [00:00<00:00, 6.26it/s]
62%|βββββββ | 5/8 [00:00<00:00, 5.84it/s]
75%|ββββββββ | 6/8 [00:00<00:00, 5.62it/s]
88%|βββββββββ | 7/8 [00:01<00:00, 5.57it/s]
100%|ββββββββββ| 8/8 [00:01<00:00, 5.34it/s]
100%|ββββββββββ| 8/8 [00:01<00:00, 5.87it/s]
0%| | 0/9 [00:00<?, ?it/s]
22%|βββ | 2/9 [00:00<00:00, 9.31it/s]
33%|ββββ | 3/9 [00:00<00:00, 6.74it/s]
44%|βββββ | 4/9 [00:00<00:00, 6.14it/s]
56%|ββββββ | 5/9 [00:00<00:00, 5.97it/s]
67%|βββββββ | 6/9 [00:00<00:00, 6.17it/s]
78%|ββββββββ | 7/9 [00:01<00:00, 6.36it/s]
89%|βββββββββ | 8/9 [00:01<00:00, 6.61it/s]
100%|ββββββββββ| 9/9 [00:01<00:00, 6.71it/s]
100%|ββββββββββ| 9/9 [00:01<00:00, 6.59it/s]
0%| | 0/10 [00:00<?, ?it/s]
20%|ββ | 2/10 [00:00<00:00, 10.44it/s]
40%|ββββ | 4/10 [00:00<00:00, 6.51it/s]
50%|βββββ | 5/10 [00:00<00:00, 6.03it/s]
60%|ββββββ | 6/10 [00:00<00:00, 5.68it/s]
70%|βββββββ | 7/10 [00:01<00:00, 5.50it/s]
80%|ββββββββ | 8/10 [00:01<00:00, 5.43it/s]
90%|βββββββββ | 9/10 [00:01<00:00, 5.37it/s]
100%|ββββββββββ| 10/10 [00:01<00:00, 5.15it/s]
100%|ββββββββββ| 10/10 [00:01<00:00, 5.67it/s]
0%| | 0/11 [00:00<?, ?it/s]
18%|ββ | 2/11 [00:00<00:00, 12.47it/s]
36%|ββββ | 4/11 [00:00<00:00, 7.88it/s]
45%|βββββ | 5/11 [00:00<00:00, 7.55it/s]
55%|ββββββ | 6/11 [00:00<00:00, 7.17it/s]
64%|βββββββ | 7/11 [00:00<00:00, 6.90it/s]
73%|ββββββββ | 8/11 [00:01<00:00, 6.88it/s]
82%|βββββββββ | 9/11 [00:01<00:00, 6.87it/s]
91%|βββββββββ | 10/11 [00:01<00:00, 6.99it/s]
100%|ββββββββββ| 11/11 [00:01<00:00, 7.09it/s]
100%|ββββββββββ| 11/11 [00:01<00:00, 7.30it/s]
0%| | 0/12 [00:00<?, ?it/s]
17%|ββ | 2/12 [00:00<00:00, 17.21it/s]
33%|ββββ | 4/12 [00:00<00:00, 10.87it/s]
50%|βββββ | 6/12 [00:00<00:00, 9.52it/s]
67%|βββββββ | 8/12 [00:00<00:00, 9.13it/s]
75%|ββββββββ | 9/12 [00:00<00:00, 9.01it/s]
83%|βββββββββ | 10/12 [00:01<00:00, 8.91it/s]
92%|ββββββββββ| 11/12 [00:01<00:00, 8.82it/s]
100%|ββββββββββ| 12/12 [00:01<00:00, 8.76it/s]
100%|ββββββββββ| 12/12 [00:01<00:00, 9.32it/s]
0%| | 0/13 [00:00<?, ?it/s]
15%|ββ | 2/13 [00:00<00:00, 16.76it/s]
31%|βββ | 4/13 [00:00<00:00, 10.80it/s]
46%|βββββ | 6/13 [00:00<00:00, 9.75it/s]
62%|βββββββ | 8/13 [00:00<00:00, 9.28it/s]
69%|βββββββ | 9/13 [00:00<00:00, 9.11it/s]
77%|ββββββββ | 10/13 [00:01<00:00, 8.98it/s]
85%|βββββββββ | 11/13 [00:01<00:00, 8.90it/s]
92%|ββββββββββ| 12/13 [00:01<00:00, 8.84it/s]
100%|ββββββββββ| 13/13 [00:01<00:00, 8.78it/s]
100%|ββββββββββ| 13/13 [00:01<00:00, 9.34it/s]
0%| | 0/14 [00:00<?, ?it/s]
14%|ββ | 2/14 [00:00<00:00, 17.33it/s]
29%|βββ | 4/14 [00:00<00:00, 10.88it/s]
43%|βββββ | 6/14 [00:00<00:00, 9.51it/s]
57%|ββββββ | 8/14 [00:00<00:00, 9.19it/s]
64%|βββββββ | 9/14 [00:00<00:00, 9.07it/s]
71%|ββββββββ | 10/14 [00:01<00:00, 8.97it/s]
79%|ββββββββ | 11/14 [00:01<00:00, 8.88it/s]
86%|βββββββββ | 12/14 [00:01<00:00, 8.82it/s]
93%|ββββββββββ| 13/14 [00:01<00:00, 8.78it/s]
100%|ββββββββββ| 14/14 [00:01<00:00, 8.76it/s]
100%|ββββββββββ| 14/14 [00:01<00:00, 9.27it/s]
0%| | 0/15 [00:00<?, ?it/s]
13%|ββ | 2/15 [00:00<00:00, 14.78it/s]
27%|βββ | 4/15 [00:00<00:01, 10.48it/s]
40%|ββββ | 6/15 [00:00<00:00, 9.58it/s]
53%|ββββββ | 8/15 [00:00<00:00, 9.22it/s]
60%|ββββββ | 9/15 [00:00<00:00, 9.08it/s]
67%|βββββββ | 10/15 [00:01<00:00, 8.98it/s]
73%|ββββββββ | 11/15 [00:01<00:00, 8.89it/s]
80%|ββββββββ | 12/15 [00:01<00:00, 8.81it/s]
87%|βββββββββ | 13/15 [00:01<00:00, 8.79it/s]
93%|ββββββββββ| 14/15 [00:01<00:00, 8.76it/s]
100%|ββββββββββ| 15/15 [00:01<00:00, 8.74it/s]
100%|ββββββββββ| 15/15 [00:01<00:00, 9.18it/s]
| fit_time | score_time | test_precision | test_recall | test_f1 | test_balanced_accuracy | test_roc_auc | test_average_precision | n_observations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| n_features | ||||||||||||||||||
| 1 | 0.004 | 0.002 | 0.049 | 0.016 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.496 | 0.008 | 0.860 | 0.060 | 0.828 | 0.086 | 210.000 | 0.000 |
| 2 | 0.005 | 0.002 | 0.053 | 0.020 | 0.615 | 0.124 | 0.431 | 0.125 | 0.496 | 0.107 | 0.614 | 0.066 | 0.715 | 0.073 | 0.654 | 0.091 | 210.000 | 0.000 |
| 3 | 0.003 | 0.000 | 0.035 | 0.006 | 0.815 | 0.084 | 0.748 | 0.091 | 0.774 | 0.060 | 0.809 | 0.048 | 0.881 | 0.050 | 0.849 | 0.071 | 210.000 | 0.000 |
| 4 | 0.004 | 0.001 | 0.035 | 0.007 | 0.806 | 0.090 | 0.733 | 0.105 | 0.761 | 0.072 | 0.799 | 0.057 | 0.877 | 0.049 | 0.841 | 0.071 | 210.000 | 0.000 |
| 5 | 0.003 | 0.000 | 0.029 | 0.004 | 0.788 | 0.091 | 0.734 | 0.098 | 0.754 | 0.066 | 0.792 | 0.053 | 0.874 | 0.046 | 0.839 | 0.069 | 210.000 | 0.000 |
| 6 | 0.003 | 0.000 | 0.032 | 0.003 | 0.783 | 0.091 | 0.726 | 0.102 | 0.747 | 0.067 | 0.786 | 0.054 | 0.870 | 0.046 | 0.835 | 0.068 | 210.000 | 0.000 |
| 7 | 0.003 | 0.001 | 0.026 | 0.006 | 0.781 | 0.088 | 0.731 | 0.118 | 0.748 | 0.076 | 0.788 | 0.060 | 0.882 | 0.049 | 0.839 | 0.075 | 210.000 | 0.000 |
| 8 | 0.003 | 0.001 | 0.031 | 0.005 | 0.797 | 0.082 | 0.735 | 0.117 | 0.759 | 0.079 | 0.798 | 0.063 | 0.900 | 0.045 | 0.870 | 0.064 | 210.000 | 0.000 |
| 9 | 0.002 | 0.000 | 0.022 | 0.003 | 0.807 | 0.091 | 0.772 | 0.098 | 0.785 | 0.073 | 0.817 | 0.061 | 0.905 | 0.047 | 0.873 | 0.068 | 210.000 | 0.000 |
| 10 | 0.003 | 0.001 | 0.027 | 0.006 | 0.809 | 0.086 | 0.789 | 0.109 | 0.794 | 0.075 | 0.825 | 0.063 | 0.912 | 0.046 | 0.880 | 0.062 | 210.000 | 0.000 |
| 11 | 0.002 | 0.000 | 0.021 | 0.000 | 0.809 | 0.084 | 0.784 | 0.109 | 0.792 | 0.076 | 0.823 | 0.063 | 0.913 | 0.046 | 0.882 | 0.062 | 210.000 | 0.000 |
| 12 | 0.002 | 0.000 | 0.021 | 0.000 | 0.846 | 0.084 | 0.803 | 0.096 | 0.820 | 0.067 | 0.846 | 0.057 | 0.925 | 0.042 | 0.899 | 0.058 | 210.000 | 0.000 |
| 13 | 0.002 | 0.000 | 0.021 | 0.000 | 0.855 | 0.080 | 0.809 | 0.090 | 0.828 | 0.064 | 0.853 | 0.054 | 0.925 | 0.041 | 0.899 | 0.057 | 210.000 | 0.000 |
| 14 | 0.002 | 0.000 | 0.021 | 0.000 | 0.846 | 0.074 | 0.817 | 0.079 | 0.828 | 0.058 | 0.853 | 0.049 | 0.929 | 0.039 | 0.907 | 0.052 | 210.000 | 0.000 |
| 15 | 0.002 | 0.000 | 0.021 | 0.000 | 0.845 | 0.076 | 0.811 | 0.080 | 0.825 | 0.061 | 0.850 | 0.052 | 0.930 | 0.039 | 0.908 | 0.051 | 210.000 | 0.000 |
Using only new features:
0%| | 0/1 [00:00<?, ?it/s]
100%|ββββββββββ| 1/1 [00:00<00:00, 1382.89it/s]
0%| | 0/2 [00:00<?, ?it/s]
100%|ββββββββββ| 2/2 [00:00<00:00, 64.15it/s]
0%| | 0/3 [00:00<?, ?it/s]
100%|ββββββββββ| 3/3 [00:00<00:00, 49.05it/s]
0%| | 0/4 [00:00<?, ?it/s]
100%|ββββββββββ| 4/4 [00:00<00:00, 44.90it/s]
0%| | 0/5 [00:00<?, ?it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 41.29it/s]
100%|ββββββββββ| 5/5 [00:00<00:00, 41.06it/s]
0%| | 0/6 [00:00<?, ?it/s]
83%|βββββββββ | 5/6 [00:00<00:00, 40.00it/s]
100%|ββββββββββ| 6/6 [00:00<00:00, 38.47it/s]
0%| | 0/7 [00:00<?, ?it/s]
71%|ββββββββ | 5/7 [00:00<00:00, 41.48it/s]
100%|ββββββββββ| 7/7 [00:00<00:00, 38.21it/s]
0%| | 0/8 [00:00<?, ?it/s]
62%|βββββββ | 5/8 [00:00<00:00, 41.39it/s]
100%|ββββββββββ| 8/8 [00:00<00:00, 37.70it/s]
0%| | 0/9 [00:00<?, ?it/s]
56%|ββββββ | 5/9 [00:00<00:00, 39.95it/s]
100%|ββββββββββ| 9/9 [00:00<00:00, 36.02it/s]
100%|ββββββββββ| 9/9 [00:00<00:00, 36.52it/s]
0%| | 0/10 [00:00<?, ?it/s]
50%|βββββ | 5/10 [00:00<00:00, 40.78it/s]
100%|ββββββββββ| 10/10 [00:00<00:00, 36.47it/s]
100%|ββββββββββ| 10/10 [00:00<00:00, 36.96it/s]
0%| | 0/11 [00:00<?, ?it/s]
45%|βββββ | 5/11 [00:00<00:00, 42.05it/s]
91%|βββββββββ | 10/11 [00:00<00:00, 36.59it/s]
100%|ββββββββββ| 11/11 [00:00<00:00, 36.96it/s]
0%| | 0/12 [00:00<?, ?it/s]
42%|βββββ | 5/12 [00:00<00:00, 43.91it/s]
83%|βββββββββ | 10/12 [00:00<00:00, 37.55it/s]
100%|ββββββββββ| 12/12 [00:00<00:00, 37.63it/s]
0%| | 0/13 [00:00<?, ?it/s]
38%|ββββ | 5/13 [00:00<00:00, 42.18it/s]
77%|ββββββββ | 10/13 [00:00<00:00, 37.48it/s]
100%|ββββββββββ| 13/13 [00:00<00:00, 37.01it/s]
0%| | 0/14 [00:00<?, ?it/s]
36%|ββββ | 5/14 [00:00<00:00, 43.50it/s]
71%|ββββββββ | 10/14 [00:00<00:00, 37.64it/s]
100%|ββββββββββ| 14/14 [00:00<00:00, 36.26it/s]
100%|ββββββββββ| 14/14 [00:00<00:00, 37.10it/s]
0%| | 0/15 [00:00<?, ?it/s]
33%|ββββ | 5/15 [00:00<00:00, 41.96it/s]
67%|βββββββ | 10/15 [00:00<00:00, 36.44it/s]
93%|ββββββββββ| 14/15 [00:00<00:00, 35.98it/s]
100%|ββββββββββ| 15/15 [00:00<00:00, 36.45it/s]
| fit_time | score_time | test_precision | test_recall | test_f1 | test_balanced_accuracy | test_roc_auc | test_average_precision | n_observations | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | mean | std | |
| n_features | ||||||||||||||||||
| 1 | 0.002 | 0.000 | 0.021 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.500 | 0.000 | 0.731 | 0.071 | 0.676 | 0.083 | 210.000 | 0.000 |
| 2 | 0.002 | 0.000 | 0.021 | 0.000 | 0.565 | 0.121 | 0.422 | 0.110 | 0.476 | 0.098 | 0.590 | 0.066 | 0.689 | 0.074 | 0.668 | 0.070 | 210.000 | 0.000 |
| 3 | 0.002 | 0.000 | 0.021 | 0.001 | 0.560 | 0.118 | 0.412 | 0.106 | 0.467 | 0.095 | 0.585 | 0.063 | 0.682 | 0.076 | 0.661 | 0.071 | 210.000 | 0.000 |
| 4 | 0.002 | 0.000 | 0.021 | 0.000 | 0.613 | 0.118 | 0.451 | 0.092 | 0.514 | 0.089 | 0.618 | 0.066 | 0.688 | 0.070 | 0.687 | 0.062 | 210.000 | 0.000 |
| 5 | 0.002 | 0.000 | 0.021 | 0.000 | 0.630 | 0.114 | 0.460 | 0.089 | 0.526 | 0.082 | 0.628 | 0.059 | 0.706 | 0.070 | 0.698 | 0.068 | 210.000 | 0.000 |
| 6 | 0.002 | 0.000 | 0.021 | 0.000 | 0.615 | 0.098 | 0.463 | 0.091 | 0.521 | 0.076 | 0.623 | 0.053 | 0.710 | 0.068 | 0.689 | 0.068 | 210.000 | 0.000 |
| 7 | 0.002 | 0.000 | 0.021 | 0.000 | 0.625 | 0.101 | 0.499 | 0.101 | 0.549 | 0.084 | 0.638 | 0.061 | 0.709 | 0.065 | 0.696 | 0.067 | 210.000 | 0.000 |
| 8 | 0.002 | 0.000 | 0.021 | 0.001 | 0.623 | 0.102 | 0.492 | 0.103 | 0.544 | 0.086 | 0.635 | 0.061 | 0.703 | 0.063 | 0.688 | 0.064 | 210.000 | 0.000 |
| 9 | 0.002 | 0.000 | 0.021 | 0.000 | 0.615 | 0.105 | 0.492 | 0.108 | 0.541 | 0.092 | 0.632 | 0.063 | 0.697 | 0.063 | 0.680 | 0.064 | 210.000 | 0.000 |
| 10 | 0.002 | 0.000 | 0.021 | 0.000 | 0.613 | 0.103 | 0.480 | 0.102 | 0.531 | 0.087 | 0.627 | 0.058 | 0.694 | 0.070 | 0.683 | 0.070 | 210.000 | 0.000 |
| 11 | 0.002 | 0.000 | 0.021 | 0.000 | 0.648 | 0.108 | 0.508 | 0.104 | 0.561 | 0.079 | 0.648 | 0.058 | 0.735 | 0.061 | 0.714 | 0.067 | 210.000 | 0.000 |
| 12 | 0.002 | 0.000 | 0.021 | 0.000 | 0.669 | 0.105 | 0.525 | 0.102 | 0.580 | 0.079 | 0.664 | 0.058 | 0.735 | 0.061 | 0.716 | 0.068 | 210.000 | 0.000 |
| 13 | 0.002 | 0.000 | 0.021 | 0.000 | 0.655 | 0.106 | 0.522 | 0.107 | 0.573 | 0.084 | 0.658 | 0.061 | 0.732 | 0.061 | 0.711 | 0.068 | 210.000 | 0.000 |
| 14 | 0.002 | 0.000 | 0.021 | 0.001 | 0.648 | 0.108 | 0.510 | 0.109 | 0.563 | 0.088 | 0.651 | 0.063 | 0.726 | 0.061 | 0.702 | 0.071 | 210.000 | 0.000 |
| 15 | 0.002 | 0.000 | 0.021 | 0.000 | 0.653 | 0.103 | 0.536 | 0.097 | 0.582 | 0.079 | 0.660 | 0.062 | 0.744 | 0.062 | 0.716 | 0.074 | 210.000 | 0.000 |
Best number of features by subset of the data:#
| ald | all | new | |
|---|---|---|---|
| fit_time | 13 | 2 | 14 |
| score_time | 4 | 2 | 8 |
| test_precision | 1 | 13 | 12 |
| test_recall | 12 | 14 | 15 |
| test_f1 | 10 | 14 | 15 |
| test_balanced_accuracy | 10 | 13 | 12 |
| test_roc_auc | 11 | 15 | 15 |
| test_average_precision | 15 | 15 | 15 |
| n_observations | 1 | 1 | 1 |
Train, test split#
Show number of cases in train and test data
| train | test | |
|---|---|---|
| False | 98 | 24 |
| True | 70 | 18 |
Results#
run_modelreturns dataclasses with the further needed resultsadd mrmr selection of data (select best number of features to use instead of fixing it)
Save results for final model on entire data, new features and ALD study criteria selected data.
0%| | 0/15 [00:00<?, ?it/s]
13%|ββ | 2/15 [00:00<00:00, 17.16it/s]
27%|βββ | 4/15 [00:00<00:01, 10.86it/s]
40%|ββββ | 6/15 [00:00<00:00, 9.76it/s]
53%|ββββββ | 8/15 [00:00<00:00, 9.30it/s]
60%|ββββββ | 9/15 [00:00<00:00, 8.95it/s]
67%|βββββββ | 10/15 [00:01<00:00, 8.88it/s]
73%|ββββββββ | 11/15 [00:01<00:00, 8.85it/s]
80%|ββββββββ | 12/15 [00:01<00:00, 8.79it/s]
87%|βββββββββ | 13/15 [00:01<00:00, 8.74it/s]
93%|ββββββββββ| 14/15 [00:01<00:00, 8.72it/s]
100%|ββββββββββ| 15/15 [00:01<00:00, 8.70it/s]
100%|ββββββββββ| 15/15 [00:01<00:00, 9.21it/s]
0%| | 0/15 [00:00<?, ?it/s]
33%|ββββ | 5/15 [00:00<00:00, 40.61it/s]
67%|βββββββ | 10/15 [00:00<00:00, 35.38it/s]
93%|ββββββββββ| 14/15 [00:00<00:00, 34.73it/s]
100%|ββββββββββ| 15/15 [00:00<00:00, 35.22it/s]
0%| | 0/11 [00:00<?, ?it/s]
18%|ββ | 2/11 [00:00<00:00, 19.56it/s]
36%|ββββ | 4/11 [00:00<00:00, 12.37it/s]
55%|ββββββ | 6/11 [00:00<00:00, 11.05it/s]
73%|ββββββββ | 8/11 [00:00<00:00, 10.55it/s]
91%|βββββββββ | 10/11 [00:00<00:00, 10.29it/s]
100%|ββββββββββ| 11/11 [00:01<00:00, 10.81it/s]
ROC-AUC on test split#
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/auc_roc_curve.pdf
Data used to plot ROC:
| ALD study all | TRKNN all | TRKNN new | ||||
|---|---|---|---|---|---|---|
| fpr | tpr | fpr | tpr | fpr | tpr | |
| 0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| 1 | 0.000 | 0.056 | 0.000 | 0.056 | 0.042 | 0.000 |
| 2 | 0.000 | 0.611 | 0.000 | 0.222 | 0.042 | 0.222 |
| 3 | 0.042 | 0.611 | 0.042 | 0.222 | 0.125 | 0.222 |
| 4 | 0.042 | 0.833 | 0.042 | 0.389 | 0.125 | 0.333 |
| 5 | 0.167 | 0.833 | 0.125 | 0.389 | 0.167 | 0.333 |
| 6 | 0.167 | 0.889 | 0.125 | 0.556 | 0.167 | 0.500 |
| 7 | 0.542 | 0.889 | 0.292 | 0.556 | 0.250 | 0.500 |
| 8 | 0.542 | 0.944 | 0.292 | 0.611 | 0.250 | 0.556 |
| 9 | 0.583 | 0.944 | 0.333 | 0.611 | 0.292 | 0.556 |
| 10 | 0.583 | 1.000 | 0.333 | 0.778 | 0.292 | 0.611 |
| 11 | 1.000 | 1.000 | 0.417 | 0.778 | 0.333 | 0.611 |
| 12 | NaN | NaN | 0.417 | 0.833 | 0.333 | 0.667 |
| 13 | NaN | NaN | 0.458 | 0.833 | 0.458 | 0.667 |
| 14 | NaN | NaN | 0.458 | 0.944 | 0.458 | 0.722 |
| 15 | NaN | NaN | 0.500 | 0.944 | 0.500 | 0.722 |
| 16 | NaN | NaN | 0.500 | 1.000 | 0.500 | 0.778 |
| 17 | NaN | NaN | 1.000 | 1.000 | 0.583 | 0.778 |
| 18 | NaN | NaN | NaN | NaN | 0.583 | 0.944 |
| 19 | NaN | NaN | NaN | NaN | 0.708 | 0.944 |
| 20 | NaN | NaN | NaN | NaN | 0.708 | 1.000 |
| 21 | NaN | NaN | NaN | NaN | 1.000 | 1.000 |
Features selected for final models#
| ALD study all | TRKNN all | TRKNN new | |
|---|---|---|---|
| rank | |||
| 0 | P10636-2;P10636-6 | P10636-2;P10636-6 | P31321 |
| 1 | K7ER15;Q9H0R4;Q9H0R4-2 | P08670 | P61088 |
| 2 | P02741 | P01011 | Q14894 |
| 3 | P61981 | Q9Y2T3;Q9Y2T3-3 | F8WBF9;Q5TH30;Q9UGV2;Q9UGV2-2;Q9UGV2-3 |
| 4 | P04075 | P10909-3 | Q9NUQ9 |
| 5 | P14174 | P61981 | Q9GZT8;Q9GZT8-2 |
| 6 | Q9Y2T3;Q9Y2T3-3 | P15151-2 | J3KSJ8;Q9UD71;Q9UD71-2 |
| 7 | P08294 | P04075 | A0A0C4DGV4;E9PLX3;O43504;R4GMU8 |
| 8 | P00338;P00338-3 | P25189;P25189-2 | Q96GD0 |
| 9 | P14618 | P14174 | A0A0J9YW36;Q9NZ72;Q9NZ72-2 |
| 10 | Q6EMK4 | P63104 | Q9H741 |
| 11 | None | P00492 | P51688 |
| 12 | None | P00338;P00338-3 | P01743 |
| 13 | None | Q6EMK4 | A0A1W2PQ94;B4DS77;B4DS77-2;B4DS77-3 |
| 14 | None | Q14894 | P31150 |
Precision-Recall plot on test data#
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/prec_recall_curve.pdf
Data used to plot PRC:
| ALD study all | TRKNN all | TRKNN new | ||||
|---|---|---|---|---|---|---|
| precision | tpr | precision | tpr | precision | tpr | |
| 0 | 0.429 | 1.000 | 0.429 | 1.000 | 0.429 | 1.000 |
| 1 | 0.439 | 1.000 | 0.439 | 1.000 | 0.439 | 1.000 |
| 2 | 0.450 | 1.000 | 0.450 | 1.000 | 0.450 | 1.000 |
| 3 | 0.462 | 1.000 | 0.462 | 1.000 | 0.462 | 1.000 |
| 4 | 0.474 | 1.000 | 0.474 | 1.000 | 0.474 | 1.000 |
| 5 | 0.486 | 1.000 | 0.486 | 1.000 | 0.486 | 1.000 |
| 6 | 0.500 | 1.000 | 0.500 | 1.000 | 0.500 | 1.000 |
| 7 | 0.514 | 1.000 | 0.514 | 1.000 | 0.514 | 1.000 |
| 8 | 0.529 | 1.000 | 0.529 | 1.000 | 0.500 | 0.944 |
| 9 | 0.545 | 1.000 | 0.545 | 1.000 | 0.515 | 0.944 |
| 10 | 0.562 | 1.000 | 0.562 | 1.000 | 0.531 | 0.944 |
| 11 | 0.548 | 0.944 | 0.581 | 1.000 | 0.548 | 0.944 |
| 12 | 0.567 | 0.944 | 0.600 | 1.000 | 0.533 | 0.889 |
| 13 | 0.552 | 0.889 | 0.586 | 0.944 | 0.517 | 0.833 |
| 14 | 0.571 | 0.889 | 0.607 | 0.944 | 0.500 | 0.778 |
| 15 | 0.593 | 0.889 | 0.593 | 0.889 | 0.519 | 0.778 |
| 16 | 0.615 | 0.889 | 0.577 | 0.833 | 0.538 | 0.778 |
| 17 | 0.640 | 0.889 | 0.600 | 0.833 | 0.520 | 0.722 |
| 18 | 0.667 | 0.889 | 0.583 | 0.778 | 0.542 | 0.722 |
| 19 | 0.696 | 0.889 | 0.609 | 0.778 | 0.522 | 0.667 |
| 20 | 0.727 | 0.889 | 0.636 | 0.778 | 0.545 | 0.667 |
| 21 | 0.762 | 0.889 | 0.619 | 0.722 | 0.571 | 0.667 |
| 22 | 0.800 | 0.889 | 0.600 | 0.667 | 0.600 | 0.667 |
| 23 | 0.789 | 0.833 | 0.579 | 0.611 | 0.579 | 0.611 |
| 24 | 0.833 | 0.833 | 0.611 | 0.611 | 0.611 | 0.611 |
| 25 | 0.882 | 0.833 | 0.588 | 0.556 | 0.588 | 0.556 |
| 26 | 0.938 | 0.833 | 0.625 | 0.556 | 0.625 | 0.556 |
| 27 | 0.933 | 0.778 | 0.667 | 0.556 | 0.600 | 0.500 |
| 28 | 0.929 | 0.722 | 0.714 | 0.556 | 0.643 | 0.500 |
| 29 | 0.923 | 0.667 | 0.769 | 0.556 | 0.692 | 0.500 |
| 30 | 0.917 | 0.611 | 0.750 | 0.500 | 0.667 | 0.444 |
| 31 | 1.000 | 0.611 | 0.727 | 0.444 | 0.636 | 0.389 |
| 32 | 1.000 | 0.556 | 0.700 | 0.389 | 0.600 | 0.333 |
| 33 | 1.000 | 0.500 | 0.778 | 0.389 | 0.667 | 0.333 |
| 34 | 1.000 | 0.444 | 0.875 | 0.389 | 0.625 | 0.278 |
| 35 | 1.000 | 0.389 | 0.857 | 0.333 | 0.571 | 0.222 |
| 36 | 1.000 | 0.333 | 0.833 | 0.278 | 0.667 | 0.222 |
| 37 | 1.000 | 0.278 | 0.800 | 0.222 | 0.800 | 0.222 |
| 38 | 1.000 | 0.222 | 1.000 | 0.222 | 0.750 | 0.167 |
| 39 | 1.000 | 0.167 | 1.000 | 0.167 | 0.667 | 0.111 |
| 40 | 1.000 | 0.111 | 1.000 | 0.111 | 0.500 | 0.056 |
| 41 | 1.000 | 0.056 | 1.000 | 0.056 | 0.000 | 0.000 |
| 42 | 1.000 | 0.000 | 1.000 | 0.000 | 1.000 | 0.000 |
Train data plots#
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/prec_recall_curve_train.pdf
pimmslearn.plotting - INFO Saved Figures to runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/auc_roc_curve_train.pdf
Output files:
{'results_TRKNN all.pkl': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/results_TRKNN all.pkl'),
'results_TRKNN new.pkl': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/results_TRKNN new.pkl'),
'results_ALD study all.pkl': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/results_ALD study all.pkl'),
'auc_roc_curve.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/auc_roc_curve.pdf'),
'mrmr_feat_by_model.xlsx': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/mrmr_feat_by_model.xlsx'),
'prec_recall_curve.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/prec_recall_curve.pdf'),
'prec_recall_curve_train.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/prec_recall_curve_train.pdf'),
'auc_roc_curve_train.pdf': PosixPath('runs/alzheimer_study/diff_analysis/AD/PI_vs_TRKNN/auc_roc_curve_train.pdf')}