nf-core/drugresponseeval
Pipeline for testing drug response prediction models in a statistically and biologically sound way.
cell-linescross-validationdeep-learningdrug-responsedrug-response-predictiondrugsfair-principlesgeneralizationhyperparameter-tuningmachine-learningrandomization-testsrobustness-assessmenttraining
Version history
What’s Changed
Added
- #43 Preprint is out now! Linking it in the documentation.
- #42 Added authors and licenses to the python scripts.
- #43 Added
--no_hyperparameter_tuningflag for quick runs without hyperparameter tuning: hpam_split takes this as argument - #43 Added
--final_model_on_full dataflag: if True, a final/production model is saved in the results directory. If hyperparameter_tuning is true, the final model is tuned, too. The model can later be loaded using the implemented load functions of the drevalpy models.- New process
FINAL_SPLIT: splits the full dataset for each model class into train, validation, and optionally early stopping. This is done per model class and not overall because here, we no longer need across-model compatibility but want to train on the maximum amount of data (which might vary between models due to different feature availability) - New process
TUNE_FINAL_MODEL: trains the final model(s) with all hyperparameter combinations - Added process
EVALUATE_AND_FIND_MAX_FINAL: re-uses theEVALUATE_AND_FIND_MAXprocess to find the best hpam combination (evaluated on the validation dataset) - New process
TRAIN_FINAL_MODEL: uses the best hpam combination to train the final model and save it
- New process
- #43 Added ProteomicsElasticNet, SingleDrugProteomicsRandomForest to list of known models
- #38 Reporting all package versions
- #38 Added
UNZIPmodule for loading and unzipping the drug response datasets instead of handling this inLOAD_RESPONSE:UNZIP_RESPONSE,UNZIP_CS_RESPONSE(for cross-study datasets). - #38 Added icon
- #30 Added the possibility of a leave-tissue-out (LTO) split
Changed
- #53 Changed to large runner for the GitHub Actions because of Docker → Singularity conversion.
- #42 Moved all publishDir directives to modules.config.
- #44 Fixed drevalpy versions in conda and docker to 1.3.5: now supporting Python 3.13
- #38 Support for AWS: changed the structure of load response and parameter check to conform more to Nextflow best practices.
- #44 Since drevalpy 1.3.5., the split_early_stopping function is no longer private.
- #39 Template update to version 3.3.1
- #38 Changed the defaults for
test_modefrom LPO to LCO anddataset_namefrom GDSC to CTRPv2 to better match the publication. - #35 , #38 Introducing
assets/NO_FILEfor empty file handling in the visualization process. - #30 Changed pipeline overview svg to Figure 1 from paper
Removed
- #30 Simplified visualization: multiple short processes were creating overhang → more efficient in one process.
- #44 Removed the
--no_refitting parameterin load_response. It was no longer needed because of the new, more nextflow-y preprocess workflow - #44 Removed redundant code in the visualization python script. Possible because of a new wrapper function in drevalpy 1.3.5.
- #38 Removed
PARAMS_CHECKprocess: now handled by the schema and theutils_nfcore_drugresponseeval_pipelinesubworkflow. - #38 Removed the
--curve_curatorflag which was true by default. It is now theno_refittingflag which is false by default.
Fixed
- #44 casting a path to a string in
bin/consolidate_results.pyfor drevalpy 1.3.5 compatibility. - #43 casting drug to str in
bin/collect_results.pybecause there were issues if all drugs were pubchem IDs and were treated as numeric values. - #43 forgot to add the
dataset_nameinbin/load_response.py, made the tissue identifier optional. This was causing problems for custom datasets. - #38 passing rand_modes in quotes to
bin/consolidate_results.pybecause otherwise, if more than one mode was passed, it was not recognized as a list. - #30 Added the path to the data directory to
COLLECT_RESULTSbecause from there, we get the drug and cell line names for visualization. - #30 Fixed handling of when ‘None’ was passed as randomization mode to
CONSOLIDATE_RESULTS.
Dependencies
| Dependency | Old version | New version |
|---|---|---|
| drevalpy | 1.1.3 | 1.3.5 |
Parameters
| Params | Status |
|---|---|
--no_hyperparameter_tuning | New |
--final_model_on_full_data | New |
--no_refitting | New (replaces --curve_curator) |
--curve_curator | Removed |
Full Changelog: https://github.com/nf-core/drugresponseeval/compare/1.0.0…1.1.0
What’s Changed
- Important! Template update for nf-core/tools v3.0.1 by @nf-core-bot in https://github.com/nf-core/drugresponseeval/pull/10
- Merge branch ‘dev’ of github.com:nf-core/drugresponseeval into dev by @JudithBernett in https://github.com/nf-core/drugresponseeval/pull/11
- Global checkpoint dir by @PascalIversen in https://github.com/nf-core/drugresponseeval/pull/16
- Important! Template update for nf-core/tools v3.1.1 by @nf-core-bot in https://github.com/nf-core/drugresponseeval/pull/17
- Fix/datapath by @JudithBernett in https://github.com/nf-core/drugresponseeval/pull/18
- Important! Template update for nf-core/tools v3.2.0 by @nf-core-bot in https://github.com/nf-core/drugresponseeval/pull/20
- Feature/curvecurator module by @picciama in https://github.com/nf-core/drugresponseeval/pull/14
- Update env.yml by @JudithBernett in https://github.com/nf-core/drugresponseeval/pull/21
- First release! by @JudithBernett in https://github.com/nf-core/drugresponseeval/pull/22
New Contributors
- @nf-core-bot made their first contribution in https://github.com/nf-core/drugresponseeval/pull/10
- @JudithBernett made their first contribution in https://github.com/nf-core/drugresponseeval/pull/11
- @PascalIversen made their first contribution in https://github.com/nf-core/drugresponseeval/pull/16
- @picciama made their first contribution in https://github.com/nf-core/drugresponseeval/pull/14
Full Changelog: https://github.com/nf-core/drugresponseeval/commits/1.0.0