nf-core/proteinfold
Protein 3D structure prediction pipeline
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringSpecifies the mode in which the pipeline will be run. mode can be any combination of [‘alphafold2’, ‘alphafold3’, ‘colabfold’, ‘esmfold’, ‘rosettafold_all_atom’, ‘boltz’, ‘helixfold3’] separated by a comma (’,’) with no spaces.
stringalphafold2Run on CPUs (default) or GPUs
booleanSplit input multi-fasta file in separated fasta files each of them containing one sequence to be folded
booleanEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringThe directory where reference data is stored. Individual methods can be overwritten with method-specific paths.
stringGlobal toggle for full database usage.
booleanUniRef major release
stringUse the cloud MSA server
booleanSpecify your custom MMSeqs2 API server url
stringAlphaFold2 options.
Maximum date of the PDB templates used by ‘AlphaFold2’ mode
string2038-01-19^\d{4}-\d{2}-\d{2}$If true uses the full version of the BFD database otherwise, otherwise it uses its reduced version, small bfd
booleanSpecifies the mode in which AlphaFold2 will be run
stringModel preset for ‘AlphaFold2’ mode
stringRandom seed to control stochastic alphafold inference.
stringAlphafold2 parameters version
stringColabFold options.
Model preset for ‘colabfold’ mode
stringNumber of recycles for ColabFold
integer3Use Amber minimization to refine the predicted structures
booleantrueSpecify the way that MMSeqs2 will load the required databases in memory
integerUse PDB templates
booleantrueCreate databases indexes when running colabfold_local mode
booleanESMFold options.
Specifies the number of recycles used by ESMFold
integer4Specifies whether is a ‘monomer’ or ‘multimer’ prediction
stringBoltz options.
run Boltz-2 using inference time potentials
booleanSets the model to use for prediction. Default is boltz2
stringFoldseek options.
Specifies the mode of foldseek search.
stringThe ID of Foldseek databases
stringSpecifies the path to foldseek databases used by ‘foldseek’.
stringSpecifies the arguments to be passed to foldseek easysearch command
stringHelixFold3 options
The numerical precision used by the HelixFold3 model.
stringNumber of independent predictions made with the HelixFold3 model
integer4No PDB template released after this date will be used to guide predictions.
string2038-01-19Options to skip various steps within the workflow.
Skip MultiQC.
booleanSkip visualisation reports.
booleanParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringParameters used to provide the links to the DBs and parameters public resources to AlphaFold2.
Link to BFD dababase
stringhttps://storage.googleapis.com/alphafold-databases/casp14_versions/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gzLink to a reduced version of the BFD dababase
stringhttps://storage.googleapis.com/alphafold-databases/reduced_dbs/bfd-first_non_consensus_sequences.fasta.gzLink to the AlphaFold2 parameters
stringhttps://storage.googleapis.com/alphafold/alphafold_params_2022-12-06.tarLink to the MGnify database
stringhttps://ftp.ebi.ac.uk/pub/databases/metagenomics/peptide_database/2024_04/mgy_clusters.fa.gzLink to the PDB70 database
stringhttps://wwwuser.gwdguser.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pdb70_from_mmcif_220313.tar.gzLink to the PDB mmCIF database
stringrsync.rcsb.org::ftp_data/structures/divided/mmCIF/Link to the PDB obsolete database
stringhttps://files.wwpdb.org/pub/pdb/data/status/obsolete.datLink to the Uniclust30 database
stringhttps://wwwuser.gwdguser.de/~compbiol/uniclust/2023_02/UniRef30_2023_02_hhsuite.tar.gzLink to the UniRef90 database
stringhttps://ftp.ebi.ac.uk/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gzLink to the PDB SEQRES database
stringhttps://files.wwpdb.org/pub/pdb/derived_data/pdb_seqres.txtLink to the SwissProt UniProt database
stringhttps://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gzLink to the TrEMBL UniProt database
stringhttps://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.fasta.gzParameters used to provide the paths to the DBs and parameters for AlphaFold2.
Specifies the DB and PARAMS path used by ‘AlphaFold2’ mode
stringPath to BFD dababase
stringnull/bfd/*Path to a reduced version of the BFD database
stringnull/small_bfd/*Path to the AlphaFold2 parameters
stringnull/params/alphafold_params_2022-12-06/*Path to the MGnify database
stringnull/mgnify/*Path to the PDB70 database
stringnull/pdb70/**Path to the PDB mmCIF database
stringnull/pdb_mmcif/mmcif_filesPath to the PDB obsolete file
stringnull/pdb_mmcif/obsolete.datPath to the Uniref30 database
stringnull/uniref30/*Path to the UniRef90 database
stringnull/uniref90/*Path to the PDB SEQRES database
stringnull/pdb_seqres/*Path to UniProt database containing the SwissProt and the TrEMBL databases
stringnull/uniprot/*Parameters used to provide the links to the DBs and parameters public resources to Alphafold3.
Specifies the DB and PARAMS path used by ‘AlphaFold3’ mode
stringLink to a reduced version of the BFD dababase
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/bfd-first_non_consensus_sequences.fasta.zstLink to the MGnify database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/mgy_clusters_2022_05.fa.zstLink to the PDB mmCIF database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/pdb_2022_09_28_mmcif_files.tar.zstLink to the UniRef90 database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/uniref90_2022_05.fa.zstLink to the PDB SEQRES database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/pdb_seqres_2022_09_28.fasta.zstLink to the UniProt database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/uniprot_all_2021_04.fa.zstLink to the RNAcentral database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/rnacentral_active_seq_id_90_cov_80_linclust.fasta.zstLink to the nt_rna database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/nt_rna_2023_02_23_clust_seq_id_90_cov_80_rep_seq.fasta.zstLink to the Rfam database
stringhttps://storage.googleapis.com/alphafold-databases/v3.0/rfam_14_9_clust_seq_id_90_cov_80_rep_seq.fasta.zstParameters used to provide the paths to the DBs and parameters for Alphafold2.
Path to the reduced version of the BFD database
stringnull/small_bfd/*Path to the Alphafold3 parameters
stringnull/params/*Path to the MGnify database
stringnull/mgnify/*Path to the PDB mmCIF database
stringnull/pdb_mmcif/mmcif_filesPath to the UniRef90 database
stringnull/uniref90/*Path to the PDB SEQRES database
stringnull/pdb_seqres/*Path to UniProt database containing the SwissProt and the TrEMBL databases
stringnull/uniprot/*Path to the RNAcentral database
stringnull/rnacentral/*Path to the nt_rna database
stringnull/nt_rna/*Path to the Rfam database
stringnull/rfam/*Parameters used to provide the links to the DBs and parameters public resources to ColabFold.
Link to the ColabFold database
stringhttps://opendata.mmseqs.org/colabfold/colabfold_envdb_202108.db.tar.gzLink to the UniRef30 database
stringhttps://opendata.mmseqs.org/colabfold/uniref30_2302.db.tar.gzLink to the Alphafold2 parameters for Colabfold
stringParameters used to provide the paths to the DBs and parameters public resources to ColabFold.
Specifies the PARAMS and DB path used by ‘colabfold’ mode
stringLink to the ColabFold database
stringnull/colabfold_envdb/*Link to the UniRef30 database
stringnull/colabfold_uniref30/*Link to the Alphafold2 parameters for Colabfold
stringDictionary with Alphafold2 parameters tags
objectParameters used to provide the links to the parameters public resources to ESMFold.
Link to the ESMFold 3B-v1 model
stringhttps://dl.fbaipublicfiles.com/fair-esm/models/esmfold_3B_v1.ptLink to the ESMFold t36-3B-UR50D model
stringhttps://dl.fbaipublicfiles.com/fair-esm/models/esm2_t36_3B_UR50D.ptLink to the ESMFold t36-3B-UR50D-contact-regression model
stringhttps://dl.fbaipublicfiles.com/fair-esm/regression/esm2_t36_3B_UR50D-contact-regression.ptParameters used to provide the paths to the parameters public resources to ESMFold.
Specifies the PARAMS path used by ‘esmfold’ mode
stringLink to the ESMFold parameters
stringnull/params/*Links used to provide model links and weight links to Boltz
Link to download CCD file
stringhttps://huggingface.co/boltz-community/boltz-1/resolve/main/ccd.pklLink to download model file
stringhttps://huggingface.co/boltz-community/boltz-1/resolve/main/boltz1_conf.ckptLink to download boltz affinity file
stringhttps://huggingface.co/boltz-community/boltz-2/resolve/main/boltz2_aff.ckptLink to download boltz-2 conf file
stringhttps://huggingface.co/boltz-community/boltz-2/resolve/main/boltz2_conf.ckptLink to download boltz-2 mols
stringhttps://huggingface.co/boltz-community/boltz-2/resolve/main/mols.tarPaths used to provide model paths and weight paths to Boltz
Path to boltz databases
stringPath to CCD file
stringnull/params/ccd.pklPath to boltz Model file
stringnull/params/boltz1_conf.ckptPath to boltz affinity file
stringnull/params/boltz2_aff.ckptPath to boltz-2 conf file
stringnull/params/boltz2_conf.ckptPath to boltz-2 mols
stringnull/params/mols/Less common options for the pipeline, typically set in a config file.
Display version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueBase URL or local path to location of pipeline test dataset files
stringhttps://raw.githubusercontent.com/nf-core/test-datasets/Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
stringDisplay the help message.
boolean,stringDisplay the full detailed help message.
booleanDisplay hidden parameters in the help message (only works when —help or —help_full are provided).
booleanParameters used to provide the paths to the DBs and parameters for RoseTTAFold All Atom.
Link to the UniRef30 database for RoseTTAFold All Atom
stringhttps://wwwuser.gwdguser.de/~compbiol/uniclust/2023_02/UniRef30_2023_02_hhsuite.tar.gzLink to the BFD database for RoseTTAFold All Atom
stringhttps://bfd.mmseqs.com/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gzLink to the PDB100 database for RoseTTAFold All Atom
stringhttps://files.ipd.uw.edu/pub/RoseTTAFold/pdb100_2021Mar03.tar.gzLink to the RoseTTAFold All Atom paper weights
stringhttp://files.ipd.uw.edu/pub/RF-All-Atom/weights/RFAA_paper_weights.ptParameters used to provide the paths to the DBs and parameters for RoseTTAFold All Atom.
Path to RoseTTAFold All Atom database
stringPath to UniRef30 database for RoseTTAFold All Atom
stringnull/uniref30/*Path to BFD database for RoseTTAFold All Atom
stringnull/bfd/*Path to PDB100 database for RoseTTAFold All Atom
stringnull/pdb100/*Path to RoseTTAFold All Atom paper weights
stringnull/params/RFAA_paper_weights.ptParameters used to provide the paths to the DBs and parameters for HelixFold3.
Path to HelixFold3 database
stringPath to HelixFold3 init models
stringnull/params/HelixFold3-240814.pdparamsPath to UniRef30 database for HelixFold3
stringnull/uniref30/*Path to CCD preprocessed file for HelixFold3
stringnull/params/ccd_preprocessed_etkdg.pkl.gzPath to Rfam database for HelixFold3
stringnull/rfam/Rfam-14.9_rep_seq.fastaPath to BFD database for HelixFold3
stringnull/bfd/*Path to reduced BFD database for HelixFold3
stringnull/small_bfd/*Path to UniProt database for HelixFold3
stringnull/uniprot/*Path to PDB SEQRES database for HelixFold3
stringnull/pdb_seqres/*Path to UniRef90 database for HelixFold3
stringnull/uniref90/*Path to MGnify database for HelixFold3
stringnull/mgnify/*Path to PDB mmCIF database for HelixFold3
stringnull/pdb_mmcif/mmcif_filesPath to Maxit Suite for HelixFold3
stringnull/maxit-v11.200-prod-srcPath to obsolete PDB file for HelixFold3
stringnull/pdb_mmcif/obsolete.datParameters used to provide the links to the parameters public resources to HelixFold3.
Link to HelixFold3 init models
stringhttps://paddlehelix.bd.bcebos.com/HelixFold3/params/HelixFold3-params-240814.zipLink to UniRef30 database for HelixFold3
stringhttps://wwwuser.gwdguser.de/~compbiol/uniclust/2023_02/UniRef30_2023_02_hhsuite.tar.gzLink to CCD preprocessed file for HelixFold3
stringhttps://paddlehelix.bd.bcebos.com/HelixFold3/CCD/ccd_preprocessed_etkdg.pkl.gzLink to Rfam database for HelixFold3
stringhttps://paddlehelix.bd.bcebos.com/HelixFold3/MSA/Rfam-14.9_rep_seq.fastaLink to BFD database for HelixFold3
stringhttps://storage.googleapis.com/alphafold-databases/casp14_versions/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gzLink to reduced BFD database for HelixFold3
stringhttps://storage.googleapis.com/alphafold-databases/reduced_dbs/bfd-first_non_consensus_sequences.fasta.gzLink to PDB SEQRES database for HelixFold3
stringhttps://files.wwpdb.org/pub/pdb/derived_data/pdb_seqres.txtLink to UniRef90 database for HelixFold3
stringftp://ftp.uniprot.org/pub/databases/uniprot/uniref/uniref90/uniref90.fasta.gzLink to MGnify database for HelixFold3
stringhttps://ftp.ebi.ac.uk/pub/databases/metagenomics/peptide_database/2024_04/mgy_clusters.fa.gzLink to PDB mmCIF database for HelixFold3
stringrsync.rcsb.org::ftp_data/structures/divided/mmCIF/Link to UniProt SwissProt database for HelixFold3
stringftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.fasta.gzLink to UniProt TrEMBL database for HelixFold3
stringftp://ftp.ebi.ac.uk/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.fasta.gzLink to obsolete PDB file for HelixFold3
stringhttps://files.rcsb.org/pub/pdb/data/status/obsolete.datLink to Maxit Suite for HelixFold3
stringhttps://proteinfold-dataset.s3.amazonaws.com/test-data/db/helixfold3/maxit-v11.200-prod-src.tar.gz