nf-core/phageannotator
Pipeline for identifying, annotation, and quantifying phage sequences in (meta)-genomic sequences.
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringFilter assemblies at the beginning of the workflow
Minimum assembly length
integer1000Run ViromeQC to estimate viral enrichment
booleanIdentify reference viruses contained in reads
Run MASH screen to identify external viruses contained in reads
booleanPath to FASTA file containing reference virus sequences
stringPath to mash sketch file for reference virus sequences
stringSave reference virus sketch, if it was created.
booleanMinimum mash screen score to consider a genome contained
number0.95Hashes present in multiple references are assigned only to top sequence
booleanClassify viral sequences using geNomad
Skip running geNomad to classify viral/non-viral sequences
booleanPath to directory containing geNomad’s database
stringSave geNomad’s database, if it was downloaded.
booleanMinimum virus score for a sequence to be considered viral
number0.7Maximum FDR for a sequence to be considered viral (will include —enable-score-calibration)
number0.1Number of splits for running geNomad (more splits lowers memory requirements)
integer5Extend viral contigs
Run COBRA to extend viral contigs
booleanThe assembler that was used to assemble viral contigs
stringMinimum kmer value used during assembly
stringMaximum kmer value used during assembly
stringAssess virus quality and filter
Skip running CheckV to assess virus quality and filter sequences
booleanPath to directory containing CheckV database
stringSave CheckV’s database, if it was downloaded
booleanMinimum virus length to pass filtering
integer3000Minimum CheckV completeness to pass filtering
integer50Remove viruses labeled as provirus by geNomad or CheckV
booleanRemove viruses with CheckV warnings
booleanCluster virus genomes based on nucleotide/protein similarity
Skip ANI-based virus clustering
booleanMinimum precent identity for BLAST hits
integer90Maximum number of BLAST hits to record for each sequence
integer25000Minimum average nucleotide identity (ANI) for sequences to be clustered together
integer95Minimum query coverage for sequences to be clustered together
integerMinimum test coverage for sequences to be clustered together
integer85Align reads to virus database
Skip read alignment to viral sequences
booleanMinimum length of reads aligned to references
integerMinimum percent identity of aligned reads
integerMinimum percent of read aligned to references
integerAbundance calculation metrics
stringmeanAssign taxonomy to virus sequences
booleanPredict host genus for phage sequences
Run iPHoP to predict phage hosts
booleanPath to locally iPHoP database
stringSave downloaded iPHoP database
booleanMinimum confidence score to provide host prediction
integer90Predict the lifestyle of viral sequences
Run BACPHLIP to predict virus lifestyle
booleanFunctionally annotate viral genomes using a variety of approaches
Run pharokka to predict and annotate phage ORFs
booleanPath to predownloaded pharokka db
stringAnalyze virus diversity at the strain level
Bypass microdiversity analysis with inStrain
booleanMinimum identity for read alignment to be considered
numberMinimum MAPQ for a read to be considered
integerMinimum coverage for a variant to be considered
integerMinimum allele frequency for an SNP to be considered
numberMaximum FDR for a SNP to be considered
integerMinimum number of reads mapping to a genome to consider profiling
numberMinimum identity for genomes to be considered in the same strain
numberMinimum percent of genomes compared for comparison to be considered
numberMinimum breadth of coverage for a genome to be considered present
numberArguments for running pipeline tests with custom arguments/databases.
booleannumberbooleanDownload test database rather than full database?
booleanbooleanParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringSet the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer16Maximum amount of memory that can be requested for any single job.
string128.GB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Maximum amount of time that can be requested for any single job.
string240.h^(\d+\.?\s*(s|m|h|d|day)\s*)+$Less common options for the pipeline, typically set in a config file.
Display help text.
booleanDisplay version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueUse logo in initialise subworkflow
booleantrueShow all params when using --help
booleanValidation of parameters fails when an unrecognised parameter is found.
booleanValidation of parameters in lenient more.
boolean