nf-core/readsimulator
A pipeline to simulate sequencing reads, such as Amplicon, Target Capture, Metagenome, and Whole genome data.
1.0.0). The latest
stable release is
1.0.1
.
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$MultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringChoose the data types that should be simulated by the pipeline.
Option to simulate amplicon sequencing reads.
booleanOption to simulate target capture sequencing reads.
booleanOption to simulate metagenomic sequencing reads.
booleanOption to simulate wholegenomic sequencing reads.
booleanOptions for simulating amplicon sequencing reads.
Forward primer to use with crabs_insilicopcr.
stringGTCGGTAAAACTCGTGCCAGCReverse primer to use with crabs_insilicopcr.
stringCATAGTGGGGTATCTAATCCCAGTTTGNumber of reads to be simulated per amplicon.
integer500Length of reads to be simulated.
integer130Sequencing system of reads to be simulated.
stringMaximum number of errors allowed in CRABS insilicoPCR primer sequences
number4.5Options for simulating target capture sequencing reads.
Path to bait/probe file. Can be a fasta file or a bed file.
stringName of supported probe. Mandatory if not using --probes parameter.
stringSimulate ‘illumina’ or ‘pacbio’ reads.
stringMedian of fragment size at shearing.
integer500Shape parameter of the fragment size distribution.
number6Median of fragment size distribution.
integer1300Shape parameter of the fragment size distribution.
number6Median of target fragment size (the fragment size of the data). If specified, will override ‘—fmedian’ and ‘—smedian’. Othersise will be estimated.
integerShape parameter of the effective fragment size distribution.
numberNumber of fragments.
integer500000Illumina: read length.
integer150PacBio: Average (polymerase) read length.
integer30000Illumina: Sequencing mode.
stringOptions for simulating metagenomic sequencing reads.
Abundance distribution.
stringPath to tab-separated file containing abundance distribution.
string^\S+\.tsv$Coverage distribution.
stringPath to tab-separated file containing coverage information.
string^\S+\.tsv$Number of reads to generate.
string1MCan be ‘kde’, or ‘basic’.
stringCan be ‘HiSeq’, ‘NovaSeq’, or ‘MiSeq’.
stringUse this option to prevent simulating reads that have abnormal GC content.
booleanOptions for simulating wholegenome sequencing reads.
The base error rate.
number0.02The outer distance between the two ends.
integer500The standard deviation.
integer50The number of read pairs.
integer1000000The length of the first reads.
integer70The length of the second reads.
integer70The rate of mutations.
number0.001The fraction of indels.
number0.15The probability that an indel is extended.
number0.3Reference genome related files and options required for the workflow.
Name of iGenomes reference.
stringPath to reference FASTA file.
string^\S+\.fn?a(sta)?(\.gz)?$Do not load the iGenomes reference config.
booleanPath to text file containing accession ids (one accession per row).
stringPath to text file containing taxids (one taxid per row).
stringThe NCBI taxonomic groups to download. Options include ‘all’, ‘archaea’, ‘bacteria’, ‘fungi’, ‘invertebrate’, ‘metagenomes’, ‘plant’, ‘protozoa’, ‘vertebrate_mammalian’, ‘vertebrate_other’, and ‘viral’. A comma-separated list is also valid (e.g., ‘bacteria,viral’).
stringThe NCBI section to download. ‘refseq’ or ‘genbank’.
stringParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringSet the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer16Maximum amount of memory that can be requested for any single job.
string128.GB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Maximum amount of time that can be requested for any single job.
string240.h^(\d+\.?\s*(s|m|h|d|day)\s*)+$Less common options for the pipeline, typically set in a config file.
Display help text.
booleanDisplay version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanIncoming hook URL for messaging service
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringBoolean whether to validate parameters against the schema at runtime
booleantrueShow all params when using --help
booleanValidation of parameters fails when an unrecognised parameter is found.
booleanValidation of parameters in lenient more.
boolean