3.7. reports.py - produce various metrics and reportsΒΆ
Functions to create reports from genomics pipeline data.
usage: reports.py subcommand
- Sub-commands:
- assembly_stats
Fetch assembly-level statistics for a given sample
usage: reports.py assembly_stats [-h] [--cov_thresholds COV_THRESHOLDS [COV_THRESHOLDS ...]] [--assembly_dir ASSEMBLY_DIR] [--assembly_tmp ASSEMBLY_TMP] [--align_dir ALIGN_DIR] [--reads_dir READS_DIR] [--raw_reads_dir RAW_READS_DIR] samples [samples ...] outFile
- Positional arguments:
samples Sample names. outFile Output report file. - Options:
--cov_thresholds=(1, 5, 20, 100) Genome coverage thresholds to report on. (default: %(default)s) --assembly_dir=data/02_assembly Directory with assembly outputs. (default: %(default)s) --assembly_tmp=tmp/02_assembly Directory with assembly temp files. (default: %(default)s) --align_dir=data/02_align_to_self Directory with reads aligned to own assembly. (default: %(default)s) --reads_dir=data/01_per_sample Directory with unaligned filtered read BAMs. (default: %(default)s) --raw_reads_dir=data/00_raw Directory with unaligned raw read BAMs. (default: %(default)s)
- alignment_summary
Write or print pairwise alignment summary information for sequences in two FASTA files, including SNPs, ambiguous bases, and indels.
usage: reports.py alignment_summary [-h] [--outfileName OUTFILENAME] [--printCounts] inFastaFileOne inFastaFileTwo
- Positional arguments:
inFastaFileOne First fasta file for an alignment inFastaFileTwo First fasta file for an alignment - Options:
--outfileName Output file for counts in TSV format --printCounts=False Undocumented
- consolidate_fastqc
Consolidate multiple FASTQC reports into one.
usage: reports.py consolidate_fastqc [-h] inDirs [inDirs ...] outFile
- Positional arguments:
inDirs Input FASTQC directories. outFile Output report file.
- consolidate_spike_count
Consolidate multiple spike count reports into one.
usage: reports.py consolidate_spike_count [-h] inDir outFile
- Positional arguments:
inDir Input spike count directory. outFile Output report file.
- plot_coverage
Generate a coverage plot from an aligned bam file
usage: reports.py plot_coverage [-h] [--plotFormat] [--plotDataStyle] [--plotStyle] [--plotWidth PLOT_WIDTH] [--plotHeight PLOT_HEIGHT] [--plotDPI PLOT_DPI] [--plotTitle PLOT_TITLE] [-q BASE_Q_THRESHOLD] [-Q MAPPING_Q_THRESHOLD] [-m MAX_COVERAGE_DEPTH] [-l READ_LENGTH_THRESHOLD] [--outSummary OUT_SUMMARY] [--plotOnlyNonDuplicates] [--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL,EXCEPTION}] [--version] [--tmp_dir TMP_DIR] [--tmp_dirKeep] in_bam out_plot_file
- Positional arguments:
in_bam Input reads, BAM format. out_plot_file The generated chart file - Options:
--plotFormat File format of the coverage plot. By default it is inferred from the file extension of out_plot_file, but it can be set explicitly via –plotFormat. Valid formats include: svgz, pdf, tif, raw, tiff, svg, jpg, png, ps, pgf, rgba, eps, jpeg
Possible choices: svgz, pdf, tif, raw, tiff, svg, jpg, png, ps, pgf, rgba, eps, jpeg
--plotDataStyle=filled The plot data display style. Valid options: filled, line, dots (default: %(default)s)
Possible choices: filled, line, dots
--plotStyle=ggplot The plot visual style. Valid options: seaborn-notebook, seaborn-paper, seaborn-colorblind, seaborn-dark, seaborn-deep, ggplot, seaborn-talk, seaborn-whitegrid, seaborn-white, seaborn-muted, dark_background, seaborn-darkgrid, seaborn-dark-palette, seaborn-ticks, seaborn-pastel, seaborn-bright, bmh, classic, grayscale, seaborn-poster, fivethirtyeight (default: %(default)s)
Possible choices: seaborn-notebook, seaborn-paper, seaborn-colorblind, seaborn-dark, seaborn-deep, ggplot, seaborn-talk, seaborn-whitegrid, seaborn-white, seaborn-muted, dark_background, seaborn-darkgrid, seaborn-dark-palette, seaborn-ticks, seaborn-pastel, seaborn-bright, bmh, classic, grayscale, seaborn-poster, fivethirtyeight
--plotWidth=1024 Width of the plot in pixels (default: %(default)s) --plotHeight=768 Width of the plot in pixels (default: %(default)s) --plotDPI=80.0 dots per inch for rendered output, more useful for vector modes (default: %(default)s) --plotTitle=Coverage Plot The title displayed on the coverage plot (default: ‘%(default)s’) -q The minimum base quality threshold -Q The minimum mapping quality threshold -m=1000000 The max coverage depth (default: %(default)s) -l Read length threshold --outSummary Coverage summary TSV file. Default is to write to temp. --plotOnlyNonDuplicates=False Plot only non-duplicates (samtools -F 1024) --loglevel=DEBUG Verboseness of output. [default: %(default)s]
Possible choices: DEBUG, INFO, WARNING, ERROR, CRITICAL, EXCEPTION
--version, -V show program’s version number and exit --tmp_dir=/tmp Base directory for temp files. [default: %(default)s] --tmp_dirKeep=False Keep the tmp_dir if an exception occurs while running. Default is to delete all temp files at the end, even if there’s a failure.
- align_and_plot_coverage
Take reads, align to reference with BWA-MEM, and generate a coverage plot
usage: reports.py align_and_plot_coverage [-h] [--plotFormat] [--plotDataStyle] [--plotStyle] [--plotWidth PLOT_WIDTH] [--plotHeight PLOT_HEIGHT] [--plotDPI PLOT_DPI] [--plotTitle PLOT_TITLE] [-q BASE_Q_THRESHOLD] [-Q MAPPING_Q_THRESHOLD] [-m MAX_COVERAGE_DEPTH] [-l READ_LENGTH_THRESHOLD] [--outSummary OUT_SUMMARY] [--outBam OUT_BAM] [--sensitive] [--excludeDuplicates] [--JVMmemory JVMMEMORY] [--picardOptions [PICARDOPTIONS [PICARDOPTIONS ...]]] [-T MIN_SCORE_TO_OUTPUT] [--loglevel {DEBUG,INFO,WARNING,ERROR,CRITICAL,EXCEPTION}] [--version] [--tmp_dir TMP_DIR] [--tmp_dirKeep] in_bam out_plot_file ref_fasta
- Positional arguments:
in_bam Input reads, BAM format. out_plot_file The generated chart file ref_fasta Reference genome, FASTA format. - Options:
--plotFormat File format of the coverage plot. By default it is inferred from the file extension of out_plot_file, but it can be set explicitly via –plotFormat. Valid formats include: svgz, pdf, tif, raw, tiff, svg, jpg, png, ps, pgf, rgba, eps, jpeg
Possible choices: svgz, pdf, tif, raw, tiff, svg, jpg, png, ps, pgf, rgba, eps, jpeg
--plotDataStyle=filled The plot data display style. Valid options: filled, line, dots (default: %(default)s)
Possible choices: filled, line, dots
--plotStyle=ggplot The plot visual style. Valid options: seaborn-notebook, seaborn-paper, seaborn-colorblind, seaborn-dark, seaborn-deep, ggplot, seaborn-talk, seaborn-whitegrid, seaborn-white, seaborn-muted, dark_background, seaborn-darkgrid, seaborn-dark-palette, seaborn-ticks, seaborn-pastel, seaborn-bright, bmh, classic, grayscale, seaborn-poster, fivethirtyeight (default: %(default)s)
Possible choices: seaborn-notebook, seaborn-paper, seaborn-colorblind, seaborn-dark, seaborn-deep, ggplot, seaborn-talk, seaborn-whitegrid, seaborn-white, seaborn-muted, dark_background, seaborn-darkgrid, seaborn-dark-palette, seaborn-ticks, seaborn-pastel, seaborn-bright, bmh, classic, grayscale, seaborn-poster, fivethirtyeight
--plotWidth=1024 Width of the plot in pixels (default: %(default)s) --plotHeight=768 Width of the plot in pixels (default: %(default)s) --plotDPI=80.0 dots per inch for rendered output, more useful for vector modes (default: %(default)s) --plotTitle=Coverage Plot The title displayed on the coverage plot (default: ‘%(default)s’) -q The minimum base quality threshold -Q The minimum mapping quality threshold -m=1000000 The max coverage depth (default: %(default)s) -l Read length threshold --outSummary Coverage summary TSV file. Default is to write to temp. --outBam Output aligned, indexed BAM file. Default is to write to temp. --sensitive=False Equivalent to giving bwa: ‘-k 12 -A 1 -B 1 -O 1 -E 1’ --excludeDuplicates=False MarkDuplicates with Picard and only plot non-duplicates --JVMmemory=2g JVM virtual memory size (default: %(default)s) --picardOptions=[] Optional arguments to Picard’s MarkDuplicates, OPTIONNAME=value ... -T=30 The min score to output during alignment (default: %(default)s) --loglevel=DEBUG Verboseness of output. [default: %(default)s]
Possible choices: DEBUG, INFO, WARNING, ERROR, CRITICAL, EXCEPTION
--version, -V show program’s version number and exit --tmp_dir=/tmp Base directory for temp files. [default: %(default)s] --tmp_dirKeep=False Keep the tmp_dir if an exception occurs while running. Default is to delete all temp files at the end, even if there’s a failure.