posterior#

Compute footprint posterior probabilities

Applies an emperical Bayesian approach to compute the posterior probability a nucleotide is protected by jointly analyzing many datasets.

INTERVAL_FILE is a BED-formatted file containing genomic regions to be analyzed. SAMPLE_DATA_FILE file that specifying sample metadata.

SAMPLE_DATA_FILE is tab-delimited with the following columns:

id Sample identifier (unique) tabix_file Path to TABIX-format cleavage statistics file dm_file Path to dataset JSON-encoded dispersion model file beta_a α parameter (see ‘learn_beta’ command) beta_b β parameter

Note: File must contain a header row. Lines ignored when ‘#’ is first character.

A bedGraph-like file ({N_samples}+3 columns) with the folowing annotations:

contig start start+1 -log(1-p)_1 … -log(1-p)_N

where N is the total number samples. Note that values are 1-posterior probability. Columns (samples) are in the same order as the sample data file.

posterior [OPTIONS] SAMPLE_DATA_FILE INTERVAL_FILE

Options

--fdr_cutoff <fdr_cutoff>#

FDR cutoff to use when computing priors

Default: 0.05

--post_cutoff <post_cutoff>#

Print only positions where that maximum posterior across all samples meets this threshold. Used to control for file size.

Default: 0.2

--outprefix <outprefix>#

Output prefix

Default: out

--n_threads <n_threads>#

Number of processors to use

Default: 2

--batch_size <batch_size>#

Batch size of intervals to process

Default: 100

Arguments

SAMPLE_DATA_FILE#: Required argument

INTERVAL_FILE#: Required argument

plot_dm

Tutorials