posterior#
Compute footprint posterior probabilities
Applies an emperical Bayesian approach to compute the posterior probability a nucleotide is protected by jointly analyzing many datasets.
INTERVAL_FILE is a BED-formatted file containing genomic regions to be analyzed. SAMPLE_DATA_FILE file that specifying sample metadata.
SAMPLE_DATA_FILE is tab-delimited with the following columns:
id Sample identifier (unique) tabix_file Path to TABIX-format cleavage statistics file dm_file Path to dataset JSON-encoded dispersion model file beta_a α parameter (see ‘learn_beta’ command) beta_b β parameter
Note: File must contain a header row. Lines ignored when ‘#’ is first character.
A bedGraph-like file ({N_samples}+3 columns) with the folowing annotations:
contig start start+1 -log(1-p)_1 … -log(1-p)_N
where N is the total number samples. Note that values are 1-posterior probability. Columns (samples) are in the same order as the sample data file.
posterior [OPTIONS] SAMPLE_DATA_FILE INTERVAL_FILE
Options
- --fdr_cutoff <fdr_cutoff>#
FDR cutoff to use when computing priors
- Default
0.05
- --post_cutoff <post_cutoff>#
Print only positions where that maximum posterior across all samples meets this threshold. Used to control for file size.
- Default
0.2
- --outprefix <outprefix>#
Output prefix
- Default
out
- --n_threads <n_threads>#
Number of processors to use
- Default
2
- --batch_size <batch_size>#
Batch size of intervals to process
- Default
100
Arguments
- SAMPLE_DATA_FILE#
Required argument
- INTERVAL_FILE#
Required argument