`modeling` submodule#

class footprint_tools.modeling.bias.bias_model#

offset()#

predict(probs, n=100)#

Compute cleavage propensities from sequence

Parameters

probsnumpy.ndarray: An array of probilities (relative values)
nint: Number of total tags to distrbute

shuffle()#

Randomly shuffle the bias model

Returns

modelbias_model: A shuffled bias model

class footprint_tools.modeling.bias.kmer_model(filepath)#

probs(seq)#

Generate cleavage preference array from DNA sequence

Parameters

seq: str: A DNA sequence to compute relative sequence preference

Returns

outnumpy.ndarray: Relate sequence preferencew

read_model(filepath)#

Read the k-mer model from a file.

Parameters

filepathstr: Path to a K-mer model file

class footprint_tools.modeling.bias.uniform_model#

probs(seq)#

This module contains classess and functions that implement a dispersion model.

footprint_tools.modeling.dispersion.base64decode()#

footprint_tools.modeling.dispersion.base64encode()#

class footprint_tools.modeling.dispersion.dispersion_model#

Dispersion model class

fit_mu()#

Computes the fitted mu term for the negative binomial from a piece-wise linear fit.

Parameters

xfloat

Returns

mufloat: mu computed from the regression fit

fit_r()#

Computes the dispersion term for the negative binomial from a piece-wise linear fit. Note that the model parameters estimate the inverse.

Parameters

xfloat

Returns

rfloat: r computed from the regression fit

h#: Histrogram of observed cleavages at each predicted cleavage rate

log_pmf_values()#

Compute the log probability mass function

Parameters

exp:class:numpy.ndarray: Expected cleavage counts
obs:class:numpy.ndarray: Observed cleavage counts

Returns

logpnumpy.ndarray: Array of log probability mass function values computed from the expected cleavage distributions

log_pmf_values_0()#

Computing the log probability mass function to pointer

Parameters

exp:class:numpy.ndarray: Expected cleavage counts
obs:class:numpy.ndarray: Observed cleavage counts

Returns

logpnumpy.ndarray (memoryview): Array pointer to log probability mass function values computed from the expected cleavage distributions

Notes

This function is equivalent to log_pmf_values, except it stores values to a matrix pointer

metadata#

mu_params#

p#: Array of the negative binomial MLE fit parameters p

p_values()#

Compute cumulative distribution (lower-tail p-value) from negative binomial

Parameters

exp:class:numpy.ndarray: Expected cleavage counts
obs:class:numpy.ndarray: Observed cleavage counts

Returns

pvalsnumpy.ndarray: Array of p-values

pmf_values()#

Compute the probability mass function

Parameters

exp:class:numpy.ndarray: Expected cleavage counts
obs:class:numpy.ndarray: Observed cleavage counts

Returns

pnumpy.ndarray: Array of probability mass function values computed from the expected cleavage distributions

pmf_values_0()#

Compute the probability mass function to pointer

Parameters

exp:class:numpy.ndarray: Expected cleavage counts
obs:class:numpy.ndarray: Observed cleavage counts

Returns

pnumpy.ndarray (memoryview): Array pointer to probability mass function values computed from the expected cleavage distributions

Notes

This function is equivalent to pmf_values, except it stores values to a matrix pointer

r#: Array of the negative binomial MLE fit parameters r

r_params#

sample()#

Sample counts from negative binomial distribution and compute p-values

Parameters

xnumpy.ndarray: Count values to specifying from which distribution to resample. This typically expected count values.
timesint: Number of times to sample (per element)

Returns

sampled_countsnumpy.ndarray: Array of sample counts (2-D array - positions by number of samples)
sampled_pvalsnumpy.ndarray: Array of sample counts (2-D array - positions by number of samples)

footprint_tools.modeling.dispersion.learn_dispersion_model()#

Learn a dispersion model from the expected vs. observed histogram

Parameters

hnumpy.ndarray: A 2-dimemsional array containing the distribution of observerd cleavages at each expected cleavage rate
cutoffint: Mininum number of observed cleavages to perform ML negative binomial fit at each value of expected cleavages
trimtuple (float): Percent of data to trim from the observed cleavage count (to mitigate outlier effects)

Returns

modeldispersion_model: A dispersion model learned from observed and expected counts

footprint_tools.modeling.dispersion.load_dispersion_model()#

Load a dispersion model encoded in JSON format

Parameters

filenamestr: Path to JSON-format dispersion model

Returns

modeldispersion_model: A dispersion model loaded from file

footprint_tools.modeling.dispersion.piecewise_five()#

footprint_tools.modeling.dispersion.piecewise_four()#

footprint_tools.modeling.dispersion.piecewise_three()#

footprint_tools.modeling.dispersion.write_dispersion_model()#

Write a JSON format dispersion model

Parameters

modeldispersion_model: An instance of dispersion_model

Returns

outstr: JSON-formatted dump of dispersion model

class footprint_tools.modeling.predict.prediction(read_func, fasta_func, bm, half_win_width=5, smoothing_half_win_width=0, smoothing_clip=0.01)#

Class that holds a wrapper function to: compute the expected cleavage counts

Attributes

bmbias.bias_model: Sequence bias model to apply
read_funccutcounts.bamfile: Cut-counts reader
fasta_func:class`pysam.FastaFile`: FASTA-file reader
half_win_widthint: Window width to apply bias model (final windows size = 2W+1)
paddingint: Padding applied to region when retrieving per-nucleotide data
smoothing_clipfloat: Fraction of nucleotides to trim when computing smoothed mean
smoothing_half_win_widthint: Half width of window used to compute windowed tag counts

compute(x)#

Computed expected cleavage counts

Parameters

xgenome_toools.genomic_interval: Genomic region to generate predicted cleavages

Returns

out: tuple of dict: Observed, expected and windowed cleavage counts

footprint_tools.modeling.predict.reverse_complement()#

Computes reverse complement of a DNA sequence

Parameters

seqstr: DNA sequence string

Returns

outstr: Reverse complement of seq

cutcounts submodule

stats submodule

modeling submodule#

`modeling` submodule#