This repo contains a set of R and Matlab scripts to accompany the following two related publications:
-
Bayesian Correlation Analysis for Sequence Count Data, PLoS ONE, 2016.
The code for this paper is in BayesianCorrelation.R
To use, simply
source
the file, and then run whichever functions you like on the data. The first function allows you to specify your own condition-specific priors, while the subsequent functions implement the three priors described in the paper.
-
Uncovering robust patterns of microRNA co-expression across cancers using Bayesian Relevance Networks, PLoS ONE, 2017 (also appeared at GLBIO 2017, where it won the "Outstanding Presentation" prize).
All scripts pertaining to this paper are in the folder BayesianRelevanceNetworks
Sourcing
the R code will create five functions:BayesianCorrelation_Grouped
,BayesianPermutation_Grouped
,PearsonCorrelation_Grouped
,PearsonPermutation_Grouped
, andFDRAnalysis
.
The first and third functions are different ways of computing correlations between different entities in a count matrix from a set of sequencing experiments. Typically, these entities would be genes or microRNAs whose expression is assessed by RNA-seq or single-cell RNA-seq. But they could be other things as well. The second and fourth functions are ways of estimating a null distribution of correlations under the hypothesis of no true correlation. The final function calculates empirical false discovery rates based on the outputs of the first and second functions, or the second and third functions.
The basic steps of a complete analysis are to
- Compute Bayesian (or Pearson) correlations between entities (genes, microRNAs, etc.),
- Compute a null distribution of those correlations by permuting the data, and
- Threshold the correlations based on the permuted distribution and a desired false discovery rate threshold.
The basic steps of a complete analysis are the same as above. See the top comment headers of the .m files for descriptions of functionality. The file
TestScript.m
demonstrates how the code would typically be used to perform an analysis of a matrix of count data.