Help for package IntegratedJM

Type:

Package

Title:

Joint Modeling of the Gene-Expression and Bioassay Data, Taking Care of the Effect Due to a Fingerprint Feature

Version:

1.6

Date:

2017-08-03

Author:

Rudradev Sengupta, Nolen Joy Perualila

Maintainer:

Rudradev Sengupta <rudradev.sengupta@uhasselt.be>

Description:

Offers modeling the association between gene-expression and bioassay data, taking care of the effect due to a fingerprint feature and helps with several plots to better understand the analysis.

Depends:

R (≥ 3.0.0), grid

Imports:

ggplot2, nlme, Biobase

License:

GPL-3

LazyLoad:

yes

Repository:

CRAN

Repository/R-Forge/Project:

integratedjm

Repository/R-Forge/Revision:

Repository/R-Forge/DateTimeStamp:

2017-08-03 19:03:01

Date/Publication:

2017-08-03 22:37:38 UTC

NeedsCompilation:

Packaged:

2017-08-03 19:25:12 UTC; rforge

Joint Modelling of Gene Expression and Bio-activity data, taking care of the effect due to Fingerprint Feature.

Description

The IntegratedJM package contains the functions to fit the Joint Model, to classify the genes based on different criteria and necessary plot functions.

fitJM

Description

The fitJM function fits the model for all the genes for a specific bio-activity vector and a particular fingerprint feature.

Usage

fitJM(dat, responseVector, covariate = NULL, methodMultTest)

Arguments

dat

Contains the gene expression data matrix for all the genes - can be a matrix or an expression set.

responseVector

Vector containing the bio-activity data.

covariate

Vector of 0's and 1's, containing data about the fingerprint feature.

methodMultTest

Character string to specify the multiple testing method. Default is the BH-FDR method.

Details

The default for the covariate parameter is NULL and if no covariate is specified it returns a data frame containing 5 variables, named as "Pearson","Spearman","p", "adj-p","logratio" and the data frame is ordered based on the column "p" which is the p-value obtained from the Log-Ratio Test. If there is a covariate, then the output is a dataframe containing 13 variables for all the genes,named as "adjPearson","adjSpearman","pPearson","Pearson", "Spearman", "pAdjR", "CovEffect1", "adjPeffect1", "CovEffect2", "adjPeffect2", "rawP1", "rawP2","logratio" and sorted based on "rawP1" and "pPearson" which are p-value corresponding to the effect of the fingerprint feature on the gene expression data as obtained from the t-table after fitting the model using gls and the p-value obtained from the Log-Ratio Test, respectively. In the first case without any covariate it calls the nullcov function inside it, otherwise the non_nullcov function is called to do the analysis.

Value

A data frame, containing the results of the model, to be used later for plots or to identify the top genes.

Examples

## Not run: 
jmRes <- fitJM(dat=gene_eset,responseVector=activity,methodMultTest='fdr')
jmRes <- fitJM(dat=gene_eset,responseVector=activity,covariate = fp,methodMultTest='fdr')

## End(Not run)

getCorrUnad

Description

The getCorrUnad function is a support function for the function plot1gene.

Usage

getCorrUnad(geneName, fp, fpName, responseVector, dat, resPlot)

Arguments

geneName

Character string, specifying the name of the gene.

fp

Vector containing 0's and 1's - the data about the fingerprint feature.

fpName

Character string, used to make the title of the plots. If not specified, the plot title will be blank.

responseVector

Vector containing the bio-activity data.

dat

Contains the gene expression data matrix for all the genes - can be a matrix or an expression set.

resPlot

Logical. If TRUE, creates the plot data for the residual plot

Details

Works as a support function for plot1gene.

Value

A list containing the data to create the respective plots and the unadjusted association between the gene expression and bio-activity data.

Examples

## Not run: 
getCorrUnad(geneName="Gene21",fp=fp,fpName="Fingerprint",
responseVector=activity,dat=gene_eset,resPlot=TRUE)

## End(Not run)

multiplot

Description

The multiplot function plots multiple ggplots in the same window.

Usage

multiplot(..., cols = 1)

Arguments

...

ggplot2 objects, separated by comma.

cols

Integer, specifying the number of plots in one row in the layout.

Details

Plots multiple ggplots in the same window - multiplot(p1,p2,p3,p4, cols=2) is similar to the standard R notation par(mfrow=c(2,2)).

Value

Creates multiple ggplots in same window

Examples

## Not run: 
multiplot(p1,p2,p3,cols=3)

## End(Not run)

non_nullcov

Description

The non_nullcov function is called while fitting the model when the covariate is specified in the fitJM function. It returns a data.frame containing the results after fitting the model. The output of this function is also the output of the fitJM function.

Usage

non_nullcov(dat, responseVector, covariate, methodMultTest, data_type)

Arguments

dat

Contains the gene expression data matrix for all the genes - can be a matrix or an expression set.

responseVector

Vector containing the bio-activity data.

covariate

Vector of 0's and 1's, containing data about the fingerprint feature.

methodMultTest

Character string to specify the multiple testing method.

data_type

Binary, specifying the type of the parameter dat: 0 - expressionSet, 1 - matrix.

Details

Fits the model, adjusting for the covariate effect, using gls, calculates the correlation, p-values, adjusted p-values (based on the multiple testing method) and logratio from LRT and returns the required results.

Value

A data frame, containing the results of the model - same as the output of the fitJM function.

Examples

## Not run: 
non_nullcov(dat=gene_eset,responseVector=activity,covariate=fp,methodMultTest='fdr',data_type=0)

## End(Not run)

nullcov

Description

The nullcov function is called while fitting the model when there no covariate is specified in the fitJM function. It returns a data.frame containing the results after fitting the model. The output of this function is also the output of the fitJM function.

Usage

nullcov(dat, responseVector, methodMultTest, data_type)

Arguments

dat

Contains the gene expression data matrix for all the genes - can be a matrix or an expression set.

responseVector

Vector containing the bio-activity data.

methodMultTest

Character string to specify the multiple testing method. Default is the BH-FDR method.

data_type

Binary, specifying the type of the parameter dat: 0 - expressionSet, 1 - matrix.

Details

Fits the model using gls, calculates the correlation, p-values, adjusted p-values (based on the multiple testing method) and logratio from LRT and returns the required results.

Value

A data frame, containing the results of the model - same as the output of the fitJM function.

Examples

## Not run: 
nullcov(dat=gene_eset,responseVector=activity,methodMultTest='fdr',data_type=0)

## End(Not run)

plot1gene

Description

The plot1gene function plots the data for a single gene.

Usage

plot1gene(geneName, fp, fpName = "", responseVector, dat, resPlot = TRUE,
  colP = "blue", colA = "white")

Arguments

geneName

Character string, specifying the name of the gene.

fp

Vector containing 0's and 1's - the data about the fingerprint feature.

fpName

Character string, used to make the title of the plots. If not specified, the plot title will be blank.

responseVector

Vector containing the bio-activity data.

dat

Contains the gene expression data matrix for all the genes - can be a matrix or an expression set.

resPlot

Logical. If TRUE, also plots the residual from the gls fit. Default is TRUE.

colP

Character string, specifying the colour for the 1's in the fp parameter. Default is blue.

colA

Character string, specifying the colour for the 0's in the fp parameter. Default is white.

Details

Calls the getCorrUnad function and creates the plot(s) accordingly.

Value

Creates a plot

Examples

## Not run: 
plot1gene(geneName="Gene21",fp=fp,fpName="Fingerprint",responseVector=activity,dat=gene_eset)

## End(Not run)

plotAsso

Description

The plotAsso function is used to plot the unadjusted association vs the adjusted association for all the genes.

Usage

plotAsso(jointModelResult, type)

Arguments

jointModelResult

Data frame, containing the results from the fitJM function.

type

Character string, specifying the type of association - Pearson or Spearman.

Details

Plots the unadjusted association vs the adjusted association for all the genes.

Value

Creates a plot

Examples

## Not run: 
plotAsso(jointModelResult=jmRes,type="Pearson")

## End(Not run)

plotEff

Description

The plotEff function is used to plot the fingerprint effect on gene expression vs the adjusted association for all the genes.

Usage

plotEff(jointModelResult, type)

Arguments

jointModelResult

Data frame, containing the results from the fitJM function.

type

Character string, specifying the type of association - Pearson or Spearman.

Details

Plots the fingerprint effect on gene expression vs the specified type of adjusted association for all the genes.

Value

Creates a plot

Examples

## Not run: 
plotEff(jointModelResult=jmRes,type="Pearson")

## End(Not run)

Sample Data Example

Description

sample_data; Sample data included in the package. Gene expression data for 500 genes and 20 compounds and data on bio-activity and fingerprint feature.

Usage

data(sampleData)

Format

The format is: List containing one gene expression data matrix, one vector each on bio-activity and fingerprint feature data, respectively.

Examples


data(sampleData)
gene_mx <- sample_data[[1]]
activity <- sample_data[[2]]
fp <- sample_data[[3]]

topkGenes

Description

The topkGenes function is to identify the top genes based on different criteria.

Usage

topkGenes(jointModelResult, subset_type, ranking, k = 10, sigLevel = 0.01)

Arguments

jointModelResult

Data frame, containing the results from the fitJM function.

subset_type

Character string to specify the set of genes. It can have four values: "Effect" for only differentially expressed genes, "Correlation" for only correlated genes, "Effect and Correlation" for genes which are both differentially expressed & correlated and "Other" for the genes which are neither differentially expressed nor correlated.

ranking

Character string, specifying one of the columns of the jointModelResult data frame, based on the genes will be ranked within the selected subset.

k

Integer, specifying the number of genes, to be returned from the list of top genes. Default is 10.

sigLevel

Numeric between 0 and 1, specifying the level of significance, used to select the subset of genes.

Details

Returned data frame contains 6 columns, named as "Genes","FP-Effect", "p-adj(Effect)", "Unadj.Asso.","Adj.Asso.", "p-adj(Adj.Asso.)".

Value

A data frame containing top k genes according to the specified criteria from the specified set of genes.

Examples

## Not run: 
topkGenes(jointModelResult=jmRes,subset_type="Effect",ranking="Pearson",k=10,sigLevel = 0.05)

## End(Not run)

volcano

Description

The volcano function produces the volcano plot for logratio / fp-effect vs corresponding p-values.

Usage

volcano(x, pValue, pointLabels, topPValues = 10, topXvalues = 10,
  smoothScatter = TRUE, xlab = NULL, ylab = NULL, main = NULL,
  newpage = TRUE, additionalPointsToLabel = NULL,
  additionalLabelColor = "red", dir = TRUE)

Arguments

x

Numeric vector of logratios or covariate effect values to be plotted.

pValue

Numeric vector of corresponding p-values obtained from some statistical test.

pointLabels

Character vector providing the texts for the points to be labelled in the plot.

topPValues

Number of top p-values to be labelled. Default value is 10.

topXvalues

Number of top logratios or covariate effect values to be labelled. Default value is 10.

smoothScatter

Logical parameter to decide if a smooth plot is expected or not. Default is TRUE.

xlab

Text for the x-axis of the plot. Default is NULL.

ylab

Text for the y-axis of the plot. Default is NULL.

main

Text for the main title of the plot. Default is NULL.

newpage

Logical parameter

additionalPointsToLabel

Set of points other than the top values to be labelled in the plot. Default is NULL.

additionalLabelColor

Colour of the additionally labelled points. Default colour is red.

dir

Logical parameter deciding if the top values should be in decreasing (= TRUE) or increasing (= FALSE) order. Default is TRUE.

Details

Creates a plot which looks like a volcano with the interesting points labelled within the plot.

Value

A plot which looks like a volcano.

Examples

## Not run: 
volcano(x=jmRes$CovEffect1,pValue=jmRes$rawP1,pointLabels=rownames(jmRes),
topPValues = 10, topXvalues = 10,xlab="FP Effect (alpha)",ylab="-log(p-values)")

## End(Not run)