Type: | Package |
Title: | Detection of Statistically Significant Combinations of SNPs in Association Mapping |
Version: | 0.6.1 |
Description: | A significant pattern mining-based toolbox for region-based genome-wide association studies and higher-order epistasis analyses, implementing the methods described in Llinares-López et al. (2017) <doi:10.1093/bioinformatics/btx071>. |
Depends: | R (≥ 3.0.2) |
Imports: | methods, Rcpp |
LinkingTo: | Rcpp |
Encoding: | UTF-8 |
LazyData: | true |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
NeedsCompilation: | yes |
RoxygenNote: | 6.0.1 |
SystemRequirements: | C++11 |
Suggests: | testthat, knitr, rmarkdown |
Author: | Felipe Llinares-López [aut, cph], Laetitia Papaxanthos [aut, cph], Damian Roqueiro [aut, cph], Matthew Baker [ctr], Mikołaj Rybiński [ctr], Uwe Schmitt [ctr], Dean Bodenham [aut, cre, cph], Karsten Borgwardt [aut, fnd, cph] |
Maintainer: | Dean Bodenham <deanbodenhambsse@gmail.com> |
VignetteBuilder: | knitr |
Packaged: | 2020-05-04 18:15:36 UTC; dean |
Repository: | CRAN |
Date/Publication: | 2020-05-05 18:10:02 UTC |
Constructor for CASMAP class object.
Description
Constructor for CASMAP class object.
Details
Constructor for CASMAP class object, which needs the mode
parameter to be set by the user. Please see the examples.
Fields
mode
Either
'regionGWAS'
or'higherOrderEpistasis'
.alpha
A numeric value setting the Family-wise Error Rate (FWER). Must be strictly between
0
and1
. Default value is0.05
.max_comb_size
A numeric specifying the maximum length of combinations. For example, if set to
4
, then only combinations of size between1
and4
(inclusive) will be considered. To consider combinations of arbitrary (maximal) length, use value0
, which is the default value.
Base method, for both modes
readFiles
Read the data, label and possibly covariates files. Parameters are
genotype_file
, for the data,phenotype_file
for the labels and (optional)covariates_file
for the covariates. The optionplink_file_root
is not supported in the current version, but will be supported in future versions.setMode
Can set/change the mode, but note that any data files will need to read in again using the
readFiles
command.setTargetFWER
Can set/change the Family-wise Error Rate (FWER). Takes a numeric parameter
alpha
, strictly between0
and1
.execute
Once the data files have been read, can execute the algorithm. Please note that, depending on the size of the data files, this could take a long time.
getSummary
Returns a data frame with a summary of the results from the execution, but not any significant regions/itemsets. See
getSignificantRegions
,getSignificantInteractions
, andgetSignificantClusterRepresentatives
.writeSummary
Directly write the information from
getSummary
to file.
regionGWAS
Methods
getSignificantRegions
Returns a data frame with the the significant regions. Only valid when
mode='regionGWAS'
.getSignificantClusterRepresentatives
Returns a data frame with the the representatives of the significant clusters. This will be a subset of the regions returned from
getSignificantRegions
. Only valid whenmode='regionGWAS'
.writeSignificantRegions
Writes the data from
getSignificantRegions
to file, which must be specified in the parameterpath
. Only valid whenmode='regionGWAS'
.writeSignificantClusterRepresentatives
Writes the data from
getSignificantClusterRepresentatives
to file, which must be specified in the parameterpath
. Only valid whenmode='regionGWAS'
.
higherOrderEpistasis
Methods
getSignificantInteractions
Returns the frame from
getSignificantInteractions
to file, which must be specified in the parameterpath
. Only valid whenmode='higherOrderEpistasis'
.writeSignificantInteractions
Writes a data frame with the significant interactions. Only valid when
mode='higherOrderEpistasis'
.
References
A. Terada, M. Okada-Hatakeyama, K. Tsuda and J. Sese Statistical significance of combinatorial regulations, Proceedings of the National Academy of Sciences (2013) 110 (32): 12996-13001
F. Llinares-Lopez, D. G. Grimm, D. Bodenham, U. Gieraths, M. Sugiyama, B. Rowan and K. Borgwardt, Genome-wide detection of intervals of genetic heterogeneity associated with complex traits, ISMB 2015, Bioinformatics (2015) 31 (12): i240-i249
L. Papaxanthos, F. Llinares-Lopez, D. Bodenham, K .Borgwardt, Finding significant combinations of features in the presence of categorical covariates, Advances in Neural Information Processing Systems 29 (NIPS 2016), 2271-2279.
F. Llinares-Lopez, L. Papaxanthos, D. Bodenham, D. Roqueiro and K .Borgwardt, Genome-wide genetic heterogeneity discovery with categorical covariates. Bioinformatics 2017, 33 (12): 1820-1828.
Examples
## An example using the "regionGWAS" mode
fastcmh <- CASMAP(mode="regionGWAS") # initialise object
datafile <- getExampleDataFilename() # file name of example data
labelsfile <- getExampleLabelsFilename() # file name of example labels
covfile <- getExampleCovariatesFilename() # file name of example covariates
# read the data, labels and covariate files
fastcmh$readFiles(genotype_file=getExampleDataFilename(),
phenotype_file=getExampleLabelsFilename(),
covariate_file=getExampleCovariatesFilename() )
# execute the algorithm (this may take some time)
fastcmh$execute()
#get the summary results
summary_results <- fastcmh$getSummary()
#get the significant regions
sig_regions <- fastcmh$getSignificantRegions()
#get the clustered representatives for the significant regions
sig_cluster_rep <- fastcmh$getSignificantClusterRepresentatives()
## Another example of regionGWAS
fais <- CASMAP(mode="regionGWAS") # initialise object
# read the data and labels, but no covariates
fastcmh$readFiles(genotype_file=getExampleDataFilename(),
phenotype_file=getExampleLabelsFilename())
## Another example, doing higher order epistasis search
facs <- CASMAP(mode="higherOrderEpistasis") # initialise object
Global variables environment
Description
An environment to store a few global variables. Internal.
Usage
CASMAPenv
Format
An object of class environment
of length 3.
Approximate fast significant interval search
Description
Class for approximate significant intervals search with Tarone correction for bounding intermediate FWERs.
Internal class for search for significant regions
Description
Please use the CASMAP
constructor.
Fast significant interval search with categorical covariates
Description
Internal class, please use CASMAP
constructor.
Significant itemsets search with categorical covariates
Description
Internal class, please use CASMAP
constructor.
Check if a variable is boolean or not
Description
Checks if a variable is boolean, if not throws an error, otherwise returns boolean.
Usage
checkIsBoolean(var, name)
Arguments
var |
The variable to be checked (if boolean). |
name |
The name of the variable to appear in any error message. |
Value
If not boolean (or NA
), throws error.
If NA
, return FALSE
. Otherwise return
boolean value of var
.
Get the path to the example covariates file for regionGWAS mode
Description
Path to CASMAP_example_covariates_1.txt
in inst/extdata
.
The covariates categories for the data set
CASMAP_example_data_1.txt
, the path to which is given by
getExampleDataFilename
.
Usage
getExampleCovariatesFilename()
Format
A single column vector of 100 labels, each of which
is 0
or 1
(same format as labels file).
Details
Path to the file containing the labels, for reading in to
CASMAP object using the readFiles
function.
See Also
getExampleDataFilename
,
getExampleLabelsFilename
Examples
covfile <- getExampleCovariatesFilename()
Get the path to the example data file for regionGWAS mode
Description
Path to CASMAP_example_data_1.txt
in inst/extdata
.
A dataset containing binary samples for the regionGWAS method.
There are accompanying labels and covariates dataset.
Usage
getExampleDataFilename()
Format
A matrix of 0
s and 1
s, with 1000 rows (features)
and 100 columns
(samples). In other words, each column is a sample, and each sample
has 1000 binary features.
Details
Path to the file containing the data, for reading in to
CASMAP object using the readFiles
function.
Note that the significant region is [99, 102]
.
See Also
getExampleLabelsFilename
,
getExampleCovariatesFilename
Examples
datafile <- getExampleDataFilename()
Get the path to the example labels file for regionGWAS mode
Description
Path to CASMAP_example_labels_1.txt
in inst/extdata
.
A dataset containing the binary labels for the data in the file
CASMAP_example_data_1.txt
, the path to which is given by
getExampleDataFilename
.
Usage
getExampleLabelsFilename()
Format
A single column of 100 labels, each of which is either 0
or 1
.
Details
Path to the file containing the labels, for reading in to
CASMAP object using the readFiles
function.
See Also
getExampleDataFilename
,
getExampleCovariatesFilename
Examples
labelsfile <- getExampleLabelsFilename()
Get the path to the example significant intervals file
Description
Path to CASMAP_example_covariates_1.txt
in
inst/extdata
.
Usage
getExampleSignificantRegionsFilename()
Examples
sigregfile <- getExampleSignificantRegionsFilename()
Gets the higherOrderEpistasis string
Description
A getter for the global higherOrderEpistasis
value, a string
for the mode parameter.
Usage
getHigherOrderEpistasisString()
Gets the minModeLength
Description
A getter for the global minModeLength
value, a string
for the mode parameter.
Gets the minimum mode character length (should be 3)
Usage
getMinModeLength()
getMinModeLength()
Get the function name
Description
Uses match.call
and as.character
.
Usage
getParentFunctionName()
Gets the regionGWAS string
Description
A getter for the global regionGWAS
value, a string
for the mode parameter.
Usage
getRegionGWASString()
Checks if substring is part of higherOrderEpistasis
Description
Using grep to search through vector of strings
Usage
isHigherOrderEpistasisString(x)
Arguments
x |
The string which will be compared to 'higherOrderEpistasis' |
Details
Uses grep
to search for exact match.
Value
TRUE
if the string is a substring of 'higherOrderEpistasis',
otherwise returns FALSE
.
A method to check value is numeric and in open interval
Description
Checks if a value is numeric and strictly between two other values.
Usage
isInOpenInterval(x, lower = 0, upper = 1)
Arguments
x |
Value to be checked. Needs to be numeric. |
lower |
Lower bound. Default value is |
upper |
Upper bound. Default value is |
Value
If numeric, and strictly greater than lower
and
strictly smaller than upper
, then return TRUE
.
Else return FALSE
.
Checks if substring is part of regionGWAS
Description
Usinggrepl
to compare strings, ignoring case.
Usage
isRegionGWASString(x)
Arguments
x |
The string which will be compared to 'regionGWAS' |
Details
Uses grepl
to search for exact match. Case will be ignored.
Value
TRUE
if the string is a substring of 'regionGWAS',
otherwise returns FALSE
.
Internal function
Description
Internal function
Usage
lib_delete_search_chi(inst)
Internal function
Description
Internal function
Usage
lib_delete_search_e(inst)
Internal function
Description
Internal function
Usage
lib_delete_search_facs(inst)
Internal function
Description
Internal function
Usage
lib_delete_search_fastcmh(inst)
Internal function
Description
Internal function
Usage
lib_execute_int(inst, alpha, l_max)
Internal function
Description
Internal function
Usage
lib_execute_iset(inst, alpha, l_max)
Internal function
Description
Internal function
Usage
lib_filter_intervals_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_get_filtered_intervals(inst)
Internal function
Description
Internal function
Usage
lib_get_result_facs(inst)
Internal function
Description
Internal function
Usage
lib_get_result_fais(inst)
Internal function
Description
Internal function
Usage
lib_get_result_int(inst)
Internal function
Description
Internal function
Usage
lib_get_result_iset(inst)
Internal function
Description
Internal function
Usage
lib_get_significant_intervals(inst)
Internal function
Description
Internal function
Usage
lib_get_significant_itemsets(inst)
Internal function
Description
Internal function
Usage
lib_new_search_chi()
Internal function
Description
Internal function
Usage
lib_new_search_e()
Internal function
Description
Internal function
Usage
lib_new_search_facs()
Internal function
Description
Internal function
Usage
lib_new_search_fastcmh()
Internal function
Description
Internal function
Usage
lib_profiler_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_significant_ints_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_significant_isets_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_testable_ints_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_testable_isets_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_read_covariates_file_facs(inst, cov_filename)
Internal function
Description
Internal function
Usage
lib_read_covariates_file_fastcmh(inst, cov_filename)
Internal function
Description
Internal function
Usage
lib_read_eth_files(inst, x_filename, y_filename, encoding)
Internal function
Description
Internal function
Usage
lib_read_eth_files_with_cov_facs(inst, x_filename, y_filename, covfilename,
encoding)
Internal function
Description
Internal function
Usage
lib_read_eth_files_with_cov_fastcmh(inst, x_filename, y_filename, covfilename,
encoding)
Internal function
Description
Internal function
Usage
lib_read_plink_files(inst, base_filename, encoding)
Internal function
Description
Internal function
Usage
lib_read_plink_files_with_cov_facs(inst, base_filename, covfilename, encoding)
Internal function
Description
Internal function
Usage
lib_read_plink_files_with_cov_fastcmh(inst, base_filename, covfilename,
encoding)
Internal function
Description
Internal function
Usage
lib_summary_write_to_file_facs(inst, output_file)
Internal function
Description
Internal function
Usage
lib_summary_write_to_file_fais(inst, output_file)
Internal function
Description
Internal function
Usage
lib_summary_write_to_file_fastcmh(inst, output_file)
Internal function
Description
Internal function
Usage
lib_write_eth_files_int(inst, x_filename, y_filename)
Internal function
Description
Internal function
Usage
lib_write_eth_files_iset(inst, x_filename, y_filename)
Internal function
Description
Internal function
Usage
lib_write_eth_files_with_cov_facs(inst, x_filename, y_filename, covfilename)
Internal function
Description
Internal function
Usage
lib_write_eth_files_with_cov_fastcmh(inst, x_filename, y_filename, covfilename)
Internal class
Description
in internal class
Internal class
Description
Internal class
Internal class
Description
An internal class
Internal class
Description
An internal class.
Internal class
Description
Internal class
Error message for mode
Description
Return the appropriate error message for incorrect mode input
Usage
modeErrorMessage()
Error message for mode, if too short
Description
Return the appropriate error message for incorrect mode input
Usage
modeLengthErrorMessage()
Checks mode string is long enough
Description
Checks mode string is at least minimum length
Usage
modeNeedsMoreChars(mode)