Title: | Automatically Builds 20 Classification Models |
Version: | 0.5.0 |
Description: | Automatically builds 20 classification models from data. The package returns 26 plots, 5 tables and a summary report. The package automatically builds 12 individual classification models, including error (RMSE) and predictions. That data is used to create an ensemble, which is then modeled using 8 methods. The process is repeated as many times as the user requests. The mean of the results are presented in a summary table. The package returns the confusion matrices for all 20 models, tables of the correlation of the numeric data, the results of the variance inflation process, the head of the ensemble and the head of the data frame. |
License: | MIT + file LICENSE |
Depends: | C50, car, caret, corrplot, doParallel, dplyr, e1071, ggplot2, gt, ipred, MachineShop, magrittr, parallel, pls, purrr, R (≥ 2.10), randomForest, ranger, reactable, reactablefmtr, scales, tidyr, tree |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
LazyData: | true |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
URL: | https://github.com/InfiniteCuriosity/ClassificationEnsembles |
BugReports: | https://github.com/InfiniteCuriosity/ClassificationEnsembles/issues |
NeedsCompilation: | no |
Packaged: | 2025-03-30 22:25:56 UTC; russellconte |
Author: | Russ Conte [aut, cre, cph] |
Maintainer: | Russ Conte <russconte@mac.com> |
Repository: | CRAN |
Date/Publication: | 2025-04-01 16:10:05 UTC |
Carseats data
Description
This is the Carseats data as shown in the ISLR package.
Usage
Carseats
Format
Carseats A simulated data set with 400 observations and 11 rows
- Sales
Unit sales (in thousands) at each location
- CompPrice
Price charged by competitor at each location
- Income
Community income level (in thousands of dollars)
- Advertising
Local advertising budget for company at each location (in thousands of dollars)
- Population
Population size in region (in thousands)
- Price
Price company charges for car seats at each site
- ShelveLoc
A factor with levels Bad, Good and Medium indicating the quality of the shelving location for the car seats at each site
- Age
Average age of the local population
- Urban
A factor with levels No and Yes to indicate whether the store is in an urban or rural location
- US
A factor with levels No and Yes to indicate whether the store is in the US or not
Source
ISLR data set, https://www.rdocumentation.org/packages/ISLR/versions/1.4/topics/Carseats
classification—function to perform classification analysis and return results to the user.
Description
classification—function to perform classification analysis and return results to the user.
Usage
Classification(
data,
colnum,
numresamples,
predict_on_new_data = c("Y", "N"),
remove_VIF_above,
scale_all_numeric_predictors_in_data,
how_to_handle_strings = c(0("No strings"), 1("Strings as factors")),
save_all_trained_models = c("Y", "N"),
save_all_plots,
use_parallel = c("Y", "N"),
train_amount,
test_amount,
validation_amount
)
Arguments
data |
a data set that includes classification data. For example, the Carseats data in the ISLR package |
colnum |
the number of the column. For example, in the Carseats data this is column 7, ShelveLoc with three values, Good, Medium and Bad |
numresamples |
the number of times to resample the analysis |
predict_on_new_data |
Gives the user the opportunity to use the trained models to predict on new and untrained data |
remove_VIF_above |
Removes columns with Variance Inflaction Factors above the level chosen by the user |
scale_all_numeric_predictors_in_data |
Scales all numeric predictors in the original data |
how_to_handle_strings |
Converts strings to factor levels |
save_all_trained_models |
Gives the user the option to save all trained models in the Environment |
save_all_plots |
Saves all plots in the user's chosen format |
use_parallel |
"Y" or "N" for parallel processing |
train_amount |
set the amount for the training data |
test_amount |
set the amount for the testing data |
validation_amount |
Set the amount for the validation data |
Value
a full analysis, including data visualizations, statistical summaries, and a full report on the results of 35 models on the data
Maternal Health Risk
Description
Data has been collected from different hospitals, community clinics, maternal health cares from the rural areas of Bangladesh through the IoT based risk monitoring system.
Usage
Maternal_Health_Risk
Format
Maternal_Health_Risk Age, Systolic Blood Pressure as SystolicBP, Diastolic BP as DiastolicBP, Blood Sugar as BS, Body Temperature as BodyTemp, HeartRate and RiskLevel. All these are the responsible and significant risk factors for maternal mortality, that is one of the main concern of SDG of UN.
- Age
Any ages in years when a women during pregnant.
- SystolicBP
Upper value of Blood Pressure in mmHg, another significant attribute during pregnancy.
- DiastolicBP
Lower value of Blood Pressure in mmHg, another significant attribute during pregnancy.
- BS
Blood glucose levels is in terms of a molar concentration
- BodyTemp
Body temperature in Farenheit
- HeartRate
A normal resting heart rate
- RiskLevel
Predicted Risk Intensity Level during pregnancy considering the previous attribute.
Dry Beans small
Description
This is a stratified version of the full dry beans data set. This is about 7 percent of the full data set
Usage
dry_beans_small
Format
dry_beans_small A reduced version with 813 rows and 17 columns of the full data set available on UCI: https://archive.ics.uci.edu/dataset/602/dry+bean+dataset
- Area
The area of a bean zone and the number of pixels within its boundaries
- Perimeter
Bean circumference is defined as the length of its border
- MajorAxisLength
The distance between the ends of the longest line that can be drawn from a bean
- MinorAxisLength
The longest line that can be drawn from the bean while standing perpendicular to the main axis
- AspectRatio
Defines the relationship between MajorAxisLength and MinorAxisLength
- Eccentricity
Eccentricity of the ellipse having the same moments as the region
- ConvexArea
Number of pixels in the smallest convex polygon that can contain the area of a bean seed
- EquivDiameter
Equivalent diameter: The diameter of a circle having the same area as a bean seed area
- Extent
The ratio of the pixels in the bounding box to the bean area
- Solidity
Also known as convexity. The ratio of the pixels in the convex shell to those found in beans.
- Roundness
Calculated with the following formula: (4piA)/(P^2)
- Compactness
Measures the roundness of an object
- ShapeFactor1
Continuous value
- ShapeFactor2
Continuous value
- ShapeFactor3
Continuous value
- ShapeFactor4
Continuous value
- Class
(Seker, Barbunya, Bombay, Cali, Dermosan, Horoz and Sira)
@source https://archive.ics.uci.edu/dataset/602/dry+bean+dataset