Help for package ISCA

Version:

0.1.0

Maintainer:

Lucas Drouhot <l.g.m.drouhot@uu.nl>

License:

GPL (≥ 3)

Title:

Compare Heterogeneous Social Groups

Description:

The Inductive Subgroup Comparison Approach ('ISCA') offers a way to compare groups that are internally differentiated and heterogeneous. It starts by identifying the social structure of a reference group against which a minority or another group is to be compared, yielding empirical subgroups to which minority members are then matched based on how similar they are. The modelling of specific outcomes then occurs within specific subgroups in which majority and minority members are matched. 'ISCA' is characterized by its data-driven, probabilistic, and iterative approach and combines fuzzy clustering, Monte Carlo simulation, and regression analysis. ISCA_random_assignments() assigns subjects probabilistically to subgroups. ISCA_clustertable() provides summary statistics of each cluster across iterations. ISCA_modeling() provides Ordinary Least Squares regression results for each cluster across iterations. For further details please see Drouhot (2021) <doi:10.1086/712804>.

Encoding:

UTF-8

RoxygenNote:

7.3.1

Depends:

R (≥ 4.3)

Imports:

data.table (≥ 1.16.0), stringr (≥ 1.5.1), e1071 (≥ 1.7-16), Hmisc (≥ 5.1-3), broom (≥ 1.0.7), dplyr (≥ 1.1.4), tidyselect (≥ 1.2.1), stats (≥ 4.3.1), tibble (≥ 3.2.1), plyr (≥ 1.8.9), magrittr (≥ 2.0.3)

Suggests:

knitr, rmarkdown, tidyr (≥ 1.3.0), testthat (≥ 3.0.0)

Config/testthat/edition:

VignetteBuilder:

knitr

LazyData:

true

NeedsCompilation:

Packaged:

2024-09-30 20:09:01 UTC; Marion Späth

Author:

Lucas Drouhot [aut, cre], Marion Späth [aut]

Repository:

CRAN

Date/Publication:

2024-10-02 13:10:10 UTC

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of calling 'rhs(lhs)'.

ISCA Cluster Tables

Description

Function to create a cluster or descriptive table across iterations.

Usage

ISCA_clustertable(data, cluster_vars, draws = 500)

Arguments

data

The dataset including all relevant variables and the random assignments from the first ISCA_random_assignments()-function.

cluster_vars

A vector specifying the variables of interest.

draws

Specification of the number of probabilistic draws. The number of draws should be equal to the number of draws specified in the first step. If not specified, the default is 500.

Value

The output is a table containing the grand mean, grand standard deviation, and cluster error for each variable and cluster. No cluster error is calculated for dichotomous variables.

Examples

data(sim_data)
ISCA_step1 <- ISCA_random_assignments(data=sim_data, filter=native, 
majority_group=1, minority_group=c(0), fuzzifier = 1.5, n_clusters=4, 
draws=5, cluster_vars= c("female", "age", "education", "income"))
result_ISCA_clustertable <- ISCA_clustertable(data = ISCA_step1, 
cluster_vars = c("native", "education", "age", "female", 
"discrimination", "religiosity"), draws = 5);

ISCA Modeling

Description

Function to compute an OLS regression across all clusters and iterations.

Usage

ISCA_modeling(data, model_spec, weights = NULL, n_clusters, draws = 500)

Arguments

data

The dataset including all relevant variables and the random assignments from the first ISCA_random_assignments()-function.

model_spec

A model specification similar to the lm()-function.

weights

A vector specifying the variable in which the weights are stored. The default is NONE.

n_clusters

Specification of the number of clusters. This value should be equal to the number of clusters specified in the first and second step.

draws

Specification of the number of probabilistic draws. The number of draws should be equal to the number of draws specified in the first and second step. If not specified, the default is 500.

Value

The output is a table containing the regression coefficients, standard error and p-value for each regression term and cluster across all iterations. It also contains the regression coefficient, standard error and p-value for a pooled model, that is a model with all clusters combined.

Examples

data(sim_data)
ISCA_step1 <- ISCA_random_assignments(data=sim_data, filter=native, 
majority_group=1, minority_group=c(0), fuzzifier = 1.5, n_clusters=4, 
draws=5, cluster_vars= c("female", "age", "education", "income"))
ISCA_modeling_res <- ISCA_modeling(data= ISCA_step1, 
model_spec="religiosity ~ native + female + age + education + discrimination", 
draws = 5, n_clusters = 4);

ISCA Random Assignments per Subgroup

Description

Function that calculates membership scores for each subgroup and assigns a cluster for a number of random draws.

Usage

ISCA_random_assignments(
  data,
  filter,
  majority_group,
  minority_group,
  cluster_vars,
  fuzzifier = 1.5,
  n_clusters,
  draws = 500
)

Arguments

data

A dataset containing all relevant variables

filter

Specification of the variable name that contains information on majority / minority status.

majority_group

Specification of the value within the variable specified in the previous filter-argument indicating majority status. This could be either a numeric value or a character string.

minority_group

specification of the value(s) indicating minority status in the filter variable. This could be either a numeric value or a character string. It can be one single minority group or a vector of several minority groups.

cluster_vars

Vector specifying the variables that should be used to create the clusters.

fuzzifier

The fuzzifier is a value larger than 1 determining the extent of overlap between clusters. A value of 1 effectively makes fuzzy c-means equivalent to hard k-means. The default is 1.5.

n_clusters

Specification of the number of clusters to be created.

draws

Specification of the number of probabilistic draws. If not specified, the default is 500.

Value

The output is a dataframe with all original variables and a new column for every draw, each containing one random assignment. This dataframe is the foundation of the subsequent functions in the ISCA package.

Examples

data(sim_data)
ISCA_step1 <- ISCA_random_assignments(data=sim_data,
filter=native, majority_group=1, minority_group=c(0), 
fuzzifier = 1.5, n_clusters=4, draws=5, 
cluster_vars= c("female", "age", "education", "income"));

Cross-sectional, artificial data on 1000 individuals

Description

Small, artificially created dataset in a cross-sectional format. Provides information on 1000 individuals to illustrate the use of the package.

Usage

data(sim_data)

Format

A data frame with 1,000 rows and 7 columns:

female: Dichotomous variable (0/1) indicating a person's sex
age: value indicating a person's age, range 18-80
education: value indicating a person's level of education, range 1-9
income: value indicating a person's income
religiosity: value indicating a person's level of religiosity, range 1-10
discrimination: value indicating a person's level of experience discrimination, range 0-8
native: Dichotomous variable (0/1) indicating whether a person is a native (1) or an immigrant (0)

References

The data was artificially created for the ISCA package.

Examples


data(sim_data)
head(sim_data)