library(manymodelr)agg_by_groupAs can be guessed from the name, this function provides an easy way
to manipulate grouped data. We can for instance find the number of
observations in the yields data set. The formula takes the form
x~y where y is the grouping variable(in this
case normal). One can supply a formula as shown next.
# Load the yields dataset
data("yields")
head(agg_by_group(yields,.~normal,length))
#> Grouped By[1]:   normal 
#> 
#>   normal height weight yield
#> 1     No    500    500   500
#> 2    Yes    500    500   500
head(agg_by_group(mtcars,cyl~hp+vs,sum))
#> Grouped By[2]:   hp vs 
#> 
#>    hp vs cyl
#> 1  91  0   4
#> 2 110  0  12
#> 3 150  0  16
#> 4 175  0  22
#> 5 180  0  24
#> 6 205  0   8rowdiffThis is useful when trying to find differences between rows. The
direction argument specifies how the subtractions are made
while the exclude argument is used to specify classes that
should be removed before calculations are made. Using
direction="reverse" performs a subtraction akin to
x-(x-1) where x is the row number.
head(rowdiff(yields,exclude = "factor",direction = "reverse"))
#>        height      weight      yield
#> 1          NA          NA         NA
#> 2 -0.04212634  0.24042659 -15.808303
#> 3  0.01516059  0.09649856  11.170825
#> 4  0.25961718  0.03008764   6.578424
#> 5 -0.11495811 -0.02971837 -19.584090
#> 6  0.57638627 -0.42979818   6.825719na_replaceThis allows the user to conveniently replace missing values. Current
options are ffill which replaces with the next non-missing
value, samples that samples the data and does replacement,
value that allows one to fill NAs with a
specific value. Other common mathematical methods like min,
max,get_mode, sd, etc are no
longer supported. They are now available with more flexibility in
standalone mde
head(na_replace(airquality, how="value", value="Missing"),8)
#>     Ozone Solar.R Wind Temp Month Day
#> 1      41     190  7.4   67     5   1
#> 2      36     118  8.0   72     5   2
#> 3      12     149 12.6   74     5   3
#> 4      18     313 11.5   62     5   4
#> 5 Missing Missing 14.3   56     5   5
#> 6      28 Missing 14.9   66     5   6
#> 7      23     299  8.6   65     5   7
#> 8      19      99 13.8   59     5   8na_replace_groupedThis provides a convenient way to replace values by group.
test_df <- data.frame(A=c(NA,1,2,3), B=c(1,5,6,NA),groups=c("A","A","B","B"))
# Replace NAs by group
# replace with the next non NA by group.
na_replace_grouped(df=test_df,group_by_cols = "groups",how="ffill")
#>   groups A B
#> 1      A 1 1
#> 2      A 1 5
#> 3      B 2 6
#> 4      B 3 6The use of mean,sd,etc is no longer
supported. Use mde
instead which is focused on missingness.