Type: | Package |
Title: | GIS Integration |
Version: | 1.0 |
Date: | 2024-01-07 |
Description: | Designed to facilitate the preprocessing and linking of GIS (Geographic Information System) databases https://www.sciencedirect.com/topics/computer-science/gis-database, the R package 'GISINTEGRATION' offers a robust solution for efficiently preparing GIS data for advanced spatial analyses. This package excels in simplifying intrica procedures like data cleaning, normalization, and format conversion, ensuring that the data are optimally primed for precise and thorough analysis. |
License: | GPL-3 |
Depends: | shapefiles, tm, syn, RecordLinkage |
Imports: | stringr, sf |
Maintainer: | Leila Marvian Mashhad <Leila.marveian@gmail.com> |
Author: | Hossein Hassani [aut], Leila Marvian Mashhad [aut, cre], Sara Stewart [aut], Steve Macfeelys [aut] |
NeedsCompilation: | no |
Packaged: | 2024-01-09 19:45:43 UTC; ne_da |
Repository: | CRAN |
Date/Publication: | 2024-01-10 20:03:03 UTC |
Consulting User
Description
After the pre processing of the data sets by preproc
function, a series of changes were made on the names of the two variables for uniformity. Sometimes these changes of names based on synonyms are not desired by the user. In this function, according to the output of the preproc
function, the user is asked to tell the program that any change in the name of the variables that he does not want.
Usage
chzInput(d1, d2, chz = "NULL")
Arguments
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
Details
For more details about this function, refer to preproc
function manual.
Value
A vector of characters. It is a vector of characters that shows the names of the variables of the second data set based on the opinion of the user who said which variable name should not be changed.
Author(s)
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
See Also
Examples
d1 = RLdata500
d2 = RLdata10000
chzInput(d1, d2)
Final Data Preprocessing
Description
First, after calling the two data sets, preliminary dat preprocessing is done using preproc
function. Then, according to its output, the user decides which variables should not be renamed. Then this function performs complementary data preprocessing such as sorting the names of the variables, matching the gender variable with different formats, etc. and produces two new data frames.
Usage
create_new_data(d1, d2, chz = "NULL")
Arguments
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
Value
Two data frames.
Author(s)
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
See Also
Examples
d1 = RLdata500
d2 = RLdata10000
create_new_data(d1, d2)
Preprocessing and Unification of Variable Names of Two Input Data Sets
Description
In this function data preprocessing has been meticulously executed to cover a wide range of datasets, ensuring that variable names are standardized using synonyms.
Usage
preproc(d1, d2)
## S3 method for class 'explain'
print(x,...)
Arguments
d1 |
A data frame. |
d2 |
A data frame. |
x |
an object of class |
... |
further arguments passed to preproc function. |
Details
Because we want users to be able to change their names. The output of this function gives the names and classes that have changed in the new version and the previous version, as well as the number of changes in both datasets. Returns the corresponding number for the chz
argument in the chzInput
function.
Value
preproc
an object of class 'explain'
.
An object of class 'explain'
is a list containing the following components:
Changed variable's names |
Character. |
Changed variable's classes |
Character. |
Initial variable's names |
Character. |
Initial variable's classes |
Character. |
A number of changed variable values for the first dataset are |
Data frame. |
A number of changed variable values for the second dataset |
Data frame. |
Number of changed variable's names |
Vector. |
Note
This function has a comprehensible output if changes have been made on the names of the variables for equalization, otherwise it has no specific output and everything is zero.
In addition, it should be noted that the names of the variables of the second data set are matched and the necessary changes are made based on the first data set.
Author(s)
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
See Also
Examples
d1 = RLdata500
d2 = RLdata10000
preproc(d1, d2)
GIS Integration
Description
This function enables users to effectively preprocess GIS data before conducting complex spatial analyses. By automating complex processes like data cleaning, normalization, and format transformation, GeoLinkR ensures that data are prepared for precise and reliable analysis.
Usage
preprocLinkageDBF(d1,d2,chz="NULL",var="area",threshold=0.9)
Arguments
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
var |
The vector of the names of the blocked variables that the user chooses based on the output of the |
threshold |
A numeric value between 0 and 1. |
Details
The results are stored in the .dbf file in the system default path.
Value
dbf file.
Author(s)
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
See Also
Examples
library(sf)
nc1 <- system.file("shape/nc.shp", package="sf")
nc2 <- system.file("shape/nc.shp", package="sf")
nc1 <- st_read(nc1, stringsAsFactors = FALSE)
nc2 <- st_read(nc2, stringsAsFactors = FALSE)
d1 <- data.frame(nc1)
d2 <- data.frame(nc2)
preprocLinkageDBF(d1, d2, var='area')
Display the names of common variables
Description
This function displays the names of common variables based on the create_new_data
function so that the user can give any variable he/she wants as a blocked variable in the preprocLinkage
function.
Usage
selVar(d1, d2, chz = "NULL")
Arguments
d1 |
A data frame. |
d2 |
A data frame. |
chz |
the number of the name of the variable that the user does not want to change based on the output of the |
Value
Character.
Author(s)
Hossein Hassani and and Leila Marvian Mashhad and Sara Stewart and Steve Macfeelys.
See Also
Examples
d1 = RLdata500
d2 = RLdata10000
selVar(d1, d2)