ProductPromotion
Logo

R Programming

made by https://0x3d.site

GitHub - boxuancui/DataExplorer: Automate Data Exploration and Treatment
Automate Data Exploration and Treatment. Contribute to boxuancui/DataExplorer development by creating an account on GitHub.
Visit Site

GitHub - boxuancui/DataExplorer: Automate Data Exploration and Treatment

GitHub - boxuancui/DataExplorer: Automate Data Exploration and Treatment

DataExplorer

CRAN
Version Downloads Total
Downloads GitHub
Stars R-CMD-check codecov CII Best
Practices Contributor
Covenant

Background

Exploratory Data Analysis (EDA) is the initial and an important phase of data analysis/predictive modeling. During this process, analysts/modelers will have a first look of the data, and thus generate relevant hypotheses and decide next steps. However, the EDA process could be a hassle at times. This R package aims to automate most of data handling and visualization, so that users could focus on studying the data and extracting insights.

Installation

The package can be installed directly from CRAN.

install.packages("DataExplorer")

However, the latest stable version (if any) could be found on GitHub, and installed using devtools package.

if (!require(devtools)) install.packages("devtools")
devtools::install_github("boxuancui/DataExplorer")

If you would like to install the latest development version, you may install the develop branch.

if (!require(devtools)) install.packages("devtools")
devtools::install_github("boxuancui/DataExplorer", ref = "develop")

Examples

The package is extremely easy to use. Almost everything could be done in one line of code. Please refer to the package manuals for more information. You may also find the package vignettes here.

Report

To get a report for the airquality dataset:

library(DataExplorer)
create_report(airquality)

To get a report for the diamonds dataset with response variable price:

library(ggplot2)
create_report(diamonds, y = "price")

Visualization

Instead of running create_report, you may also run each function individually for your analysis, e.g.,

## View basic description for airquality data
introduce(airquality)
rows 153
columns 6
discrete_columns 0
continuous_columns 6
all_missing_columns 0
total_missing_values 44
complete_rows 111
total_observations 918
memory_usage 6,376
## Plot basic description for airquality data
plot_intro(airquality)

## View missing value distribution for airquality data
plot_missing(airquality)

## Left: frequency distribution of all discrete variables
plot_bar(diamonds)
## Right: `price` distribution of all discrete variables
plot_bar(diamonds, with = "price")

## View frequency distribution by a discrete variable
plot_bar(diamonds, by = "cut")

## View histogram of all continuous variables
plot_histogram(diamonds)

## View estimated density distribution of all continuous variables
plot_density(diamonds)

## View quantile-quantile plot of all continuous variables
plot_qq(diamonds)

## View quantile-quantile plot of all continuous variables by feature `cut`
plot_qq(diamonds, by = "cut")

## View overall correlation heatmap
plot_correlation(diamonds)

## View bivariate continuous distribution based on `cut`
plot_boxplot(diamonds, by = "cut")

## Scatterplot `price` with all other continuous features
plot_scatterplot(split_columns(diamonds)$continuous, by = "price", sampled_rows = 1000L)

## Visualize principal component analysis
plot_prcomp(diamonds, maxcat = 5L)
#> 2 features with more than 5 categories ignored!
#> color: 7 categories
#> clarity: 8 categories

Feature Engineering

To make quick updates to your data:

## Group bottom 20% `clarity` by frequency
group_category(diamonds, feature = "clarity", threshold = 0.2, update = TRUE)

## Group bottom 20% `clarity` by `price`
group_category(diamonds, feature = "clarity", threshold = 0.2, measure = "price", update = TRUE)

## Dummify diamonds dataset
dummify(diamonds)
dummify(diamonds, select = "cut")

## Set values for missing observations
df <- data.frame("a" = rnorm(260), "b" = rep(letters, 10))
df[sample.int(260, 50), ] <- NA
set_missing(df, list(0L, "unknown"))

## Update columns
update_columns(airquality, c("Month", "Day"), as.factor)
update_columns(airquality, 1L, function(x) x^2)

## Drop columns
drop_columns(diamonds, 8:10)
drop_columns(diamonds, "clarity")

Articles

See article wiki page.

More Resources
to explore the angular.

mail [email protected] to add your project or resources here ๐Ÿ”ฅ.

Related Articles
to learn about angular.

FAQ's
to learn more about Angular JS.

mail [email protected] to add more queries here ๐Ÿ”.

More Sites
to check out once you're finished browsing here.

0x3d
https://www.0x3d.site/
0x3d is designed for aggregating information.
NodeJS
https://nodejs.0x3d.site/
NodeJS Online Directory
Cross Platform
https://cross-platform.0x3d.site/
Cross Platform Online Directory
Open Source
https://open-source.0x3d.site/
Open Source Online Directory
Analytics
https://analytics.0x3d.site/
Analytics Online Directory
JavaScript
https://javascript.0x3d.site/
JavaScript Online Directory
GoLang
https://golang.0x3d.site/
GoLang Online Directory
Python
https://python.0x3d.site/
Python Online Directory
Swift
https://swift.0x3d.site/
Swift Online Directory
Rust
https://rust.0x3d.site/
Rust Online Directory
Scala
https://scala.0x3d.site/
Scala Online Directory
Ruby
https://ruby.0x3d.site/
Ruby Online Directory
Clojure
https://clojure.0x3d.site/
Clojure Online Directory
Elixir
https://elixir.0x3d.site/
Elixir Online Directory
Elm
https://elm.0x3d.site/
Elm Online Directory
Lua
https://lua.0x3d.site/
Lua Online Directory
C Programming
https://c-programming.0x3d.site/
C Programming Online Directory
C++ Programming
https://cpp-programming.0x3d.site/
C++ Programming Online Directory
R Programming
https://r-programming.0x3d.site/
R Programming Online Directory
Perl
https://perl.0x3d.site/
Perl Online Directory
Java
https://java.0x3d.site/
Java Online Directory
Kotlin
https://kotlin.0x3d.site/
Kotlin Online Directory
PHP
https://php.0x3d.site/
PHP Online Directory
React JS
https://react.0x3d.site/
React JS Online Directory
Angular
https://angular.0x3d.site/
Angular JS Online Directory