github.com-tesselle-dimensio_-_2021-05-24_16-00-49
Item Preview
Share or Embed This Item
Flag this item for
- Publication date
- 2021-05-24
:round_pushpin: Multivariate Data Analysis
dimensio 
Simple Principal Components Analysis (PCA) and Correspondence Analysis(CA) based on the Singular Value Decomposition (SVD). This packageprovides S4 classes and methods to compute, extract, summarize andvisualize results of multivariate data analysis. It also includesmethods for partial bootstrap validation.
There are many very good packages for multivariate data analysis (suchas FactoMineR,ade4 orca, all extended byFactoExtra). dimensiois designed to be as simple as possible, providing all the necessarytools to explore the results of the analysis.
Installation
You can install the released version of dimensio fromCRAN with:
rinstall.packages("dimensio")
And the development version from GitHub with:
``` r
install.packages("remotes")
remotes::install_github("tesselle/dimensio")```
Usage
``` r
Load packages
library(dimensio)
library(ggplot2)library(ggrepel)library(khroma)```
``` r
Load data
data(iris)
Compute PCA
(non numeric variables are automatically removed)
X <- pca(iris, center = TRUE, scale = TRUE)
> 1 qualitative variable was removed: Species.
```
Summarize
``` r
Summarize results for the individuals (first two components)
summary(X, margin = 1, rank = 2)
> --- Principal Components Analysis (PCA) -----------------------------------------
>
> Eigenvalues:
> eigenvalues variance cumulative
> F1 2.918 73.342 73.342
> F2 0.914 22.970 96.312
> F3 0.147 3.688 100.000
>
> Active individuals:
> dist PC1coord PC1contrib PC1cos2 PC2coord PC2contrib PC2cos2
> 1 2.319 -2.265 1.172 0.954 -0.480 0.168 0.043
> 2 2.202 -2.081 0.989 0.893 0.674 0.331 0.094
> 3 2.389 -2.364 1.277 0.979 0.342 0.085 0.020
> 4 2.378 -2.299 1.208 0.935 0.597 0.260 0.063
> 5 2.476 -2.390 1.305 0.932 -0.647 0.305 0.068
> 6 2.555 -2.076 0.984 0.660 -1.489 1.617 0.340
> 7 2.468 -2.444 1.364 0.981 -0.048 0.002 0.000
> 8 2.246 -2.233 1.139 0.988 -0.223 0.036 0.010
> 9 2.592 -2.335 1.245 0.812 1.115 0.907 0.185
> 10 2.249 -2.184 1.090 0.943 0.469 0.160 0.043
> (150 more)
```
Extract
dimesion provides several methods to extract the results:
get_data()
returns the original data.get_contributions()
returns the contributions to the definition ofthe principal dimensions.get_coordinates()
returns the principal coordinates.get_correlations()
returns the correlations between variables anddimensions.get_cos2()
returns the cos2 values (i.e. the quality ofthe representation of the points on the factor map).get_eigenvalues()
returns the eigenvalues, the percentages ofvariance and the cumulative percentages of variance.
``` r
Eigenvalues
get_eigenvalues(X)
> eigenvalues variance cumulative
> F1 2.9184978 73.342264 73.34226
> F2 0.9140305 22.969715 96.31198
> F3 0.1467569 3.688021 100.00000
```
Visualize
dimensio uses ggplot2for plotting informations. Visualization methods produce graphics withas few elements as possible: this makes it easy to customize diagrams(e.g. using extra layers, themes and scales).
``` r
Plot active individuals by group
plotrows(X, group = iris$Species) + ggplot2::statellipse() + # Add ellipses ggplot2::themebw() + # Change theme khroma::scalecolor_contrast() # Custom color scale
Plot all individuals by cos2
plotrows(X, highlight = "cos2") + ggplot2::themebw() + # Change theme ggplot2::scalesizecontinuous(range = c(1, 3)) + # Custom size scale khroma::scalecoloriridescent() # Custom color scale```
``` r
Plot variables factor map
plotcolumns(X) + ggrepel::geomlabelrepel() + # Add repelling labels ggplot2::themebw() # Change theme
Highlight contributions
plotcolumns(X, highlight = "contrib") + ggrepel::geomlabelrepel() + # Add repelling labels ggplot2::themebw() + # Change theme khroma::scalecolorYlOrBr(range = c(0.5, 1)) # Custom color scale```
``` r
Plot eigenvalues
plotvariance(X, variance = FALSE, cumulative = FALSE) + ggplot2::themebw() # Change theme
Plot percentages of variance
plotvariance(X, variance = TRUE, cumulative = TRUE) + ggplot2::geomtext(nudgey = 3) + # Add labels ggplot2::themebw() # Change theme
Plot variables contributions to the definition of the first component
plotcontributions(X, margin = 2, axes = 1) + ggplot2::geomtext(nudgey = 2) + # Add labels ggplot2::themebw() + # Change theme ggplot2::theme( # Edit theme # Rotate x axis labels axis.text.x = ggplot2::element_text(angle = 45, hjust = 1, vjust = 1) )
Plot cos2
plotcos2(X, margin = 2, axes = c(1, 2)) + ggplot2::geomtext(nudgey = 0.05) + # Add labels ggplot2::themebw() + # Change theme ggplot2::theme( # Edit theme # Rotate x axis labels axis.text.x = ggplot2::element_text(angle = 45, hjust = 1, vjust = 1) )```
Validation
``` r
Partial bootstrap
Y <- bootstrap(X, n = 30)
Plot with ellipses
plotcolumns(Y) + ggplot2::statellipse() + # Add ellipses ggplot2::theme_bw() # Change theme```
Contributing
Please note that the dimensio project is released with aContributor Code of Conduct. Bycontributing to this project, you agree to abide by its terms.
To restore the repository download the bundle
wget https://archive.org/download/github.com-tesselle-dimensio_-_2021-05-24_16-00-49/tesselle-dimensio_-_2021-05-24_16-00-49.bundle
and run: git clone tesselle-dimensio_-_2021-05-24_16-00-49.bundle
Source: https://github.com/tesselle/dimensio
Uploader: tesselle
Upload date: 2021-05-24
- Addeddate
- 2021-07-06 16:47:34
- Identifier
- github.com-tesselle-dimensio_-_2021-05-24_16-00-49
- Originalurl
-
https://github.com/tesselle/dimensio
- Pushed_date
- 2021-05-24 16:00:49
- Scanner
- Internet Archive Python library 1.9.9
- Uploaded_with
- iagitup - v1.6.2
- Year
- 2021