TOC

  • splitTools: Tools for Data Splitting
  • matlab2r: Translation Layer from MATLAB to R
  • grafify: Easy Graphs for Data Visualisation and Linear Models for ANOVA
  • DiagrammeR: Graph/Network Visualization
  • optedr: Calculating Optimal and D-Augmented Designs
  • ReDaMoR: Relational Data Modeler
  • gridpattern: 'grid' Pattern Grobs
  • PairViz: Visualization using Graph Traversal
  • explore: Simplifies Exploratory Data Analysis
  • outForest: Multivariate Outlier Detection and Replacement

Introduction

Each month I will describe the package that I've discovered or rediscovered and the ones that I've used the most of my time. I will start with the package used in my work and the the one that I would like to try/did not had time to use for work and also fun

Each card is organised as this

Name of the package: short description

mytags: #example tag

links
[cran package link]
[cran vignette link]
[github link]

description from the author/vignette

mynotes

splitTools: Tools for Data Splitting

mytags: #data splitting

links
[cran package link] https://cran.r-project.org/package=splitTools
[cran vignette link] https://cran.r-project.org/web/packages/splitTools/vignettes/splitTools.html

description from the author/vignette

Fast, lightweight toolkit for data splitting. Data sets can be partitioned into disjoint groups (e.g. into training, validation, and test) or into (repeated) k-folds for subsequent cross-validation. Besides basic splits, the package supports stratified, grouped as well as blocked splitting. Furthermore, cross-validation folds for time series data can be created. See e.g. Hastie et al. (2001) doi:10.1007/978-0-387-84858-7 for the basic background on data partitioning and cross-validation.

mynotes

matlab2r: Translation Layer from MATLAB to R

mytags: #R #Matlab

links
[cran package link] https://cran.r-project.org/package=matlab2r

description from the author/vignette

Allows users familiar with MATLAB to use MATLAB-named functions in R. Several basic MATLAB functions are written in this package to mimic the behavior of their original counterparts, with more to come as this package grows.

mynotes

grafify: Easy Graphs for Data Visualisation and Linear Models for ANOVA

mytags: #multivariate #Inference #tests #statistics

links
[cran package link] https://cran.r-project.org//package=energy

description from the author/vignette

E-statistics (energy) tests and statistics for multivariate and univariate inference, including distance correlation, one-sample, two-sample, and multi-sample tests for comparing multivariate distributions, are implemented. Measuring and testing multivariate independence based on distance correlation, partial distance correlation, multivariate goodness-of-fit tests, k-groups and hierarchical clustering based on energy distance, testing for multivariate normality, distance components (disco) for non-parametric analysis of structured data, and other energy statistics/methods are implemented.

mynotes

DiagrammeR: Graph/Network Visualization

mytags: #graph #networks

links
[cran package link] https://cran.r-project.org/package=DiagrammeR

description from the author/vignette

Build graph/network structures using functions for stepwise addition and deletion of nodes and edges. Work with data available in tables for bulk addition of nodes, edges, and associated metadata. Use graph selections and traversals to apply changes to specific nodes or edges. A wide selection of graph algorithms allow for the analysis of graphs. Visualize the graphs and take advantage of any aesthetic properties assigned to nodes and edges.

mynotes

optedr: Calculating Optimal and D-Augmented Designs

mytags: #DoE #Chemometrics #optimal-design

links
[cran package link] https://cran.r-project.org//package=optedr

description from the author/vignette

Calculates D-, Ds-, A- and I-optimal designs for non-linear models, via an implementation of the cocktail algorithm (Yu, 2011, doi:10.1007/s11222-010-9183-2). Compares designs via their efficiency, and D-augments any design with a controlled efficiency. An efficient rounding function has been provided to transform approximate designs to exact designs.mynotes

ReDaMoR: Relational Data Modeler

mytags: #database #relational #data

links
[cran package link] https://cran.r-project.org/package=ReDaMoR
[vignette link] https://cran.r-project.org/web/packages/ReDaMoR/vignettes/ReDaMoR.html

description from the author/vignette

The aim of this package is to manipulate relational data models in R. It provides functions to create, modify and export data models in json format. It also allows importing models created with 'MySQL Workbench' (https://www.mysql.com/products/workbench/). These functions are accessible through a graphical user interface made with 'shiny'. Constraints such as types, keys, uniqueness and mandatory fields are automatically checked and corrected when editing a model. Finally, real data can be confronted to a model to check their compatibility.

gridpattern: 'grid' Pattern Grobs

mytags: #database #relational #data

links
[cran package link] https://cran.r-project.org/package=gridpattern
[vignette link] https://cran.r-project.org/web/packages/gridpattern/vignettes/developing-patterns.html,
https://cran.r-project.org/web/packages/gridpattern/vignettes/tiling.html

description from the author/vignette

Provides 'grid' grobs that fill in a user-defined area with various patterns. Includes enhanced versions of the geometric and image-based patterns originally contained in the 'ggpattern' package as well as original 'pch', 'polygon_tiling', 'regular_polygon', 'rose', 'text', 'wave', and 'weave' patterns plus support for custom user-defined patterns.

PairViz: Visualization using Graph Traversal

mytags: #graphs #visualization

links
[cran package link] https://cran.r-project.org/package=gridpattern
[vignette link] https://cran.r-project.org/web/packages/gridpattern/vignettes/developing-patterns.html,
https://cran.r-project.org/web/packages/gridpattern/vignettes/tiling.html

description from the author/vignette

Improving graphics by ameliorating order effects, using Eulerian tours and Hamiltonian decompositions of graphs. References for the methods presented here are C.B. Hurley and R.W. Oldford (2010) doi:10.1198/jcgs.2010.09136 and C.B. Hurley and R.W. Oldford (2011) doi:10.1007/s00180-011-0229-5.

explore: Simplifies Exploratory Data Analysis

mytags: #graphs #visualization

links
[cran package link] https://cran.r-project.org/package=gridpattern
[vignette link] https://cran.r-project.org/web/packages/explore/vignettes/explore.html,
https://cran.r-project.org/web/packages/explore/vignettes/explore_mtcars.html,
https://cran.r-project.org/web/packages/explore/vignettes/explore_penguins.html,
https://cran.r-project.org/web/packages/explore/vignettes/explore_titanic.html

description from the author/vignette

Interactive data exploration with one line of code or use an easy to remember set of tidy functions for exploratory data analysis. Introduces three main verbs. explore() to graphically explore a variable or table, describe() to describe a variable or table and report() to create an automated report.

outForest: Multivariate Outlier Detection and Replacement

mytags: #random forest #outliers

links

[cran package link] https://cran.r-project.org/package=outForest [vignette link] https://cran.r-project.org/web/packages/outForest/vignettes/outForest.html

description from the author/vignette

Provides a random forest based implementation of the method described in Chapter 7.1.2 (Regression model based anomaly detection) of Chandola et al. (2009) doi:10.1145/1541880.1541882. It works as follows: Each numeric variable is regressed onto all other variables by a random forest. If the scaled absolute difference between observed value and out-of-bag prediction of the corresponding random forest is suspiciously large, then a value is considered an outlier. The package offers different options to replace such outliers, e.g. by realistic values found via predictive mean matching. Once the method is trained on a reference data, it can be applied to new data..