R - Intro to ggplot2


Before starting with plotting of data with ggplot2, let us start with understanding of the data to plot. We can classify data in following different categories.

1. Qualitative Data - Non numeric or data which cannot expressed with numbers

2. Quantitative Data - Numeric or data which can be quantified

We can further classify Qualitative data to -

1. Nominal - Where order of data elements is not important e.g. eye color - black, blue , gray

2. Ordinal - order is important e.g. temp - low, medium, high

We can further classify Numeric or quantitative data as follows -

1. Discrete - counts or integers or data which can not be divided further e.g. no of animals which can not be in fraction

2. Continuous - measurement data e.g. height, weight

From all these type of categories we can have only two types i.e. Numerical and Categorical data.So, overall three type of combinations will be possible i.e. Numeric - Numeric, Numeric-Categorical, Categorical - Categorical and three different type of charts we can create Scatter plots, Bar Charts & Mosaic / Tree-map charts respectively.

Introduction :

In R, We can use base library or lattice package for graphics. ggplot2 is a plotting system for R, based on the grammar of graphics, which tries to take the good parts of base and lattice graphics and none of the bad parts.

Installation :

install.packages("ggplot2")

Documentation :

http://docs.ggplot2.org/current/

What is grammer of graphics

It is like grammer of language which defines rules to create statements in the same way grammar of graphics defines rules for graphs & charts.

What are major components ?

  • Data

  • Variables to be portrayed on the graph

  • Variables are mapped to aesthetics, the perceivable features of the graph

  • Geoms

  • Objects/shapes on the graph (e.g. points, bars)

  • Stats

  • Statistical transformation, usually to summarize (e.g. mean, variance)

  • Scales

  • Define which aesthetic values are mapped to variables (e.g. which colors are mapped to which values)

  • Coordinate Systems

  • Define how data are mapped to the plane of the graphic (e.g Cartesian)

  • Faceting

  • Splits data into subsets to create multiple variations of the same graph (paneling)

Cheatsheet :

Download link

Examples :

Please refer this post to create different types of elegant graphs and charts with the help of ggplot2.


About Author

Dattatray Shinde have over 12+ years of experience in Software Design, Development & Maintenance of Web Based Applications; worked on Healthcare, Insurance, E-commerce and Learning Management System domains. Over 6 + years as Data Scientist worked mainly in predictive analytics, survey analytics, risk analytics platforms.

Featured Posts
Recent Posts