Check If Data Is Normally Distributed Using R - QQ Plots
The first step to check if your data is normally distributed is to plot a histogram and observe its shape. If it looks bell-shaped and symmetric around the mean you can assume that your data is normally distributed. However, using histograms to assess normality of data can be problematic especially if you have small dataset.
A better way to check if your data is normally distributed is to create quantile-quantile (QQ) plots which can easily be created in R or Python.
QQ Plots
The idea of a quantile-quantile plot is to compare the distribution of two datasets. It is done by matching a common set of quantiles in the two datasets.
In R, a QQ plot can be constructed using the qqplot() function which takes two datasets as its parameters. In R, when you create a qq plot, this is what happens. First the data in both datasets is sorted. These sorted values are then plotted against each other in a scatter chart. This is the qq-plot. A 45 degree line is also drawn to make the interpretation easier.
In finance, qq plots are used to determine if the distribution of returns is normal. They are also used to detect fat tails of the distribution.
To check for normality, instead of comparing two sample datasets, you compare your returns dataset with a theoretical sample that is normally distributed. To do so, you can first create a normally distributed sample dataset and use the qqplot() function to create the qq plot of the two datasets. Or you can you a special function called qqnorm().
The qqnorm() function in R compares a certain sample data (in this case returns), against the values that come from a normal distribution. The sample you want to plot should go as the first argument of the qqnorm() function. Using this function it is possible to observe how closely a certain sample follows a theoretical normal distribution function. It is like a visualization check of the normal distribution test.
qq means quantile-quantile. This refer that the quantiles of your data are compared with the quantiles from a normal distribution (in the qqnorm function) using a scatter plot. Quantile is the fraction of points below the given value. This means that the 0.4 (or 40%) quantile is the point at which 40% percent of the data fall below, and 60% fall above that value.
The qqline() function is used in conjuntion with qqnorm() to plot the theoretical line (45 degree line) of the normal distribution function. If most of the points of the sample data fall along this theoretical line, it is likely that your sample data has a normal distribution. Otherwise, when your sample data departs or diverge significantly from this 45 degree line, the sample data doesn't follow a normal distribution.

