Normal Distribution

A random variable \(X\) is said to normal if it symmetiric and has bell shape with only one peak. Mathematically, it is represented by the following function.

\(f(x)=\frac{1}{\sqrt{2 \pi}} \exp ^{-\frac{1}{2} \left( \frac{x-\mu}{\sigma }\right)^{2}}\)

where \(-\infty < \mu < \infty\) and \(0< \sigma < \infty\)

Example :

Here we are plotting a normal distribution function with $$=0 and \(\sigma = 1.\)

x<- seq(-4,4,length=200)
y<- 1/sqrt(2*pi)*exp(-x^2/2)
plot(x,y,type="l",lwd=2,col="red")

plot of chunk unnamed-chunk-2

We can plot the same normal distribution function by using default funchtion in R.

 x=seq(-4,4,length=200)
 y=dnorm(x,mean=0,sd=1)
 plot(x,y,type="l",lwd=2,col="red")

plot of chunk unnamed-chunk-3

How about changing the standard deviation ? Pay attention on the shape of the curve, the normal curve gets flatter and more spread when standard deviation increases. Basically, standard deviation controls the shape of the curve. Therefore, standard deviation is also called “shape parameter.”

x=seq(-8,8,length=500)
 y1=dnorm(x,mean=0,sd=1)
 plot(x,y1, type="l",lwd=2,col="red", ylab="density")
 y2=dnorm(x,mean=0,sd=2)
lines(x,y2,type="l",lwd=2,col="blue")

 y3=dnorm(x,mean=0,sd=3)
lines(x,y3,type="l",lwd=2,col="green")

 y4=dnorm(x,mean=0,sd=4)
lines(x,y4,type="l",lwd=2,col="black")

 y5=dnorm(x,mean=0,sd=5)
lines(x,y5,type="l",lwd=2,col="yellow")
legend('topright', legend=c("mean=0,sd=1", "mean=0,sd=2", "mean=0,sd=3", "mean=0,sd=4", "mean=0,sd=5"), 
col=c("red","blue" ,"green","black","yellow"), pch=16)

plot of chunk unnamed-chunk-4

Let us assume the standard deviation is fixed. How about changing the mean ? The normal curve moves horizontally. We get these curves in different location but all of them have same shape. Thus, the mean of the normal data controls the location of the curve. Therefore, mean is called “location parameter.”

 x=seq(-10,10,length=500)
 y6=dnorm(x,mean=0,sd=2)
plot(x,y6,type="l",lwd=2,col="black", ylab="density")

 y7=dnorm(x,mean=1,sd=2)
lines(x,y7,type="l",lwd=2,col="blue")

 y8=dnorm(x,mean=2,sd=2)
lines(x,y8,type="l",lwd=2,col="green")

 y9=dnorm(x,mean=3,sd=2)
lines(x,y9,type="l",lwd=2,col="yellow")

y10=dnorm(x,mean=4,sd=2)
lines(x,y10,type="l",lwd=2,col="red")

 y11=dnorm(x,mean=-3,sd=2)
lines(x,y11,type="l",lwd=2,col="grey")

legend('topright', legend=c("mean=0,sd=2", "mean=1,sd=2", "mean=2,sd=2", "mean=3,sd=2", "mean=4,sd=2","mean=-3,sd=2"), 
col=c("black","blue" ,"green","yellow","red", "grey"), pch=16)

plot of chunk unnamed-chunk-5

Area under normal curve

x=seq(70,130,length=200)
 y=dnorm(x,mean=100,sd=10)
 plot(x,y,type="l",lwd=2,col="red")
 x=seq(70,90,length=100)
 y=dnorm(x,mean=100,sd=10)
 polygon(c(70,x,90),c(0,y,0),col="gray")

plot of chunk unnamed-chunk-6

pnorm(130, mean=100, sd=10)-pnorm(70, mean=100, sd=10)
## [1] 0.9973

Inverse of Normal Distribution

Find 95th percentile of the normal data with mean=100 and sd=10.

 qnorm(0.95,mean=100,sd=10)
## [1] 116.4