Introduction
Statistics is a key component in data science, which deals with gathering, analyzing, and drawing conclusions from data. An aspect of statistics is the probability distribution, which gives an idea of the likelihood of an event occurring, for example, there is an 80% chance of rain tonight.
Regarding probability, the common notation used is p(X), which means the probability that a random variable X is equal to a particular value, therefore p(X=0.8), in the example given, indicates that there's an 80% chance of X occurring. The sum of all probabilities should be equal to 1, therefore if there's a 0.8 chance of rain, then there's a 0.2 chance of no rain. Probabilities are also between 0 and 1. There are two types of probability distributions,
- Discrete probability distribution
- Continuous probability distribution
The following sections talk a bit more about both distributions
Visualization of different types of distribution
There are various types of a discrete probability distribution, some of which are
- Poisson, for counting situations, such as the counts of televisions sold at a video store per week
- Binomial for the binary situations, such as if it would rain or not
- Uniform distribution for multiple situations that have the same probability such as a die roll
The following plot is a visualization of Normal Distribution
The normal distribution has certain characteristics which make it a bit easier to spot, some of which are:
- The mean, median, and mode are equal
- There is no skew(whether left or right), meaning 50% of the values are on the left of the mean, and the other 50% on the right
- The mean and standard deviation are the key terms that characterize this
You can find more about Gaussian Distribution in the Following Article
Top comments (0)