Frequency Polygons
A frequency polygon is a graph that is obtained by joining the class marks of a histogram with the two end points lying on the horizontal axis. It gives an idea of the shape of the distribution. It can be superimposed on the histogram by placing the dots on the class marks of the histogram, as shown below.
You probably have learned how to do this in college. However won't it be nice if we can have something That could do that for us especially when we have large data??? Who doesn't want thatππ€ͺπ€ͺ???. Alright keep reading.
steps to take to plot frequency polygon.
- Get your data and plot histogram using
hist()
function and provide necessary parameters. - get the vector of the mid-point of all the bars.
- Get the vector of all the breaking point of your data
- Draw a line using lines() function such that the vector in number2 be on y axis and vector in number3 be on x-axis.
- Make any adjustments if necessary.
Construction of histogram using R
Here we going to make use of the following example, so we need to plot histogram first.
Example1
Use the following data showing the score of 50 students to construct frequency polygon.
22 17 26 27 14 15 21 18 8 19 26 14 20 12 11 17 20 16
26 12 21 15 18 21 10 16 20 18 21 22 21 15 19 10 25 15
16 31 24 21 14 24 23 20 14 15 16 29 20 21.
Solution
step1: input all the data and plot the histogram.
codes>>
score=c(22, 17, 26, 27, 14, 15 ,21, 18 , 8,
19, 26, 14, 20, 12, 11 ,17, 20, 16,26, 12,
21 ,15, 18, 21, 10, 16 ,20, 18 ,21, 22, 21,
15, 19, 10, 25 ,15,16, 31, 24, 21, 14, 24,
23, 20 ,14, 15 ,16, 29, 20, 21)
hist.score=hist(score,main='Histogram of the scores',
breaks =8,col = 'lightblue',border = 'red')
Now let's see some information about our histogram, notice that the histogram was assigned to a particular variable.
codes>>
score=c(22, 17, 26, 27, 14, 15 ,21, 18 , 8,
19, 26, 14, 20, 12, 11 ,17, 20, 16,26, 12,
21 ,15, 18, 21, 10, 16 ,20, 18 ,21, 22, 21,
15, 19, 10, 25 ,15,16, 31, 24, 21, 14, 24,
23, 20 ,14, 15 ,16, 29, 20, 21)
hist.score=hist(score,main='Histogram of the scores',
breaks =8,col = 'lightblue',border = 'red')
hist.score
Result>>
$breaks
[1] 5 10 15 20 25 30 35
$counts
[1] 3 12 16 13 5 1
$density
[1] 0.012 0.048 0.064 0.052 0.020 0.004
$mids
[1] 7.5 12.5 17.5 22.5 27.5 32.5
$xname
[1] "score"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
As you can see above the breaks tell us exactly all breaking point of our bars. mids is telling us all the midpoint while counts is giving us the frequency of each bars. now we can use those information to draw the frequency polygon line by listing the vector of x-axis and y-axis. see the following codes;
codes>>
score=c(22, 17, 26, 27, 14, 15 ,21, 18 , 8,
19, 26, 14, 20, 12, 11 ,17, 20, 16,26, 12,
21 ,15, 18, 21, 10, 16 ,20, 18 ,21, 22, 21,
15, 19, 10, 25 ,15,16, 31, 24, 21, 14, 24,
23, 20 ,14, 15 ,16, 29, 20, 21)
hist.score=hist(score,main='Histogram of the scores',
breaks =8,col = 'lightblue',border = 'red')
xaxis=c(min(hist.score$breaks),hist.score$mids,max(hist.score$breaks))
y.axis=c(0,g$counts,0)
lines(x.axis,y.axis,type='l')
Result>>
Now we have constructed the polygon right? Well, you might not like it this, why?? You may be someone like me who want the polygon without the histogram. so to do that all you have to do is to change the color and the border of the histogram to transparent like the following;
Codes>>
score=c(22, 17, 26, 27, 14, 15 ,21, 18 , 8,
19, 26, 14, 20, 12, 11 ,17, 20, 16,26, 12,
21 ,15, 18, 21, 10, 16 ,20, 18 ,21, 22, 21,
15, 19, 10, 25 ,15,16, 31, 24, 21, 14, 24,
23, 20 ,14, 15 ,16, 29, 20, 21)
hist.score=hist(score,main='Histogram of the scores',
breaks =8,col = 'transparent',border = 'transparent')
xaxis=c(min(hist.score$breaks),hist.score$mids,max(hist.score$breaks))
y.axis=c(0,g$counts,0)
lines(x.axis,y.axis,type='l')
Example 2
Use the following data showing the score of 100 students in a statistics exam to construct frequency polygon.
Codes>>
examScores=scan()
63 60 57 55 60 56 58 61 62 60 60 60 58 60 59 60 60 60 58 62 58 63 60 62 61
61 58 59 61 61 61 61 59 61 61 57 58 57 62 59 60 59 59 57 58 58 62 59 58 60
64 59 60 57 60 61 62 61 61 61 60 57 57 58 60 56 58 65 63 63 58 61 60 59 60
60 61 61 59 57 58 57 58 56 60 58 60 58 58 56 64 57 61 57 57 59 57 63 60 61
hist.Score=hist(examScores,main='Histogram of the scores',
breaks =8,col = 'lightblue',border = 'red')
x.axis=c(min(hist.Score$breaks),hist.Score$mids,max(hist.Score$breaks))
y.axis=c(0,hist.Score$counts,0)
lines(x.axis,y.axis,type='l')
Result>>
you may remove the histogram part using the same method like the following;
Codes
examScores=scan()
63 60 57 55 60 56 58 61 62 60 60 60 58 60 59 60 60 60 58 62 58 63 60 62 61
61 58 59 61 61 61 61 59 61 61 57 58 57 62 59 60 59 59 57 58 58 62 59 58 60
64 59 60 57 60 61 62 61 61 61 60 57 57 58 60 56 58 65 63 63 58 61 60 59 60
60 61 61 59 57 58 57 58 56 60 58 60 58 58 56 64 57 61 57 57 59 57 63 60 61
hist.Score=hist(examScores,main='Histogram of the scores',
breaks =8,col = 'transparent',border = 'transparent')
x.axis=c(min(hist.Score$breaks),hist.Score$mids,max(hist.Score$breaks))
y.axis=c(0,hist.Score$counts,0)
lines(x.axis,y.axis,type='l')
I hope you find this article helpful?? consider to share to somewhere else who might interested.Please support and like to motivate me to write more. Chat me up if you have any problem.
Top comments (0)