DEV Community

Francisco Carrillo Pérez
Francisco Carrillo Pérez

Posted on

Analyzing sugar in McDonald's menu's items

Recently I've found a data set in Kaggle which is composed with the nutrition facts of every item in McDonald's Menu (Dataset).

I know for a while that the consumption of sugar in our life is way upon the one recommended by the health care organizations, so I wanted to analyzed how much sugar where in the McDonald menu's items and which of them don't have any kind of added sugar.

For this I've used a Jupyter Notebook with the following libraries: plotly and pandas. Here I'm going to explain the different steps I've followed, but the complete notebook could be checked in my Github Repository.

The information is in a cvs. First let's load the information to see how it is structured:

menu = pd.read_csv('./menu.csv')
menu.head(10)
Enter fullscreen mode Exit fullscreen mode

With this we could see the information of the dataset. This means, the columns and the rows.

Ok, the one I'm interested in is sugar, so I'm going to create a new pandas data frame composed by the column with the item's name and the amount of sugar, and Aldo I'm going to order them in an increasing order:

df_sugars = pd.DataFrame(columns=('Item','Sugars'))
df_sugars['Item'] = menu['Item']
df_sugars['Sugars'] = menu['Sugars']
print("Let's sort them by the amount of sugar they have in a ascending order: ")
df_sugars = df_sugars.sort_values('Sugars', ascending=[True])
print(df_sugars.head(10))
Enter fullscreen mode Exit fullscreen mode

So now that I have this, I want to check which are the menu items that don't have any amount of sugar:

print("Number of items in the menu: "+str(len(menu.index)))
print("Number of items without sugar in the menu: "+str(len(df_sugars.loc[df_sugars['Sugars'] == 0])))
print(df_sugars.loc[df_sugars['Sugars'] == 0])
Enter fullscreen mode Exit fullscreen mode

And I obtain the following result:

Number of items in the menu: 260
Number of items without sugar in the menu: 25
                             Item  Sugars
145                Coffee (Small)       0
99              Kids French Fries       0
96             Small French Fries       0
81   Chicken McNuggets (20 piece)       0
114             Diet Coke (Small)       0
115            Diet Coke (Medium)       0
116             Diet Coke (Large)       0
117             Diet Coke (Child)       0
122        Diet Dr Pepper (Small)       0
123       Diet Dr Pepper (Medium)       0
124        Diet Dr Pepper (Large)       0
98             Large French Fries       0
80   Chicken McNuggets (10 piece)       0
79    Chicken McNuggets (6 piece)       0
136           Dasani Water Bottle       0
137              Iced Tea (Small)       0
138             Iced Tea (Medium)       0
139              Iced Tea (Large)       0
140              Iced Tea (Child)       0
78    Chicken McNuggets (4 piece)       0
146               Coffee (Medium)       0
38                     Hash Brown       0
147                Coffee (Large)       0
125        Diet Dr Pepper (Child)       0
97            Medium French Fries       0
Enter fullscreen mode Exit fullscreen mode

So only 25 elements of 260, which means that only the 9.61% of the items in McDonalds doesn't have any amount of sugar. Now, let's do the plot to see this graphically, for this I'm going to use the Plotly library:

print("Let's start with the bar chart")

data = [go.Bar(
             y = df_sugars['Sugars'].values,
            x = df_sugars['Item'].values,
    )]

py.iplot(data, filename='basic-bar')
Enter fullscreen mode Exit fullscreen mode

Bar Chart

Also, I'm going to plot a scatter plot:

# Now let's plot a scatter plot
# This plot is based on the one made by Anisotropic:
# https://www.kaggle.com/arthurtok/super-sized-we-mcdonald-s-nutritional-metrics

trace = go.Scatter(
    y = df_sugars['Sugars'].values,
    x = df_sugars['Item'].values,
    mode='markers',
    marker=dict(
        size= df_sugars['Sugars'].values,
        #color = np.random.randn(500), #set color equal to a variable
        color = df_sugars['Sugars'].values,
        colorscale='Portland',
        showscale=True
    ),
    text = menu['Item'].values
)
data = [trace]

layout= go.Layout(
    autosize= True,
    title= 'Scatter plot of Sugars per Item on the Menu',
    hovermode= 'closest',
    xaxis=dict(
        showgrid=False,
        zeroline=False,
        showline=False
    ),
    yaxis=dict(
        title= 'Sugars(g)',
        ticklen= 5,
        gridwidth= 2,
        showgrid=False,
        zeroline=False,
        showline=False
    ),
    showlegend= False
)
fig = go.Figure(data=data, layout=layout)
py.iplot(fig,filename='scatterChol')
Enter fullscreen mode Exit fullscreen mode

Scatter Plot

The OMS tell that the max amount of sugar per day should be 50g. Let's see the items of the menu go over this threshold:

# First let's add a new column to the dataframe, all equal to 50
df_sugars['Amount of Sugar recommended (g)'] = 50

# Let's plot them

trace1 = go.Bar(
    y = df_sugars['Sugars'].values,
    x = df_sugars['Item'].values,
    name='Sugars(g)'
)
trace2 = go.Bar(
    y = df_sugars['Amount of Sugar recommended (g)'].values,
    x = df_sugars['Item'].values,
    name='Recommended value of sugar OMS (g)'
)

data = [trace1, trace2]
layout = go.Layout(
    barmode='group'
)

layout= go.Layout(
    autosize= True,
    title= 'Relation between OMSs recommendation and  Sugars per Item on the Menu',
    hovermode= 'closest',
    xaxis=dict(
        showgrid=False,
        zeroline=False,
        showline=False
    ),
    yaxis=dict(
        title= 'Sugars(g)',
        ticklen= 5,
        gridwidth= 2,
        showgrid=False,
        zeroline=False,
        showline=False
    ),
    showlegend= False
)
fig = go.Figure(data=data, layout=layout)
graph = py.iplot(fig, filename='grouped-bar')
Enter fullscreen mode Exit fullscreen mode

Relation with the OMS threshold

So as you could see, there's a lot of items in the menu that are bad for our health
for not saying all of them. For seeing the items in a more detailed way, you could check the notebook because the plots are interactive.

I hope you like this short analysis I've made. Check the repository in Github! :)

Top comments (4)

Collapse
 
agazaboklicka profile image
Aga Zaboklicka

Data analysis of McDonald's. I'm impressed :D Good job ;)

Collapse
 
pacocp profile image
Francisco Carrillo Pérez

Thank you very much!

Collapse
 
anbrandt profile image
Andrzej Brandt

Great one! I really enjoyed going through the process, especially when im beginning to learn a bit of Python.

Collapse
 
pacocp profile image
Francisco Carrillo Pérez

Thank you very much! I'm glad it helps!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.