Skip to content

DEV Community

Charles Landau

Posted on Aug 1, 2019 • Edited on Aug 2, 2019

Visualizing the Primary Debates

#dataviz #showdev #python #nlp

The past couple days in the US, the Democrats have been debating about who should get to run against Trump. The transcripts seemed like a fun subject for dataviz.

All the code for these visualizations are posted here in various commits.

Firstly I thought it would be helpful to get a simple bar showing how much the candidates spoke.

Note: as you'll see, I didn't take time to ensure a perfect cleanse of the data. There are some artifacts and errors, which will be obvious in the word clouds.

I was also surprised to find that if you create a TF-IDF based distance matrix...

... The speakers sort themselves out nicely. The lowest-polling person I've seen described as T1 is Mayor Pete, and the pattern holds whether or not you count him as T1.

Does this mean anything? I don't think so, at least not all on its own.

Finally here are some word clouds:

Overall I think this was a fun little exercise, but I don't suspect that it says too much about the race.

Let me know what you think! Especially if you notice a mistake.

Top comments (6)

Subscribe

Thao Thanh Luu • Aug 2 '19

The word clouds are really difficult to read due to the colors but regardless this is great!

Charles Landau • Aug 2 '19

Thanks for the feedback! I also did a light-themed version but in my quick testing I felt that it looked worse. You can check it out below:

Let me know if you think that's better

Kevin K. Johnson • Aug 2 '19

Probably easier to read, but both versions have contrast issues. The purples on the dark background, the yellows on the light background.

Additionally, word clouds are a bit difficult to read generally.

Charles Landau • Aug 2 '19

Yea I agree with that. The Python wordclouds package does the best it can.

David J Eddy • Aug 2 '19

Any chance we could get links to large format images? They look interesting but are small (when I click on them). I esp. like the word cloud concept.

Charles Landau • Aug 2 '19

Thanks! The source images are in the kaggle link I shared at the top.

kaggle.com/charleslandau/democrati...

Read next

ChatWithSQL — Secure, Schema-Validated Text-to-SQL Python Library, Eliminating Arbitrary Query Risks from LLMs

Sathnindu Kottage - Dec 4

Thursday Quiz

Scofield Idehen - Oct 31

Top 5 Python Libraries to Watch in 2025

Developer Service - Dec 4

Card Fight: A Python Terminal Game

Mareyia - Oct 31