Dictionaries are good for storing structured data. They have a key/value pair so you can look up values of certain keys. The author provides some ways to initialize a dictionary, with comments about what is more or less pythonic (I'll take the author's word for it, but open to other perspectives).
Some of the things you can do with dictionaries
are query keys, values, assign new key/value pairs, check for existence of keys or retrieve certain values.
empty_dict = {} # most pythonic
empty_dict2 = dict() # less pythonic
grades = {"Joel": 80, "Grus": 99} # dictionary literal
type(grades) # type check, dict
# use bracket to look up values
grades["Grus"] # 99
grades["Joel"] # 80
# KeyError for looking up non-existent keys
try:
kate_grades = grades["Kate"]
except KeyError:
print("That key doesn't exist")
# use in operator to check existence of key
joe_has_grade = "Joel" in grades
joe_has_grade # true
kate_does_not = "Kate" in grades
kate_does_not # false
# use 'get' method to get values in dictionaries
grades.get("Joel") # 80
grades.get("Grus") # 99
grades.get("Kate") # default: None
# assign new key/value pair using brackets
grades["Tim"] = 93
grades # {'Joel': 80, 'Grus': 99, 'Tim': 93}
Dictionaries are good for representing structured data that can be queried. The key take-away here is that in order to iterate through dictionaries
to get either keys
, values
or both, we'll need to use specific methods likes keys()
, values()
or items()
.
tweet = {
"user": "paulapivat",
"text": "Reading Data Science from Scratch",
"retweet_count": 100,
"hashtags": ["#66daysofdata", "datascience", "machinelearning", "python", "R"]
}
# query specific values
tweet["retweet_count"] # 100
# query values within a list
tweet["hashtags"] # ['#66daysofdata', 'datascience', 'machinelearning', 'python', 'R']
tweet["hashtags"][2] # 'machinelearning'
# retrieve ALL keys
tweet_keys = tweet.keys()
tweet_keys # dict_keys(['user', 'text', 'retweet_count', 'hashtags'])
type(tweet_keys) # different data type: dict != dict_keys
# retrieve ALL values
tweet_values = tweet.values()
tweet_values # dict_values(['paulapivat', 'Reading Data Science from Scratch', 100, ['#66daysofdata', 'datascience', 'machinelearning', 'python', 'R']])
type(tweet_values) # different data type: dict != dict_values
# create iterable for Key-Value pairs (in tuple)
tweet_items = tweet.items()
# iterate through tweet_items()
for key,value in tweet_items:
print("These are the keys:", key)
print("These are the values:", value)
# cannot iterate through original tweet dictionary
# ValueError: too many values to unpack (expected 2)
for key, value in tweet:
print(key)
# cannot use 'enumerate' because that only provides index and key (no value)
for key, value in enumerate(tweet):
print(key) # print 0 1 2 3 - index values
print(value) # user text retweet_count hashtags (incorrectly print keys)
Just like in lists
and tuples
, you can use the in
operator to find membership. The one caveat is you cannot look up values that are in lists
, unless you use bracket notation to help.
# search keys
"user" in tweet # true
"bball" in tweet # false
"paulapivat" in tweet_values # true
'python' in tweet_values # false (python is nested in 'hashtags')
"hashtags" in tweet # true
# finding values inside a list requires brackets to help
'python' in tweet['hashtags'] # true
What is or is not hashable?
Dictionary
keys must be hashable.
Strings
are hashable. So we can use strings
as dictionary keys, but we cannot use lists
because they are not hashable.
paul = "paul"
type(paul) # check type, str
hash(paul) # -3897810863245179227 ; strings are hashable
paul.__hash__() # -3897810863245179227 ; another way to find the hash
jake = ['jake'] # this is a list
type(jake) # check type, list
# lists are not hashable - cannot be used as dictionary keys
try:
hash(jake)
except TypeError:
print('lists are not hashable')
For more content on data science, machine learning, R, Python, SQL and more, find me on Twitter.
Top comments (0)