⚠️ I like to use Pipenv to manage dependencies and virtual environments for Python applications – you can use any other dependency manager that suits you best.
The libraries I am going to use are:
- geopandas
- matplotlib
- mind-the-gap
- pandas
- seaborn
- twython
Getting the information
The first thing I need to do is download the information of the London bike stations, for this I will use the library that I create myself (I can tell you about it in the future) to query the TFL API, I will create a separate file to be able to modularize information. The way we can do it with mind-the-gap
is:
from tfl.api import bike_point
all_bike_points = bike_point.all()
# Now we can take a single element and verify its content
place = all_bike_points[0]
print(f"{place.commonName} (LAT: {place.lat}, LON: {place.lon})")
# out: Vicarage Gate, Kensington (LAT: 51.504723, LON: -0.192538)
Additionally, each of those elements like place contain a set of additional properties, or AdditionalProperties
from which we can extract information such as the number of docks available, how many of these docks are in use, and how many bikes you have available. To extract this additional information, I created this this helper function:
def get_number(additional_properties: List[AdditionalProperties], key: str) -> int:
[nb] = [prop.value for prop in additional_properties if prop.key == key]
return int(nb)
# Then we can use it as:
bikes = get_number(place.additionalProperties, "NbBikes")
empty_docks = get_number(place.additionalProperties, "NbEmptyDocks")
docks = get_number(place.additionalProperties, "NbDocks")
print(f"{place.commonName} tiene {bikes} bicicletas disponibles y {docks} docks en total")
# out: Vicarage Gate, Kensington tiene 3 bicicletas disponibles y 18 docks en total
Then we can create a data frame using a for cycle:
def download_cycles_info() -> pd.DataFrame:
all_bike_points = bike_point.all()
query_time = datetime.now()
data = []
for place in all_bike_points:
bikes = get_number(place.additionalProperties,"NbBikes")
empty_docks = get_number(place.additionalProperties,"NbEmptyDocks")
docks = get_number(place.additionalProperties,"NbDocks")
data.append(
(
place.id, place.commonName,
place.lat, place.lon,
bikes, empty_docks, docks,
)
)
data_df = pd.DataFrame(
data, columns=["id","name","lat","lon","bikes","empty_docks","docks"]
).set_index("id")
data_df["query_time"] = pd.to_datetime(query_time).floor("Min")
data_df["proportion"] = (data_df["docks"] - data_df["empty_docks"]) / data_df["docks"]
return data_df
bike_info_data_frame = download_cycles_info()
bike_info_data_frame.head()
| id | name | lat | lon | bikes | empty_docks | docks | query_time | proportion |
|:---------------|:--------------------------|--------:|----------:|--------:|--------------:|--------:|:--------------------|-------------:|
| BikePoints_103 | Vicarage Gate, Kensingt | 51.5047 | -0.192538 | 1 | 17 | 18 | 2022-01-28 16:18:00 | 0.0555556 |
| BikePoints_105 | Westbourne Grove, Baysw | 51.5155 | -0.19024 | 14 | 11 | 26 | 2022-01-28 16:18:00 | 0.576923 |
| BikePoints_106 | Woodstock Street, Mayfa | 51.5141 | -0.147301 | 13 | 8 | 21 | 2022-01-28 16:18:00 | 0.619048 |
| BikePoints_107 | Finsbury Leisure Centre's | 51.526 | -0.096317 | 8 | 12 | 20 | 2022-01-28 16:18:00 | 0.4 |
| BikePoints_108 | Abbey Orchard Street, W | 51.4981 | -0.132102 | 21 | 8 | 29 | 2022-01-28 16:18:00 | 0.724138 |
I have placed these two functions in a file named
Yo he puesto estas dos funciones en un archivo llamado download.py en la raíz de mi repositorio; más adelante lo usaré.
Plotting the information
There are about 750 bike stations in London, as I want to make this information as accessible as possible, it occurred to me that the best way to do it was through an image showing the occupation of each of these stations.
Getting a map
Before I start, I need a map of London in a format that the computer can interpret, and I just found one that I can even download programmatically from the city government's website. To make my life easier, I created a Makefile with a task called shapefiles
that downloads and moves the necessary files:
shapefiles:
wget https://data.london.gov.uk/download/statistical-gis-boundary-files-london/9ba8c833-6370-4b11-abdc-314aa020d5e0/statistical-gis-boundaries-london.zip
unzip statistical-gis-boundaries-london.zip
mv statistical-gis-boundaries-london/ESRI shapefiles/
rm -rf statistical-gis-boundaries-london statistical-gis-boundaries-london.zip
Which should leave us with a folder called shapefiles whose content is as follows:
shapefiles
├── London_Borough_Excluding_MHW.GSS_CODE.atx
├── London_Borough_Excluding_MHW.NAME.atx
├── London_Borough_Excluding_MHW.dbf
├── London_Borough_Excluding_MHW.prj
├── London_Borough_Excluding_MHW.sbn
├── London_Borough_Excluding_MHW.sbx
├── London_Borough_Excluding_MHW.shp
├── London_Borough_Excluding_MHW.shp.xml
└── London_Borough_Excluding_MHW.shx
Plotting a map
This function is more or less simple, there is a follow up post where I go into further detail about how I created the map, for the time being, I will just post the code and talk generally about what is going on:
def plot_map(cycles_info: pd.DataFrame) -> str:
london_map = gpd.read_file("shapefiles/London_Borough_Excluding_MHW.shp").to_crs(epsg=4326)
fig = plt.figure(figsize=(6, 4), dpi=170)
ax = fig.gca()
london_map.plot(ax=ax)
sns.scatter(y="lat", x="lon", hue="proportion", palette="Blues", data=cycles_info, s=25, ax=ax)
prepare_axes(ax, cycles_info)
map_file = save_fig(fig)
return map_file
The first thing is to read the .shp file of the map that we are going to use. Then we create a shape and take the axes to draw on it. We draw the map using the plot
method of GeoDataFrame
. We use seaborn to put the stations on the map, keep in mind that we are specifying the location (lat
, lon
) for each point, the coloration of each point will be defined by the column proportion
and finally the size of each of them will be 25. To finish we make some adjustments to the axis and save the figure in a temporary address only to return the path where the generated image is saved.
I have saved this function in a separate file named plot.py.
Tweeting the information
We already have the image, it's time to tweet it using Twython, we're going to need a few secrets that we got from Twitter in the previous post, for now. Let's use those secrets to create a client of Twython:
app_key = os.environ["API_KEY"]
app_secret = os.environ["API_SECRET"]
oauth_token = os.environ["ACCESS_TOKEN"]
oauth_token_secret = os.environ["ACCESS_TOKEN_SECRET"]
twitter = Twython(app_key, app_secret, oauth_token, oauth_token_secret)
The way the Twitter API works requires that we first upload the image to their service and then tweet it, for both we are going to use the newly created twitter
variable, the trick is to use the media_id
we retrieved from uploading the image:
with open(image_path, "rb") as cycles_png:
image = twitter.upload_media(media=cycles_png)
now = datetime.now().strftime("%m/%d/%Y, %H:%M")
twitter.update_status(
status=f'London Cycles update at {now}',
media_ids=[image['media_id']]
)
Just to modularize the code I put this code inside a function and this function in its own tweeter.py file.
Conclusion
We already have everything in place, now we can combine all our functions to achieve that with a single script we download information, generate a map and tweet it:
from download import download_cycles_info
from plot import plot_map
from tweeter import tweet
def execute():
information = download_cycles_info()
map_image = plot_map(information)
tweet(map_image)
I saved this code in a file called app.py. And this is how it looks like by the end of this post.
Remember that you can find me on Twitter at @feregri_no to ask me about this post – if something is not so clear or you found a typo. The final code for this series is on GitHub and the account tweeting the status of the bike network is @CyclesLondon.
Top comments (0)