Antonio Feregrino

Posted on Feb 14, 2022 • Edited on May 29, 2022

Programming the lambda – Tweeting from a lambda

#python #github #aws #lambda

⚠️ I like to use Pipenv to manage dependencies and virtual environments for Python applications – you can use any other dependency manager that suits you best.

The libraries I am going to use are:

geopandas
matplotlib
mind-the-gap
pandas
seaborn
twython

Getting the information

The first thing I need to do is download the information of the London bike stations, for this I will use the library that I create myself (I can tell you about it in the future) to query the TFL API, I will create a separate file to be able to modularize information. The way we can do it with mind-the-gap is:

from tfl.api import bike_point

all_bike_points = bike_point.all()

# Now we can take a single element and verify its content
place = all_bike_points[0]
print(f"{place.commonName} (LAT: {place.lat}, LON: {place.lon})")
# out: Vicarage Gate, Kensington (LAT: 51.504723, LON: -0.192538)

Additionally, each of those elements like place contain a set of additional properties, or AdditionalProperties from which we can extract information such as the number of docks available, how many of these docks are in use, and how many bikes you have available. To extract this additional information, I created this this helper function:

def get_number(additional_properties: List[AdditionalProperties], key: str) -> int:
    [nb] = [prop.value for prop in additional_properties if prop.key == key]
    return int(nb)

# Then we can use it as:
bikes = get_number(place.additionalProperties, "NbBikes")
empty_docks = get_number(place.additionalProperties, "NbEmptyDocks")
docks = get_number(place.additionalProperties, "NbDocks")

print(f"{place.commonName} tiene {bikes} bicicletas disponibles y {docks} docks en total")
# out: Vicarage Gate, Kensington tiene 3 bicicletas disponibles y 18 docks en total

Then we can create a data frame using a for cycle:

def download_cycles_info() -> pd.DataFrame:
    all_bike_points = bike_point.all()
    query_time = datetime.now()
    data = []

    for place in all_bike_points:
        bikes = get_number(place.additionalProperties,"NbBikes")
        empty_docks = get_number(place.additionalProperties,"NbEmptyDocks")
        docks = get_number(place.additionalProperties,"NbDocks")
        data.append(
            (
                                place.id, place.commonName,
                place.lat, place.lon,
                bikes, empty_docks, docks,
            )
        )

    data_df = pd.DataFrame(
        data, columns=["id","name","lat","lon","bikes","empty_docks","docks"]
    ).set_index("id")
    data_df["query_time"] = pd.to_datetime(query_time).floor("Min")
        data_df["proportion"] = (data_df["docks"] - data_df["empty_docks"]) / data_df["docks"]

    return data_df

bike_info_data_frame = download_cycles_info()
bike_info_data_frame.head()

| id             | name                      |     lat |       lon |   bikes |   empty_docks |   docks | query_time          |   proportion |
|:---------------|:--------------------------|--------:|----------:|--------:|--------------:|--------:|:--------------------|-------------:|
| BikePoints_103 | Vicarage Gate, Kensingt   | 51.5047 | -0.192538 |       1 |            17 |      18 | 2022-01-28 16:18:00 |    0.0555556 |
| BikePoints_105 | Westbourne Grove, Baysw   | 51.5155 | -0.19024  |      14 |            11 |      26 | 2022-01-28 16:18:00 |    0.576923  |
| BikePoints_106 | Woodstock Street, Mayfa   | 51.5141 | -0.147301 |      13 |             8 |      21 | 2022-01-28 16:18:00 |    0.619048  |
| BikePoints_107 | Finsbury Leisure Centre's | 51.526  | -0.096317 |       8 |            12 |      20 | 2022-01-28 16:18:00 |    0.4       |
| BikePoints_108 | Abbey Orchard Street, W   | 51.4981 | -0.132102 |      21 |             8 |      29 | 2022-01-28 16:18:00 |    0.724138  |

I have placed these two functions in a file named
Yo he puesto estas dos funciones en un archivo llamado download.py en la raíz de mi repositorio; más adelante lo usaré.

Plotting the information

There are about 750 bike stations in London, as I want to make this information as accessible as possible, it occurred to me that the best way to do it was through an image showing the occupation of each of these stations.

Getting a map

Before I start, I need a map of London in a format that the computer can interpret, and I just found one that I can even download programmatically from the city government's website. To make my life easier, I created a Makefile with a task called shapefiles that downloads and moves the necessary files:

shapefiles:
    wget https://data.london.gov.uk/download/statistical-gis-boundary-files-london/9ba8c833-6370-4b11-abdc-314aa020d5e0/statistical-gis-boundaries-london.zip
    unzip statistical-gis-boundaries-london.zip
    mv statistical-gis-boundaries-london/ESRI shapefiles/
    rm -rf statistical-gis-boundaries-london statistical-gis-boundaries-london.zip

Which should leave us with a folder called shapefiles whose content is as follows:

shapefiles
├── London_Borough_Excluding_MHW.GSS_CODE.atx
├── London_Borough_Excluding_MHW.NAME.atx
├── London_Borough_Excluding_MHW.dbf
├── London_Borough_Excluding_MHW.prj
├── London_Borough_Excluding_MHW.sbn
├── London_Borough_Excluding_MHW.sbx
├── London_Borough_Excluding_MHW.shp
├── London_Borough_Excluding_MHW.shp.xml
└── London_Borough_Excluding_MHW.shx

Plotting a map

This function is more or less simple, there is a follow up post where I go into further detail about how I created the map, for the time being, I will just post the code and talk generally about what is going on:

def plot_map(cycles_info: pd.DataFrame) -> str:
    london_map = gpd.read_file("shapefiles/London_Borough_Excluding_MHW.shp").to_crs(epsg=4326)

    fig = plt.figure(figsize=(6, 4), dpi=170)
    ax = fig.gca()

    london_map.plot(ax=ax)
    sns.scatter(y="lat", x="lon", hue="proportion", palette="Blues", data=cycles_info, s=25, ax=ax)

    prepare_axes(ax, cycles_info)

    map_file = save_fig(fig)

    return map_file

The first thing is to read the .shp file of the map that we are going to use. Then we create a shape and take the axes to draw on it. We draw the map using the plot method of GeoDataFrame. We use seaborn to put the stations on the map, keep in mind that we are specifying the location (lat, lon) for each point, the coloration of each point will be defined by the column proportion and finally the size of each of them will be 25. To finish we make some adjustments to the axis and save the figure in a temporary address only to return the path where the generated image is saved.

I have saved this function in a separate file named plot.py.

Tweeting the information

We already have the image, it's time to tweet it using Twython, we're going to need a few secrets that we got from Twitter in the previous post, for now. Let's use those secrets to create a client of Twython:

app_key = os.environ["API_KEY"]
app_secret = os.environ["API_SECRET"]
oauth_token = os.environ["ACCESS_TOKEN"]
oauth_token_secret = os.environ["ACCESS_TOKEN_SECRET"]

twitter = Twython(app_key, app_secret, oauth_token, oauth_token_secret)

The way the Twitter API works requires that we first upload the image to their service and then tweet it, for both we are going to use the newly created twitter variable, the trick is to use the media_id we retrieved from uploading the image:

with open(image_path, "rb") as cycles_png:
    image = twitter.upload_media(media=cycles_png)

now = datetime.now().strftime("%m/%d/%Y, %H:%M")
twitter.update_status(
    status=f'London Cycles update at {now}',
    media_ids=[image['media_id']]
)

Just to modularize the code I put this code inside a function and this function in its own tweeter.py file.

Conclusion

We already have everything in place, now we can combine all our functions to achieve that with a single script we download information, generate a map and tweet it:

from download import download_cycles_info
from plot import plot_map
from tweeter import tweet

def execute():
    information = download_cycles_info()
    map_image = plot_map(information)
    tweet(map_image)

I saved this code in a file called app.py. And this is how it looks like by the end of this post.

Remember that you can find me on Twitter at @feregri_no to ask me about this post – if something is not so clear or you found a typo. The final code for this series is on GitHub and the account tweeting the status of the bike network is @CyclesLondon.

DEV Community

Programming the lambda – Tweeting from a lambda

Getting the information

Plotting the information

Getting a map

Plotting a map

Tweeting the information

Conclusion

Top comments (0)

Read next

New AWS WAF Feature: Top Insights Visualizations

Create an auto-merging workflow on Github

Code Your Diagrams: Automate Architecture with Python's Diagrams Library

Python 3.13: The Gateway to High-Performance Multithreading Without GIL