In this article, I will explain how I got frustrated with writing redundant code needed to extend a service every time a new requirement was raised, and how I automated this with code generation.
You can skip to one of the sections:
0) Backstory / Lore Time π§
1) The Redundant Work π₯±
2) Writing a Code Generator π§βπ»
3) The Result β‘
Backstory / Lore Time π§
A new customer logged in at Middleware and they had 100 times the data of our previous ones, causing our data pipelines to choke. We had to ship a hot fix ASAP.
After mitigation, we discovered most of their data was irrelevant bot-generated content, which we could filter out during data sync. We implemented a hot fix to filter data at ingestion, which worked well.
Next, I was tasked with adding a new setting to filter bot-generated data without manual coding. While this provided more control over what we sync, it was a boring π₯±, redundant task that required understanding the Setting Service context π.
The Redundant Work π₯±
We have a Setting Service in our codebase that handles all the settings in our product. The code follows a great structure; it's easy to breeze through and handles any type of setting needed by the product.
The whole process of adding a new setting requires any developer to see a previous PR where someone added a new setting, regain context, make code changes across multiple files (spanning from adapters to validators), which all depend on the new setting's schema, and ensure that the APIs are working.
The code changes are straightforward; they just feel like a lot of manual work with little gain and are heavily dependent on the class schema of the new setting type. It's easily half a day of work for any developer.
When I got the task to add a setting, I thought to myself: π€π‘ If some work feels redundant and follows changes based on a set structure, I should try to automate it.
Writing a Code Generator π§βπ»
I had no idea how I could automate code generation based on some rules. I have often heard my friends working in bigger organizations speak about how they have whole services that build out APIs and layouts for them based on certain schemas, so I knew it was possible.
Research π
I googled if a solution already existed. People have built similar things, but none tied to my use case.
I used ChatGPT to churn out an easy solution, but none worked. My next idea was to just send all the files to GPT using a script and get updated files, but that had two problems:
1) LLMs are not reliable enough with code generation.
2) Sharing proprietary code would get me fired π.
Breaking Down the Problem π οΈ
The next step was to verify if a solution was even possible for our use case, so I did the following:
1) Gained all the context needed to add a new setting to the codebase.
2) Tracked all the files and specific classes and functions where the changes would go.
3) Mapped the nature of changes needed in each file with the new setting schema.
So far, the function/variable/class naming was consistent with the setting type name, and the logic for adapters and validators could be coded out and automated based on the primitive data types used in my new setting type schema.
If I added placeholder comments where new code was to be added and my script could identify which code to add in place of which comment and at the same time move the comment down so it could be used again next time, it would solve the problem.
The Solution β
The first challenge was to create pure functions that could generate code based on the new setting name and schema. Generating new enums, classes, and dictionaries was straightforward, but creating handlers, adapters, and validators was more complex and time-consuming. I used GPT and Claude to help develop a generic solution for this.
The next challenge was locating placeholder comments across files and inserting the generated code while handling Python's indentation issues. This was particularly painful as I had little experience with regex and had to learn it on the go.
Here is how I managed the code population bit, explained with an example because it was new for me and might be for you as well:
enum_pattern = r"(?P<indent>\s*)# ADD NEW SETTING TYPE ENUM HERE\n"
match = re.search(enum_pattern, content)
if match:
indent = match.group('indent')
new_enum_entry = f'{indent}{setting_type.upper()} = "{setting_type.upper()}"\n'
content = re.sub(enum_pattern, new_enum_entry + match.group(0), content)
-
enum_pattern
: The regex pattern to find the placeholder comment. -
re.search
: Searches for the pattern in the file content. -
match.group('indent')
: Captures the indentation level. -
re.sub
: Replaces the placeholder with the new enum entry, preserving the indentation.
And all this effort paid off β¨
The Result β‘
After hours of struggling to build the code generator script, it paid off π
I was able to build a script that would prompt the user for the new setting name and the required fields along with their types and make changes across files. All the developers would need to do is add the imports (because they were too messy to handle) and handle any complex data types (which would be rather simple as 90% of the code is generated).
This is the Pull Request that adds the script to our Middleware Open Source Codebase. I would recommend going through the code if you wish to implement a similar solution for your use case: https://github.com/middlewarehq/middleware/pull/433
This brought down the development time of adding a setting from a few hours to less than 10 minutes and, most importantly, helped me and other developers escape the boring work π!
Thanks for sticking till the end π€
If you liked the article please spare some time to star the opensource repositories I maintain:
β https://github.com/middlewarehq/middleware
β https://github.com/RocketChat/Apps.Github22
You can follow me on socials:
GitHub/samad-yar-khan
LinkedIn/samad-yar-khan
X/samadnotyouryar
Top comments (5)
samad paajiiii learning new tricks from growth teamπ₯
why not use cookiecutter
I have never used that, will look into it.
Thank you for sharing this Samad! I fully support writing code generators, and this is clearly the way to go to help others as well, congratulations!
automation for the win ser!!