Editor's note: Jesus Martinez joined Lob's Print and Mail API team in for a 3-month internship. He reworked our redactions and contributed to overall improvements within our APIs. This article was written about his experience improving user data redaction at Lob.
Since Lob's customer base includes those in the healthcare and insurance industries, our Engineering teams are keenly aware of the need to manage data privacy. Whether it's per contractual and compliance obligations (i.e., GDPR and CCPA) or based on customer request and retention policies, engineers need to be able to redact personal identifiable information (PII).
Improving user data redaction at Lob
User data redaction involves obfuscating sensitive user data. In Lob’s case, it includes, but isn't limited to first name, last name, and email address.
At Lob, we have a way to programmatically handle redaction for resources like letters and postcards but our redaction system needed updating to include user data. We had a manual process to redact user data by querying the production users table to perform redactions but to scale, it needed to be automated. In this post, we'll walk through how we implemented our solution.
Diving into the problem
While scoping this project, I reviewed our redaction codebase to determine if rewriting it to include user data made sense. To not disrupt existing functionality, I opted to create an endpoint designed for user data redactions and make it work within the current redaction system.
One of my favorite parts of being an engineer is putting on my detective cap and digging into a problem. Working on the redactions project necessitated the use of a magnifying glass.
The solution I landed on was to store user data redactions in a database table. By making post requests to a new redactions endpoint Lob can store information about the user and have a worker run in the background checking daily for new users to redact. The worker does the following:
- Selects new records with the status 'created' from the redactions table
- Assigns the record a unique id
- Updates the status of the record from 'created' to 'ongoing'
- Calls an AWS step function, which retrieves the necessary information and fields
- The AWS step function makes a POST request to the {resource}/redact endpoint to handles the actual redaction
- Updates the status of the record to success or failed
In addition to the endpoint, I created two migration scripts. The first script updated constraints in the redactions table and allows users to be a redactable resource. The second script added a ‘date_redacted’ and ‘redaction_id’ property to the users table and updated the model and validators so users could be treated as a redactable resource.
After my initial investigation, I proceeded to define a user redaction endpoint. Testing the new endpoint locally worked fine, but an issue surfaced when I deployed it to our staging environment. I discovered I needed to enable the step function to access the new redaction endpoint.
Lob uses a node-wrapper to make HTTP requests. I had to publish a new version of this wrapper to include the new user redaction capabilities. After writing many more tests, I deployed it to staging and everything worked as expected.
One of Lob’s core values is Own the Outcome. To help future Lob developers, I updated our internal documentation with implementation details on user data redaction and how to extend Lob’s redaction system to support future use cases.
Takeaways
This project was a success for several reasons:
- A solution was implemented to automate the problem of redacting user data
- The project contributed to making the redactions process easier
- The solution has been used to redact user data for customers
- Not only did I deliver an important new feature, but I gained experience with Docker, AWS LocalStack, AWS Lambda, AWS Step Functions, and Postgres.
I don’t think you can ask for anything more from an internship.
Top comments (0)