The Bike Shed
360: ActiveRecord Models
Fellow thoughtboter Sarah Lima joins Joël to discuss an issue Sarah had when she was doing a code review recently: making HTTP requests in an ActiveRecord model. Her concern with that approach was that a class was having too many responsibilities that would break the single-responsibility principle, and that it would make the class hard to maintain. Because the ActiveRecord layer is a layer that's meant to encapsulate business roles and data, her issue was that adding another responsibility on top of it would be too much. Her solution was to extract a class that would handle the whole HTTP request process.
This episode is brought to you by Airbrake. Visit Frictionless error monitoring and performance insight for your app stack.
- SQL TRIM()
- Iteration as an anti-pattern
- WET tests
- thoughtbot code review guidelines
- Side effects in tests
- Active Resource
- Different strategies for 3rd party requests
Transcript:
JOËL: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Joël Quenneville. And today, I'm joined by fellow thoughtboter Sarah Lima.
SARAH: Happy to be here.
JOËL: And together, we're here to share a little bit of what we've learned along the way. So, Sarah, what's new in your world?
SARAH: Well, after a year and a half working on the same thoughtbot client, I have rolled off, and I have joined a new team. And I am learning a lot about not only a new codebase but learning to work with a new team. So that's always challenging, and this time it's not different.
JOËL: What is something that you like to do when joining a new team to help smooth the onboarding process?
SARAH: Well, I think especially getting to know people with one on ones. This time, I didn't do that right away because I had a bunch of time off scheduled right at the beginning of the project. But I did it right after I came back. And I'm learning a lot about my new colleagues, how they like to work, how they learn best. So, for instance, there are some people that like to learn and grow by reading blog posts, reading books, and there are other people that don't like that as much.
JOËL: So when you joined the new project, you just reached out to all of these people and set up a few meetings just to get to know them.
SARAH: Yeah, exactly.
JOËL: That's really good. I've never done that on a project. And now that you've said it, it kind of seems obvious. Maybe I should do that moving forward to get to know new teammates.
SARAH: Yeah. And I think it's easier on my project because it's a very small team. There are four of us thoughtboters, and there are just two client developers. So it was easier.
JOËL: What about on the code side of things? Are there any tricks you like to do when you're first getting started in a new codebase?
SARAH: Well, I think I really enjoy diving in right away, working on something small, and asking questions. I have also found it helpful in the past, especially on larger codebases, that someone that's experienced on a project gives me an overview showing me the quirks. And, of course, a good README is always a good thing to have, and during the process, always be updating the README. In this recent project, it was not different. I opened a lot of PRs to update the README. So that was good to have a PR right on your first day.
JOËL: I love that. I think that's usually my goal when I start on a new project is to have a PR the first day that fixes anything in the setup script that has been broken since the last person onboarded or documentation that was wrong.
SARAH: Yeah, absolutely.
JOËL: It's always a strong first contribution.
SARAH: Yeah. What about you, Joël? What's going on? What's new in your world?
JOËL: I've been investigating flaky tests, and I ran across a wild bug this week. I had a test that would fail every now and then. And it was pulling some data from Postgres and then doing some transformations on it. And I couldn't figure out why it was failing. It was a complex query. So it was just pulling out not ActiveRecord objects but a raw array of values. At some point, I was putting a PUT statement in the code with the array of values I expected to get and the array I would actually get.
And I was surprised to see that there is a field in there that is a float that was rounded to a different number of decimal places. I was like, that doesn't seem right. And so I was digging into it more, and I found out that this decimal value is from a timestamp that is in a file name for an mp4 video file name. And what is happening is that when we're querying the database, we're trying to extract the timestamp out of the file name by dropping the .mp4 file extension. And we're using the SQL TRIM function.
Unfortunately, TRIM does not do whatever the original authors thought it does. It doesn't just remove that substring from the end, but instead, it will remove any of those characters, so in my case, any of dot, M, P, or 4 in any combination from the end of the string. So anytime that my timestamp ended in a four, any fours were just getting chopped off. So if it ended in 44.mp4, the 44 would also get removed, not just the .mp4, which meant that randomly whenever a timestamp happened to end in 4, my test would flake.
SARAH: Wow. Do you have any idea how much time you spent debugging that?
JOËL: Oh, probably took, I'd say, a day, two days. This is spread over a couple of debugging sessions. But eventually, finding that particular location for the bug probably took us a couple of days. In the end, the bug fix for this is just a couple of lines, a couple of days work, and the diff is only a few lines. But I'm sure that the discussion on the PR is going to be really interesting. There's probably going to be a description that is a lot longer than the actual diff.
SARAH: Yeah, 100%. [laughs]
JOËL: Have you run across any interesting PRs on your new project?
SARAH: Yeah, I did. In fact, I recently reviewed a PR that had three interesting main issues that I wanted to address. And I wanted to lead the person that was working on it to a slightly better solution. So the three issues I saw were that the tests that were added were very DRY, so that was making everything a bit difficult to understand. The second one was that I saw one of the ActiveRecord classes was making HTTP requests, and that didn't sound like a good idea to me.
JOËL: That is unusual.
SARAH: Yes. The third one was that there were a lot of collections being built iteratively where another innumerable method would be a better fit, such as map instead of an each call.
JOËL: Oh, this is a classic situation where you're just using each to go through and transform something, and you've got some sort of external array that you're mutating as part of the each.
SARAH: Yes.
JOËL: There's a great thought article, I believe, by Joe Ferris on Iteration as an Anti-pattern.
SARAH: I think it's by Mike Burns. And I have referred to that article. In fact, I had very good articles for two of these three problems. I referred to a bunch of articles about WET tests as opposed to DRY tests, like how striving for tests that are DRY is not a good idea as opposed to telling a whole story in your tests. And I referred to that other article how iteratively building a collection can be an anti-pattern by Mike Burns. But the second issue about HTTP requests I didn't have anything to refer to. Maybe we should write one.
JOËL: This reminds me that in the thoughtbot Slack, we have a custom emoji for you should write a blog post about that. And this would probably be a good time to use it.
SARAH: Yes. So, Joël, how do you typically handle a PR that is maybe too long, and you have a lot of concerns about it? And how do you handle delivering that feedback?
JOËL: Oh, that is a challenge. I've definitely done it poorly in the past. And I think the wrong way to go about that situation is to go thoroughly through the PR and leave 50, 60 comments. That is overwhelming for the other person. And they're going to have a really bad day when they see 50 comments come through. And there's so much that they can't really address the main things you were talking about anyway.
So what I generally try to do, and it's kind of nice now that GitHub doesn't immediately publish your comments, is if I realize...like I start putting some more detailed comments, and then I realize, oh, there's going to be a lot, zoom out a little bit, and try to find are there some higher level trends that I can talk about? And maybe even just summarize in a larger comment at the bottom and say, "Hey, I see some larger structural issues," or "This PR is leaning very heavily on a technique that I think is maybe not the best use here. Maybe we should discuss that," instead of digging into maybe the actual implementation details of the code.
SARAH: Yeah, funny, you should mention that. I have recently also started doing that, using the summary version of GitHub reviews. And I used to just go file by file and leaving comments right away. And I'm thinking that this is not a good idea, especially when the PR is long. So I think another thing I would do is also call the person to pair and ask questions and understand where the person is coming from and also explain what are your concerns and how you both can get to a better place with that PR.
JOËL: That's really important. You have to remember there's another person on the other end of this. I love the idea of reaching out to them directly. Especially if there's a larger conversation to be had around approach or implementation, it's often easier to resolve those directly rather than back and forth in GitHub comments. So you mentioned situations where the PR is really long. Have you ever had to push back on that in some way?
SARAH: Yes. Especially when I saw, whoa, that's going to be difficult to understand, that's going to be difficult to review. And I have reached out to the person to say, "Hey, what about we split that PR in two?" Of course thinking about splitting the PR in a way that makes sense, in a way that still delivers our users’ value as soon as possible.
JOËL: I've been in situations like that where it's a really long PR, and the person has already invested a lot of work into it. And maybe it's even gone through a round of reviews. It feels almost too late to ask them to split up the work. But then I've actually regretted not doing that because there's so much complexity going on that then it doesn't work, or there are some bugs in it. We struggle to ship this, or it might just have to go through so many rounds of review and re-review and re-review. And because the PR is so long, it's a huge commitment for me to re-review it every time.
So there are situations I've been in where I wish that before even looking at the code at all, I was like, this is too long. We need to either slim down the story of what's being done. Because sometimes that's what happens is that the ticket is not well-defined, and someone goes in and just sort of keeps adding more code. And it becomes a bit of a big ball of mud. So, either helping to refine the ticket first or splitting the PR rather than actually looking at the code.
SARAH: Yeah, and pairing often can also help with that. So especially as consultants, our clients may ask us to work on different projects, and you work alone. And you may have tight deadlines, but I think it's always helpful to find time anyway to help your colleagues as well.
JOËL: I like that. I think there's a lot of value in the work that we do, where we collaborate with others in addition to whatever we do solo. So, oftentimes, it's great to pair with people at a client where possible to become involved in the code review process to even get involved in maybe some of the more broader system design conversations, sprint planning. All of those things are really good to jump into more than just getting siloed into working on just a solo feature.
SARAH: Yes, 100%.
MID-ROLL AD:
Debugging errors can be a developer’s worst nightmare...but it doesn’t have to be. Airbrake is an award-winning error monitoring, performance, and deployment tracking tool created by developers for developers that can actually help cut your debugging time in half.
So why do developers love Airbrake? It has all of the information that web developers need to monitor their application - including error management, performance insights, and deploy tracking!
Airbrake’s debugging tool catches all of your project errors, intelligently groups them, and points you to the issue in the code so you can quickly fix the bug before customers are impacted.
In addition to stellar error monitoring, Airbrake’s lightweight APM helps developers to track the performance and availability of their application through metrics like HTTP requests, response times, error occurrences, and user satisfaction.
Finally, Airbrake Deploy Tracking helps developers track trends, fix bad deploys, and improve code quality.
Since 2008, Airbrake has been a staple in the Ruby community and has grown to cover all major programming languages. Airbrake seamlessly integrates with your favorite apps to include modern features like single sign-on and SDK-based installation. From testing to production, Airbrake notifiers have your back.
Your time is valuable, so why waste it combing through logs, waiting for user reports, or retrofitting other tools to monitor your application? You literally have nothing to lose. Head on over to airbrake.io/try/bikeshed to create your FREE developer account today!
JOËL: So one of the things you mentioned that stood out for you when you were doing some code review recently was making HTTP requests in an ActiveRecord model. Why is that something that sort of caught your eyes, maybe an area to push back on in a particular design?
SARAH: That's a good question. My concern with that approach was that our class was having too many responsibilities that would break the SRP principle, the single-responsibility principle, and that would make our class hard to maintain. So the ActiveRecord layer is a layer that's meant to encapsulate business roles and data. So I was worried that adding another responsibility on top of it would be too much. So my idea was that we would extract a class that would handle the whole HTTP request process.
JOËL: Yeah, I feel like my instincts typically when I've done third-party integrations is that the ActiveRecord class should not know about the external internet world. It knows about the database. It knows about some of its core model functionality but that knowing about the internet world is somebody else's responsibility and that, ideally, the direction of dependency should flow the other way. So maybe the class that makes an external request knows about the ActiveRecord object if it needs to let's say, instantiate an instance of that model using data from an external request.
Or maybe it's even some third-party thing; maybe it's their controller that knows how to make or that will ask another object to make a request to some API and might also make a request to the model and ask it for some database data and then combine those two together. But that the ActiveRecord object only knows about that database area of responsibility and doesn't know that other things are also happening in the system.
SARAH: Absolutely. And I was also thinking that that class would have a difficult test to write. So a good idea is to separate our code that is side-effectful into their own classes, and that makes our tests so much easier.
JOËL: I actually wrote an article on the topic where one of my realizations at some point was that a lot of the pain points in code are what functional programmers would call side effects, so things like HTTP requests. And these are often things where we need to stub or do other things. And so isolating them as much as possible often simplifies our tests.
SARAH: Yeah, certainly. And I refer to that article every time I have the chance.
JOËL: Have you encountered the general concept of layered architectures, or hexagonal architectures, or things like that in the world of Rails or maybe elsewhere?
SARAH: Not hexagonal architecture. I have heard about it, but I haven't dived into it yet. Can you give us an overview?
JOËL: So I've also not worked with an actual hexagonal architecture. But the general idea, I guess, of layered architectures is that you build your code in a variety of layers, and different layers don't have access to or don't know about the ones...and I forget in this model if it's above or below, let's say it's below. So the inner layers don't know about the outer layers, but the outer layers can know about anything below them.
And so if the core of your app is the database, your database is most definitely not knowing about anything outside of just its data. And your ActiveRecord models that sit on top of that know about the database, but they don't know if they're being fronted by a web application, or a command line, or anything else. And then, above that, you might have more of a business process layer that knows about the database. It might know about how to make some external requests, but it doesn't know about anything above that.
And then, maybe at the final layer, you've got an application layer that handles things like controllers and interactions with users of the site. The core idea is that you split it into layers, and the higher-up layers know about everything below them, but no layer knows about what's above it. I feel like we're loosely applying that to the situation here with ActiveRecord in that it feels like the ActiveRecord layer if you will, shouldn't really know about third-party API requests.
SARAH: So, one exception to that is the ActiveResource approach that connects our business objects to REST services. So if you have an external website and you want to connect it via HTTP, you can do it using Rails ActiveResource.
JOËL: That is interesting because it functions like an ActiveRecord object, but instead of being backed by the database, it's backed by some kind of API. I almost wonder if...let's refactor our mental model here. And instead of saying that HTTP belongs in a separate layer that's higher up, maybe, in this case, it's almost like a sibling layer.
So your ActiveRecord models know about the database, and they make database requests in ActiveResource, or I think there are some gems that provide similar behavior. It might be backed by a particular API, but neither of them should know about the other. So maybe an ActiveResource model should not be making database requests.
SARAH: Yes, I like that line of thought.
JOËL: I guess the question then becomes, what about interactions between the two where you want to, I don't know, have some kind of association? You know, I don't think I've ever used ActiveResource on a project.
SARAH: I did once when trying to work with something close to microservice architecture. So we had a monolith, and we built a small service that was also in Rails, and we needed to consume the data that was stored in the monolith.
JOËL: And did you like that approach?
SARAH: Yeah. I think in that specific scenario, it was very productive. And I enjoyed a lot the API that Rails provided me via ActiveResources.
JOËL: Did you ever have to mix ActiveResource models and ActiveRecord models?
SARAH: No, I didn't; thankfully, not. I have never thought about that.
JOËL: So maybe in most applications, those two will just sort of naturally fall into maybe separate parts of the app, and they don't need to interact that much.
SARAH: Yeah, I think that will be the case. So mixing two of those subjects we're talking about here, that's testing and HTTP requests; we've been having a discussion in our project about the usage of VCR. That's a gem that records your HTTP requests interactions and replays them during tests. We've been discussing if using it is a good idea or not because we've been having issues with cassettes, that's one of VCR's concepts when these cassettes are not valid anymore. So do you have any thoughts on the subject? Maybe that will make a whole episode.
JOËL: We could definitely do a whole episode, I think, on testing third-party APIs. VCR is one of multiple different strategies that can be used to not make actual real network requests in your tests which brings some stability. There are also some downsides to it. I have found, in general, that over time, cassettes become brittle. So the idea of VCR is really cool. In practice, I think I've found that a few hand-rolled Webmock stubs usually do the job better for my needs.
SARAH: Yeah, I'll be interested in hearing that episode because, at least in my project, we have a lot of HTTP requests to external services, and they return a lot of information. I'm wondering if just dealing with that with Webmock would be too much work.
JOËL: One of the really useful things about VCR is that you can just make your request from anywhere, and it will just completely handle it. In some ways, though, I think it maybe hides some of that test pain that we were talking about earlier and allows you to sort of put HTTP in a lot of places that maybe you don't want it to. And by allowing yourself to feel a little bit of that test pain, you can more easily notice the places where maybe an object should not be making a request.
Or the actual HTTP logic can be moved to a concentrated place where all the HTTP is done together. And then only that object will need unit tests that actually need to mock the network, and most of your objects are fine. Where it gets interesting is more for things like integration tests, where now you're doing a lot of interactions, and you might have quite a few background requests that need to be made.
SARAH: I'm looking forward to the whole episode on this subject because I feel there's so much to talk about.
JOËL: There really is. I have a blog post that sort of summarizes a few different common categories of approaches to testing third-party requests, which might be different depending on whether you're doing a unit test or an integration test. But I grouped common solutions into four different categories. We'll make sure to link that in the show notes. So we've been talking a lot about testing. I'm curious when you review PR, do you start with the tests, maybe read through the tests first, and then the implementation?
SARAH: That's a good question. I have never thought about starting with tests. I think I'm going to give that a try anytime. But I just start reviewing them like by the first file that comes up. [laughs]
JOËL: I'm the same. I normally just do them in order. I have occasionally tried to do a test first, and that is sometimes interesting. Sometimes you read the test and, especially when you don't know what the implementation is going to be, you're like, why is this in the test? And then you jump to the implementation like, oh, that's what's going on.
Well, thank you so much, Sarah, for joining us on this whirlwind tour of code review, design of objects, and interacting with HTTP and testing.
SARAH: My pleasure.
JOËL: Where can people find you online if they would like to follow your work?
SARAH: I'm on Twitter @sarahlima_rb.
JOËL: We'll make sure to link that in the show notes. And with that, let's wrap up.
The show notes for this episode can be found at bikeshed.fm.
This show is produced and edited by Mandy Moore.
If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review in iTunes. It really helps other folks find the show.
If you have any feedback, you can reach us at @_bikeshed, or reach me at @joelquen on Twitter, or at hosts@bikeshed.fm via email. Thank you so much for listening to The Bike Shed, and we'll see you next week. Byeeeeeee!!!!!!
ANNOUNCER: This podcast was brought to you by thoughtbot. thoughtbot is your expert design and development partner. Let's make your product and team a success.
Sponsored By:
- Airbrake: Deploy fearlessly and fix bugs faster with Airbrake Error & Performance Monitoring. Airbrake notifiers are available for all major programming languages and frameworks, and install in minutes, with an open-source SDK-based install and near-zero technical debt. Spend less time tracking down bugs and more time developing. Visit Frictionless error monitoring and performance insight for your app stack.