This article started because I thought that researching a particular bug could be useful to understand devto's source code a little better. So I li...
For further actions, you may consider blocking this person and/or reporting abuse
O/RM methods like
destroy_all
really need to be renameddestroy_slowly_and_painfully_use_sql_if_you_can
, I swear. This isn't the first time I've seen something like this. Batching ids to delete is one option; if you can just run a script or prepared statement to delete reactions by the article/post id that's even better since the engine can use an index on the foreign key (although the difference may be negligible at the scales we're talking about currently). If the reaction deletion from users' feeds is in-band it probably doesn't need to be, and managing the cache of reactions to a dead article is obviously superfluous. No idea why it's saving other articles since there isn't any metrics aggregation but that's already out of band like you mentioned.Ahahah, yeah pretty much.
Nope, definitely not. A lot of ORM like AR encourage their users to surrender to all possible callbacks and many of the gems used by Rails applications hook into those callbacks without the developer knowing so after a while it's not uncommon to be hit by this issue.
Yeah, a post is never going to have millions of reactions, at least not for a while. By the way there's no foreign key between the two tables (articles and reactions) so no index to take advantage of.
You could add an index without the proper foreign key constraint, or vice versa, but without looking at the data model my guess is that having both is probably best in this scenario.
You can't easily add a foreign key between reactions and articles because of the "polymorphic" nature of these relation. A reaction can be attached to a comment, to an article and other stuff I guess. So what Rails does it setup a pair of fields called
reactable_id
andreactable_type
like this:On top of those it mounts what it calles polymorphic associations. The gist is that every select triggered by something like
article.reactions
becomes aSELECT * FROM reactions WHERE reactable_type = 'Article' AND reactable_id = 1234
, same for the other queriesGross. The site runs on Postgres though iirc, so you could establish a foreign key relationship between
reactions
andreactables
as a parent table extended byarticles
,posts
, and so on. But I don't know if ActiveRecord would play nicely with that kind of specialized structure.No you can't use Postgresql inheritance in Rails, not easily.
There's a concept of single table inheritance but it's all virtual. The various entities share the same table and are distinguished by a
type
column. You can find the explanation here medium.com/@dcordz/single-table-in... - I've used it once and it was used to map entities with differed only in name (sensors with a uniform API)Another option is a complex junction table with
reaction_id
,article_id
,post_id
, and aCHECK
constraint ensuring that only one reactable foreign key can have a value. But that does add enough complexity to at least give me pause.The indexing in the current model could still be improved since
reactable_id
is unreliable on its own andreactable_type
is low-cardinality. A single index on(reactable_type, reactable_id)
would be much more useful (including for bulk delete!).Yeah, I think the index on both keys is a good comprise
It might be nice longterm to merge reactables if they're similar enough. Articles are almost exactly posts without parents, after all. Ancillary information can go into a specialization table with an optional 1:1 relationship.
This is fabulous, I’ll have some thoughts to add color in the morning.
There are a couple main themes here for me:
This is exactly why we open sourced. Dives like this help make the code awesome.
Thanks Ben! Take your time, I know it's a lot :-)
Yeah, count me in. I tend to move the callback logic in explicit "service objects" or something like that, like your "labor" folder. It makes testing a little bit easier as well I guess. These kind of refactorings definitely require lots of digging, testing and code breaking because you have to go counter the Rails way of trusting implicit side effects and make them explicit. The gain is that code usually becomes as side effect free as possible and more understandable for new comers because it tends to be narrated in the controller:
something like that :D
We currently do use this sort of pattern, and even give it its own folder
app/services
, but we only use it sometimes.Right now I'd say we're sort of caught in between "hacked to get things to work" and "decent architecture" state (most code could probably be described that way).
The problem is probably that we've sort of gone half-way with some of these things.
The fact that we have the labor folder and the services folder (and the liquid_tags) and the models folders is indicative of some indecision combined with lack of taking the time to actually take the time for dedicated large-scale refactoring.
There are different schools of thoughts in terms of folder structure here (should it all go in models, with subfolders?)
If you have any opinions here, I'm all ears.
PS, we actually define our own timeout at about 9 seconds.
RACK_TIMEOUT
accepts some a Rails-magicky ENV variable calledRACK_TIMEOUT_SERVICE_TIMEOUT
. We could extend this longer, but everything we do really should finish sooner, so it would be better to fix the code IMO.I think it's normal :) A way to refactor could be choosing one style and slowly convert the rest to it
Dunno. I've never liked the "fat model" approach because it leads to giant files with a lot of different business logic. For example, where do you put a method that operates on a user and a comment? In user? In comment? 42?
One thing that I like is having "third party services" clients in the same folder, especially if they are used in multiple part of the business logic.
I've also used presenters in the past to group the display logic and (some) query objects to hide complex queries and be able to test them in isolation.
These are a few articles on the topic of how to structure or refactor Rails code:
Another thing I would consider is to make more explicit what is an asynchronous job. The code base right now uses
delayed_job
directly with its various.delay
,_without_delay
andhandle_asynchronously
. Refactoring all of this into Active Jobs would have the effect of make jobs explicit and produce a niceapp/jobs
folder as a central repository for all the async stuff. This way you can test it in isolation if needed or if in the future you outgrow delayed job you know where to go to make the changes.Ah! Good to know :) Well, it's entirely possible that the wall is hit at 9 seconds then.
Agreed
I know it's a lot, but the great thing about refactoring is that it can be (mostly) done one step a time :-)