Be careful when using BETWEEN
with timestamps. You might end up chopping off a whole day of data 😬
Imagine you were chief safety inspector at a local trampoline park (bonus points if that is your job in real life). You might write a query like this to get a report of accidents in December:
SELECT *
FROM accidents
WHERE created_at BETWEEN '2019-12-01' AND '2019-12-31'
Looks good, right? Nope.
This query would omit any mishaps the whole day of December 31. Why? Your query only looks from midnight on Dec 1 to midnight on Dec 31. Any bump, abrasion, or mid-air collision that occurred after midnight on the 31st won't be in your results. The query above is the same as:
SELECT *
FROM accidents
WHERE created_at >= '2019-12-01 00:00:00.000000'
AND created_at <= '2019-12-31 00:00:00.000000'
You can avoid this problem by writing the query:
SELECT *
FROM accidents
WHERE created_at >= '2019-12-01'
AND created_at < '2020-01-01'
The lesson: save BETWEEN
for discrete quantities like integers. And stay away from trampoline parks. We’ve seen the data and it doesn't look pretty.
Top comments (2)
Yep! During reviews I beg other developers to avoid
BETWEEN
clause completely and be explicit about these kind of queries.Totally! BETWEEN seems like such a shortcut at first, but it definitely bites you eventually.
Btw, I see on your profile that you work at Euromonitor International, which totally takes me back! In a past internship, I used Euromonitor data to help small companies in Ohio export their products. I couldn't believe the depth and richness of your datasets on everything from flavors of ice cream to pistachio exports (sometimes correlated, haha). Lemme know if you're ever in need of a great SQL editor built for teams :) popsql.com