SQL is something many of us have gone to great lengths to try to abstract away. The Repository pattern, ORMs, and Type Providers are examples of abstractions around SQL. But each of these have trade-offs which do not fit every use case.
Today I put forth something that is less of an abstraction and more of a set of helper functions. This has trade-offs too, but it fits my needs very well. See what you think.
What it's like to use
It doesn't try to hide SQL from you, but just tries to get out of your way while you use SQL. Here's how you can define a query.
open QueryHelpers
let listCourses offset limit =
sql
"""
SELECT CourseId, CourseName
FROM Course
OFFSET @Offset ROWS
FETCH
NEXT @Limit ROWS ONLY
;
"""
[
p "Offset" offset
p "Limit" limit
]
The helper functions in the above example are sql
and p
. Actually, p
(short for parameter) is just a shortcut for making a tuple and boxing📌 the parameter value.
📌 Boxing parameter values
Boxing means converting something to type
obj
. For primitive types likeint
, converting it to anobj
has a memory/performance penalty. (Because it has to be wrapped in a reference type and allocated by the garbage collector IIRC.)At the time of this writing, ADO.NET requires boxing parameter values before it will use them. So there is no disadvantage to doing so, and it can prevent type inference problems in our F# code.
Here's how you would run the query.
type Course =
{
CourseId : int
CourseName : string
}
let query = listCourses request.Offset request.Limit
let coursesAsync = Sql.query<Course> connectString query
// coursesAsync is Async<Course array>
Here, Sql.query<Course>
is a function which runs the query and converts each data row into a Course
object. Like with most mappers, the property types and names have to match the returned columns.
Note that the query is created separately from the code which executes the query. This is particularly handy for queries which perform updates. I can setup multiple writes ahead of time, even from separate pieces of code, then perform them later in the same transaction.
let deactivateCourse courseId =
sql "UPDATE ..." [ p "CourseId" courseId ]
let cancelCourseRegistrations courseId =
sql "DELETE ..." [ p "CourseId" courseId ]
...
let patches =
[
deactivateCourse courseId
cancelCourseRegistrations courseId
]
...
Sql.writeBatch connectString patches
// returns Async<unit>
Implementing it
Here is the entire 40-line implementation for the usages above.
namespace Foo.MsSql
open System
open System.Data.SqlClient
type SqlQuery =
{
Query : string
Parameters : (string * obj) list
}
module QueryHelpers =
let p name value =
(name, box value)
let sql query parameters =
{
Query = query
Parameters = parameters
}
module Sql =
open Dapper // adds QueryAsync and ExecuteAsync
let query<'T> connectString q =
async {
use connection = new SqlConnection(connectString)
do! connection.OpenAsync() |> Async.AwaitTask
let! result =
connection.QueryAsync<'T>(q.Query, dict q.Parameters)
|> Async.AwaitTask
return Array.ofSeq result
}
let writeBatch connectString writes =
async {
use connection = new SqlConnection(connectString)
do! connection.OpenAsync() |> Async.AwaitTask
use transaction = connection.BeginTransaction()
for write in writes do
do!
connection.ExecuteAsync
(write.Query, dict write.Parameters, transaction)
|> Async.AwaitTask
|> Async.Ignore
transaction.Commit()
}
Dapper does the hard work of mapping rows to objects. Dapper is awesome.
This code is only a starting point, but it is pretty open for adding new features. For instance, I have added the following over time which I may cover in a later post:
- returning multiple data sets from a query
- more settings like command timeout
- batching by combining writes into one query different from above by making only one round-trip to database
- large result set handling returns a sequence that doesn't read into memory until iterated
- Postgres version
- parameter types mainly for Postgres jsonb type, which Npgsql does not correctly infer
I did have to add Dapper Type Handlers for option
types. But with that, I can map null DB fields to F# Option, and avoid null handling.
What about ...?
Getting only 1 row
Since the data is returned as an array, you can use F# Array functions like Array.tryHead
to get the first row if one exists.
Getting a scalar
Dapper supports converting rows to primitives like int
. So you can use Sql.query<int>
and that will return the int
value out of the first column for each row. Then you can use Array functions to get only the first row.
How far can this get you?
I use a few principles which makes this (plus the additions I mention) satisfy all my SQL needs.
First, I separate decisions from side-effects. So my code returns messages representing what decisions and events happened as a result of the user's request. Later in the pipeline, these decisions are turned into side effects like saving to the database or sending emails.
Second, I model the data multiple ways. The first model is the authoritative model (source of truth) on which the others are based. This could be an event log or normalized relational tables. This is where most people stop when using a relational database. However, I additionally create models which are tailor-fit for specific purposes.
For example, I store a Course's data (which has several layers of nested objects) as a JSON object, since it is convenient to load and save it that way for Add/Edit/View scenarios. I have a separate model (table) for CourseSearch which consists of some of the same data, but as relational columns. These columns are used for filtering and sorting and populating a text search index, to enable users to perform Full Text Searches. Both models are useful for their specific purposes. This is Separation of Concerns at the data level. But if I tried to squeeze both into the same mega-model, it would be hard to work with. Because I'd always have to think of the mega-model from two different dimensions at once in order to make the right choices about it. My dimensionally-challenged brain can't do that consistently.
Third, as a rule, I do not have queries which both change data and return data. It's either/or. That way I can't introduce bugs due to unanticipated side effects. As a consequence, I cannot use auto-generated IDs. But in the end that turns out to be ok. Because they complicate other features, like retries. And they have to be abandoned (or outsourced) anyway once you outgrow a single server.
Due to the above, my queries stay relatively simple and targeted for a specific purpose. So a simple SQL API meets all my needs.
for each desired change, make the change easy (warning: this may be hard), then make the easy change
- Kent Beck
Top comments (16)
This looks really good -- in my attempts I am having trouble with null values. I've tried the linked OptionHandler but doesn't seem to work...
Are you able to provide an example handling
Options<string>
?Also, have you considered releasing what you've got as a library?
If not, i might once I can work out the details...
Here is what I use to setup the optional type handlers. After that
string option
and many other optional types are handled automatically.Then as part of app startup, I call:
To me this is still a bit too jagged of an edge. Probably the way around having to call something at app startup is to include the initialization as a
let
statement in the Sql module. But I am not sure if it would get optimized away in a production build, since it will never get referenced.One other gotcha: The column order in the returned query data must match the property order on F# records. Maybe this is because of the way the F# record constructor is generated, but Dapper cannot figure it out if the returned data is in a different order.
I will setup a github repo for it tonight, time permitting. I can include multi queries too (i.e.
SELECT ... ; SELECT ... ;
).Created an initial repo. Still need to figure out how to make a NuGet package, add examples, etc.
SlimSql - seems a stupid name, but best I could come up with tonight.
Thanks for this!
I had the project on hold that could use this, but am restarting it now.
I currently have bits of code from your blogpost, plus other bits I scrapped together, as I discover things I need.
When it settles down I will try and switch to SlimSql... and perhaps add a PR or two :)
Sure thing. I need to figure out how to execute the startup code so the user doesn't have to wire it into their app startup. I hate that Dapper requires this.
FWIW I am currently calling this as part of opening the connection. The result is a noop since the method checks if the typehandler already exists in the SqlMapper.typeHandlers dictionary.
Looks something like this:
PS: Am also forced to use Oracle, so would be nice to make that an option :)
Is there any way to avoid boxing when passing int parameters?
Sure. You can manually construct the query to include the data as part of the query. That way, you deal with the types without boxing to construct the query. However, this is considered bad practice because it makes SQL Injection attacks possible.
For parameterized queries, this is currently not possible when using ADO.NET (which is used by Dapper). There is an open issue to use parameters without boxing. I'll have to see what implementation they come up with before I know whether/how it will complicate the functions I posted here.
In practice, I have not found the performance to be a problem... considering that IO is typically slower than boxing perf and there is a hard limit to the number of parameters (~2100). Are you just curious or did you have a specific use case where this was a problem?
It's generally better to avoid additional work for GC, but in this case fsharp type checking forces writing box even for reference types, which is inconvenient, as params are of this type Parameters : (string * obj) list. I have do this for both ints and strings:
[
p "paramInt" (box paramInt)
p "paramString" (box paramString)
]
It is possible to avoid boxing in F#, but there is no point because the value is going to get boxed anyway by the database client (SqlParameter only takes
object
s as values). Avoiding boxing in the F# code will just bloat the code to no benefit.I see, thank you for the clarifications
Hi Kasey, thank you for the useful post.
Is there any problem with or disadvantage to defining p as:
let p name value = (name, (box value))
Which allows you to just type
[ p "paramInt" paramInt ]
Edit: I just re-read the preceding conversation and realized I answered the wrong question. Both yours and @lanayx previously.
No it is not a problem at all for the
p
helper to box the value, to avoid repeatingbox
every time. And in fact that is exactly how I defined it in my own library. I believe it was just an oversight in the post. I will update the post.Leaving the original response below.
I created this tiny helper due to a couple of factors.
obj
anywaySo without the helper, the parameter list looked like this:
Not terrible, but removing characters
, box
to addp
on the front seemed a favorable trade.That ended up being fortuitous because later, using a tuple was not quite enough. Sometimes it is necessary to declare the database type of the parameter. For example, when the value is null. (With Npgsql also when the column is
jsonb
.) So the parameter tuple became a full record type.p
can be changed to work with this, and in a way that does not break existing defined queries. And it will be drastically shorter than spelling out the full SqlParam record.Hi Kasey, thank you for the quick and helpful reply.
Moving from the F# type providers to Dapper has been a real pleasure for me. For the price of giving up the compile time type checking, I now have simple reusable data transfer objects (DTOs) and functions to convert these from domain and back.
Best regards
Roland
Thanks for this! I'm an Elm/OCaml programmer and this really makes me want to learn F#
Nice! I'll have to look into Dapper.