I was working on a project with User
and Costumer
identificação. The first is generated by the web app, and the second by the business department.
So I had to develop a logic to make those match between identifications.
The User
entity needs to know the Costumer
identification and, in order to make the match, first I selected the users without a costumer and extracted a list of emails. Then, I selected the customers based on this email list. This statement can be transcribed to SQL syntax, like this:
SELECT *
FROM Customers
WHERE email IN (SELECT email FROM Users WHERE CustomerId IS NULL)
But how to translate the SQL to C# LINQ API?
- First, make the selection from Users, extracting the list of emails
- Select the Customers with those emails
This way:
var emails = _userManager.Users
.Where(user => user.CustomerId == null)
.Select(user => user.Email) // extract the emails from users
.ToList();
var customers = _applicationRepository.GetCustomers()
.Where(customer => emails.Contains(customer.Email)) // the Contains method carry the IN logic when translated to SQL script
.ToList();
Now we have the match programmatically.
That's it!
Top comments (9)
Thank Alex,
the translation looked easy to understand as the LINQ version matches with SQL statement 👍.
And I have a question regarding the conversion.
Would the LINQ version fetch all data from database first and filter (using
Where
clause) in memory?or
would the query create an optimal syntax to send to the database and fetch only filtered records?
Calling
ToList()
results in execution of the query. So the first statement will make an in memory list of email addresses.To get a better feel for what's happening you can intercept the actual queries. Assuming Entity framework is used: Log Queries executed by Entity Framework DbContext.
I stopped writing C# right as LINQ got popular so I don't know it too well but I believe this does store the list of emails in memory.
In SQL you could do this:
and achieve the same result without a subquery. I don't know how you'd go about translating that to LINQ though.
In my opinion, this is a better approach, which doesn't require an additional call to the database and doesn't require to store entities in the memory (which could have terrible results in case of a huge collection).
And of course, it is available in LINQ.
It's true! But the tables are in different contexts, which means they are in different databases. So, it's not possible to make a
JOIN
between them.My condolences :)
Here is the EF SQL of "Users":
And here is the EF SQL of "Customers":
As we can see, the
IN
logic really occursin memory
.Thanks for the clarification Alex 😎
I prefer to use stored procedures and table variables for this kind of query when it gets complex or is working against a large amount of data. I prefer to let the database do the job its been optimized for, parsing and sorting data, and use C# and LINQ for stuff SQL isn't optimized to do well, like string manipulation, or to work with smaller amounts of data, like lookup tables.
Of course, using techniques like in this article are good when there's a "turf war" between DBA's and application developers or when the application team has limited SQL skills and a DBA isn't available to write optimal procs.