C# – LINQ – How to Create a Left Outer Join

Here’s another question that I see come up pretty regularly:

How can I create a left outer join using LINQ?

I can understand how this could be considered tricky.

Here’s the code:

class User
{
    int userID;
    string name;
}

class Purchase
{
    int userID;
    double cost;
}

// Assume users has type List<User>
// Assume purchases has type List<Purchase>

var result = users.GroupJoin(purchases,
        u => u.userID,
        p => p.purchaseID,
        (u, p) => 
            new { User = u, Purchases = p.DefaultIfEmpty() })
    .SelectMany(a => a.Purchases.
        Select(b => new { User = a.User, Purchase = b });

Ok, so let’s take a quick walk through the code.

The very first thing we do is create a class to store a user object and one to store a purchase. The common field between the two is userID.

Imagine we have a full list of Users and Purchases. Now let’s start querying.

users.GroupJoin(purchases,
        u => u.userID,
        p => p.purchaseID,
        (u, p) => 
            new { User = u, Purchases = p.DefaultIfEmpty() })

The GroupJoin here is going to join the users collection with the purchases collection, which is specified as the first parameter to the GroupJoin. The second and third parameters are lambda expressions which specify which fields in the list we are joining on. The last parameter specifies what we are doing with each collection. In this case we are creating a new anonymous type. This type has a User field with the user, and a Purchases field which contains a list of the purchases containing that user’s id. We are also using DefaultIfEmpty. This means that if that user has no purchases then Purchases is going to be null.

We’re practically at outer left join already.

.SelectMany(a => a.Purchases.
    Select(b => new { User = a.User, Purchase = b });

Now we perform a SelectMany. This is being called on our collection of our anonymous type that matches Users to their Purchases or null. We want to select based on the Purchases field. This will ensure we create one record in our new collection for every Purchase in the input collection. We are using them to create a new anonymous type that maps a User to a Purchase — not a collection of purchases.

Our result out of this step of the query is a collection of an anonymous type containing a User and a single Purchase. There is a record in this collection for every single Purchase in purchases, and also a record for each user with no purchases, where Purchase is equal to null.

That is a left outer join in LINQ.