ChatGPT解决这个技术问题 Extra ChatGPT

How do you perform a left outer join using linq extension methods

Assuming I have a left outer join as such:

from f in Foo
join b in Bar on f.Foo_Id equals b.Foo_Id into g
from result in g.DefaultIfEmpty()
select new { Foo = f, Bar = result }

How would I express the same task using extension methods? E.g.

Foo.GroupJoin(Bar, f => f.Foo_Id, b => b.Foo_Id, (f,b) => ???)
    .Select(???)

B
B--rian

For a (left outer) join of a table Bar with a table Foo on Foo.Foo_Id = Bar.Foo_Id in lambda notation:

var qry = Foo.GroupJoin(
          Bar, 
          foo => foo.Foo_Id,
          bar => bar.Foo_Id,
          (x,y) => new { Foo = x, Bars = y })
       .SelectMany(
           x => x.Bars.DefaultIfEmpty(),
           (x,y) => new { Foo=x.Foo, Bar=y});

This is actually not nearly as crazy as it seems. Basically GroupJoin does the left outer join, the SelectMany part is only needed depending on what you want to select.
This pattern is great because Entity Framework recognizes it as a Left Join, which I used to believe was an impossibility
@MarcGravell How would you achieve the same to select only the rows where right side columns are all null (that is the case in SQL Server Outer Join when match does not meet)?
@nam Well you'd need a where statement, x.Bar == null
@AbdulkarimKanaan yes - SelectMany flattens two layers of 1-many into 1 layer with an entry per pair
O
Ocelot20

Since this seems to be the de facto SO question for left outer joins using the method (extension) syntax, I thought I would add an alternative to the currently selected answer that (in my experience at least) has been more commonly what I'm after

// Option 1: Expecting either 0 or 1 matches from the "Right"
// table (Bars in this case):
var qry = Foos.GroupJoin(
          Bars,
          foo => foo.Foo_Id,
          bar => bar.Foo_Id,
          (f,bs) => new { Foo = f, Bar = bs.SingleOrDefault() });

// Option 2: Expecting either 0 or more matches from the "Right" table
// (courtesy of currently selected answer):
var qry = Foos.GroupJoin(
                  Bars, 
                  foo => foo.Foo_Id,
                  bar => bar.Foo_Id,
                  (f,bs) => new { Foo = f, Bars = bs })
              .SelectMany(
                  fooBars => fooBars.Bars.DefaultIfEmpty(),
                  (x,y) => new { Foo = x.Foo, Bar = y });

To display the difference using a simple data set (assuming we're joining on the values themselves):

List<int> tableA = new List<int> { 1, 2, 3 };
List<int?> tableB = new List<int?> { 3, 4, 5 };

// Result using both Option 1 and 2. Option 1 would be a better choice
// if we didn't expect multiple matches in tableB.
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    }

List<int> tableA = new List<int> { 1, 2, 3 };
List<int?> tableB = new List<int?> { 3, 3, 4 };

// Result using Option 1 would be that an exception gets thrown on
// SingleOrDefault(), but if we use FirstOrDefault() instead to illustrate:
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    } // Misleading, we had multiple matches.
                    // Which 3 should get selected (not arbitrarily the first)?.

// Result using Option 2:
{ A = 1, B = null }
{ A = 2, B = null }
{ A = 3, B = 3    }
{ A = 3, B = 3    }    

Option 2 is true to the typical left outer join definition, but as I mentioned earlier is often unnecessarily complex depending on the data set.


I think "bs.SingleOrDefault()" will not work if you have another following Join or Include. We need the "bs.FirstOrDefault()" in this cases.
True, Entity Framework and Linq to SQL both require that since they can't easily do the Single check amidst a join. SingleOrDefault however is a more "correct" way to demonstrate this IMO.
You need to remember to Order your joined table or the .FirstOrDefault() is going to get a random row from the multiple rows that might match the join criteria, whatever the database happens to find first.
@ChrisMoschini: Order and FirstOrDefault are unnecessary since the example is for a 0 or 1 match where you would want to fail on multiple records (see comment above code).
This isn't an "extra requirement" unspecified in the question, it's what a lot of people think of when they say "Left Outer Join". Also, the FirstOrDefault requirement referred to by Dherik is EF/L2SQL behavior and not L2Objects (neither of these are in the tags). SingleOrDefault is absolutely the correct method to call in this case. Of course you want to throw an exception if you encounter more records than possible for your data set instead of picking an arbitrary one and leading to a confusing undefined result.
G
Guido Mocha

Group Join method is unnecessary to achieve joining of two data sets.

Inner Join:

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

For Left Join just add DefaultIfEmpty()

var qry = Foos.SelectMany
            (
                foo => Bars.Where (bar => foo.Foo_id == bar.Foo_id).DefaultIfEmpty(),
                (foo, bar) => new
                    {
                    Foo = foo,
                    Bar = bar
                    }
            );

EF and LINQ to SQL correctly transform to SQL. For LINQ to Objects it is beter to join using GroupJoin as it internally uses Lookup. But if you are querying DB then skipping of GroupJoin is AFAIK as performant.

Personlay for me this way is more readable compared to GroupJoin().SelectMany()


This perfomed better than a .Join for me, plus I could do my conditonal joint that I wanted (right.FooId == left.FooId || right.FooId == 0)
linq2sql translates this approach as left join. this answer is better and simpler. +1
Warning! Changing my query from GroupJoin to this approach resulted in a CROSS OUTER APPLY instead of a LEFT OUTER JOIN. That can result in very different performance based on your query. (Using EF Core 5)
h
hajirazin

You can create extension method like:

public static IEnumerable<TResult> LeftOuterJoin<TSource, TInner, TKey, TResult>(this IEnumerable<TSource> source, IEnumerable<TInner> other, Func<TSource, TKey> func, Func<TInner, TKey> innerkey, Func<TSource, TInner, TResult> res)
    {
        return from f in source
               join b in other on func.Invoke(f) equals innerkey.Invoke(b) into g
               from result in g.DefaultIfEmpty()
               select res.Invoke(f, result);
    }

This looks like it would work (for my requirement). Can you provide an example? I am new to LINQ Extensions and am having a hard time wrapping my head around this Left Join situation I am in...
@Skychan May be I need to look at it, it's old answer and was working at that time. Which Framework are you using? I mean .NET version?
This works for Linq to Objects but not when querying a database as you need to operate on an IQuerable and use Expressions of Funcs instead
C
Chris Moschini

Improving on Ocelot20's answer, if you have a table you're left outer joining with where you just want 0 or 1 rows out of it, but it could have multiple, you need to Order your joined table:

var qry = Foos.GroupJoin(
      Bars.OrderByDescending(b => b.Id),
      foo => foo.Foo_Id,
      bar => bar.Foo_Id,
      (f, bs) => new { Foo = f, Bar = bs.FirstOrDefault() });

Otherwise which row you get in the join is going to be random (or more specifically, whichever the db happens to find first).


That's it! Any unguaranteed one to one relation.
B
Bob Vale

Whilst the accepted answer works and is good for Linq to Objects it bugged me that the SQL query isn't just a straight Left Outer Join.

The following code relies on the LinqKit Project that allows you to pass expressions and invoke them to your query.

static IQueryable<TResult> LeftOuterJoin<TSource,TInner, TKey, TResult>(
     this IQueryable<TSource> source, 
     IQueryable<TInner> inner, 
     Expression<Func<TSource,TKey>> sourceKey, 
     Expression<Func<TInner,TKey>> innerKey, 
     Expression<Func<TSource, TInner, TResult>> result
    ) {
    return from a in source.AsExpandable()
            join b in inner on sourceKey.Invoke(a) equals innerKey.Invoke(b) into c
            from d in c.DefaultIfEmpty()
            select result.Invoke(a,d);
}

It can be used as follows

Table1.LeftOuterJoin(Table2, x => x.Key1, x => x.Key2, (x,y) => new { x,y});

H
Harley Waldstein

Turning Marc Gravell's answer into an extension method, I made the following.

internal static IEnumerable<Tuple<TLeft, TRight>> LeftJoin<TLeft, TRight, TKey>(
    this IEnumerable<TLeft> left,
    IEnumerable<TRight> right,
    Func<TLeft, TKey> selectKeyLeft,
    Func<TRight, TKey> selectKeyRight,
    TRight defaultRight = default(TRight),
    IEqualityComparer<TKey> cmp = null)
{
    return left.GroupJoin(
            right,
            selectKeyLeft,
            selectKeyRight,
            (x, y) => new Tuple<TLeft, IEnumerable<TRight>>(x, y),
            cmp ?? EqualityComparer<TKey>.Default)
        .SelectMany(
            x => x.Item2.DefaultIfEmpty(defaultRight),
            (x, y) => new Tuple<TLeft, TRight>(x.Item1, y));
}

D
Dejan

Marc Gravell's answer turn into an extension method that support the IQueryable<T> interface is given in this answer and with added support for C# 8.0 NRT reads as follows:

#nullable enable
using LinqKit;
using LinqKit.Core;
using System.Linq.Expressions;

...

/// <summary>
/// Left join queryable. Linq to SQL compatible. IMPORTANT: any Includes must be put on the source collections before calling this method.
/// </summary>
public static IQueryable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>(
    this IQueryable<TOuter> outer,
    IQueryable<TInner> inner,
    Expression<Func<TOuter, TKey>> outerKeySelector,
    Expression<Func<TInner, TKey>> innerKeySelector,
    Expression<Func<TOuter, TInner?, TResult>> resultSelector)
{
    return outer
        .AsExpandable()
        .GroupJoin(
            inner,
            outerKeySelector,
            innerKeySelector,
            (outerItem, innerItems) => new { outerItem, innerItems })
        .SelectMany(
            joinResult => joinResult.innerItems.DefaultIfEmpty(),
            (joinResult, innerItem) =>
                resultSelector.Invoke(joinResult.outerItem, innerItem));
}