ChatGPT解决这个技术问题 Extra ChatGPT

Is having an 'OR' in an INNER JOIN condition a bad idea?

In trying to improve the speed of an immensely slow query (several minutes on two tables with only ~50,000 rows each, on SQL Server 2008 if it matters), I narrowed down the problem to an OR in my inner join, as in:

SELECT mt.ID, mt.ParentID, ot.MasterID
  FROM dbo.MainTable AS mt
  INNER JOIN dbo.OtherTable AS ot ON ot.ParentID = mt.ID
                                  OR ot.ID = mt.ParentID

I changed this to (what I hope is) an equivalent pair of left joins, shown here:

SELECT mt.ID, mt.ParentID,
   CASE WHEN ot1.MasterID IS NOT NULL THEN
      ot1.MasterID ELSE
      ot2.MasterID END AS MasterID
  FROM dbo.MainTable AS mt
  LEFT JOIN dbo.OtherTable AS ot1 ON ot1.ParentID = mt.ID
  LEFT JOIN dbo.OtherTable AS ot2 ON ot2.ID = mt.ParentID
  WHERE ot1.MasterID IS NOT NULL OR ot2.MasterID IS NOT NULL

.. and the query now runs in about a second!

Is it generally a bad idea to put an OR in a join condition? Or am I just unlucky somehow in the layout of my tables?

Show us the execution plan instead of your query.
seems like an odd relationship
@Blindy: good idea. It turns out the execution plans show just what Quassnoi mentions below: the first query results in nested loops, while the second is done with a hash join.

V
Vaibhav

This kind of JOIN is not optimizable to a HASH JOIN or a MERGE JOIN.

It can be expressed as a concatenation of two resultsets:

SELECT  *
FROM    maintable m
JOIN    othertable o
ON      o.parentId = m.id
UNION
SELECT  *
FROM    maintable m
JOIN    othertable o
ON      o.id = m.parentId

, each of them being an equijoin, however, SQL Server's optimizer is not smart enough to see it in the query you wrote (though they are logically equivalent).


this makes sense, thank you. I'm still not sure if there is something peculiar about my query, or if I should just avoid joins of the ON w=x OR y=z pattern entirely?
@ladenedge: these joins will be performed using a table scan in a nested loop. This is slow if your tables are large.
just to be clear, when you say "these joins," you mean all joins of the form ON w=x OR y=z? (Thanks for your patience!)
@ladenedge: there may be additional conditions which could help SQL Server understand that a concatenation would be needed. Say, the query SELECT * FROM othertable WHERE parentId = 1 OR id = 2 will use a concatenation if both fields are indexed so theoretically there is nothing that would prevent doing the same thing in a loop. Whether SQL Server will build this plan actually or not, depends on very many factors, but I've never seen it built in real life.
Also note that if you know they are disjoint sets, union ALL can improve performance significantly by avoiding merging the result sets.
M
MEO

I use following code for get different result from condition That worked for me.

Select A.column, B.column
FROM TABLE1 A
INNER JOIN
TABLE2 B
ON A.Id = (case when (your condition) then b.Id else (something) END)

s
sanampakuwal

You can use UNION ALL instead

SELECT mt.ID, mt.ParentID, ot.MasterID
FROM dbo.MainTable AS mt
UNION ALL
SELECT mt.ID, mt.ParentID, ot.MasterID
FROM dbo.OtherTable AS ot

UNION ALL will give you duplicates compared to the JOIN with an OR condition.
For that UNION will be right. For more details read the following link union-instead-of-or
yes but in your example you written it with union all which is not correct as the article you link to also describes.