ChatGPT解决这个技术问题 Extra ChatGPT

How can I do a FULL OUTER JOIN in MySQL?

I want to do a full outer join in MySQL. Is this possible? Is a full outer join supported by MySQL?

possible duplicate of MySQL Full Outer Join Syntax Error
This question have better answers
Beware of the answers here. The SQL standard says full join on is inner join on rows union all unmatched left table rows extended by nulls union all right table rows extended by nulls. Most answers here are wrong (see the comments) & the ones that aren't wrong don't handle the general case. Even though there are many (unjustified) upvotes. (See my answer.)
@JairoLozano Constraints are not needed to query. Although when constraints hold extra queries return the desired answer that otherwise wouldn't. Constraints don't affect what full join on returns for given arguments. The problem you describe is that the query you wrote is the wrong query. (Presumably the common error where people want some joins, each possibly involving a different key, of some subqueries, each possibly involving join and/or aggregation, but they erroneously try to do all the joining then all the aggregating or to aggregate over previous aggregations.)
all the answers doing UNION instead of UNION ALL are incorrect. all answers with subqueries or 3 unioned selects are inefficient. correct answers will do a union all of a left join with a select from the second table with a where not exists on the first table (or the equivalent outer join + where =NULL condition)

P
Peter Mortensen

You don't have full joins in MySQL, but you can sure emulate them.

For a code sample transcribed from this Stack Overflow question you have:

With two tables t1, t2:

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id

The query above works for special cases where a full outer join operation would not produce any duplicate rows. The query above depends on the UNION set operator to remove duplicate rows introduced by the query pattern. We can avoid introducing duplicate rows by using an anti-join pattern for the second query, and then use a UNION ALL set operator to combine the two sets. In the more general case, where a full outer join would return duplicate rows, we can do this:

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION ALL
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id
WHERE t1.id IS NULL

Actually the thing you wrote is not correct. Because when you do a UNION you will remove duplicates, and sometimes when you join two different tables there should be duplicates.
This is the correct example: (SELECT ... FROM tbl1 LEFT JOIN tbl2 ...) UNION ALL (SELECT ... FROM tbl1 RIGHT JOIN tbl2 ... WHERE tbl1.col IS NULL)
So the difference is that I am doing a left inclusive join and then right exclusive using UNION ALL
and I see now that you say that yourself, sorry. Perhaps you could update your answer, given there is this case that it gets wrong and that the UNION ALL is always going to be more efficient?
@ypercube: If there no duplicate rows in t1 and t2, the query in this answer does return a resultset that emulates FULL OUTER JOIN. But in the more general case, for example, the SELECT list doesn't contain sufficient columns/expressions to make the returned rows unique, then this query pattern is insufficient to reproduce the set that would be produced by a FULL OUTER JOIN. To get a more faithful emulation, we'd need a UNION ALL set operator, and one of the queries would need an anti-join pattern. The comment from Pavle Lekic (above) gives the correct query pattern.
C
Community

The answer that Pablo Santa Cruz gave is correct; however, in case anybody stumbled on this page and wants more clarification, here is a detailed breakdown.

Example Tables

Suppose we have the following tables:

-- t1
id  name
1   Tim
2   Marta

-- t2
id  name
1   Tim
3   Katarina

Inner Joins

An inner join, like this:

SELECT *
FROM `t1`
INNER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

Would get us only records that appear in both tables, like this:

1 Tim  1 Tim

Inner joins don't have a direction (like left or right) because they are explicitly bidirectional - we require a match on both sides.

Outer Joins

Outer joins, on the other hand, are for finding records that may not have a match in the other table. As such, you have to specify which side of the join is allowed to have a missing record.

LEFT JOIN and RIGHT JOIN are shorthand for LEFT OUTER JOIN and RIGHT OUTER JOIN; I will use their full names below to reinforce the concept of outer joins vs inner joins.

Left Outer Join

A left outer join, like this:

SELECT *
FROM `t1`
LEFT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

...would get us all the records from the left table regardless of whether or not they have a match in the right table, like this:

1 Tim   1    Tim
2 Marta NULL NULL

Right Outer Join

A right outer join, like this:

SELECT *
FROM `t1`
RIGHT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

...would get us all the records from the right table regardless of whether or not they have a match in the left table, like this:

1    Tim   1  Tim
NULL NULL  3  Katarina

Full Outer Join

A full outer join would give us all records from both tables, whether or not they have a match in the other table, with NULLs on both sides where there is no match. The result would look like this:

1    Tim   1    Tim
2    Marta NULL NULL
NULL NULL  3    Katarina

However, as Pablo Santa Cruz pointed out, MySQL doesn't support this. We can emulate it by doing a UNION of a left join and a right join, like this:

SELECT *
FROM `t1`
LEFT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`

UNION

SELECT *
FROM `t1`
RIGHT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`;

You can think of a UNION as meaning "run both of these queries, then stack the results on top of each other"; some of the rows will come from the first query and some from the second.

It should be noted that a UNION in MySQL will eliminate exact duplicates: Tim would appear in both of the queries here, but the result of the UNION only lists him once. My database guru colleague feels that this behavior should not be relied upon. So to be more explicit about it, we could add a WHERE clause to the second query:

SELECT *
FROM `t1`
LEFT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`

UNION

SELECT *
FROM `t1`
RIGHT OUTER JOIN `t2` ON `t1`.`id` = `t2`.`id`
WHERE `t1`.`id` IS NULL;

On the other hand, if you wanted to see duplicates for some reason, you could use UNION ALL.


For MySQL you really want to avoid using UNION instead of UNION ALL if there is no overlap (see Pavle's comment above). If you could add some more info to that effect in your answer here, I think it'd be the preferred answer for this question as it's more thorough.
The recommendation from the "database guru colleague" is correct. In terms of the relational model (all the theoretical work done by Ted Codd and Chris Date), a query of the last form emulates a FULL OUTER JOIN, because it combines two distinct sets, The second query doesn't introduce "duplicates" (rows already returned by the first query) that would not be produced by a FULL OUTER JOIN. There's nothing wrong with doing queries that way, and using UNION to remove those duplicates. But to really replicate a FULL OUTER JOIN, we need one of the queries to be an anti-join.
@IstiaqueAhmed: the goal is to emulate a FULL OUTER JOIN operation. We need that condition in the second query so it returns only rows that don't have a match (an anti-join pattern.). Without that condition, the query is an outer join... it returns rows that match as well as those without a match. And the rows that match were already returned by the first query. If the second query returns those same rows (again), we've duplicated rows and our result will not be equivalent to a FULL OUTER JOIN.
@IstiaqueAhmed: It is true that a UNION operation will remove those duplicates; but it also removes ALL duplicate rows, including duplicate rows that would be in the returned by a FULL OUTER JOIN. To emulate a FULL JOIN b, the correct pattern is (a LEFT JOIN b) UNION ALL (b ANTI JOIN a).
Very concise answer with a great explanation. Thanks for this.
P
Peter Mortensen

Using a union query will remove duplicates, and this is different than the behavior of full outer join that never removes any duplicates:

[Table: t1]        [Table: t2]
value              value
-----------        -------
1                  1
2                  2
4                  2
4                  5

This is the expected result of a full outer join:

value | value
------+-------
1     | 1
2     | 2
2     | 2
Null  | 5
4     | Null
4     | Null

This is the result of using left and right join with union:

value | value
------+-------
Null  | 5
1     | 1
2     | 2
4     | Null

SQL Fiddle

My suggested query is:

select
    t1.value, t2.value
from t1
left outer join t2
  on t1.value = t2.value
union all      -- Using `union all` instead of `union`
select
    t1.value, t2.value
from t2
left outer join t1
  on t1.value = t2.value
where
    t1.value IS NULL

The result of the above query that is as the same as the expected result:

value | value
------+-------
1     | 1
2     | 2
2     | 2
4     | NULL
4     | NULL
NULL  | 5

SQL Fiddle

@Steve Chambers: [From comments, with many thanks!]

Note: This may be the best solution, both for efficiency and for generating the same results as a FULL OUTER JOIN. This blog post also explains it well - to quote from Method 2: "This handles duplicate rows correctly and doesn’t include anything it shouldn’t. It’s necessary to use UNION ALL instead of plain UNION, which would eliminate the duplicates I want to keep. This may be significantly more efficient on large result sets, since there’s no need to sort and remove duplicates."

I decided to add another solution that comes from full outer join visualization and math. It is not better than the above, but it is more readable:

Full outer join means (t1 ∪ t2): all in t1 or in t2 (t1 ∪ t2) = (t1 ∩ t2) + t1_only + t2_only: all in both t1 and t2 plus all in t1 that aren't in t2 and plus all in t2 that aren't in t1:

-- (t1 ∩ t2): all in both t1 and t2
select t1.value, t2.value
from t1 join t2 on t1.value = t2.value
union all  -- And plus
-- all in t1 that not exists in t2
select t1.value, null
from t1
where not exists( select 1 from t2 where t2.value = t1.value)
union all  -- and plus
-- all in t2 that not exists in t1
select null, t2.value
from t2
where not exists( select 1 from t1 where t2.value = t1.value)

SQL Fiddle


We are doing same task tow times , If there are sub query for t1 and t2 then mysql have to do same task more times, is not it ? Can we remove this using alias in this situation ?:
I suggest you to use some temporary tables ;).
This method seems to be the best solution, both for efficiency and for generating the same results as a FULL OUTER JOIN. This blog post also explains it well - to quote from Method 2: "This handles duplicate rows correctly and doesn’t include anything it shouldn’t. It’s necessary to use UNION ALL instead of plain UNION, which would eliminate the duplicates I want to keep. This may be significantly more efficient on large result sets, since there’s no need to sort and remove duplicates."
@SteveChambers it's too late, but thanks for your comment. I added your comment to then answer to highlighted more, If you are not agree please roll it back ;).
No problem @shA.t - IMO this should really have more upvotes and/or be the accepted answer.
P
Peter Mortensen

MySQL does not have FULL-OUTER-JOIN syntax. You have to emulate it by doing both LEFT JOIN and RIGHT JOIN as follows:

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id

But MySQL also does not have a RIGHT JOIN syntax. According to MySQL's outer join simplification, the right join is converted to the equivalent left join by switching the t1 and t2 in the FROM and ON clause in the query. Thus, the MySQL query optimizer translates the original query into the following -

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION
SELECT * FROM t2
LEFT JOIN t1 ON t2.id = t1.id

Now, there is no harm in writing the original query as is, but say if you have predicates like the WHERE clause, which is a before-join predicate or an AND predicate on the ON clause, which is a during-join predicate, then you might want to take a look at the devil; which is in details.

The MySQL query optimizer routinely checks the predicates if they are null-rejected.

https://i.stack.imgur.com/NioBX.png

Now, if you have done the RIGHT JOIN, but with WHERE predicate on the column from t1, then you might be at a risk of running into a null-rejected scenario.

For example, the query

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t1.col1 = 'someValue'
UNION
SELECT * FROM t1
RIGHT JOIN t2 ON t1.id = t2.id
WHERE t1.col1 = 'someValue'

gets translated to the following by the query optimizer:

SELECT * FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t1.col1 = 'someValue'
UNION
SELECT * FROM t2
LEFT JOIN t1 ON t2.id = t1.id
WHERE t1.col1 = 'someValue'

So the order of tables has changed, but the predicate is still applied to t1, but t1 is now in the 'ON' clause. If t1.col1 is defined as NOT NULL column, then this query will be null-rejected.

Any outer-join (left, right, full) that is null-rejected is converted to an inner-join by MySQL.

Thus the results you might be expecting might be completely different from what the MySQL is returning. You might think its a bug with MySQL's RIGHT JOIN, but that’s not right. Its just how the MySQL query optimizer works. So the developer in charge has to pay attention to these nuances when he/she is constructing the query.


P
Peter Mortensen

None of the previous answers are actually correct, because they do not follow the semantics when there are duplicated values.

For a query such as (from this duplicate):

SELECT * FROM t1 FULL OUTER JOIN t2 ON t1.Name = t2.Name;

The correct equivalent is:

SELECT t1.*, t2.*
FROM (SELECT name FROM t1 UNION  -- This is intentionally UNION to remove duplicates
      SELECT name FROM t2
     ) n LEFT JOIN
     t1
     ON t1.name = n.name LEFT JOIN
     t2
     ON t2.name = n.name;

If you need this to work with NULL values (which may also be necessary), then use the NULL-safe comparison operator, <=> rather than =.


this is often a good solution, but it might give different results than a FULL OUTER JOIN whenever the name column is null. The union all query with anti-join pattern should reproduce the outer join behavior correctly, but which solution is more appropriate depends on the context and on the constraints that are active on the tables.
@fthiella . . . That is a good point. I adjusted the answer.
okay, but the null-safe comparison operator will make the join succeed, which is different than the full outer join behavior in case you have null names both in t1 and in t2
@fthiella . . . I'll have to think about the best way to do this. But given how wrong the accepted answer is, almost anything is closer to the right answer. (That answer is just wrong if there are multiple keys on either side.)
yes the accepted answer is wrong, as a general solution I think it's correct to use union all, but that answer misses an anti-join pattern in either the first or the second query that will keep existing duplicates but prevents from adding new ones. Depending on the context other solutions (like this one) might be more appropriate.
s
shA.t

In SQLite you should do this:

SELECT * 
FROM leftTable lt 
LEFT JOIN rightTable rt ON lt.id = rt.lrid 
UNION
SELECT lt.*, rl.*  -- To match column set
FROM rightTable rt 
LEFT JOIN  leftTable lt ON lt.id = rt.lrid

Can we use it ? like as: SELECT * FROM leftTable lt LEFT JOIN rightTable rt ON lt.id = rt.lrid UNION SELECT lt.*, rl.* -- To match column set FROM leftTable lt RIGHT JOIN rightTable rt ON lt.id = rt.lrid;
yes but SQLite doesn't support right joins but yes in MYSQL yes
l
lolo

You can do the following:

(SELECT 
    *
FROM
    table1 t1
        LEFT JOIN
    table2 t2 ON t1.id = t2.id
WHERE
    t2.id IS NULL)
UNION ALL
 (SELECT 
    *
FROM
    table1 t1
        RIGHT JOIN
    table2 t2 ON t1.id = t2.id
WHERE
    t1.id IS NULL);

An explanation would be in order. Please respond by editing (changing) your answer, not here in comments (without "Edit:", "Update:", or similar - the answer should appear as if it was written today).
P
Peter Mortensen

I modified shA.t's query for more clarity:

-- t1 left join t2
SELECT t1.value, t2.value
FROM t1 LEFT JOIN t2 ON t1.value = t2.value   

    UNION ALL -- include duplicates

-- t1 right exclude join t2 (records found only in t2)
SELECT t1.value, t2.value
FROM t1 RIGHT JOIN t2 ON t1.value = t2.value
WHERE t1.value IS NULL 

A
Alex Pliutau
SELECT
    a.name,
    b.title
FROM
    author AS a
LEFT JOIN
    book AS b
    ON a.id = b.author_id
UNION
SELECT
    a.name,
    b.title
FROM
    author AS a
RIGHT JOIN
    book AS b
    ON a.id = b.author_id

y
ysth

You can just convert a full outer join, e.g.

SELECT fields
FROM firsttable
FULL OUTER JOIN secondtable ON joincondition

into:

SELECT fields
FROM firsttable
LEFT JOIN secondtable ON joincondition
UNION ALL
SELECT fields (replacing any fields from firsttable with NULL)
FROM secondtable
WHERE NOT EXISTS (SELECT 1 FROM firsttable WHERE joincondition)

Or if you have at least one column, say foo, in firsttable that is NOT NULL, you can do:

SELECT fields
FROM firsttable
LEFT JOIN secondtable ON joincondition
UNION ALL
SELECT fields
FROM firsttable
RIGHT JOIN secondtable ON joincondition
WHERE firsttable.foo IS NULL

P
Peter Mortensen

I fix the response, and works include all rows (based on the response of Pavle Lekic):

    (
    SELECT a.* FROM tablea a
    LEFT JOIN tableb b ON a.`key` = b.key
    WHERE b.`key` is null
    )
    UNION ALL
    (
    SELECT a.* FROM tablea a
    LEFT JOIN tableb b ON a.`key` = b.key
    where  a.`key` = b.`key`
    )
    UNION ALL
    (
    SELECT b.* FROM tablea a
    right JOIN tableb b ON b.`key` = a.key
    WHERE a.`key` is null
    );

No, this is a type of "outer-only" join, that will only return the rows from tablea that don't have a match in tableb and vice versa. The you try to UNION ALL, which would only work if these two tables have equivalently ordered columns, which isn't guaranteed.
it works, I create on temp database tablea(1,2,3,4,5,6) and tableb(4,5,6,7,8,9) its rows have 3 cols "id", "number" and "name_number" as text, and works in result only have (1,2,3,7,8,9)
That's not an outer join. An outer join also includes the matching members.
that new sentence have all results 1,2, ..., 9
That is close to incomprehensible ("I fix the response, and works include all rows"). Can you fix it? (But without "Edit:", "Update:", or similar - the answer should appear as if it was written today.)
P
Peter Mortensen

Use:

SELECT * FROM t1 FULL OUTER JOIN t2 ON t1.id = t2.id;

It can be recreated as follows:

 SELECT t1.*, t2.*
 FROM (SELECT * FROM t1 UNION SELECT name FROM t2) tmp
 LEFT JOIN t1 ON t1.id = tmp.id
 LEFT JOIN t2 ON t2.id = tmp.id;

Using a UNION or UNION ALL answer does not cover the edge case where the base tables have duplicated entries.

Explanation:

There is an edge case that a UNION or UNION ALL cannot cover. We cannot test this on MySQL as it doesn't support full outer joins, but we can illustrate this on a database that does support it:

 WITH cte_t1 AS
 (
     SELECT 1 AS id1
     UNION ALL SELECT 2
     UNION ALL SELECT 5
     UNION ALL SELECT 6
     UNION ALL SELECT 6
 ),
cte_t2 AS
(
     SELECT 3 AS id2
     UNION ALL SELECT 4
     UNION ALL SELECT 5
     UNION ALL SELECT 6
     UNION ALL SELECT 6
)
SELECT  *  FROM  cte_t1 t1 FULL OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2;

This gives us this answer:

id1  id2
1  NULL
2  NULL
NULL  3
NULL  4
5  5
6  6
6  6
6  6
6  6

The UNION solution:

SELECT  * FROM  cte_t1 t1 LEFT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2
UNION    
SELECT  * FROM cte_t1 t1 RIGHT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2

Gives an incorrect answer:

 id1  id2
NULL  3
NULL  4
1  NULL
2  NULL
5  5
6  6

The UNION ALL solution:

SELECT  * FROM cte_t1 t1 LEFT OUTER join cte_t2 t2 ON t1.id1 = t2.id2
UNION ALL
SELECT  * FROM  cte_t1 t1 RIGHT OUTER JOIN cte_t2 t2 ON t1.id1 = t2.id2

Is also incorrect.

id1  id2
1  NULL
2  NULL
5  5
6  6
6  6
6  6
6  6
NULL  3
NULL  4
5  5
6  6
6  6
6  6
6  6

Whereas this query:

SELECT t1.*, t2.*
FROM (SELECT * FROM t1 UNION SELECT name FROM t2) tmp
LEFT JOIN t1 ON t1.id = tmp.id
LEFT JOIN t2 ON t2.id = tmp.id;

Gives the following:

id1  id2
1  NULL
2  NULL
NULL  3
NULL  4
5  5
6  6
6  6
6  6
6  6

The order is different, but otherwise matches the correct answer.


That's cute, but misrepresents the UNION ALL solution. Also, it presents a solution using UNION which would be slower on large source tables because of the required de-duplication. Finally, it wouldn't compile, because the field id doesn't exist in the subquery tmp.
I never made a claim about speed, and neither did the OP mention anything about speed. Assuming the the UNION ALL (you don't rely specify which one) and this both give the correct answer, if we wanted to make the assertion that one is faster, we would need to provide benchmarks, and that would be digressing from the OP question.
As to the observation about the id not being in the sub-query, I corrected the typo - thank you for pointing it out. Your misrepresentations claim is vague - if maybe you could provide more information, I can address that. On your final observation about cuteness, I don't have any comment, I would rather focus on the logic of the sql.
Misrepresents: "The UNION ALL solution: ... Is also incorrect." The code you present leaves out the intersection-exclusion from the right-join (where t1.id1 is null) that must be provided in the UNION ALL. Which is to say, your solution trumps all the others, only when one of those other solutions is incorrectly implemented. On "cuteness," point taken. That was gratuitous, my apologies.
P
Peter Mortensen

Use a cross join solution:

SELECT t1.*, t2.*
FROM table1 t1
INNER JOIN table2 t2 
ON 1=1;

No, this is a cross join. It will match every row in t1 to every row in t2, yielding the set of all possible combinations, with select (select count(*) from t1) * (select count(*) from t2)) rows in the result set.
While this code may answer the question, providing additional context regarding how and why it solves the problem would improve the answer's long-term value.
Which addition may be helpful? maybe en example?
a
alamelu

It is also possible, but you have to mention the same field names in select.

SELECT t1.name, t2.name FROM t1
LEFT JOIN t2 ON t1.id = t2.id
UNION
SELECT t1.name, t2.name FROM t2
LEFT JOIN t1 ON t1.id = t2.id

This is just duplicating the results from a left join.
This will not give an equivalent result as full outer join.
p
philipxy

The SQL standard says full join on is inner join on rows union all unmatched left table rows extended by nulls union all right table rows extended by nulls. Ie inner join on rows union all rows in left join on but not inner join on union all rows in right join on but not inner join on.

Ie left join on rows union all right join on rows not in inner join on. Or if you know your inner join on result can't have null in a particular right table column then "right join on rows not in inner join on" are rows in right join on with the on condition extended by and that column is null.

Ie similarly right join on union all appropriate left join on rows.

From What is the difference between “INNER JOIN” and “OUTER JOIN”?:

(SQL Standard 2006 SQL/Foundation 7.7 Syntax Rules 1, General Rules 1 b, 3 c & d, 5 b.)


This seems to be partly incomprehensible (e.g., "full join on is inner join on rows union all unmatched left table rows extended by nulls union all right table rows extended by nulls."). Can you rephrase?
@PeterMortensen I will edit the post soon. FULL JOIN ON returns UNION ALL UNION ALL . Of course the phrasing should be made clear for anyone. But if you know what FULL JOIN ON returns I find it odd you can't parse it as it is.
@PeterMortensen Your commas changed the meaning so I rolled them back. But I will be editing to clarify.