ChatGPT解决这个技术问题 Extra ChatGPT

Explicit vs implicit SQL joins

Is there any efficiency difference in an explicit vs implicit inner join? For example:

SELECT * FROM
table a INNER JOIN table b
ON a.id = b.id;

vs.

SELECT a.*, b.*
FROM table a, table b
WHERE a.id = b.id;
Good question. I'm curious why the explicit join is used at all. Is it not possible to do all queries without it?
use EXPLAIN keyword to know the difference about both the queries.. use JOIN and see the difference.. If you try in a table more than 100k records you can see the difference...
@andrew My question was actually whether implicit join was a form of "hack" (as in "A query involving more than one table, not using a join? That's a hack isn't it?")
They are different, implicit joining will surprise you every once in a while when dealing with null values; use explicit joining and avoid bugs that arise when "nothing changed!"
There is no difference. , is CROSS JOIN with looser binding & INNER JOIN is CROSS JOIN with ON like WHERE but tighter binding. What matters to execution is how the DBMS optimizes queries.

S
Stevoisiak

Performance wise, they are exactly the same (at least in SQL Server).

PS: Be aware that the IMPLICIT OUTER JOIN syntax is deprecated since SQL Server 2005. (The IMPLICIT INNER JOIN syntax as used in the question is still supported)

Deprecation of "Old Style" JOIN Syntax: Only A Partial Thing


@lomaxx, just for clarity's sake, could you specify which syntax of the 2 in the question is deprecated?
Can you provide supporting documentation? This sounds wrong on multiple levels.
How do you deprecate the SQL standard?
@david Crenshaw, the implicit join is no longer in the standard and hasn't been for 18 years.
So-called "implicit joins" of the 'inner' or 'cross' variety remain in the Standard. SQL Server is deprecating the "old-style" outer join syntax (i.e. *= and =*) which has never been Standard.
g
grom

Personally I prefer the join syntax as its makes it clearer that the tables are joined and how they are joined. Try compare larger SQL queries where you selecting from 8 different tables and you have lots of filtering in the where. By using join syntax you separate out the parts where the tables are joined, to the part where you are filtering the rows.


I completely agree, but this is kind of off-topic. OP asked about efficiency.
M
Matt Fenwick

On MySQL 5.1.51, both queries have identical execution plans:

mysql> explain select * from table1 a inner join table2 b on a.pid = b.pid;
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref          | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
|  1 | SIMPLE      | b     | ALL  | PRIMARY       | NULL | NULL    | NULL         |  986 |       |
|  1 | SIMPLE      | a     | ref  | pid           | pid  | 4       | schema.b.pid |   70 |       |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
2 rows in set (0.02 sec)

mysql> explain select * from table1 a, table2 b where a.pid = b.pid;
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
| id | select_type | table | type | possible_keys | key  | key_len | ref          | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
|  1 | SIMPLE      | b     | ALL  | PRIMARY       | NULL | NULL    | NULL         |  986 |       |
|  1 | SIMPLE      | a     | ref  | pid           | pid  | 4       | schema.b.pid |   70 |       |
+----+-------------+-------+------+---------------+------+---------+--------------+------+-------+
2 rows in set (0.00 sec)

table1 has 166208 rows; table2 has about 1000 rows.

This is a very simple case; it doesn't by any means prove that the query optimizer wouldn't get confused and generate different plans in a more complicated case.


This should be the accepted answer. This is correct, the plan is the same (or close to with bigger statements) but the amount of records will be drastic, thus causing difference in performance.
e
edosoft

The second syntax has the unwanted possibility of a cross join: you can add tables to the FROM part without corresponding WHERE clause. This is considered harmful.


What if the table names in the from clause are generated from the tables used in the where clause?
you can do a cross join with the explicit JOIN syntax as well.(stackoverflow.com/a/44438026/929164) you probably meant that it is less strict, thus more prone to user error.
a
andy47

The first answer you gave uses what is known as ANSI join syntax, the other is valid and will work in any relational database.

I agree with grom that you should use ANSI join syntax. As they said, the main reason is for clarity. Rather than having a where clause with lots of predicates, some of which join tables and others restricting the rows returned with the ANSI join syntax you are making it blindingly clear which conditions are being used to join your tables and which are being used to restrict the results.


m
marc_s

@lomaxx: Just to clarify, I'm pretty certain that both above syntax are supported by SQL Serv 2005. The syntax below is NOT supported however

select a.*, b.*  
from table a, table b  
where a.id *= b.id;

Specifically, the outer join (*=) is not supported.


Frankly I wouldn't use it even in SQL Server 2000, the *= syntax often gives wrong answers. Sometimes it interprets these as cross joins.
J
Joshdan

Performance wise, they are exactly the same (at least in SQL Server) but be aware that they are deprecating this join syntax and it's not supported by sql server2005 out of the box.

I think you are thinking of the deprecated *= and =* operators vs. "outer join".

I have just now tested the two formats given, and they work properly on a SQL Server 2008 database. In my case they yielded identical execution plans, but I couldn't confidently say that this would always be true.


L
Leigh Caldwell

On some databases (notably Oracle) the order of the joins can make a huge difference to query performance (if there are more than two tables). On one application, we had literally two orders of magnitude difference in some cases. Using the inner join syntax gives you control over this - if you use the right hints syntax.

You didn't specify which database you're using, but probability suggests SQL Server or MySQL where there it makes no real difference.


Leigh, you can use the hints in implicit joins too.
In Oracle it is extremely rare for the join order to affect the execution plan in a meaningful way. See this article by Jonathan Lewis for an explanation.
M
Mike McAllister

As Leigh Caldwell has stated, the query optimizer can produce different query plans based on what functionally looks like the same SQL statement. For further reading on this, have a look at the following two blog postings:-

One posting from the Oracle Optimizer Team

Another posting from the "Structured Data" blog

I hope you find this interesting.


Mike, the difference they are talking about is that you need to be sure that if you specify an explicit join, you specify the join condition to join on, not the filter. You will note that for semantically correct queries, the exec plan is the same.
M
Michele La Ferla

Basically, the difference between the two is that one is written in the old way, while the other is written in the modern way. Personally, I prefer the modern script using the inner, left, outer, right definitions because they are more explanatory and makes the code more readable.

When dealing with inner joins there is no real difference in readability neither, however, it may get complicated when dealing with left and right joins as in the older method you would get something like this:

SELECT * 
FROM table a, table b
WHERE a.id = b.id (+);

The above is the old way how a left join is written as opposed to the following:

SELECT * 
FROM table a 
LEFT JOIN table b ON a.id = b.id;

As you can visually see, the modern way of how the script is written makes the query more readable. (By the way same goes for right joins and a little more complicated for outer joins).

Going back to the boiler plate, it doesn't make a difference to the SQL compiler how the query is written as it handles them in the same way. I've seen a mix of both in Oracle databases which have had many people writing into it, both elder and younger ones. Again, it boils down to how readable the script is and the team you are developing with.


D
David

Performance wise, it should not make any difference. The explicit join syntax seems cleaner to me as it clearly defines relationships between tables in the from clause and does not clutter up the where clause.


S
Sean

In my experience, using the cross-join-with-a-where-clause syntax often produces a brain damaged execution plan, especially if you are using a Microsoft SQL product. The way that SQL Server attempts to estimate table row counts, for instance, is savagely horrible. Using the inner join syntax gives you some control over how the query is executed. So from a practical point of view, given the atavistic nature of current database technology, you have to go with the inner join.


Do you have any proof of this? Because the accepted answer says otherwise.