The JOIN concept
JOIN is a query clause that can be used with the SELECT, UPDATE, and DELETE data query statements to simultaneously affect rows from multiple tables. There are several distinct types of JOIN statements that return different data result sets.
Joined tables must each include at least one field in both tables that contain comparable data. For example, if you want to join a Customer table and a Transaction table, they both must contain a common element, such as a CustomerID column, to serve as a key on which the data can be matched. Tables can be joined on multiple columns so long as the columns have the potential to supply matching information. Column names across tables don't have to be the same, although for readability this standard is generally preferred.
SELECT CustomerName, TransDate
FROM Customer INNER JOIN Transaction
ON Customer.CustomerID = Transaction.CustomerID;
If a row in the Transaction table contains a CustomerID that’s not listed in the Customer table, that row will not be returned as part of the result set. Likewise, if the Customer table has a CustomerIDwith no corresponding rows in the Transaction table, the row from the Customer table won’t be returned.
JOIN is a query clause that can be used with the SELECT, UPDATE, and DELETE data query statements to simultaneously affect rows from multiple tables. There are several distinct types of JOIN statements that return different data result sets.
Joined tables must each include at least one field in both tables that contain comparable data. For example, if you want to join a Customer table and a Transaction table, they both must contain a common element, such as a CustomerID column, to serve as a key on which the data can be matched. Tables can be joined on multiple columns so long as the columns have the potential to supply matching information. Column names across tables don't have to be the same, although for readability this standard is generally preferred.
Now that we’ve examined the basic theory, let’s take a look at the various types of joins and examples of each.
The basic JOIN statement
A basic JOIN statement has the following format:
SELECT Customer.CustomerID, TransID, TransAmt
FROM Customer JOIN Transaction
ON Customer.CustomerID = Transaction.CustomerID;
In practice, you'd never use the example above because the type of join is not specified. In this case, SQL Server assumes an INNER JOIN. You can get the equivalent to this query by using the statement:
SELECT Customer.CustomerID, TransID, TransAmt
FROM Customer, Transaction;
The basic JOIN statement
A basic JOIN statement has the following format:
SELECT Customer.CustomerID, TransID, TransAmt
FROM Customer JOIN Transaction
ON Customer.CustomerID = Transaction.CustomerID;
In practice, you'd never use the example above because the type of join is not specified. In this case, SQL Server assumes an INNER JOIN. You can get the equivalent to this query by using the statement:
SELECT Customer.CustomerID, TransID, TransAmt
FROM Customer, Transaction;
However, the example is useful to point out a few noteworthy concepts:
- TransID and TransAmt do not require fully qualified names because they exist in only one of the tables. You can use fully qualified names for readability if you wish.
- The Customer table is considered to be the “left” table because it was called first. Likewise, theTransaction table is the “right” table.
- You can use more than two tables, in which case each one is “naturally” joined to the cumulative result in the order they are listed, unless controlled by other functionality such as “join hints” or parenthesis.
- You may use WHERE and ORDER BY clauses with any JOIN statement to limit the scope of your results. Note that these clauses are applied to the results of your JOIN statement.
- SQL Server does not recognize the semicolon (;), but I use it in the included examples to denote the end of each statement, as would be expected by most other RDBMSs.
The INNER JOIN drops rows
When you perform an INNER JOIN, only rows that match up are returned. Any time a row from either table doesn’t have corresponding values from the other table, it is disregarded. Because stray rows aren’t included, you don’t have any of the “left” and “right” nonsense to deal with and the order in which you present tables matters only if you have more than two to compare. Since this is a simple concept, here’s a simple example:
When you perform an INNER JOIN, only rows that match up are returned. Any time a row from either table doesn’t have corresponding values from the other table, it is disregarded. Because stray rows aren’t included, you don’t have any of the “left” and “right” nonsense to deal with and the order in which you present tables matters only if you have more than two to compare. Since this is a simple concept, here’s a simple example:
SELECT CustomerName, TransDate
FROM Customer INNER JOIN Transaction
ON Customer.CustomerID = Transaction.CustomerID;
If a row in the Transaction table contains a CustomerID that’s not listed in the Customer table, that row will not be returned as part of the result set. Likewise, if the Customer table has a CustomerIDwith no corresponding rows in the Transaction table, the row from the Customer table won’t be returned.
The OUTER JOIN can include mismatched rows
OUTER JOINs, sometimes called “complex joins,” aren’t actually complicated. They are so-called because SQL Server performs two functions for each OUTER JOIN.
The first function performed is an INNER JOIN. The second function includes the rows that the INNER JOIN would have dropped. Which rows are included depends on the type of OUTER JOIN that is used and the order the tables were presented.
There are three types of an OUTER JOIN: LEFT, RIGHT, and FULL. As you’ve probably guessed, the LEFT OUTER JOIN keeps the stray rows from the “left” table (the one listed first in your query statement). In the result set, columns from the other table that have no corresponding data are filled with NULL values. Similarly, the RIGHT OUTER JOIN keeps stray rows from the right table, filling columns from the left table with NULL values. The FULL OUTER JOIN keeps all stray rows as part of the result set. Here is your example:
SELECT CustomerName, TransDate, TransAmt
FROM Customer LEFT OUTER JOIN Transaction
ON Customer.CustomerID = Transaction.CustomerID;
OUTER JOINs, sometimes called “complex joins,” aren’t actually complicated. They are so-called because SQL Server performs two functions for each OUTER JOIN.
The first function performed is an INNER JOIN. The second function includes the rows that the INNER JOIN would have dropped. Which rows are included depends on the type of OUTER JOIN that is used and the order the tables were presented.
There are three types of an OUTER JOIN: LEFT, RIGHT, and FULL. As you’ve probably guessed, the LEFT OUTER JOIN keeps the stray rows from the “left” table (the one listed first in your query statement). In the result set, columns from the other table that have no corresponding data are filled with NULL values. Similarly, the RIGHT OUTER JOIN keeps stray rows from the right table, filling columns from the left table with NULL values. The FULL OUTER JOIN keeps all stray rows as part of the result set. Here is your example:
SELECT CustomerName, TransDate, TransAmt
FROM Customer LEFT OUTER JOIN Transaction
ON Customer.CustomerID = Transaction.CustomerID;
Customer names that have no associated transactions will still be displayed. However, transactions with no corresponding customers will not, because we used a LEFT OUTER JOIN and theCustomer table was listed first.
In SQL Server, the word OUTER is actually optional. The clauses LEFT JOIN, RIGHT JOIN, and FULL JOIN are equivalent to LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN, respectively.
In SQL Server, the word OUTER is actually optional. The clauses LEFT JOIN, RIGHT JOIN, and FULL JOIN are equivalent to LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN, respectively.
No comments:
Post a Comment