SQL Joins

SQL Joins

When we're using relational databases correctly, we're going to end up with a lot of tables in our database. That's great for organizing our data, but how do we query that data when it's spread across multiple tables? Well, the answer is JOIN.

We're going to work with a small database that contains just two tables pulled from the ACADEMY AWARDS® DATABASE. An actors table and an awards table. The following SQL contains the schema for the database and some seed data.

Note

Make sure you execute the schema then the seeds before continuing.

Schema

Loading...

Seeds

Loading...

If we wanted to get all of the awards with the id of the actor that won the award, that would be a simple query

Loading...

But instead of getting the winner's id, we want the winner's name. We want to

SELECT year, title, full_name

Now we're dealing with data from more than one table. We've got the year and title from the awards table, and the full_name from the actors table. We also have a relationship, the winner_id is an id of one of the actors. So we need to use this information to JOIN the tables together so that we can get the data we need. Note that when there is a shared column name between tables, we must specify which table the column belongs to, hence the actors.id

Loading...

In order to get data from multiple tables, we can JOIN the tables together. When we JOIN tables, we must specify how the tables are related to each other using ON.

This kind of JOIN is called an INNER JOIN. When we just write JOIN like this, it's just short hand for INNER JOIN. Try writing INNER before the word JOIN and re running the query.

Did you notice that the first query without the JOIN returned 12 results, but the second query with the INNER JOIN only returned 8 results? Why is that?

It's because an INNER JOIN will only return values where there's a match between the tables. In the first query, we selected all of the awards, even if there wasn't a winner (the winner_id was null). Since the table's are being JOINed when actors.id = winner_id, we won't get any results when the winner_id is null.

But what if we want all of the awards, even when there's no winner? Then we'll have to use another JOIN!

Loading...
  • INNER returns rows from both tables where there's a match between the tables.
  • LEFT and RIGHT return all rows from the left or right table respectively, and only the matched results from the other table, or NULL if there is no match.
  • FULL returns all rows from both tables.

Find an issue with this page? Fix it on GitHub