Hello everybody,
It’s Michael, and today’s post will be about subqueries. Subqueries are basically queries-within-queries; think mini-queries within a larger query, and you’ve got the idea. The purpose of subqueries is to retrieve data for the main query that will be used as a condition to filter the query output. You can usually find subqueries in three types of clauses (SELECT, FROM, and WHERE) and inside other subqueries.
Here’s a simple example with a WHERE clause:

In this query, I selected the fields Singer Name and Death (referring to a singer’s date of death) from the Singer table. The line WHERE Death = (SELECT MAX(Death) FROM mydb.Singer) asks the query to select and display the maximum, or most recent, date of death listed in the database (along with displaying the corresponding singer name). As you can see, the most recent date of death in the database is June 18, 2018, which corresponds to the rapper XXXTentacion.
- Even though MAX is in the subquery, GROUP BY is not necessary. Don’t worry about GROUP BY when including an aggregate function (COUNT, MAX, MIN, etc.) in your subquery.
Here’s a slightly more complicated example:
In this query, I selected the Singer Name, Age, and Gender fields from the Singer table. The subquery in Line 2 asks the query to only display records for singers within a certain age range (18-36, essentially any of the millennials). Line 3 simply orders the results from oldest to youngest singer.
Let’s try another subquery, except this time using the SELECT clause:
The subquery in the select clause-SELECT MAX(Track Duration) FROM mydb.Tracks-tells the main query to display the length of the longest song listed on this table as a column; the result is 18 minutes and 1 second. For those wondering, the song “Empire of the Clouds” from Iron Maiden’s The Book of Souls is the longest song listed on this table.
- SELECT DISTINCT ensures that the maximum track duration only appears once, otherwise it will appear 800 times (this happened when I first ran the query without the DISTINCT keyword).
Here’s a subquery with the FROM clause:
In this query, I am trying to find out the average song length for songs on each of the 57 albums listed on the database. The subquery `SELECT Album_idAlbum, SEC_TO_TIME(AVG(Track Duration)) AS Average Song Length FROM mydb.Tracks
GROUP BY Album_idAlbum ORDER BY Average Song Length ASC` basically functions as the whole query, as SELECT * asks the query to select everything that matches the criteria listed in the FROM clause subquery.
In the subquery, I selected the Album_idAlbum foreign key and the (average) Track Duration field from the Tracks Table. I then group the results by Album and order the results by Average Song Length in ascending order (from albums with the shortest average track length to albums with the longest average track length).
SEC_TO_TIMEdisplays the average track for each album in hour:minute:second form, because without this function the time displays as a decimal.
One last thing I wanted to mention with regards to subqueries is that it’s possible to have subqueries-within-subqueries. In these cases, the innermost subquery will run first, followed by the outer subquery, and finally the main query. Here’s an example:
In this query, I am trying to find out which rap album has the longest duration. I first select the Name, Release Date, and Duration fields from the Album table. The inner subquery-SELECT Genre FROM mydb.Album WHERE Genre IN (RAP)-retrieves all the rap albums from the database. The outer subquery-SELECT MAX(Duration) FROM mydb.Album WHERE Genre IN (inner subquery)-retrieves the rap album with the longest duration. WHERE Duration = (outer subquery(inner subquery)) tells the query to display the rap album (and corresponding release date) with the longest duration (which is Tupac’s album Until The End of Time).
Thanks for reading,
Michael