Hello everybody,
This is Michael, and as I mentioned in the last post, I will start building the database that I will be using for this series of posts.
The database will store information about 54 albums (3 for each year from 2000 to 2018) such as track listings, artists, featured artists on certain tracks, genre, release year, duration (of album and individual track), etc.
In the previous post, I did mention that MySQL is meant for query-based analysis. However, before beginning to do queries, we must create our database. As the title explains, this post will focus on the creation of an ER (entity-relationship) diagram. An ER diagram is a graphical representation of items in a database (in this case, albums, track listings, artists, etc.) and how they are related to each other (like how albums can have several track listings).
So, without further hesitation, here is the ER diagram for the database I will be using.

Now you may be confused by all of the arrows and tables in the diagram. Here’s an explanation.
- This ER diagram will represent the relationship between albums and singers, songs on the album, featured artists, as well as any awards the album either won or received nominations.
- How do all of these tables relate to each other? Here’s how.
- Each album must have several tracks, while each track belongs to one and only one album (the thing that looks like a three-pointed arrow means “many” while the thing with two vertical lines represents “one”)
- Each track can have several, one, or no feature artists (that’s why you see a circle) but each featured artist must belong to one and only track.
- Each singer can appear more than once (if they have several albums in the database) or just once but each album must correspond to one and only one singer.
- Each album can be nominated for one or several awards (eg. Grammys, MTV VMAs, etc.) but each award must correspond to only one album.
- You’ll notice that this is the only dotted line in the diagram. This is because the relationship between album and awards is non-identifying, meaning that you can identify the award based on idAwards field alone, without needing the album field for identification.
- As for the rest of the relationships (which are known as identifying relationships), each table is dependent on the other table for identification
- For example, you can’t identify a featured artist without knowing what track they appear on. Likewise, you can’t identify a song without knowing what album it is a part of. Nor can you identify an album without knowing which singer/group created it.
Now what about the attributes in each table (those are the things with diamonds right by them)? Here’s what each of them mean.
- The album field contains the attributes
- Name-the name of the album
- Duration-how long the album is (given in hours:minutes:seconds)
- Release Year-the year the album came out
- Album number-how many albums has the artist made up through that point; in other words, is this the artist’s 1st album? 4th? 5th?
- Genre-the genre of the album
- The singer field contains the attributes
- Singer Name-the singer’s (or group’s) name
- Age-the singer’s age as of August 1, 2018 (if they are still living)
- Birthplace-the singer’s birthplace (or where the group was formed)
- Date of Death-the date the artist died
- You’ll notice the attributes age and date of death have white diamonds right by them; this is because each of them can be null (have no value). For instance, the date of death field can stay blank for living artists. For the other attributes that have blue diamonds besides them, they have to have some sort of value (can’t be null in other words).
- The tracks field contains the attributes
- Track Name-the song’s name
- Track Duration-the length of the song (given in hours:minutes:seconds)
- The featured artist field contains the attribute
- Featured artist name-the name of any artist who appears on a particular track
- The awards field contains the attributes
- Ceremony-the ceremony where the album either was nominated for or won an award (Grammys, Billboard Music Awards, etc.)
- Ceremony Year-the year of the ceremony where the album either got nominated for or won an award
- Won/Nominated-whether an album won or was nominated for a particular award
But wait, what are those keys right by some of the attributes? Those are called primary keys, which are one or more columns with data used to uniquely identify each row in the table. Primary keys are usually stored in auto-incrementing indexes (starting with 1, then 2, then 3, and so on) to ensure uniqueness. For example, in the album table, 1 would be the primary key for the first album in the database, then 2, then 3, all the way to 54.
Take this part of the diagram:
idSinger and idAlbum are both primary keys in their respective tables. But wait, why does Singer_idSinger1 appear in the album table? That is because Singer_idSinger1 is a foreign key, which is a column or set of columns in a table that refers to the primary key in another table-which would be the primary key for the singer table. Foreign keys basically serve as a means to connect the referencing table (album) with the referenced table (singer).
If you want to know how primary keys and foreign keys differ from each other, here’s a handy table (Source-https://www.essentialsql.com/what-is-the-difference-between-a-primary-key-and-a-foreign-key/)
That’s all for now. Thanks for reading,
Michael