Surrogate Key vs Primary Key: What’s the Difference?

Ryan Forrester

Nov 1st, 2023

Blog

The Ultimate Guide to Surrogate Key vs Primary Key

This article provides a comprehensive guide on database keys, specifically focusing on surrogate keys and primary keys, explaining their purposes, differences, and scenarios for use.

It is part of a series of articles on database design that also cover important concepts such as one-to-many relationships or many-to-many relationships.

SQL Made Simple - Try Five
Five is the Easiest Way to Create, Model and Query a SQL Database

___

Here’s What You Can Do With Five

Create and Model a SQL Database ✅
Write or Build SQL Queries ✅
Visualize Queries as Charts or in Dashboards ✅
Add CRUD Permissions to Control Data Access ✅
Host Your Database Online ✅

“Five bridges the gap between SQL and the web,
allowing me to create full-stack applications almost entirely in SQL”
– Crag Jones, Database Administrator (DBA)

Table of Contents

Database GUIs: How to Work with a SQL Database
What Is a Database Key?
What Is a Surrogate Key?
What Is a Primary Key?
What Are the Differences Between Surrogate Keys and Primary Keys?
When To Use a Surrogate Key
When To Use a Primary Key
A Simple Analogy for Surrogate Keys vs Primary Keys
Five and Primary Keys

Database GUIs: How to Work with a SQL Database

Learning SQL is an important skill for almost every programmer. SQL is so pervasive in application development that not understanding relational databases and how to interact with them is a big gap in programmers’ skills.

To make learning SQL easy, choosing the right tool is important. MySQL Workbench and other database GUIs all enable programmers to interact with an SQL database.

Five, a rapid app development environment with an integrated MySQL GUI, makes SQL even simpler. For example, Five automatically adds primary keys to all tables created. This feature is highly beneficial for users as it ensures every table has a unique identifier for each record without manual setup. The primary keys used in Five are GUIDs, which are 128-bit text strings. These provide a high level of uniqueness, significantly reducing the likelihood of duplication or collision.

What Is a Database Key?

Let’s start with this first. A database key is a column or set of columns in a database table that uniquely identifies each row in the table. Keys are used to enforce data integrity and to improve the performance of database queries.

What Is a Surrogate Key?

A surrogate key is a database key that is not derived from the natural key of the table. Instead, it is a unique identifier that is generated by the database system. Surrogate keys are often used when the natural key is not unique or when it is subject to change.

What Is a Primary Key?

A primary key is a database key that uniquely identifies each row in a table. The primary key is often the natural key of the table, but it can also be a surrogate key.

What Are the Differences Between Surrogate Keys and Primary Keys?

The main difference between a surrogate key and a primary key is that a surrogate key is not derived from the natural key of the table. This means that a surrogate key is not meaningful to users, but it is efficient for the database system to use. A primary key, on the other hand, is often the natural key of the table, which means that it is meaningful to users and can be used to perform business logic.

In other words, a surrogate key is a type of primary key.

Here is a simple example:

Natural key: A customer’s name.
Primary key: The customer ID number.
Surrogate key: A unique customer ID number.

In this example, the natural key is not unique, because there may be multiple customers with the same name. Therefore, we would use a surrogate key as the primary key. This ensures that each customer record has a unique identifier.

When To Use a Surrogate Key

When the Natural Key Is Not Unique.

A natural key is a column or set of columns that uniquely identify a row in a table based on the business meaning of the data. However, there are cases where a natural key may not be unique, such as when two people have the same name or when a product has the same SKU number but different sizes or colors. In these cases, a surrogate key can be used to uniquely identify each row in the table.

For example, a table of customers may have a natural key of name and address. However, it is possible for two customers to have the same name and address. To ensure that each customer row is uniquely identified, a surrogate key, such as customer_id, can be used.

When the Natural Key Is Too Long or Complex.

A surrogate key is typically a short, integer value. This makes it easier to store and index in a database, and it can also improve the performance of queries. If the natural key is too long or complex, it can impact the performance of the database.

For example, a table of products may have a natural key of product_description. However, product descriptions can be quite long and complex. To improve the performance of the database, a surrogate key, such as product_id, can be used.

When the Natural Key Is Subject to Change.

If the natural key is subject to change, this can cause problems for the database. For example, if the natural key is a customer’s Social Security number, and the customer changes their Social Security number, this will require the database to be updated. This can be a complex and time-consuming process.

To avoid these problems, a surrogate key can be used instead of the natural key. Surrogate keys are typically not subject to change, which makes the database more stable and easier to maintain.

When To Use a Primary Key

When the Natural Key Is Unique and Permanent.

A natural key is a column or set of columns that uniquely identify a row in a table based on the business meaning of the data. For example, a natural key for a table of customers might be customer_id.

If the natural key is unique and permanent, then it is a good candidate for the primary key. This is because the primary key should be a column or set of columns that can uniquely identify each row in the table and that will not change over time.

When the Natural Key Is a Good Identifier for the Record.

The primary key should also be a good identifier for the record. This means that it should be easy to understand and remember. For example, a customer ID is a good identifier for a customer record, but a customer’s Social Security number would not be a good identifier because it is difficult to remember and may change over time.

When the Natural Key Is Needed for Foreign Key Relationships.

Foreign key relationships are used to link tables together. For example, a table of orders may have a foreign key relationship to the table of customers, so that each order can be linked to the customer who placed the order.

If the natural key is needed for a foreign key relationship, then it should be used as the primary key. This will ensure that the foreign key relationship is always valid.

Examples

Here are some examples of when to use a primary key:

A customer ID in a table of customers
A product ID in a table of products
An order ID in a table of orders
An invoice ID in a table of invoices
An employee ID in a table of employees

In general, it is a good practice to use natural keys as primary keys whenever possible. However, there are some cases where it is necessary or beneficial to use a surrogate key instead.

A Simple Analogy for Surrogate Keys vs Primary Keys

Imagine a classroom of students.

A natural key for a student could be their name. However, there may be multiple students in the classroom with the same name. To ensure that each student can be uniquely identified, the teacher could assign each student a unique student ID number. This student ID number would be the student’s surrogate key.

The student ID number uniquely identifies each student and is not subject to change. This makes it a good candidate for the primary key of the student table in the school database.

Five and Primary Keys

Five uses primary keys to uniquely identify each row in a table. This allows Five to efficiently query and manipulate data.

Five also uses primary keys to enforce data integrity. For example, a foreign key constraint can be used to ensure that a row in one table references a valid row in another table.

Here are some specific examples of how Five uses primary keys:

To generate database tables: When you create a new database table in Five, Five will automatically add a primary key to a newly created table. You don’t have to worry about doing so yourself. Given the importance of primary keys for data accuracy, this is a great tool for developers.
To create forms and dashboards: Five can automatically generate forms and dashboards based on your database tables. When generating a form or dashboard, Five will use the primary key column to uniquely identify each row. This allows Five to display the correct data in the form or dashboard.
To create relationships between tables: Five allows you to create relationships between tables. For example, you could create a relationship between a customers table and an orders table. When creating a new table using the Table Wizard, Five will ask you whether you’d like to establish any relationships between tables. Just select the tables that you’re new table is related to, and Five will insert a Foreign Key for you.

Thanks for reading Surrogate Key vs Primary Keys! For a free trial of Five, make sure to sign up here!