What Is a Primary Key?
A primary key is an entry in a database that is unique to a single record. This key is generally derived in one of two ways: a unique identification code from outside the database or a generated number from within the database. When the database will contain information that is always unique to the entry, such as a social security number or part identification number, then those are typically used as a primary key. When the data won’t have such an identifier, the database will often create numbers based on its internal systems to give each record a unique code.
There are three main restrictions on a primary key: existence, uniqueness and immutability. A key must exist at the time the record is made — it can’t be added in later. Each key has to be completely different from any other key. This means that common identifiers, such as name or birth date, can’t be used because it is possible that two people are born on the same day with the same name. Lastly, a primary key can never be altered once created.
Since a database can potentially have an infinite number of entries, a primary key must be infinite as well. To make sure that a database will never run out of keys, most records use a numerical code for the key. Since numbers can always get bigger and computers can simply add place-holding zeroes to older entries, a system will never run out of keys. Sometimes these numbers are based on non-unique information, but a unique identifier is added to make sure the key is viable.
Databases will use a primary key as a way of organizing data. Since the key is never repeated, that piece of information will allow the database to keep every record separate from every other. Each piece of information in a record is connected back to the key; that way, no matter what happens to the system, the database can rebuild the records from loose information.
Assigning a meaningful primary key is often seen as a better practice than auto-generating a value. This will give the record an identifier that both works as a key and provides data. In small databases, this distinction is rarely necessary, but in large systems, the extra space used by a generated key can result is serious database bloat. This will both slow the system down and make the database require significantly more storage space.
I can't agree with what author said about the "meaningful primary key". The primary key should be some unique meaningless identifier. Whenever you assign a meaningful primary key, I guarantee that exception will occur. If you didn't see one, you project is not big enough. For example, you would assume a SSN can be used as unique identifier, but what if the person is an undocumented, new born baby? They don't have a SSN!
@allenJo - One of the other purposes of primary keys is that they enable you to relate tables to each other. Table relationships are another aspect of good database design.
You may be able to relate tables without primary keys, but you will have problems later on if you try to do batch updates of the data and stuff. I also think that in this regard, numbers are indeed better candidates as primary keys.
@hamje32 - You can certainly do that, but in my opinion there is no better primary key than a single, unique number. If you use something else – or even a composite of fields like you said – it slows down the system.
As much as some people bewail Social Security numbers as being the mark of Big Brother, there would no other practical way to uniquely track individuals in our country in my opinion.
What else would you use – a combination of last name, first name, date and location of birth? I guarantee you that you will hit duplicates using that approach eventually.
You can’t go wrong with a number that never repeats. Databases are far more efficient processing numbers than they are with text anyway.
@Mammmood - Yeah, another thing to point out is that you can have multiple primary keys in a table. In this case it would be a composite primary key.
If it takes, for example, three pieces of information to uniquely identify a record, then you can set primary keys on all of them. The combination of those three fields would need to be unique, otherwise the database would throw up an error and prevent you from adding a new record.
Here I am not referring to an auto number field of course, which would be unique in itself. I am referring to other fields you create which you decide to use as primary keys.
Primary keys are one of the first things that you learn about in good database design. To create tables that have no primary keys is to invite a nightmare.
If you don’t want to create a unique primary key on your own, some database systems have what’s called an auto number field. Basically this field starts with the number one and increments with each new record that you add. It’s a long integer field, and it can increment until you reach a certain number into the billions.
Most tables don’t have billions of records so you don’t need to worry there.
@robbie21 - Well, with ISBNs, you see that they had to add three more digits because 10 digits were no longer going to be enough! And I guess with social security numbers, if there were ever more than a billion Americans, they could add digits to that, too!
VINs are so long and remember they also have letters in them, which means there are more possibilities (because a letter space has 26 rather than 10 possible place fillers) so I suppose they have plenty of room for now. After all, there isn't really an infinite number of cars - there are only so many people, and only so many cars that it's practical for one person to own!
But yes, I guess those would be really good examples of primary keys. The difference I see is that the ISBN refers to a whole edition of a book - all that copies would have the same number - while a VIN is unique to one specific vehicle. (I'm a librarian and one memorable summer, I had a job in a body shop!)
Would ISBN (the 10- or 13-digit number you find in a book that uniquely identifies it) and VIN (vehicle identification number, that super-long number your car has) be considered primary keys? They are not infinite; they have set lengths. So do social security numbers, come to think of it.
Post your comments