Differences between key, superkey, minimal superkey, candidate key and primary key
I'm new to MySQL, and I'm really confused about the different terms that I've encountered. I tried googling the answer but the results are really confusing and when I try and understand it just seems like they are the same thing.
What exactly are the differences among key, superkey, minimal superkey, candidate key and primary key?
Here I copy paste some of the information that I have collected
Key A key is a single or combination of multiple fields. Its purpose is to access or retrieve data rows from table according to the requirement. The keys are defined in tables to access or sequence the stored data quickly and smoothly. They are also used to create links between different tables.
Types of Keys
Primary Key The attribute or combination of attributes that uniquely identifies a row or record in a relation is known as primary key.
Secondary key A field or combination of fields that is basis for retrieval is known as secondary key. Secondary key is a non-unique field. One secondary key value may refer to many records.
Candidate Key or Alternate key A relation can have only one primary key. It may contain many fields or combination of fields that can be used as primary key. One field or combination of fields is used as primary key. The fields or combination of fields that are not used as primary key are known as candidate key or alternate key.
Composite key or concatenate key A primary key that consists of two or more attributes is known as composite key.
Sort Or control key A field or combination of fields that is used to physically sequence the stored data called sort key. It is also known s control key.
A superkey is a combination of attributes that can be uniquely used to identify a database record. A table might have many superkeys. Candidate keys are a special subset of superkeys that do not have any extraneous information in them.
Example for super key:
Imagine a table with the fields <Name>
, <Age>
, <SSN>
and <Phone Extension>
. This table has many possible superkeys. Three of these are <SSN>
, <Phone Extension, Name>
and <SSN, Name>
. Of those listed, only <SSN>
is a candidate key, as the others contain information not necessary to uniquely identify records.
Foreign Key A foreign key is an attribute or combination of attribute in a relation whose value match a primary key in another relation. The table in which foreign key is created is called as dependent table. The table to which foreign key is refers is known as parent table.
I have always found it difficult to remember all the keys; so I keep the below notes handy, hope they help someone! Let me know if it can be improved.
Key: An attribute or combination of attributes that uniquely identify an entity/record in a relational table.
PK: A single key that is unique and not-null. It is one of the candidate keys.
Foreign Key: FK is a key in one table (child) that uniquely identifies a row of another table (parent). A FK is not-unique in the child table. It is a candidate key in the parent table. Referential integrity is maintained as the value in FK is present as a value in PK in parent table else it is NULL.
Unique Key: A unique key that may or may not be NULL
Natural key: PK in OLTP. It may be a PK in OLAP.
Surrogate Key: It is the Surrogate PK in OLAP acting as the substitute of the PK in OLTP. Artificial key generated internally in OLAP.
Composite Key: PK made up of multiple attributes
SuperKey: A key that can be uniquely used to identify a database record, that may contain extra attributes that are not necessary to uniquely identify records.
Candidate Key: A candidate key can be uniquely used to identify a database record without any extraneous data. They are Not Null and unique. It is a minimal super-key.
Alternate Key: A candidate key that is not the primary key is called an alternate key.
Candidate Key/s with Extraneous data: Consider that can be used to identify a record in the Employee table but candidate key alone is sufficient for this task. So becomes the extraneous data.
Note that the PK, Foreign Key, Unique Key, Natural key, Surrogate Key, Composite Key are defined as Database objects; where the Natural key is a PK in the OLTP and could be a PK in the target OLAP. For the rest of the keys, it's up to the DB designer/architect to decide whether unique/not-null/referential integrity constraints need to enforced or not.
Below I have tried to use set theory to simplify the representation of the membership of the keys w.r.t. each other.
key = { All of the below keys }
PK = { PK }
Foreign Key = { Key with Not Null constraint }
Unique Key = { {Candidate Key/s}, {attributes containing NULL} }
Natural key = { PK }
Surrogate Key = { PK }
Composite Key = { PK }
Super Key = { {Candidate Key/s}, {Candidate Key/s with Extraneous data} }
Candidate Key = { PK, {Alternate Key/s} }
Alternate Key = { {Candidate Keys} - PK }
Candidate Key/s with Extraneous data = { }
I have summarized it below:
Notes: an-overview-of-the-database-keys-primary-key-composite-key-surrogate-key-et-al