Same data from different entities in Database - Best Practice - Phone numbers example
Given a database system which deals with Staff, Customers and Suppliers, all of which have multiple possible phone numbers, how would you go about storing these numbers in a nice normalised way? I have a had a little think about and the logical way isn't jumping out at me.
Solution 1:
In most cases . . .
- "Staff" always describes people.
- Some customers are people.
- Some customers are businesses (organizations).
- "Suppliers" are usually (always?) organizations.
- Staff can also be customers.
- Suppliers can also be customers.
There are serious problems with having separate tables of staff phone numbers, supplier phone numbers, and customer phone numbers.
- Staff can be customers. If a staff phone number changes, does a customer phone number also need to be updated? How do you know which one to update?
- Suppliers can be customers. If a supplier's phone number changes, does a customer phone number also need to be updated? How do you know which one to update?
- You have to duplicate and maintain without error the constraints for phone numbers in every table that stores phone numbers.
- The same problems arise when a customer's phone number changes. Now you have to check to see whether staff and supplier phone numbers also need to be updated.
- To answer the question "Whose phone number is 123-456-7890?", you have to look in 'n' different tables, where 'n' is the number of different "kinds" of parties you deal with. In addition to staff, customers, and suppliers, think "contractor's phones", "prospect's phones", etc.
You need to implement a supertype/subtype schema. (PostgreSQL code, not rigorously tested.)
create table parties (
party_id integer not null unique,
party_type char(1) check (party_type in ('I', 'O')),
party_name varchar(10) not null unique,
primary key (party_id, party_type)
);
insert into parties values (1,'I', 'Mike');
insert into parties values (2,'I', 'Sherry');
insert into parties values (3,'O', 'Vandelay');
-- For "persons", a subtype of "parties"
create table person_st (
party_id integer not null unique,
party_type char(1) not null default 'I' check (party_type = 'I'),
height_inches integer not null check (height_inches between 24 and 108),
primary key (party_id),
foreign key (party_id, party_type) references parties (party_id, party_type) on delete cascade
);
insert into person_st values (1, 'I', 72);
insert into person_st values (2, 'I', 60);
-- For "organizations", a subtype of "parties"
create table organization_st (
party_id integer not null unique,
party_type CHAR(1) not null default 'O' check (party_type = 'O'),
ein CHAR(10), -- In US, federal Employer Identification Number
primary key (party_id),
foreign key (party_id, party_type) references parties (party_id, party_type) on delete cascade
);
insert into organization_st values (3, 'O', '00-0000000');
create table phones (
party_id integer references parties (party_id) on delete cascade,
-- Whatever you prefer to distinguish one kind of phone usage from another.
-- I'll just use a simple 'phone_type' here, for work, home, emergency,
-- business, and mobile.
phone_type char(1) not null default 'w' check
(phone_type in ('w', 'h', 'e', 'b', 'm')),
-- Phone numbers in the USA are 10 chars. YMMV.
phone_number char(10) not null check (phone_number ~ '[0-9]{10}'),
primary key (party_id, phone_type)
);
insert into phones values (1, 'h', '0000000000');
insert into phones values (1, 'm', '0000000001');
insert into phones values (3, 'h', '0000000002');
-- Do what you need to do on your platform--triggers, rules, whatever--to make
-- these views updatable. Client code uses the views, not the base tables.
-- In current versions of PostgreSQL, I think you'd create some "instead
-- of" rules.
--
create view people as
select t1.party_id, t1.party_name, t2.height_inches
from parties t1
inner join person_st t2 on (t1.party_id = t2.party_id);
create view organizations as
select t1.party_id, t1.party_name, t2.ein
from parties t1
inner join organization_st t2 on (t1.party_id = t2.party_id);
create view phone_book as
select t1.party_id, t1.party_name, t2.phone_type, t2.phone_number
from parties t1
inner join phones t2 on (t1.party_id = t2.party_id);
To stretch this out a little further, a table to implement "staff" needs to reference the person subtype, not the party supertype. Organizations can't be on staff.
create table staff (
party_id integer primary key references person_st (party_id) on delete cascade,
employee_number char(10) not null unique,
first_hire_date date not null default CURRENT_DATE
);
If suppliers can only be organizations, not individuals, then a table implementing suppliers would reference the organizations subtype in a similar way.
For most companies, a customer can be either a person or an organization, so a table implementing customers should reference the supertype.
create table customers (
party_id integer primary key references parties (party_id) on delete cascade
-- Other attributes of customers
);