Best practices for storing postal addresses in a database (RDBMS)?

Are there any good references for best practices for storing postal addresses in an RDBMS? It seems there are lots of tradeoffs that can be made and lots of pros and cons to each to be evaluated -- surely this has been done time and time again? Maybe someone has at least written done some lessons learned somewhere?

Examples of the tradeoffs I am talking about are storing the zipcode as an integer vs a char field, should house number be stored as a separate field or part of address line 1, should suite/apartment/etc numbers be normalized or just stored as a chunk of text in address line 2, how do you handle zip +4 (separate fields or one big field, integer vs text)? etc.

I'm primarily concerned with U.S. addresses at this point but I imagine there are some best practices in regards to preparing yourself for the eventuality of going global as well (e.g. naming fields appropriately like region instead of state or postal code instead of zip code, etc.


Solution 1:

For more international use, one schema to consider is the one used by Drupal Address Field. It's based on the xNAL standard, and seems to cover most international cases. A bit of digging into that module will reveal some nice pearls for interpreting and validating addresses internationally. It also has a nice set of administrative areas ( province, state, oblast, etc ) with ISO codes.

Here's the gist of the schema, copied from the module page:

country => Country (always required, 2 character ISO code)
name_line => Full name (default name entry)
first_name => First name
last_name => Last name
organisation_name => Company
administrative_area => State / Province / Region (ISO code when available)
sub_administrative_area => County / District (unused)
locality => City / Town
dependent_locality => Dependent locality (unused)
postal_code => Postal code / ZIP Code
thoroughfare => Street address
premise => Apartment, Suite, Box number, etc.
sub_premise => Sub premise (unused)

A lessons I've learned:

  • Don't store anything numerically.
  • Store country and administrative area as ISO codes where possible.
  • When you don't know, be lax about requiring fields. Some country may not use fields you take for granted, even basic things like locality & thoroughfare.

Solution 2:

As an 'international' user, there is nothing more frustrating than dealing with a website that is oriented around only US-format addresses. It's a little rude at first, but becomes a serious problem when the validation is also over-zealous.

If you are concerned with going global, the only advice I have is to keep things free-form. Different countries have different conventions - in some, the house number comes before the street name, in some it comes after. Some have states, some regions, some counties, some combinations of those. Here in the UK, the zipcode is not a zipcode, it's a postcode containing both letters and numbers.

I'd advise simply ~10 lines of variable-length strings, together with a separate field for a postcode (and be careful how you describe that to cope with national sensibilities). Let the user/customer decide how to write their addresses.

Solution 3:

If you need comprehensive information about how other countries use postal addresses, here's a very good reference link (Columbia University):

Frank's Compulsive Guide to Postal Addresses
Effective Addressing for International Mail