How do you add a NOT NULL Column to a large table in SQL Server?

I ran into this problem for my work also. And my solution is along #2.

Here are my steps (I am using SQL Server 2005):

1) Add the column to the table with a default value:

ALTER TABLE MyTable ADD MyColumn varchar(40) DEFAULT('')

2) Add a NOT NULL constraint with the NOCHECK option. The NOCHECK does not enforce on existing values:

ALTER TABLE MyTable WITH NOCHECK
ADD CONSTRAINT MyColumn_NOTNULL CHECK (MyColumn IS NOT NULL)

3) Update the values incrementally in table:

GO
UPDATE TOP(3000) MyTable SET MyColumn = '' WHERE MyColumn IS NULL
GO 1000
  • The update statement will only update maximum 3000 records. This allow to save a chunk of data at the time. I have to use "MyColumn IS NULL" because my table does not have a sequence primary key.

  • GO 1000 will execute the previous statement 1000 times. This will update 3 million records, if you need more just increase this number. It will continue to execute until SQL Server returns 0 records for the UPDATE statement.


Here's what I would try:

  • Do a full backup of the database.
  • Add the new column, allowing nulls - don't set a default.
  • Set SIMPLE recovery, which truncates the tran log as soon as each batch is committed.
  • The SQL is: ALTER DATABASE XXX SET RECOVERY SIMPLE
  • Run the update in batches as you discussed above, committing after each one.
  • Reset the new column to no longer allow nulls.
  • Go back to the normal FULL recovery.
  • The SQL is: ALTER DATABASE XXX SET RECOVERY FULL
  • Backup the database again.

The use of the SIMPLE recovery model doesn't stop logging, but it significantly reduces its impact. This is because the server discards the recovery information after every commit.


You could:

  1. Start a transaction.
  2. Grab a write lock on your original table so no one writes to it.
  3. Create a shadow table with the new schema.
  4. Transfer all the data from the original table.
  5. execute sp_rename to rename the old table out.
  6. execute sp_rename to rename the new table in.
  7. Finally, you commit the transaction.

The advantage of this approach is that your readers will be able to access the table during the long process and that you can perform any kind of schema change in the background.


Just to update this with the latest information.

In SQL Server 2012 this can now be carried out as an online operation in the following circumstances

  1. Enterprise Edition only
  2. The default must be a runtime constant

For the second requirement examples might be a literal constant or a function such as GETDATE() that evaluates to the same value for all rows. A default of NEWID() would not qualify and would still end up updating all rows there and then.

For defaults that qualify SQL Server evaluates them and stores the result as the default value in the column metadata so this is independent of the default constraint which is created (which can even be dropped if no longer required). This is viewable in sys.system_internals_partition_columns. The value doesn't get written out to the rows until next time they happen to get updated.

More details about this here: online non-null with values column add in sql server 2012


Admitted that this is an old question. My colleague recently told me that he was able to do it in one single alter table statement on a table with 13.6M rows. It finished within a second in SQL Server 2012. I was able to confirm the same on a table with 8M rows. Something changed in later version of SQL Server?

Alter table mytable add mycolumn char(1) not null default('N');