How to extract the schema of an Access (.mdb) database?

I am trying to extract the schema of an .mdb database, so that I can recreate the database elsewhere.

How can I pull off something like this?


Solution 1:

It is possible to do a little with VBA. For example, here is a start on creating script for a database with local tables.

Dim db As Database
Dim tdf As TableDef
Dim fld As DAO.Field
Dim ndx As DAO.Index
Dim strSQL As String
Dim strFlds As String
Dim strCn As String

Dim fs, f

    Set db = CurrentDb

    Set fs = CreateObject("Scripting.FileSystemObject")
    Set f = fs.CreateTextFile("C:\Docs\Schema.txt")
    
    For Each tdf In db.TableDefs
        If Left(tdf.Name, 4) <> "Msys" Then
            strSQL = "strSQL=""CREATE TABLE [" & tdf.Name & "] ("
            
            strFlds = ""
            
            For Each fld In tdf.Fields
                
                strFlds = strFlds & ",[" & fld.Name & "] "
                
                Select Case fld.Type
                
                    Case dbText
                        'No look-up fields
                        strFlds = strFlds & "Text (" & fld.Size & ")"
                    
                    Case dbLong
                        If (fld.Attributes And dbAutoIncrField) = 0& Then
                            strFlds = strFlds & "Long"
                        Else
                            strFlds = strFlds & "Counter"
                        End If
    
                    Case dbBoolean
                        strFlds = strFlds & "YesNo"
    
                    Case dbByte
                        strFlds = strFlds & "Byte"
                        
                    Case dbInteger
                        strFlds = strFlds & "Integer"
                    
                    Case dbCurrency
                        strFlds = strFlds & "Currency"
            
                    Case dbSingle
                        strFlds = strFlds & "Single"
            
                    Case dbDouble
                        strFlds = strFlds & "Double"
            
                    Case dbDate
                        strFlds = strFlds & "DateTime"
            
                    Case dbBinary
                        strFlds = strFlds & "Binary"
            
                    Case dbLongBinary
                        strFlds = strFlds & "OLE Object"
                        
                    Case dbMemo
                        If (fld.Attributes And dbHyperlinkField) = 0& Then
                            strFlds = strFlds & "Memo"
                        Else
                            strFlds = strFlds & "Hyperlink"
                        End If
                        
                    Case dbGUID
                        strFlds = strFlds & "GUID"
                        
                End Select
            
            Next
    
            strSQL = strSQL & Mid(strFlds, 2) & " )""" & vbCrLf & "Currentdb.Execute strSQL"
            
            f.WriteLine vbCrLf & strSQL
        
            'Indexes
            For Each ndx In tdf.Indexes
            
                If ndx.Unique Then
                    strSQL = "strSQL=""CREATE UNIQUE INDEX "
                Else
                    strSQL = "strSQL=""CREATE INDEX "
                End If
                
                strSQL = strSQL & "[" & ndx.Name & "] ON [" & tdf.Name & "] ("
                
                strFlds = ""

                For Each fld In tdf.Fields
                    strFlds = strFlds & ",[" & fld.Name & "]"
                Next
                    
                strSQL = strSQL & Mid(strFlds, 2) & ") "
                
                strCn = ""
                
                If ndx.Primary Then
                    strCn = " PRIMARY"
                End If
                
                If ndx.Required Then
                    strCn = strCn & " DISALLOW NULL"
                End If
                
                If ndx.IgnoreNulls Then
                    strCn = strCn & " IGNORE NULL"
                End If
                
                If Trim(strCn) <> vbNullString Then
                    strSQL = strSQL & " WITH" & strCn & " "
                End If
                
                f.WriteLine vbCrLf & strSQL & """" & vbCrLf & "Currentdb.Execute strSQL"
            Next
        End If
    Next
        
    f.Close

Solution 2:

It's an ancient question now, but unfortunately perennial :(

I thought this code may be of use to others looking for solutions. It's designed to be run from the command line via cscript, so no need to import code into your Access project. Similar to (and inspired by) the code from Oliver in How do you use version control with Access development.

' Usage:
'  CScript //Nologo ddl.vbs <input mdb file> > <output>
'
' Outputs DDL statements for tables, indexes, and relations from Access file 
' (.mdb, .accdb) <input file> to stdout.  
' Requires Microsoft Access.
'
' NOTE: Adapted from code from "polite person" + Kevin Chambers - see:
' http://www.mombu.com/microsoft/comp-databases-ms-access/t-exporting-jet-table-metadata-as-text-119667.html
'
Option Explicit
Dim stdout, fso
Dim strFile
Dim appAccess, db, tbl, idx, rel

Set stdout = WScript.StdOut
Set fso = CreateObject("Scripting.FileSystemObject")

' Parse args
If (WScript.Arguments.Count = 0) then
    MsgBox "Usage: cscript //Nologo ddl.vbs access-file", vbExclamation, "Error"
    Wscript.Quit()
End if
strFile = fso.GetAbsolutePathName(WScript.Arguments(0))

' Open mdb file
Set appAccess = CreateObject("Access.Application")
appAccess.OpenCurrentDatabase strFile
Set db = appAccess.DBEngine(0)(0)

' Iterate over tables
  ' create table statements
For Each tbl In db.TableDefs
  If Not isSystemTable(tbl) And Not isHiddenTable(tbl) Then
    stdout.WriteLine getTableDDL(tbl)
    stdout.WriteBlankLines(1)

    ' Iterate over indexes
      ' create index statements
    For Each idx In tbl.Indexes
      stdout.WriteLine getIndexDDL(tbl, idx)
    Next

    stdout.WriteBlankLines(2)
  End If
Next

' Iterate over relations
  ' alter table add constraint statements
For Each rel In db.Relations
  Set tbl = db.TableDefs(rel.Table)
  If Not isSystemTable(tbl) And Not isHiddenTable(tbl) Then
    stdout.WriteLine getRelationDDL(rel)
    stdout.WriteBlankLines(1)
  End If
Next

Function getTableDDL(tdef)
Const dbBoolean = 1
Const dbByte = 2
Const dbCurrency = 5
Const dbDate = 8
Const dbDouble = 7
Const dbInteger = 3
Const dbLong = 4
Const dbDecimal = 20
Const dbFloat = 17
Const dbMemo = 12
Const dbSingle = 6
Const dbText = 10
Const dbGUID = 15
Const dbAutoIncrField = 16

Dim fld
Dim sql
Dim ln, a

    sql = "CREATE TABLE " & QuoteObjectName(tdef.name) & " ("
    ln = vbCrLf

    For Each fld In tdef.fields
       sql = sql & ln & " " & QuoteObjectName(fld.name) & " "
       Select Case fld.Type
       Case dbBoolean   'Boolean
          a = "BIT"
       Case dbByte   'Byte
          a = "BYTE"
       Case dbCurrency  'Currency
          a = "MONEY"
       Case dbDate 'Date / Time
          a = "DATETIME"
       Case dbDouble    'Double
          a = "DOUBLE"
       Case dbInteger   'Integer
          a = "INTEGER"
       Case dbLong  'Long
          'test if counter, doesn't detect random property if set
          If (fld.Attributes And dbAutoIncrField) Then
             a = "COUNTER"
          Else
             a = "LONG"
          End If
       Case dbDecimal    'Decimal
          a = "DECIMAL"
       Case dbFloat      'Float
          a = "FLOAT"
       Case dbMemo 'Memo
          a = "MEMO"
       Case dbSingle    'Single
          a = "SINGLE"
       Case dbText 'Text
          a = "VARCHAR(" & fld.Size & ")"
       Case dbGUID 'Text
          a = "GUID"
       Case Else
          '>>> raise error
          MsgBox "Field " & tdef.name & "." & fld.name & _
                " of type " & fld.Type & " has been ignored!!!"
       End Select

       sql = sql & a

       If fld.Required Then _
          sql = sql & " NOT NULL "
       If Len(fld.DefaultValue) > 0 Then _
          sql = sql & " DEFAULT " & fld.DefaultValue

       ln = ", " & vbCrLf
    Next

    sql = sql & vbCrLf & ");"
    getTableDDL = sql

End Function

Function getIndexDDL(tdef, idx)
Dim sql, ln, myfld

    If Left(idx.name, 1) = "{" Then
       'ignore, GUID-type indexes - bugger them
    ElseIf idx.Foreign Then
       'this index was created by a relation.  recreating the
       'relation will create this for us, so no need to do it here
    Else
       ln = ""
       sql = "CREATE "
       If idx.Unique Then
           sql = sql & "UNIQUE "
       End If
       sql = sql & "INDEX " & QuoteObjectName(idx.name) & " ON " & _
             QuoteObjectName(tdef.name) & "( "
       For Each myfld In idx.fields
          sql = sql & ln & QuoteObjectName(myfld.name)
          ln = ", "
       Next
       sql = sql & " )"
       If idx.Primary Then
          sql = sql & " WITH PRIMARY"
       ElseIf idx.IgnoreNulls Then
          sql = sql & " WITH IGNORE NULL"
       ElseIf idx.Required Then
          sql = sql & " WITH DISALLOW NULL"
       End If
       sql = sql & ";"
    End If
    getIndexDDL = sql

End Function

' Returns the SQL DDL to add a relation between two tables.
' Oddly, DAO will not accept the ON DELETE or ON UPDATE
' clauses, so the resulting sql must be executed through ADO
Function getRelationDDL(myrel)
Const dbRelationUpdateCascade = 256
Const dbRelationDeleteCascade = 4096
Dim mytdef
Dim myfld
Dim sql, ln


    With myrel
       sql = "ALTER TABLE " & QuoteObjectName(.ForeignTable) & _
             " ADD CONSTRAINT " & QuoteObjectName(.name) & " FOREIGN KEY ( "
       ln = ""
       For Each myfld In .fields 'ie fields of the relation
          sql = sql & ln & QuoteObjectName(myfld.ForeignName)
          ln = ","
       Next
       sql = sql & " ) " & "REFERENCES " & _
             QuoteObjectName(.table) & "( "
       ln = ""
       For Each myfld In .fields
          sql = sql & ln & QuoteObjectName(myfld.name)
          ln = ","
       Next
       sql = sql & " )"
       If (myrel.Attributes And dbRelationUpdateCascade) Then _
             sql = sql & " ON UPDATE CASCADE"
       If (myrel.Attributes And dbRelationDeleteCascade) Then _
             sql = sql & " ON DELETE CASCADE"
       sql = sql & ";"
    End With
    getRelationDDL = sql
End Function


Function isSystemTable(tbl)
Dim nAttrib
Const dbSystemObject = -2147483646
    isSystemTable = False
    nAttrib = tbl.Attributes
    isSystemTable = (nAttrib <> 0 And ((nAttrib And dbSystemObject) <> 0))
End Function

Function isHiddenTable(tbl)
Dim nAttrib
Const dbHiddenObject = 1
    isHiddenTable = False
    nAttrib = tbl.Attributes
    isHiddenTable = (nAttrib <> 0 And ((nAttrib And dbHiddenObject) <> 0))
End Function

Function QuoteObjectName(str)
    QuoteObjectName = "[" & str & "]"
End Function

If you are looking to export query definitions as well, this question should help. It's a little different because you don't usually create querydefs with plain DDL CREATE VIEW foo AS ... syntax, in fact I'm not sure you can (?)

But here's a little piece of a script I wrote for backing up queries to separate .sql files (which is part of a larger script for backing up all front-end db code, see Oliver's answer for this question).

Dim oApplication
Set oApplication = CreateObject("Access.Application")
oApplication.OpenCurrentDatabase sMyAccessFilePath
oApplication.Visible = False

For Each myObj In oApplication.DBEngine(0)(0).QueryDefs
    writeToFile sExportpath & "\queries\" & myObj.Name & ".sql", myObj.SQL 
Next

Function writeToFile(path, text)
Dim fso, st
  Set fso = CreateObject("Scripting.FileSystemObject")
  Set st = fso.CreateTextFile(path, True)
  st.Write text
  st.Close
End Function

Solution 3:

The following C# outlines how to obtain the schema from a .mdb file.

Obtain a connection to the database:

String f = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + "database.mdb";
OleDbConnection databaseConnection = new OleDbConnection(f);
databaseConnection.Open();

Get the name of each table:

DataTable dataTable = databaseConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
int numTables = dataTable.Rows.Count;
for (int tableIndex = 0; tableIndex < numTables; ++tableIndex)
{
    String tableName = dataTable.Rows[tableIndex]["TABLE_NAME"].ToString();

Get the fields for each table:

    DataTable schemaTable = databaseConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, new object[] { null, null, tableName, null });
    foreach (DataRow row in schemaTable.Rows)
    {
        String fieldName = row["COLUMN_NAME"].ToString(); //3
        String fieldType = row["DATA_TYPE"].ToString(); // 11
        String fieldDescription = row["DESCRIPTION"].ToString(); //27
    }
}

Where do the 3, 11 and 27 come from? I found them by inspecting DataRow.ItemArray with a debugger, does anyone know the 'correct' way?

Solution 4:

If you're happy to use something other than pure Access SQL, you could persist a collection of ADOX objects and use those to recreate the table structure.

Example (in Python, doesn't currently recreate relationships and indexes as it wasn't needed for the project I was working on):

import os
import sys
import datetime
import comtypes.client as client

class Db:
    def __init__(self, original_con_string = None, file_path = None,
                 new_con_string = None, localise_links = False):
        self.original_con_string = original_con_string
        self.file_path = file_path
        self.new_con_string = new_con_string
        self.localise_links = localise_links

    def output_table_structures(self, verbosity = 0):
        if os.path.exists(self.file_path):
            if not os.path.isdir(self.file_path):
                raise Exception("file_path must be a directory!")
        else:
            os.mkdir(self.file_path)
        cat = client.CreateObject("ADOX.Catalog")
        cat.ActiveConnection = self.original_con_string
        linked_tables = ()
        for table in cat.Tables:
            if table.Type == u"TABLE":
                f = open(self.file_path + os.path.sep +
                         "Tablestruct_" + table.Name + ".txt", "w")
                conn = client.CreateObject("ADODB.Connection")
                conn.ConnectionString = self.original_con_string
                rs = client.CreateObject("ADODB.Recordset")
                conn.Open()
                rs.Open("SELECT TOP 1 * FROM [%s];" % table.Name, conn)
                for field in rs.Fields:
                    col = table.Columns[field.Name]
                    col_details = (col.Name, col.Type, col.DefinedSize,
                                   col.Attributes)
                    property_dict = {}
                    property_dict["Autoincrement"] = (
                        col.Properties["Autoincrement"].Value)
                    col_details += property_dict,
                    f.write(repr(col_details) + "\n")
                rs.Close()
                conn.Close()
                f.close()
            if table.Type == u"LINK":
                table_details = table.Name,
                table_details += table.Properties(
                    "Jet OLEDB:Link DataSource").Value,
                table_details += table.Properties(
                    "Jet OLEDB:Link Provider String").Value,
                table_details += table.Properties(
                    "Jet OLEDB:Remote Table Name").Value,
                linked_tables += table_details,
        if linked_tables != ():
            f = open(self.file_path + os.path.sep +
                     "linked_list.txt", "w")
            for t in linked_tables:
                f.write(repr(t) + "\n")
        cat.ActiveConnection.Close()

A similar reverse function reconstructs the database using the second connection string.

Solution 5:

You can use the ACE/Jet OLE DB Provider and an ADO Connection object's OpenSchema method to get schema information as a Recordset (which is arguable better than a Collection because it can be filtered, sorted, etc).

The basic methodology is to use adSchemaTables to get the base tables (not VIEWs), then use each TABLE_NAME to fetch adSchemaColumns for ORDINAL_POSITION, !DATA_TYPE, !IS_NULLABLE, !COLUMN_HASDEFAULT, !COLUMN_DEFAULT, !CHARACTER_MAXIMUM_LENGTH, !NUMERIC_PRECISION,!NUMERIC_SCALE.

adSchemaPrimaryKeys is straightforward. adSchemaIndexes is where you will find UNIQUE constraints, not sure wether these can be distinguished from unique indexes, also the names of FOREIGN KEYs to plug into the adSchemaForeignKeys rowset e.g. (pseudo code):

rsFK.Filter = "FK_NAME = '" & !INDEX_NAME & "'") 

-- watch for the gotcha that Jet 3.51 allows a FK based on a nameless PK (!!)

Names of Validation Rules and CHECK constraints can be found in the adSchemaTableConstraints rowset, using the table name in the OpenSchema call, then use the name in the call to the adSchemaCheckConstraints rowset, filter for CONSTRAINT_TYPE = 'CHECK' (a gotcha is a constraint named 'ValidationRule' + Chr$(0), so best to escape the null characters form the name). Remember that ACE/Jet Validation rules can be either row-level or table-level (CHECK constraints are always table-level), so you may need to use the table name in the filter: for adSchemaTableConstraints is [].[].ValidationRule will be [].ValidationRule in adSchemaCheckConstraints. Another gotcha (suspected bug) is that the Field is 255 characters wide, so any Validation Rule/CHECK constraint definition of more than 255 characters will have a NULL value.

adSchemaViews, for Access Query objects based on non-paramaterized SELECT SQL DML, is straightforward; you can use the VIEW name in adSchemaColumns to get the column details.

PROCEDURES are in adSchemaProcedures, being all other flavours of Access Query objects including parameterized SELECT DML; for the latter I prefer to replace the PARAMETERS syntax with CREATE PROCEDURE PROCEDURE_NAME in the PROCEDURE_DEFINITION. Don't boterh looking in the adSchemaProcedureParameters, you won't find anything: the parameters can be enumerated by using an ADOX Catalog object to return an ADO Command e.g. (pseudo code):

Set Command = Catalog.Procedures(PROCEDURE_NAME).Command

then enumerate the Comm.Parameters collection for the .Name, .Type for DATA_TYPE, (.Attributes And adParamNullable) for IS_NULLABLE, .Value for COLUMN_HASDEFAULT and COLUMN_DEFAULT, .Size, .Precision, .NumericScale.

For ACE/Jet-specific properties such as Unicode compression you need to use another kind of object. For example, a Long Integer Autonumber in Access-speak can be found using an ADO Catalog object e.g. (pseudo code):

bIsAutoincrement = Catalog.Tables(TABLE_NAME).Columns(COLUMN_NAME).Properties("Autoincrement").Value

Good luck :)