Best practices for automated database migration

Automating aspects of database migration can make significant time savings.

Matthew Dunsdon running a talk on database migration

Overview

I think it is important to look at how databases affect the development of systems. For I believe that this does not only affect how functionality is composed; but defines the roles, responsibilities and social interactions within a development. I would therefore like to take this opportunity to discuss with you the methods that use database version control to provide database migration and the possibilities of automating database migration.

Project Development Level

In project development, it is important to treat the database as a distinct component to the application and external services (such as data logging and report generation tools). The primary job of both the application and any external service is to provide processing to the data it receives, whereas the function of the database is primarily as an information storage and retrieval system.

Diagram showing that the application, the database and the external services are separate.

The problems faced when looking at typical development teams is that there is this propensity for application code to be placed under a version control system, but for the database to be placed outside of version control. It is often common place for single instances of databases to be shared between multiple developers.

From a project development level, each developer should work with their own version of the database. No longer does there need to be a build-master to police the changes made to the database.

By using database scripts to make changes to the database and having a system which can execute those scripts, then there should be the framework to allow developers to re-build their databases to make sure they have the modifications that other team members have made.

Comic sketch by Geek & Poke

Automation Process

The automation process for the database component of a project could be done by tools that generate database scripts. One option would be to use tools that examine code and files to automatically generate these scripts. The alternative approach would be to consider using a graphical tool that allows the user to modify a database, whilst it tracks changes made in order to generate scripts.

An example of a tool that manages scripts is Microsoft Visual Studio Team System, which allows ‘Database Projects’ to be created.

Whether or not you would want to use these tools to automate the process of generating databases scripts is really be up to you. The issue that will be faced is finding the right tool for your needs that will allows those nasty edge cases to be manually enforced.

As long as database descriptions are generated, whether that is manually or otherwise, then they can be versioned. With database descriptions versioned, then the rest of the database migration can take place. From my experience, generating these by hand would work.

Addition of Features

Features are added to a project to add or modify functionality for the application, database (such as reporting data or query performance) for the whole project. It should be the case that when modifications are made to a database, they should also be attached to any relevant code changes via source control.

Developers should be encouraged to change the structure of tables, rather than feeling that they should only rely on views / functions / procedures to generate data. This attitude should affect the culture in the workplace and empower good development choices.

It is worth explaining at this point the two methods of defining modifications to the database when using database scripts.

Object Based Scripts

Example list of data tables
In an object based scripting system, one file is responsible for creating a new item – such as a database table, view, procedure or function. Each file may also take the responsibility of removing or backing up the previous instance of the item before creating a fresh copy.

What this method does is allow fresh items to be created by a developer in one go, since one file contains the complete description needed for creation.

The process for re-building the database is:

  1. Remove database
  2. Run every file to create database, tables and other items
  3. Import data into tables

‘Delta’ Based Scripts

Screen shot of example delta patching list
A ‘delta’ based scripting system, operates by taking built version of a database and applying patches to bring a database up to date. In this type of script, one file shows changes to an item rather than describing item as an item. This means that often it is easier to find out the current structure of a table by checking database rather than reading through multiple files.

The process for re-building the database is:

  1. Remove database
  2. Build database by importing tables and data
  3. Apply patches up till the version needed
  4. Call scripts to create the other items – views, procedures, functions

Scripting Database Migration

‘Delta’ files work well for migration, as long as you keep record of the patches that have been applied to a database.

The process:

  1. Perform delta
  2. Drop other database items (view, procedures, functions)
  3. Build / create other database items

The advantages of using delta files are that they often provide a good description of the changes that have taken place in a table and often capture the intent of the changes as part of a database patch. Data processing and manipulation can take place on a patch by patch basis.

They are not always the best of solutions. They may increase the time taken for a development build of a project to re-build the database, as tables will be modified more than once.

Relying on this approach will also require developers to create ‘delta’ files when they want on the database. It is hard to judge whether this is a positive or negative change.

When using object based scripts for database migration, either a database difference script or a set of ‘delta’ files need to be generated. The problem with relying on a generated script that just compares the database structure is that the outputted migration script is usually too messy to read.

‘Delta’ files can be written by comparing the table differences using a version control system. Those files can either be written manually or by an automation process that can process the logs in the version control system. Both of these approaches often the require contact developer who made the change, in order to understand the intent of the change and whether any data processing is needed.

The Current Migration Scheme

We use object based scripts to keep the barrier for database table changes to be low. Delta files are not automatically generated and require one or two individuals to write them. This does require chatting to other member of the development team to make sure the intent and data changes can be captured.

There are several scripts currently in use to aid the development of ‘delta’ files and to perform database migration. In this first case, there are tools to help perform operations such as add column, which cannot be performed in a single database command. There are scripts that allow test migrations to take place by retrieving and building an old version of a database and then running a set of patches.

There is also a tool called SQL Delta, which allows database structures and data to be compared for inconsistencies.

There is currently no script that handles the recreation of the functions, procedure, views, etc.

Where can we take this?

We need to change to way database migration takes place. The process for updating functions, procedures and views is currently a manual process involving dependencies on GUI. It should be appropriate to drop these items and create them new, since they contain no data.

It is worth considering if the generation of delta files and patches can be aided via an automated process that using a version control system to identify changes made. This may allow delta files to be grouped together when files are modifying same table.

There is no way to automate the testing of a migration easily with the scripts. We may need to look at making changes or additions to the test migration scripts to better support this.

Currently, we can use SQL Delta to some extent to aid testing by using it to compare structure and data differences.

This post is part of our team knowledge share presentations. To see a complete list of these sessions, please take a look at our knowledge sharing page.