Dbmstools Documentation

Contents

Introduction

dbmstools is a set of utilities for software developers who work with databases. dbmstools understands database schemas (tables, sequences, views etc.) and DBMSs (Database Management Systems, e.g. Oracle, Postgres). It can:

  • Generate the same (or similar) schema for many different DBMSs, all from a single XML representation
  • Migrate a schema from one DBMS to another, via an XML representation
  • Given 2 versions of the schema in XML, generate schema upgrade scripts for one or more DBMS
  • Generate HTML documentation (including diagrams) from an XML representation
  • Export data from a schema, in many different formats
  • Export a schema from a DBMS to XML
  • Clear all data from a schema (or some tables)
  • Drop all objects (tables, views, sequences) from a schema (or some objects)
  • Customise data insert scripts for different DBMSs.
images/overview-small.png

Tools in the dbmstools package - click for the full-sized diagram

dbmstools is mainly aimed at developers who work with databases, and particularly those who need to support schemas on more than one DBMS. The base tools are intended to allow all the database information (schema and base data) to be kept in one single place, and to have DBMS-specific scripts (to create and populate the database schema, and upgrade from one version to the next) generated from that data. If you're anything like me you hate duplication in software code, and I wrote dbmstools to remove that duplication.

As well as the tools for manipulating schemas, there's a second group of tools to help with common database-related tasks, in particular those that are needed during system testing (clearing or dropping tables, inserting or exporting test data).

All tools can be run from the command line on most operating systems. There are also wrappers for several of the tools, so that they can be run from within Apache Ant (a Java build system) using a supplied Ant task.

There are a number of GUI tools that do much more than dbmstools does. However, there seem to be very few that can be plugged into a build system, to produce schema create and drop scripts (and schema documentation) automatically. That's a hole that dbmstools attempts to fill.

The project home page is at http://sourceforge.net/projects/dbmstools/.

Main Features

The main purpose of this tool set is creating, maintaining and upgrading DBMS-specific database schemas from a single XML model. dbmstools provides tools (both command-line and running in ant, the Java build tool) to:

There are also 3 tools concerned with DML (data manipulation language - the SQL commands that select, insert, update or delete data). These will:

The tools can generate and upgrade the following schema artifacts:

The DBMS's supported (and the completeness/maturity of that support) are:

A major feature of dbmstools is the ability to define templates for columns and grants, and use those templates in different tables, either as-is or with some attributes overridden. This reduces duplication in your schema, which increases consistency and reduces maintenance (always a good result if you can get it).

The tools can also generate different schema variants, based on variables provided to the tools. This is commonly used when the schema varies by DBMS, but it can be used for any purpose, e.g. to generate master and slave versions of a replicated schema (where the slave has more indexes).

By and large, these tools don't try to hide what's happening, particularly the major SQL differences between different DBMSs - rather, they provide some default mapping ability to each DBMS, but also allow you to specify exactly what should happen for different DBMSs.

Quick Start to Using dbmstools

In the immortal words of the Python README, "If you don't read instructions, congratulations on getting this far. :-)".

I imagine you'll be looking for the quick start guide. You may also want to look at the definition of the XML used by dbmstools.

Prerequisites

At its simplest, dbmstools requires only Python, version 2.2+. This can be either the standard Python (aka CPython), or Jython, a variant that runs in Java (dbmstools provides a stripped-down version of Jython). For most people, a practical minimum will be Python.

However, there are more requirements if you want to use the ant wrappers, extract information from an existing database (both of these require Java 1.3+), or generate schema diagrams (which requires 'dot'). See installation details for more information.

dbmstools should run wherever Python and Java do. It gets tested on Linux and Windows.

Installation

Installation is pretty simple - see the installation guide for details.

The Tools

All the tools are in the main/dbmstools directory. They break down into 2 categories: those related to DDL (i.e. for maintaining schemas), and those related to DML (i.e. for maintaining data).

DDL Tools (for schemas)

xml2ddl:xml2ddl is the main tool. It generates DBMS-specific DDL to create a schema, from the common XML representation. It also generates scripts to drop all the objects in the schema.
xml2ddl-ant:xml2ddl-ant is a wrapper for xml2ddl, allowing it to be used within Apache ant build scripts.
xml2doc:xml2doc generates DBMS-specific documentation in HTML, from the XML schema representation. The documentation can also include clickable diagrams.
xml2doc-ant:xml2doc-ant is an ant wrapper for xml2doc, so you can generate documentation automatically from your automated builds.
deltaddl:deltaddl takes 2 versions of a database schema (as XML files), and diffs them to generate DBMS-specific upgrade (or downgrade) scripts to convert the first schema version into the second.
db2xml:db2xml is a reverse-engineering tool to import an existing database schema into the required XML format by querying an existing database. This script must be run using the dbmsjy wrapper.
dbdrop:dbdrop drops schema objects (tables, sequences and/or views), either directly or by producing an SQL script. This script must be run using the dbmsjy wrapper.

For any of the DDL tools above, you'll want to check out the documentation of the XML schema used by dbmstools.

DML Tools (for data)

filterdml:filterdml generates DBMS-specific DML from a general representation. This allows reference data to be maintained in 1 form only.
filterdml-ant:filterdml-ant is a wrapper for filterdml, allowing it to be used within Apache ant build scripts.
db2sql:db2sql exports data from a database, as SQL INSERT statements, XML, comma- or tab-separated text, Postgres COPY statements, straight text, or just row counts. This can be used to migrate data from 1 DBMS type to another, and is also useful for system testing (compare the database contents after a test run with a previously-exported known-good set of data). This script must be run using the dbmsjy wrapper.
dbclear:dbclear clears data from tables, either directly or by producing an SQL script. This script must be run using the dbmsjy wrapper.

Other Tools

There are two extra helper applications:

dbmsjy:dbmsjy is used to run all the db* scripts, and can also run any other script under the supplied version of Jython. Its task is to set up the environment required by the target script, i.e. Java/Jython/JDBC driver.
Jython ant task:
 The Jython Ant task provides a task for the Apache Ant build tool, to Python scripts in Jython (using the Jython distributed with dbmstools) within a build. It's used to run any of the *-ant wrappers in dbmstools, but can be used for running any other Python script as part of your build process.

How we use dbmstools

At work (Dialect Payment Technologies - we build and host payment software), we now use dbmstools to generate DBMS-specific DDL and DML. It works like this:

Schema:
  • Our DBMS schema is maintained in XML (it was originally imported using db2xml)
  • Our ant build script generates schema create and drop scripts for all target DBMSs as part of the build process, using the Jython ant task and the xml2ddl-ant script. This includes different schema variants for systems with and without a replica database (we use variables and conditions in the XML to control which indexes are included in each variant)
  • The ant build scripts also use xml2doc (via the xml2doc-ant wrapper) to generate HTML documentation for the schema (for each DBMS). We try to put good descriptions in for each element, so the output from this is good enough to provide to customers (as well as being widely used internally)
  • As we draw near to the finish of an iteration, we use deltaddl to generate schema upgrade scripts for the system testers. We generate the deltas against the schema from the last external release, so this is also what will go to customers
  • When we ship our systems to customers, the DBAs typically use our DDL scripts as a starting point, since they'll likely want to manipulate things like tablespaces. The information in the table headers in the script (description, table size, select and update frequency) assist them in doing so. We also provide the HTML documentation.
Test data:
  • Our developer test database (for all our developer-level integration and system testing, including automated tests running from the continuous build system) is maintained as a single SQL DML file, with #dbms() annotations for MSSQL identify column manipulation, and for setting up sequences for Oracle and Postgres
  • To add data to the test database, we often use the system under test to add the required objects, then we use db2sql to export the data as INSERT statements. That way, we know the data is valid
  • Our ant build script generates data insertion scripts for all target DBMSs as part of the build process, using the filterdml-ant script and the Jython ant task
  • The generated schema and data insertion scripts are used as part of the automated deploy step from ant, which builds a fresh database before all tests are run on the continuous integration system
System test:
  • Our system testers use db2sql to verify the state of the database after automated test runs. After each run, they export the database using db2sql, and compare this with an archived export from a known-good run. They always export in text or XML format, since both of these place one column value per line in the output, and so are particularly suited to diffing. They also post-process the results (using e.g. sed scripts) to normalise any varying data such as timestamps.

These tools have allowed us to support 4 DBMSs (with replicated and non-replicated variants) from the one set of data, and to automate what was previously largely manual (and therefore more error-prone). OK, I'm biased of course, but it looks like it's all been well worth it to me.

Samples

In the samples directory you'll find:

Frequently Asked Questions

The FAQ is the first place you should look if you have a problem. If it doesn't mention your issue, please let me know.

Change Log

See the change log for the history of the project, at least since version 0.2.8.

Contacting Me

My name's John Dickson. I'm an architect for server-side Java systems, but Python is what keeps me sane and chirpy. I'm keen to hear from you if you use dbmstools, or even if you looked at it and decided not to. And if you have any suggestions, I'd love to hear them.

You can get in touch with me at the email address: 'captsens at gmail dot com'.

Licence

Dbmstools is licenced under the GNU Public Licence, version 2. See the LICENCE file for details.