Frequently Asked Questions

Contents

Return to main dbmstools documentation page.

General Questions

What database objects can dbmstools deal with?

dbmstools provides at least reasonable support for the following types of database objects:

  • Tables
  • Columns, including column constraints
  • Primary keys
  • Foreign keys
  • Unique constraints
  • Indexes
  • Grants
  • Views (with some limitations)
  • DBMS-specific table properties, e.g. Oracle storage parameters, MySQL InnoDB indicators.

What limitations are there in dbmstools?

Probably quite a few that I haven't noted down yet ;}

But here are some of the main ones:

  • The DBMS's supported (and the completeness/maturity of that support) are:

    • Postgres (7 and 8) - good
    • Oracle - good
    • Microsoft SQLServer - good
    • MySQL - good
    • Hypersonic - limited (xml2* tools only)
    • Sqlite - limited (xml2* tools only).
  • Support for DBMS-specific features in DDL is patchy

  • Although schema names are understood by dbmstools, you can't have 2 top-level tables, views or sequences with the same name, even if they're in different schemas

  • No support for functions, stored procedures etc.

  • The definition of a view is at least partially opaque to dbmstools, so it can't be exported by db2xml for all DBMSs (currently supported for Postgres and Oracle), and deltaddl doesn't understand about the tables the view depends on.

How do I run these tools under Windows?

For the CPython-based tools (xml2*, filterdml, deltaddl), you have these options:

  • If you've installed Python properly, you should just be able to type the script name at the command prompt, and it will run, e.g.

    C:\dbmstools-0.3.8>main\deltaddl.py
    
  • Otherwise, if Python is in your PATH, you can prefix the script name with python, e.g.

    C:\dbmstools-0.3.8>python main\deltaddl.py
    

If neither of these options work, you should either install Python properly or ensure that the python interpreter (python.exe) is in your path. You'll find more information at http://www.python.org/doc/faq/windows/ . Another alternative is to run them under Jython, which is supplied with dbmstools - see the dbmsjy page for details.

For the Jython-based tools (db2sql, db2xml, dbclear, dbdrop), see the dbmsjy page for more information.

Where do I put my DTD so that it gets checked?

The DTD is in the same directory as the dbmstools scripts. When running under Jython, an entity resolver is installed to load the DTD from that directory, or from the Jar file.

Note that the DTD is never checked under CPython - see below.

Why isn't my XML validated under CPython?

The XML parser I've used for the CPython parsing doesn't support validation. The DTD will only be checked if you use Jython (since the standard Java XML parsers do support validation).

Why do I get 'Relative URI can not be resolved'?

When running dbmstools under Java (either using Jython directly, or via one of the ant wrappers), the script may terminate with the following message:

org.xml.sax.SAXParseException: org.xml.sax.SAXParseException: Relative URI "db-schema.dtd"; can not be
resolved without a base URI

This seems to happen mainly when Java is using later versions of the Crimson XML parser, and it's caused by the DTD reference in your XML file.

If your reference looks like this:

<!DOCTYPE db-schema SYSTEM "db-schema.dtd">

it will fail, because Crimson thinks there's insufficient information to locate the DTD. The solution is to add file:/// before the DTD name:

<!DOCTYPE db-schema SYSTEM "file:///db-schema.dtd">

How can I generate different variants of my schema?

There are 2 ways in which dbmstools can generate different variants of a schema.

The first is variants by DBMS, e.g. when you need primary key columns to be based on a sequence for DBMSs that support them (like Oracle and Postgres) or auto-increment (for Mysql and Mssql). This has always been one of the primary functions of dbmstools, and is achieved using the dbms attribute on each schema item. Adding a dbms attribute to any item (e.g. a table) ensures that the item is only included if the target DBMS is one of the values in the attribute.

The second way of getting variants is more powerful (and more recent). By defining acceptable variables in the schema XML, and passing values to those variables to xml2ddl at runtime, you can use those variables in conditions on each database item, to control whether it is included or not. For example, you could define a schemaType variable to identify whether the database is not replicated, or is the master or slave for a replicated system. Then, indexes that you want to vary would include a condition element to ensure they were only included for the correct schema type(s).

See the XML schema documentation for more information on how to use variables.

Note that the schema-override and column-override elements can be used in conjunction with the above mechanisms to allow more fine-grained control.

Do the db* scripts work with the free gcj Java?

That depends very much on your JDBC driver. I don't do any testing with gcj, but so far I've noticed that Postgres seems to work OK, while Oracle gives a NoClassDefFoundError.

Why do my create and drop scripts have the table dependency order wrong?

dbmstools has a very simple way of determining the dependency order for creating and dropping tables - it uses the order from the XML schema file (or the reverse of that order when dropping). If you don't have the order correct, i.e. if you have foreign keys that depend on tables that haven't already been created, then both your create and drop scripts will fail.

If this is a problem for you, you can work around it by using the -s option to xml2ddl, which writes all constraints (including foreign keys) to a separate script. But the drop scripts produced will still be incorrect.

Why does dbmsjy write after the command prompt in Windows?

When using dbmsjy to run the db* scripts from the Windows command prompt, you'll find that the command prompt (e.g. C:>) returns, and then the db* script output is written. Everything runs just fine, but the output looks a little strange.

This is a bug in dbmsjy - I haven't yet worked out how to get this to work correctly on both Windows and Linux/UNIX. Note that it works OK on Cygwin/Windows.

Why are there sometimes strange lines on screen when something goes wrong with a script?

If there's an error running a dbmstools script, e.g. it can't find the directory you told it to use, you'll sometimes see what's called a stack trace on screen, followed by a (hopefully understandable) message about what went wrong. The trick is to look at the last line for the message.

For example, here's the output if you specify an invalid XML schema path to xml2ddl:

$ dbmstools/xml2ddl.py -d postgres8 -wtvo ~/tmp/psschema-export/ missing.xml
Loading DOM
Traceback (most recent call last):
File "dbmstools/xml2ddl.py", line 249, in ?
    main()
File "dbmstools/xml2ddl.py", line 245, in main
    writeDatabaseScripts(extraArgs[0], options)
File "dbmstools/xml2ddl.py", line 220, in writeDatabaseScripts
    database = loadDatabase(databaseFile, options.verbose)
File "/home/johnd/python/dbmstools/dbmstools/common/databaselib.py", line 815, in loadDatabase
    rootElement = xmlutils.loadDom(path, verbose)
File "/home/johnd/python/dbmstools/dbmstools/common/xmlutils.py", line 213, in loadDom
    raise 'XML database schema not found: %s' % path
XML database schema not found: missing.xml

At some point in the future I'll get rid of the last of the stack traces, but I haven't got there yet...

Why does the script generated by dbclear fail to clear some tables?

If you use dbclear, either to generate a script or to delete data directly in the database, you may strike a problem where the delete fails due to a foreign key constraint.

dbclear tries to TRUNCATE tables if possible, and only use DELETE if necessary. TRUNCATE is much faster, but can't be used when there's a foreign key referring to the table. If you use the -i (include) or -e (exclude) option, and the table with the foreign key isn't in the list of included tables, dbclear won't know anything about the foreign key, and therefore will try to TRUNCATE when it should DELETE.

The workaround for this to use the --force-delete option.

How does dbmstools deal with unicode, Latin 1 and other encodings?

dbmstools deals reasonably well with different character encodings, both in the description fields in the XML schema, and in text fields in the database. It is able to output SQL scripts in ISO-8859-1 (also known as Latin 1 and Western European, the default), and UTF-8 (a type of unicode).

XML Schemas:You indicate the encoding in the first line of the file, e.g. <?xml version="1.0" encoding="ISO-8859-1" ?>. The 2 most common encodings are ISO-8859-1 and UTF-8. Note that only the description attributes/elements may contain anything other than US-ASCII.
xml2doc:xml2doc always writes HTML as UTF-8, so it should almost always do the right thing with your descriptions
xml2ddl:Output defaults to ISO-8859-1, unless the --utf8 option is used.
db2xml:The XML schema is specified as UTF-8, and this same encoding is used for encoding the output file. However, output to the console uses the default console encoding, so if you have characters above 127 in your schema descriptions, and your console doesn't default to unicode, you should specify an output file
db2sql:Output defaults to ISO-8859-1, unless the --utf8 option is used. However, output to the console uses the default console encoding, as per db2xml
filterdml:Output defaults to ISO-8859-1, unless the --utf8 option is used. However, output to the console uses the default console encoding, as per db2xml

Deltaddl

Just how good is deltaddl?

Deltaddl is still in development, but it's advancing rapidly. As of version 0.4.5, here's the state of play for the different DBMSs:

Postgres:Largely complete, and well tested. Used in production for frequent releases on Postgres 8.x
Oracle:Reasonably complete, and reasonably well tested. Used in production for frequent releases on Oracle 9, with testing also on 10g
Mysql:Reasonably complete, and reasonably well tested. We don't use it in production
Mssql:Support only recently added, and not as well tested. Some production use
Hypersonic:Not implemented, and no plans to do so
Sqlite:Not implemented, and no plans to do so

It's probably worth trying deltaddl with your schema and DBMS, to see how close it gets to what you need.

Why do my renamed objects get dropped and recreated?

When you use deltaddl to generate an upgrade script, it will rename an object (table, column etc.) if the object has a previous-name attribute, AND it does not exist in the previous version of the schema under its current name.

If you forget the previous-name attribute, deltaddl will think you want to drop one object and create another.

I shortened a column, and the first upgrade script fails

If you change the length of a column, the change will be made in the first DDL script generated by deltaddl. If your data is too long for the new length, this will fail.

You need to write a DML script to restructure the data (you needed to do this anyway), but in this case you must run it before the first DDL script.

What can I do if deltaddl gets it wrong?

There are sometimes things that deltaddl will get wrong, because it just doesn't have quite enough knowledge or intelligence to work out what order to do things in, or because its understanding of SQL variants isn't as good as it should be.

If you run into such a problem, you'll probably want to use --omit to leave out the offending object(s) or column(s), and add it manually to your own custom DDL script.

Why do I lose foreign keys and views after a table is copied and recreated?

There's a bug remaining in deltaddl, to do with renaming temporary tables in postgres (possibly with other DBMSs too, I haven't tested this elsewhere yet).

A foreign key typically has a dependency on the primary key of the referenced table. This means that you cannot drop the primary key of the referenced table without either first dropping all foreign keys that refer to it, or alternatively using CASCADE on the drop. The same is true of a view which references a table.

As at deltaddl version 0.3.8, when renaming a table to be copied, I drop the primary key on the renamed table (so that I can create a primary key of the same name on the copied table). I have to cascade the drop, because of the above reason. This has 2 effects:

  • If the cascade drops a foreign key that was to be dropped anyway as part of the schema upgrade, the DBMS will show an error when the explicit drop is attempted
  • (worse) If the cascade drops a foreign key that was NOT to be dropped anyway, it won't be recreated, so it will be missing from the schema.

In the next version, I aim to add the primary key in the second script, after dropping the temporary table.

Note that on DBMSs that don't support cascading drops (e.g. Mssql, MySQL), dropping the primary key will fail.

What happened to my Oracle/Mysql storage parameters in deltaddl?

Another bug, I'm afraid - deltaddl doesn't yet include storage lines when using the recreate-and-copy strategy, although they are included for any new tables added.

Why can't I add auto-increment to an existing column in Postgres?

Postgres uses the pseudo-types SERIAL and BIGSERIAL for auto-increment columns, but it doesn't seem to recognise these in ALTER TABLE commands. So if you add auto-increment to a column, deltaddl will generate something like the following, which is logical but which Postgres doesn't support :(

ALTER TABLE Foo
    ALTER systemId TYPE BIGSERIAL;

Why can't I drop column defaults in Mssql?

Mssql only allows dropping of column default values by name. Since in standard SQL there's no such thing as a name for a column default, Mssql makes up its own name. deltaddl has no way of knowing what name Mssql assigned, so it will just output an SQL comment explaining that it's skipping this step.

Oracle-Specific

Every DBMS is different, but Oracle seems to pride itself on being more different. In particular, Oracle's JDBC support has been rather Oracle-specific for a long time, and that means that dbmstools sometimes struggles when trying to export schemas or data from Oracle (ie. the db2xml and db2sql tools).

How can I get the db* scripts to work with Oracle?

The main problem when trying to get schema metadata from Oracle is system tables and views - Oracle will tell you they exist, then explode when you ask for information about them. Or, it may export them, leaving you with rubbish in your XML output. dbmstools tries to avoid exporting system tables and the like (by excluding anything from the SYS schema, anything with a $ sign in its name, and anything with a name starting with SYS_). But usually this isn't enough...

So the main trick to getting the db* scripts to work with your Oracle database is to make sure you only ask for the sequences, tables and views you want. The two main ways to do this are by specifying the schema name (using the -s option), and specifying the names of the required tables (using the -i option). Using the schema name is best, since otherwise all sequences and views will be exported.

Note that, for Oracle, the schema name is case-sensitive and is almost always upper case, e.g. -s MYSCHEMA. If no schema is specifed when objects are created, they go into the user's default schema, which normally has the same name as the user.

Postgres-Specific

Why do the db* scripts fail to connect to my Postgres database?

The Jython-based scripts (db*) use TCP/IP to connect to the database. Make sure that you've allowed TCP/IP access to the appropriate user/database/host in your pg_hba.conf file, e.g.

host    all         all         127.0.0.1         255.255.255.255   md5

Why do the db* scripts fail spectacularly on Postgres 8.0.8?

The JDBC driver initially shipped with Postgres 8.0.8 (at least on some Linux distributions), version 316, has a serious bug which prevents db2sql and db2xml working. Version 317 of the driver, which fixes the bug, was released shortly after version 316 - use that instead.

Auto-increment columns leave un-dropped sequences

When you set a column to auto-increment in Postgres, it generates a matching sequence behind the scenes, and uses that to populate the incrementing value. The drop script generated by xml2ddl won't pick up that sequence.

Note: This may be resolved in later versions of Postgres (TBC) - I've heard that it manages these sequences itself.

Miscellaneous

Why won't xml2doc < 0.4.1 generate schema diagrams with Graphviz 2.12+?

Some time around version 2.12, the 'dot' program in Graphviz changed the way it reported its version number, and this prevented xml2doc from recognising it properly. It was xml2doc's fault...

Version 0.4.2 of dbmstools fixes this issue.