db2sql - Export Data

db2sql is a tool to export data from a database, as:

SQL INSERT statements

Comma- or tab-separated data

XML

Text (one column per line)

Postgres COPY statements.

This can be used to migrate data between databases, and even from one DBMS type to another. The text output format is useful for system testing - it allows for easy diffing of database contents after a test run with a previously-exported set of known-good data.

By default, all data in the database is exported. However, this can be restricted by schema, by table, or even to a subset of data within a table (by specifying a query and/or column exclusions).

The tool can also generate row counts for tables, which provides a useful check that all data has been migrated between two databases.

Prerequisites

Because this script needs to get at metadata for the database, it uses the dbmsjy wrapper and a JDBC driver. See the dbmsjy page for details.

Usage

You use the tool like this:

dbmsjy.py db2sql.py [dbOptions] [generalOptions]

or, specifying JDBC details directly:

dbmsjy.py db2sql.py [jdbcOptions] [generalOptions]

Database Options:

`-d dbms`	Name of the DBMS. Required. Values recognised are: postgres7, postgres8, mssql, oracle, mysql, hsql
`-D dbName`	Name of the database to connect to. Required
`-H dbHost`	Name or IP address of the database host to connect to. Optional, defaults to localhost
`-P dbPort`	Port of the database host to connect to. Optional, defaults to the default port for the DBMS

JDBC Options:

`-c driverClass`	Fully-qualified name of the JDBC driver class, e.g. org.postgresql.Driver. If omitted, will be defaulted based on URL. Must be present in classpath, usually by placing in the `jdbc-drivers` directory
`-j JDBC_URL`	JDBC URL for connecting to the database, in the format expected by the JDBC driver, e.g. `jdbc:postgresql://localhost:5432/PS30` Required

General Options:

`-e tableName`	Specifically exclude this table (add a -e option for each table you want to exclude)
`-f filepath`	Output file path - used only with -q
`-h`	Shows this screen
`-i tableName`	Specifically include this table (add a -i option for each table you want to include)
`-n rows`	Number of rows to fetch at a time (defaults to 10,000).
`-o directory`	Write files in this directory, rather than to stdout
`-p password`	Password to use to connect to DB (if required)
`-q query`	Query to use to export data. If a query is specified, no other tables will be exported. Requires a single -i option giving the name of the table
`-s schema`	Schema name to export (defaults to all schemas)
`-u username`	Username to use to connect to DB (if required)
`-v`	Enable verbose logging (to stderr)
`-x columnName`	Specifically exclude this column (add a -x option for each column you want to exclude)
`-y`	Output as postgres COPY statements, rather than INSERT statements

Long Options

`--countonly`	Output only row counts for each table, not the data
`--format fmt`	Specifies the format for output. Options are 'sql' (default), 'copy' (equivalent of -y option, for Postgres COPY), 'text' (1 column value per line, intended for diffing databases), 'csv' (comma-separated value), 'tsv' (tab-separated value) and 'xml'
`--includeschema`
	Include the schema name in the SQL generated (default is to omit)
`--utf8`	Encode output files as UTF-8, rather than ISO-8859-1
`--version`	Print version number and exit

See the dbmsjy page for details of default JDBC drivers and URLs etc.

Examples

Export all tables in all schemas from a local Postgres database called mydb:

$ dbmsjy.py db2sql.py -u fred -p itsasecret -d postgres8 -D mydb

Export 2 named tables from a local MySQL database:

$ dbmsjy.py db2sql.py -i tableone -i table2 -u fred -p itsasecret \
    -d mysql -D mydb

Export only rows meeting specified criteria from an Oracle database on host bigserver:

$ dbmsjy.py db2sql.py -i tableone \
    -q "select * from tableone where lastdate > '2006-04-11' \
    -u fred -p itsasecret -d oracle -H bigserver -D mydb