db2sql - Export Data

Contents

Return to main dbmstools documentation page.

Introduction

db2sql is a tool to export data from a database, as:

  • SQL INSERT statements
  • Comma- or tab-separated data
  • XML
  • Text (one column per line)
  • Postgres COPY statements.

This can be used to migrate data between databases, and even from one DBMS type to another. The text output format is useful for system testing - it allows for easy diffing of database contents after a test run with a previously-exported set of known-good data.

By default, all data in the database is exported. However, this can be restricted by schema, by table, or even to a subset of data within a table (by specifying a query and/or column exclusions).

The tool can also generate row counts for tables, which provides a useful check that all data has been migrated between two databases.

Prerequisites

Because this script needs to get at metadata for the database, it uses the dbmsjy wrapper and a JDBC driver. See the dbmsjy page for details.

Usage

You use the tool like this:

dbmsjy.py db2sql.py [dbOptions] [generalOptions]

or, specifying JDBC details directly:

dbmsjy.py db2sql.py [jdbcOptions] [generalOptions]

Database Options:

-d dbms Name of the DBMS. Required. Values recognised are: postgres7, postgres8, mssql, oracle, mysql, hsql
-D dbName Name of the database to connect to. Required
-H dbHost Name or IP address of the database host to connect to. Optional, defaults to localhost
-P dbPort Port of the database host to connect to. Optional, defaults to the default port for the DBMS

JDBC Options:

-c driverClass Fully-qualified name of the JDBC driver class, e.g. org.postgresql.Driver. If omitted, will be defaulted based on URL. Must be present in classpath, usually by placing in the jdbc-drivers directory
-j JDBC_URL JDBC URL for connecting to the database, in the format expected by the JDBC driver, e.g. jdbc:postgresql://localhost:5432/PS30 Required

General Options:

-e tableName Specifically exclude this table (add a -e option for each table you want to exclude)
-f filepath Output file path - used only with -q
-h Shows this screen
-i tableName Specifically include this table (add a -i option for each table you want to include)
-n rows Number of rows to fetch at a time (defaults to 10,000).
-o directory Write files in this directory, rather than to stdout
-p password Password to use to connect to DB (if required)
-q query Query to use to export data. If a query is specified, no other tables will be exported. Requires a single -i option giving the name of the table
-s schema Schema name to export (defaults to all schemas)
-u username Username to use to connect to DB (if required)
-v Enable verbose logging (to stderr)
-x columnName Specifically exclude this column (add a -x option for each column you want to exclude)
-y Output as postgres COPY statements, rather than INSERT statements

Long Options

--countonly Output only row counts for each table, not the data
--format fmt Specifies the format for output. Options are 'sql' (default), 'copy' (equivalent of -y option, for Postgres COPY), 'text' (1 column value per line, intended for diffing databases), 'csv' (comma-separated value), 'tsv' (tab-separated value) and 'xml'
--includeschema
 Include the schema name in the SQL generated (default is to omit)
--utf8 Encode output files as UTF-8, rather than ISO-8859-1
--version Print version number and exit

See the dbmsjy page for details of default JDBC drivers and URLs etc.

Examples

Export all tables in all schemas from a local Postgres database called mydb:

$ dbmsjy.py db2sql.py -u fred -p itsasecret -d postgres8 -D mydb

Export 2 named tables from a local MySQL database:

$ dbmsjy.py db2sql.py -i tableone -i table2 -u fred -p itsasecret \
    -d mysql -D mydb

Export only rows meeting specified criteria from an Oracle database on host bigserver:

$ dbmsjy.py db2sql.py -i tableone \
    -q "select * from tableone where lastdate > '2006-04-11' \
    -u fred -p itsasecret -d oracle -H bigserver -D mydb