How MySql loads result sets

I was having a devil of a time yesterday with the simple task of using a Java program to copy records as objects from one MySql database to another [1]. I kept running out of memory. While the root problem had to do with creating 2M objects in memory it did lead to a better understanding of how MySql loads result sets. In short, it loads the whole result set into memory in one shot. So, if you have 1M records at 1K each in the result set you will need at least 1G of memory to hold them. If you then build 1M objects from these records you will need an additional 1M * object size of memory. In other words, a lot of memory. You can have the MySql JDBC driver "stream" the result set, however. That is, read the records row by row from the database. It is less efficient for the driver -- multiple trips back and forth between the server -- but doing so requires far less memory. You can turn on streaming at the statement level or at the datasource level.

Statement Level

To turn on streaming at the statement level you need to use a set of common JDBC settings that, when used together, inform the driver to stream. When you create or prepare a statement you must define how the result will be used and what is the fetch size. For example,

Statement statement = connection.createStatement(
 ResultSet.TYPE_FORWARD_ONLY, 
 ResultSet.CONCUR_READ_ONLY);
statement.setFetchSize(Integer.MIN_VALUE);

and

PreparedStatement statement = connection.prepareStatement(
 "select ... from ... where ...",
 ResultSet.TYPE_FORWARD_ONLY,
 ResultSet.CONCUR_UPDATABLE);
statement.setFetchSize(Integer.MIN_VALUE);

For more information see section Result Set in JDBC API Implementation Notes. DataSource Level
To turn on streaming at the statement level you need to add a property to the JDBC uri. For example, Integer. MIN_VALUE is -2^31 and so use

jdbc:mysql://localhost/?defaultFetchSize=-2147483648

For more information see Driver/Datasource Class Names, URL Syntax and Configuration Properties for Connector/J. [1] I could not use the MySql tools for dumping and loading the table data because I used an auto_increment column in one of the related tables, the target database was active with data, and so could not reset the target's auto_increment column to an appropriate value.