Working with a relational database

Groovy’s groovy-sql module provides a higher-level abstraction over Java’s JDBC technology. JDBC itself provides a lower-level but fairly comprehensive API which provides uniform access to a whole variety of supported relational database systems. We’ll use HSQLDB in our examples here but you can alternatively use Oracle, SQL Server, MySQL and a host of others. The most frequently used class within the groovy-sql module is the groovy.sql.Sql class which raises the JDBC abstractions up one level. We’ll cover that first.

1. Connecting to the database

Connecting to a database with Groovy’s Sql class requires four pieces of information:

  • The database uniform resource locator (URL)

  • Username

  • Password

  • The driver class name (which can be derived automatically in some situations)

For our HSQLDB database, the values will be something like that shown in the following table:

Property Value

url

jdbc:hsqldb:mem:yourdb

user

sa (or your username)

password

yourPassword

driver

org.hsqldb.jdbcDriver

Consult the documentation for the JDBC driver that you plan to use to determine the correct values for your situation.

The Sql class has a newInstance factory method which takes these parameters. You would typically use it as follows:

Connecting to HSQLDB
import groovy.sql.Sql

def url = 'jdbc:hsqldb:mem:yourDB'
def user = 'sa'
def password = ''
def driver = 'org.hsqldb.jdbcDriver'
def sql = Sql.newInstance(url, user, password, driver)

// use 'sql' instance ...

sql.close()

If you don’t want to have to handle resource handling yourself (i.e. call close() manually) then you can use the withInstance variation as shown here:

Connecting to HSQLDB (withInstance variation)
Sql.withInstance(url, user, password, driver) { sql ->
  // use 'sql' instance ...
}

1.1. Connecting with a DataSource

It is often preferred to use a DataSource. You may have one available to you from a connection pool. Here we’ll use the one provided as part of the HSQLDB driver jar as shown here:

Connecting to HSQLDB with a DataSource
import groovy.sql.Sql
import org.hsqldb.jdbc.JDBCDataSource

def dataSource = new JDBCDataSource(
    database: 'jdbc:hsqldb:mem:yourDB', user: 'sa', password: '')
def sql = new Sql(dataSource)

// use then close 'sql' instance ...

If you have your own connection pooling, the details will be different, e.g. for Apache Commons DBCP:

Connecting to HSQLDB with a DataSource using Apache Commons DBCP
@Grab('org.apache.commons:commons-dbcp2:2.7.0')
import groovy.sql.Sql
import org.apache.commons.dbcp2.BasicDataSource

def ds = new BasicDataSource(driverClassName: "org.hsqldb.jdbcDriver",
    url: 'jdbc:hsqldb:mem:yourDB', username: 'sa', password: '')
def sql = new Sql(ds)
// use then close 'sql' instance ...

1.2. Connecting using @Grab

The previous examples assume that the necessary database driver jar is already on your classpath. For a self-contained script you can add @Grab statements to the top of the script to automatically download the necessary jar as shown here:

Connecting to HSQLDB using @Grab
@Grab('org.hsqldb:hsqldb:2.7.1:jdk8')
@GrabConfig(systemClassLoader=true)
// create, use, and then close sql instance ...

The @GrabConfig statement is necessary to make sure the system classloader is used. This ensures that the driver classes and system classes like java.sql.DriverManager are in the same classloader.

2. Executing SQL

You can execute arbitrary SQL commands using the execute() method. Let’s have a look at using it to create a table.

2.1. Creating tables

The simplest way to execute SQL is to call the execute() method passing the SQL you wish to execute as a String as shown here:

Creating a table
// ... create 'sql' instance
sql.execute '''
  CREATE TABLE Author (
    id          INTEGER GENERATED BY DEFAULT AS IDENTITY,
    firstname   VARCHAR(64),
    lastname    VARCHAR(64)
  );
'''
// close 'sql' instance ...

There is a variant of this method which takes a GString and another with a list of parameters. There are also other variants with similar names: executeInsert and executeUpdate. We’ll see examples of these variants in other examples in this section.

3. Basic CRUD operations

The basic operations on a database are Create, Read, Update and Delete (the so-called CRUD operations). We’ll examine each of these in turn.

3.1. Creating/Inserting data

You can use the same execute() statement we saw earlier but to insert a row by using a SQL insert statement as follows:

Inserting a row
sql.execute "INSERT INTO Author (firstname, lastname) VALUES ('Dierk', 'Koenig')"

You can use a special executeInsert method instead of execute. This will return a list of all keys generated. Both the execute and executeInsert methods allow you to place '?' placeholders into your SQL string and supply a list of parameters. In this case a PreparedStatement is used which avoids any risk of SQL injection. The following example illustrates executeInsert using placeholders and parameters:

Inserting a row using executeInsert with placeholders and parameters
def insertSql = 'INSERT INTO Author (firstname, lastname) VALUES (?,?)'
def params = ['Jon', 'Skeet']
def keys = sql.executeInsert insertSql, params
assert keys[0] == [1]

In addition, both the execute and executeInsert methods allow you to use GStrings. Any '$' placeholders within the SQL are assumed to be placeholders. An escaping mechanism exists if you want to supply part of the GString with a variable in a position which isn’t where normal placeholders go within SQL. See the GroovyDoc for more details. Also, executeInsert allows you to supply a list of key column names, when multiple keys are returned and you are only interested in some of them. Here is a fragment illustrating key name specification and GStrings:

Inserting a row using executeInsert with a GString and specifying key names
def first = 'Guillaume'
def last = 'Laforge'
def myKeyNames = ['ID']
def myKeys = sql.executeInsert """
  INSERT INTO Author (firstname, lastname)
  VALUES (${first}, ${last})
""", myKeyNames
assert myKeys[0] == [ID: 2]

3.2. Reading rows

Reading rows of data from the database is accomplished using one of several available methods: query, eachRow, firstRow and rows.

Use the query method if you want to iterate through the ResultSet returned by the underlying JDBC API as shown here:

Reading data using query
def expected = ['Dierk Koenig', 'Jon Skeet', 'Guillaume Laforge']

def rowNum = 0
sql.query('SELECT firstname, lastname FROM Author') { resultSet ->
  while (resultSet.next()) {
    def first = resultSet.getString(1)
    def last = resultSet.getString('lastname')
    assert expected[rowNum++] == "$first $last"
  }
}

Use the eachRow method if you want a slightly higher-level abstraction which provides a Groovy friendly map-like abstraction for the ResultSet as shown here:

Reading data using eachRow
rowNum = 0
sql.eachRow('SELECT firstname, lastname FROM Author') { row ->
  def first = row[0]
  def last = row.lastname
  assert expected[rowNum++] == "$first $last"
}

Note that you can use Groovy list-style and map-style notations when accessing the row of data.

Use the firstRow method if you for similar functionality as eachRow but returning only one row of data as shown here:

Reading data using firstRow
def first = sql.firstRow('SELECT lastname, firstname FROM Author')
assert first.values().sort().join(',') == 'Dierk,Koenig'

Use the rows method if you want to process a list of map-like data structures as shown here:

Reading data using rows
List authors = sql.rows('SELECT firstname, lastname FROM Author')
assert authors.size() == 3
assert authors.collect { "$it.FIRSTNAME ${it[-1]}" } == expected

Note that the map-like abstraction has case-insensitive keys (hence we can use 'FIRSTNAME' or 'firstname' as the key) and also that -ve indices (a standard Groovy feature) works when using an index value (to count column numbers from the right).

You can also use any of the above methods to return scalar values, though typically firstRow is all that is required in such cases. An example returning the count of rows is shown here:

Reading scalar values
assert sql.firstRow('SELECT COUNT(*) AS num FROM Author').num == 3

3.3. Updating rows

Updating rows can again be done using the execute() method. Just use a SQL update statement as the argument to the method. You can insert an author with just a lastname and then update the row to also have a firstname as follows:

Updating a row
sql.execute "INSERT INTO Author (lastname) VALUES ('Thorvaldsson')"
sql.execute "UPDATE Author SET firstname='Erik' where lastname='Thorvaldsson'"

There is also a special executeUpdate variant which returns the number of rows updated as a result of executing the SQL. For example, you can change the lastname of an author as follows:

Using executeUpdate
def updateSql = "UPDATE Author SET lastname='Pragt' where lastname='Thorvaldsson'"
def updateCount = sql.executeUpdate updateSql
assert updateCount == 1

def row = sql.firstRow "SELECT * FROM Author where firstname = 'Erik'"
assert "${row.firstname} ${row.lastname}" == 'Erik Pragt'

3.4. Deleting rows

The execute method is also used for deleting rows as this example shows:

Deleting rows
assert sql.firstRow('SELECT COUNT(*) as num FROM Author').num == 3
sql.execute "DELETE FROM Author WHERE lastname = 'Skeet'"
assert sql.firstRow('SELECT COUNT(*) as num FROM Author').num == 2

4. Advanced SQL operations

4.1. Working with transactions

The easiest way to perform database operations within a transaction is to include the database operation within a withTransaction closure as shown in the following example:

A successful transaction
assert sql.firstRow('SELECT COUNT(*) as num FROM Author').num == 0
sql.withTransaction {
  sql.execute "INSERT INTO Author (firstname, lastname) VALUES ('Dierk', 'Koenig')"
  sql.execute "INSERT INTO Author (firstname, lastname) VALUES ('Jon', 'Skeet')"
}
assert sql.firstRow('SELECT COUNT(*) as num FROM Author').num == 2

Here the database starts empty and has two rows after successful completion of the operation. Outside the scope of the transaction, the database is never seen as having just one row.

If something goes wrong, any earlier operations within the withTransaction block are rolled back. We can see that in operation in the following example where we use database metadata (more details coming up shortly) to find the maximum allowable size of the firstname column and then attempt to enter a firstname one larger than that maximum value as shown here:

A failed transaction will cause a rollback
def maxFirstnameLength
def metaClosure = { meta -> maxFirstnameLength = meta.getPrecision(1) }
def rowClosure = {}
def rowCountBefore = sql.firstRow('SELECT COUNT(*) as num FROM Author').num
try {
  sql.withTransaction {
    sql.execute "INSERT INTO Author (firstname) VALUES ('Dierk')"
    sql.eachRow "SELECT firstname FROM Author WHERE firstname = 'Dierk'", metaClosure, rowClosure
    sql.execute "INSERT INTO Author (firstname) VALUES (?)", 'X' * (maxFirstnameLength + 1)
  }
} catch(ignore) { println ignore.message }
def rowCountAfter = sql.firstRow('SELECT COUNT(*) as num FROM Author').num
assert rowCountBefore == rowCountAfter

Even though the first sql execute succeeds initially, it will be rolled back and the number of rows will remain the same.

4.2. Using batches

When dealing with large volumes of data, particularly when inserting such data, it can be more efficient to chunk the data into batches. This is done using the withBatch statement as shown in the following example:

Batching SQL statements
sql.withBatch(3) { stmt ->
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Dierk', 'Koenig')"
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Paul', 'King')"
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Guillaume', 'Laforge')"
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Hamlet', 'D''Arcy')"
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Cedric', 'Champeau')"
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Erik', 'Pragt')"
  stmt.addBatch "INSERT INTO Author (firstname, lastname) VALUES ('Jon', 'Skeet')"
}

After executing these statements, there will be 7 new rows in the database. In fact, they will have been added in batches even though you can’t easily tell that after that fact. If you want to confirm what is going on under the covers, you can add a little bit of extra logging into your program. Add the following lines before the withBatch statement:

Logging additional SQL information
import java.util.logging.*

// next line will add fine logging
Logger.getLogger('groovy.sql').level = Level.FINE
// also adjust logging.properties file in JRE_HOME/lib to have:
// java.util.logging.ConsoleHandler.level = FINE

With this extra logging turned on, and the changes made as per the above comment for the logging.properties file, you should see output such as:

SQL logging output with batching enable
FINE: Successfully executed batch with 3 command(s)
Apr 19, 2015 8:38:42 PM groovy.sql.BatchingStatementWrapper processResult

FINE: Successfully executed batch with 3 command(s)
Apr 19, 2015 8:38:42 PM groovy.sql.BatchingStatementWrapper processResult

FINE: Successfully executed batch with 1 command(s)
Apr 19, 2015 8:38:42 PM groovy.sql.Sql getStatement

We should also note, that any combination of SQL statements can be added to the batch. They don’t all have to be inserting a new row to the same table.

We noted earlier that to avoid SQL injection, we encourage you to use prepared statements, this is achieved using the variants of methods which take GStrings or a list of extra parameters. Prepared statements can be used in combination with batches as shown in the following example:

Batching prepared statements
def qry = 'INSERT INTO Author (firstname, lastname) VALUES (?,?)'
sql.withBatch(3, qry) { ps ->
  ps.addBatch('Dierk', 'Koenig')
  ps.addBatch('Paul', 'King')
  ps.addBatch('Guillaume', 'Laforge')
  ps.addBatch('Hamlet', "D'Arcy")
  ps.addBatch('Cedric', 'Champeau')
  ps.addBatch('Erik', 'Pragt')
  ps.addBatch('Jon', 'Skeet')
}

This provides a much safer option if the data could come from a user such as via a script or a web form. Of course, given that a prepared statement is being used, you are limited to a batch of the same SQL operation (insert in our example) to the one table.

4.3. Performing pagination

When presenting large tables of data to a user, it is often convenient to present information a page at a time. Many of Groovy’s SQL retrieval methods have extra parameters which can be used to select a particular page of interest. The starting position and page size are specified as integers as shown in the following example using rows:

Retrieving pages of data
def qry = 'SELECT * FROM Author'
assert sql.rows(qry, 1, 3)*.firstname == ['Dierk', 'Paul', 'Guillaume']
assert sql.rows(qry, 4, 3)*.firstname == ['Hamlet', 'Cedric', 'Erik']
assert sql.rows(qry, 7, 3)*.firstname == ['Jon']

4.4. Fetching metadata

JDBC metadata can be retrieved in numerous ways. Perhaps the most basic approach is to extract the metadata from any row as shown in the following example which examines the tablename, column names and column type names:

Using row metadata
sql.eachRow("SELECT * FROM Author WHERE firstname = 'Dierk'") { row ->
  def md = row.getMetaData()
  assert md.getTableName(1) == 'AUTHOR'
  assert (1..md.columnCount).collect{ md.getColumnName(it) } == ['ID', 'FIRSTNAME', 'LASTNAME']
  assert (1..md.columnCount).collect{ md.getColumnTypeName(it) } == ['INTEGER', 'VARCHAR', 'VARCHAR']
}

And another slight variant to the previous example, this time also looking at the column label:

Also using row metadata
sql.eachRow("SELECT firstname AS first FROM Author WHERE firstname = 'Dierk'") { row ->
  def md = row.getMetaData()
  assert md.getColumnName(1) == 'FIRSTNAME'
  assert md.getColumnLabel(1) == 'FIRST'
}

Accessing metadata is quite common, so Groovy also provides variants to many of its methods that let you supply a closure that will be called once with the row metadata in addition to the normal row closure which is called for each row. The following example illustrates the two closure variant for eachRow:

Using row and metadata closures
def metaClosure = { meta -> assert meta.getColumnName(1) == 'FIRSTNAME' }
def rowClosure = { row -> assert row.FIRSTNAME == 'Dierk' }
sql.eachRow("SELECT firstname FROM Author WHERE firstname = 'Dierk'", metaClosure, rowClosure)

Note that our SQL query will only return one row, so we could have equally used firstRow for the previous example.

Finally, JDBC also provides metadata per connection (not just for rows). You can also access such metadata from Groovy as shown in this example:

Using connection metadata
def md = sql.connection.metaData
assert md.driverName == 'HSQL Database Engine Driver'
assert md.databaseProductVersion == '2.7.1'
assert ['JDBCMajorVersion', 'JDBCMinorVersion'].collect{ md[it] } == [4, 2]
assert md.stringFunctions.tokenize(',').contains('CONCAT')
def rs = md.getTables(null, null, 'AUTH%', null)
assert rs.next()
assert rs.getString('TABLE_NAME') == 'AUTHOR'

Consult the JavaDoc for your driver to find out what metadata information is available for you to access.

4.5. Named and named-ordinal parameters

Groovy supports some additional alternative placeholder syntax variants. The GString variants are typically preferred over these alternatives but the alternatives are useful for Java integration purposes and sometimes in templating scenarios where GStrings might already be in heavy use as part of a template. The named parameter variants are much like the String plus list of parameter variants but instead of having a list of ? placeholders followed by a list of parameters, you have one or more placeholders having the form :propName or ?.propName and a single map, named arguments or a domain object as the parameter. The map or domain object should have a property named propName corresponding to each supplied placeholder.

Here is an example using the colon form:

Named parameters (colon form)
sql.execute "INSERT INTO Author (firstname, lastname) VALUES (:first, :last)", first: 'Dierk', last: 'Koenig'

And another example using the question mark form:

Named parameters (question mark form)
sql.execute "INSERT INTO Author (firstname, lastname) VALUES (?.first, ?.last)", first: 'Jon', last: 'Skeet'

If the information you need to supply is spread across multiple maps or domain objects you can use the question mark form with an additional ordinal index as shown here:

Named-ordinal parameters
class Rockstar { String first, last }
def pogo = new Rockstar(first: 'Paul', last: 'McCartney')
def map = [lion: 'King']
sql.execute "INSERT INTO Author (firstname, lastname) VALUES (?1.first, ?2.lion)", pogo, map

4.6. Stored procedures

The exact syntax for creating a stored procedure or function varies slightly between different databases. For the HSQLDB database we are using, we can create a stored function which returns the initials of all authors in a table as follows:

Creating a stored function
sql.execute """
  CREATE FUNCTION SELECT_AUTHOR_INITIALS()
  RETURNS TABLE (firstInitial VARCHAR(1), lastInitial VARCHAR(1))
  READS SQL DATA
  RETURN TABLE (
    SELECT LEFT(Author.firstname, 1) as firstInitial, LEFT(Author.lastname, 1) as lastInitial
    FROM Author
  )
"""

We can use a SQL CALL statement to invoke the function using Groovy’s normal SQL retrieval methods. Here is an example using eachRow.

Creating a stored procedure or function
def result = []
sql.eachRow('CALL SELECT_AUTHOR_INITIALS()') {
  result << "$it.firstInitial$it.lastInitial"
}
assert result == ['DK', 'JS', 'GL']

Here is the code for creating another stored function, this one taking the lastname as a parameter:

Creating a stored function with a parameter
sql.execute """
  CREATE FUNCTION FULL_NAME (p_lastname VARCHAR(64))
  RETURNS VARCHAR(100)
  READS SQL DATA
  BEGIN ATOMIC
    DECLARE ans VARCHAR(100);
    SELECT CONCAT(firstname, ' ', lastname) INTO ans
    FROM Author WHERE lastname = p_lastname;
    RETURN ans;
  END
"""

We can use the placeholder syntax to specify where the parameter belongs and note the special placeholder position to indicate the result:

Using a stored function with a parameter
def result = sql.firstRow("{? = call FULL_NAME(?)}", ['Koenig'])
assert result[0] == 'Dierk Koenig'

Finally, here is a stored procedure with input and output parameters:

Creating a stored procedure with input and output parameters
sql.execute """
  CREATE PROCEDURE CONCAT_NAME (OUT fullname VARCHAR(100),
    IN first VARCHAR(50), IN last VARCHAR(50))
  BEGIN ATOMIC
    SET fullname = CONCAT(first, ' ', last);
  END
"""

To use the CONCAT_NAME stored procedure parameter, we make use of a special call method. Any input parameters are simply provided as parameters to the method call. For output parameters, the resulting type must be specified as shown here:

Using a stored procedure with input and output parameters
sql.call("{call CONCAT_NAME(?, ?, ?)}", [Sql.VARCHAR, 'Dierk', 'Koenig']) {
  fullname -> assert fullname == 'Dierk Koenig'
}
Creating a stored procedure with an input/output parameter
sql.execute """
  CREATE PROCEDURE CHECK_ID_POSITIVE_IN_OUT ( INOUT p_err VARCHAR(64), IN pparam INTEGER, OUT re VARCHAR(15))
  BEGIN ATOMIC
    IF pparam > 0 THEN
      set p_err = p_err || '_OK';
      set re = 'RET_OK';
    ELSE
      set p_err = p_err || '_ERROR';
      set re = 'RET_ERROR';
    END IF;
  END;
"""
Using a stored procedure with an input/output parameter
def scall = "{call CHECK_ID_POSITIVE_IN_OUT(?, ?, ?)}"
sql.call scall, [Sql.inout(Sql.VARCHAR("MESSAGE")), 1, Sql.VARCHAR], {
  res, p_err -> assert res == 'MESSAGE_OK' && p_err == 'RET_OK'
}

5. Using DataSets

Groovy provides a groovy.sql.DataSet class which enhances the groovy.sql.Sql class with what can be thought of as mini ORM functionality. Databases are accessed and queried using POGO fields and operators rather than JDBC-level API calls and RDBMS column names.

So, instead of a query like:

def qry = """SELECT * FROM Author
  WHERE (firstname > ?)
  AND (lastname < ?)
  ORDER BY lastname DESC"""
def params = ['Dierk', 'Pragt']
def result = sql.rows(qry, params)
assert result*.firstname == ['Eric', 'Guillaume', 'Paul']

You can write code like this:

def authorDS = sql.dataSet('Author')
def result = authorDS.findAll{ it.firstname > 'Dierk' }
        .findAll{ it.lastname < 'Pragt' }
        .sort{ it.lastname }
        .reverse()
assert result.rows()*.firstname == ['Eric', 'Guillaume', 'Paul']

Here we have a helper "domain" class:

class Author {
    String firstname
    String lastname
}

Database access and manipulation involves creating or working with instances of the domain class.