Google Cloud Spanner Dialect for Hibernate ORM

Table of Contents

Quick Set-Up
Examples
User Guide
Cloud Spanner Hibernate ORM Limitations

This is a dialect compatible with Hibernate 5.4 for the Google Cloud Spanner database service. The SpannerDialect produces SQL, DML, and DDL statements for most common entity types and relationships using standard Hibernate and Java Persistence annotations.

Please see the following sections for important details about dialect differences due to the unique features and limitations of Cloud Spanner.

Quick Set-Up

First, add the Maven dependencies for the Cloud Spanner Hibernate Dialect and the Cloud Spanner JDBC driver.

Maven coordinates for the dialect:

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-spanner-hibernate-dialect</artifactId>
  <version>1.5.3</version>
</dependency>

Maven coordinates for the official open source Cloud Spanner JDBC Driver.

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-spanner-jdbc</artifactId>
  <version>2.0.0</version>
</dependency>

Note	Hibernate ORM with Cloud Spanner is officially supported only with the open source Cloud Spanner JDBC Driver.

If you’re using a SNAPSHOT version of the dialect, please add the Sonatype Snapshots repository to your pom.xml:

<repository>
  <id>snapshots-repo</id>
  <url>https://oss.sonatype.org/content/repositories/snapshots</url>
  <releases><enabled>false</enabled></releases>
  <snapshots><enabled>true</enabled></snapshots>
</repository>

Configuring the SpannerDialect and a Cloud Spanner Driver class is typical of all Hibernate dialects in the hibernate.properties file:

hibernate.dialect=com.google.cloud.spanner.hibernate.SpannerDialect
hibernate.connection.driver_class=com.google.cloud.spanner.jdbc.JdbcDriver
hibernate.connection.url=jdbc:cloudspanner:/projects/{INSERT_PROJECT_ID}/instances/{INSERT_INSTANCE_ID}/databases/{INSERT_DATABASE_ID}

The service account JSON credentials file location should be in the GOOGLE_APPLICATION_CREDENTIALS environment variable. The driver will use default credentials set in the Google Cloud SDK gcloud application otherwise.

You are now ready to begin using Hibernate with Cloud Spanner.

Examples

To see some examples of using the dialect, please consult our sample applications or try the Cloud Spanner Hibernate codelab.

User Guide

This guide contains a variety of best practices for using Hibernate with Spanner which can significantly improve the performance of your application.

Schema Creation and Entity Design

Hibernate generates statements based on your Hibernate entity design. Following these practices can result in better DDL and DML statement generation which can improve performance.

Use Generated UUIDs for ID Generation

The Universally Unique Identifier (UUID) is the preferred ID type in Cloud Spanner because it avoids hotspots as the system divides data among servers by key ranges. UUIDs are strongly preferred over sequentially increasing IDs for this reason.

It is also recommended to use Hibernate’s @GeneratedValue annotation to generate this UUID automatically; this can reduce the number of statements that Hibernate generates to perform an insert because it does not need to run extra SELECT statements to see if the record already exists in the table.

You can configure UUID generation like below:

@Entity
public class Employee {

  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  @Type(type="uuid-char")
  public UUID id;
}

The @Type(type="uuid-char") annotation specifies that this UUID value will be stored in Cloud Spanner as a STRING column. Leaving out this annotation causes a BYTES column to be used.

Hibernate’s @GeneratedValue annotation for numeric fields is supported but not recommended:

@Entity
public class Employee {

  @Id
  @GeneratedValue   // Not Recommended.
  public Long id;
}

This results in sequential IDs that are not optimal for Cloud Spanner and makes use of the hibernate_sequence table for generating IDs.

Custom Spanner Column Types

This project offers the following custom Hibernate type mappings for specific Spanner column types:

Spanner Data Type	Hibernate Type
ARRAY	`com.google.cloud.spanner.hibernate.types.SpannerArrayListType`
JSON	`com.google.cloud.spanner.hibernate.types.SpannerJsonType`

You can use these type mappings through the Hibernate @TypeDefs system:

// Use the @TypeDefs annotation to declare custom types you would like to use.
@TypeDefs({
  @TypeDef(
    name = "spanner-array",
    typeClass = SpannerArrayListType.class
  )
})
@Entity
public class Singer {

  // Specify the custom type with the @Type annotation.
  @Type(type = "spanner-array")
  private List<String> nickNames;

  ...
}

A working example of this feature can be found in the The Hibernate Basic Sample.

Auto-generate Schema for Faster Development

It is often useful to generate the schema for your database, such as during the early stages of development. The Spanner dialect supports Hibernate’s hibernate.hbm2ddl.auto setting which controls the framework’s schema generation behavior on start-up.

The following settings are available:

none: Do nothing.
validate: Validate the schema, makes no changes to the database.
update: Create or update the schema.
create: Create the schema, destroying previous data.
create-drop: Drop the schema when the SessionFactory is closed explicitly, typically when the application is stopped.

Hibernate performs schema updates on each table and entity type on startup, which can take more than several minutes if there are many tables. To avoid schema updates keeping Hibernate from starting for several minutes, you can update schemas separately and use the none or validate settings.

Leverage Cloud Spanner Foreign Key Constraints

The dialect supports all of the standard entity relationships:

@OneToOne
@OneToMany
@ManyToOne
@ManyToMany

These can be used via @JoinTable or @JoinColumn.

The Cloud Spanner Hibernate dialect will generate the correct foreign key DDL statements during schema generation for entities using these annotations. However, Cloud Spanner currently does not support cascading deletes on foreign keys, therefore database-side cascading deletes are not supported via the @OnDelete(action = OnDeleteAction.CASCADE).

The dialect also supports unique column constraints applied through @Column(unique = true) or @UniqueConstraint. In these cases, the dialect will create a unique index to enforce uniqueness on the specified columns.

Advanced Cloud Spanner Features (via. JDBC)

Cloud Spanner offers several features that traditional databases typically do not offer. These include:

Stale Reads
Read-only transactions
Partitioned DML
Mutations API (faster insert/update/delete operations)

We provide a Cloud Spanner Features Sample Application which demonstrates best practices for accessing these features through the Cloud Spanner JDBC driver.

Please consult the Cloud Spanner JDBC driver documentation for more information.

Performance Optimizations

There are some practices which can improve the execution time of Hibernate operations.

Be Clear About Inserts or Updates

Hibernate may generate additional SELECT statements if it is unclear whether you are attempting to insert a new record or update an existing record. The following practices can help with this:

Let Hibernate generate the ID by leaving the entity’s id null and annotate the field with @GeneratedValue. Hibernate will know that the record did not exist prior if it generates a new ID. See the above section for more details.
Or use session.persist() which will explicitly attempt the insert.

Enable Hibernate Batching

Batching SQL statements together allows you to optimize the performance of your application by including a group of SQL statements in a single remote call. This allows you to reduce the number of round-trips between your application and Cloud Spanner.

By default, Hibernate does not batch the statements that it sends to the Cloud Spanner JDBC driver.

Batching can be enabled by configuring hibernate.jdbc.batch_size in your Hibernate configuration file:

<property name="hibernate.jdbc.batch_size">100</property>

The property is set to 100 as an example; you may experiment with the batch size to see what works best for your application.

Use Interleaved Tables for Parent-Child Entities

Cloud Spanner offers the concept of Interleaved Tables which allows you to co-locate the rows of an interleaved table with rows of a parent table for efficient retrieval. This feature enforces the one-to-many relationship and provides efficient queries and operations on entities of a single domain parent entity.

If you would like to generate interleaved tables in Cloud Spanner, you must annotate your entity with the @Interleaved annotation. The primary key of the interleaved table must also include at least all of the primary key attributes of the parent. This is typically done using the @IdClass or @EmbeddedId annotation.

The Hibernate Basic Sample contains an example of using @Interleaved for the Singer and Album entities. The code excerpt of the Album entity below demonstrates how to declare an interleaved entity in the Singer table.

@Entity
@Interleaved(parentEntity = Singer.class, cascadeDelete = true)
@IdClass(AlbumId.class)
public class Album {

  @Id
  @GeneratedValue(strategy = GenerationType.AUTO)
  @Type(type = "uuid-char")
  private UUID albumId;

  @Id
  @ManyToOne
  @JoinColumn(name = "singerId")
  @Type(type = "uuid-char")
  private Singer singer;

  // Constructors, getters/setters

  public static class AlbumId implements Serializable {

    // The primary key columns of the parent entity
    // must be declared first.
    Singer singer;

    @Type(type = "uuid-char")
    UUID albumId;

    // Getters and setters
  }
}

The parent entity should define a @OneToMany relationship with the child entity as well. Use the mappedBy setting to specify which field in the child maps back to the parent.

@Entity
public class Singer {

  @OneToMany(mappedBy = "singer")
  List<Album> albums;

  // continued...
}

Tune JDBC Driver Parameters

The Spanner JDBC driver allows you to set the number of GRPC channels initialized through the JDBC connection URL. Each channel can support up to 100 concurrent requests; for applications that require a high amount of concurrency this value can be increased (from the default of 4).

jdbc:cloudspanner:/projects/PROJECT_ID/instances/INSTANCE_ID/databases/DATABASE_ID?numChannels=8

The full list of configurable properties can be found in the Spanner JDBC Driver Java docs.

Use Spanner Query Optimization

The Cloud Spanner SQL syntax offers a variety of query hints to tune and optimize the performance of queries. If you find that you need to take advantage of this feature, you can achieve this in Hibernate using native SQL queries.

This is an example of using the @{FORCE_JOIN_ORDER=TRUE} hint in a native Spanner SQL query.

SQLQuery query = session.createSQLQuery("SELECT * FROM Singers AS s
                                         JOIN@{FORCE_JOIN_ORDER=TRUE} Albums AS a
                                         ON s.SingerId = a.Singerid
                                         WHERE s.LastName LIKE '%x%'
                                         AND a.AlbumTitle LIKE '%love%';");

// Executes the query.
List<Object[]> entities = query.list();

Also, you may consult the Cloud Spanner documentation on general recommendations for optimizing performance.

Cloud Spanner Hibernate ORM Limitations

The Cloud Spanner Hibernate Dialect supports most of the standard Hibernate and Java Persistence annotations, but there are minor differences in supported features because of differences in Cloud Spanner from other traditional SQL databases.

Unsupported Feature	Description
Large DML Transactions	Each Spanner transaction may only have up to 20,000 operations which modify rows of a table.
Catalog and schema scoping for table names	Tables name references cannot contain periods or other punctuation.
Column default values	The dialect does not set default values based on the `@ColumnDefault` annotation, because Cloud Spanner does not support column defaults in the DDL.

Large DML Transactions Limits

Cloud Spanner has a mutation limit on each transaction - each Spanner transaction may only have up to 20,000 operations which modify rows of a table.

Note	Deleting a row counts as one operation and inserting/updating a single row will count as a number of operations equal to the number of affected columns. For example if one inserts a row that contains 5 columns, it counts as 5 modify operations for the insert.

Consequently, users must take care to avoid encountering these constraints.

We recommend being careful with the use of CASCADE_TYPE.ALL in Entity annotations because, depending on the application, it might trigger a large number of entities to be deleted in a single transaction and bring you over the 20,000 limit.
Also, when persisting a collection of entities, be mindful of the 20,000 mutations per transaction constraint.

Catalog/Schema Table Names

The Cloud Spanner Dialect only supports @Table with the name attribute. It does not support table names with catalog and schema components because Spanner table names may not contain punctuation:

// Supported.
@Table(
  name = "book"
)

// Not supported: `public.store.book` is not a valid Cloud Spanner table name reference.
@Table(
  catalog = "public",
  schema = "store",
  name = "book"
)

Column Default Values

The dialect does not support the @ColumnDefault annotation because Cloud Spanner does not offer a way of setting a default value for a column during table creation through DDL statements.

Sanne / google-cloud-spanner-hibernate