Using Primary Keys with Java Persistence

Sean BrydonSmitha Kangath
Status: In Early Access

Problem Description

Persistent entities need primary keys. This document will cover some guidelines and tips for using primary keys while developing the model tier of an application using the Java Persistence APIs. We will first look at how primary keys are defined and then cover some strategies for generating them.

Solution

There are several guidelines to be followed for primary keys.

Simple or Composite?

The primary key type can be of  Java primitives type (int,  byte, long, etc.), primitive wrapper type (Integer, Byte, Long, etc.), java.lang.String, java.util.Date or java.sql.Date. In addition to choosing a type, the application developers must also choose whether to use a simple primary key or a composite primary key.

A simple primary key involves a single persistent field or property of the entity. The @Id annotation  is used to denote a simple primary key. The field (or property) of the entity that is denoted with the @Id annotation is mapped to the primary key of the corresponding database table. The code example 1 below shows how a simple primary key is defined:

@Entity public class Item {
  ...
  //use default column mapping
  String itemId;

  @Id
  public String getItemId() {
    return itemId;
  }
 ...
}
 Code Example 1: Simple Primary Key

A composite key may correspond to either a single persistent field or a group of fields and is represented by a primary key class. Composite keys are usually defined as embedded classes which contain several columns from the table. Composite keys are useful when the primary key of the corresponding database table has consists of more than one column. Composite keys can be defined by the @EmbeddedId or the @IdClass annotation. When the @EmbeddedId annotation is used, the class representing the composite key is an embedded class specified by the @Embeddable annotation.  Please note the composite key class must be serializable and must define equals and hashCode methods. The code example 2 below shows how a composite key is defined using the @EmbeddedId annotation.

@Embeddable public class ItemId implements Serializable {
    private String firstName;
    private String lastName;
    ...
   
}

@Entity public class Item implements Serializable {
    private ItemId id;
    @EmbeddedId public ItemId getId() {
        return id;
    }
    ... 
}

Code Example 2: Composite Primary Key using @EmbeddedId annotation

Strategies for Generation of Primary Keys

Primary keys can be generated by the database, by the container or by the application itself. The primary key is said to be auto-generated when it is generated by the provider or the database.

Generating primary keys in application

One option here is to use the primary key that is inherent part of the data. In this case, data already has a primary key so there is no need to generate or assign one, the application can just use the primary key of the data. An example would be the social security number for an entity that represents a tax payer in the US. If your application data does not already contain a  primary key, then your application will need to assign a  primary key to the data when the data is created and stored in the database. The application would have to create and set the value of  primary key programmatically. One way to do this is to create a table of Ids like sequence numbers and an associated object that is mapped to that table, and have an API like IdGenerator.getNextId(). Another approach would be to just call System.currentTimeMillis(). Keep in mind that these methods may not work as expected in a clustered environment.

One advantage of having the application generate the primary key is that you get to know the key before the data is persisted in the database.

Automatically generating primary keys

Another option is to automatically generate Id and let the provider or the database set the value. That way the application developer doesn't have to write any Id generation logic. You just use an annotation on the Id and let the provider or the database do the work.

Java Persistence provides a mechanism for automatic generation of primary keys in this manner through the use of  @GeneratedValue annotation. We have to provide an optional id generation strategy and an optional id generator. The @GeneratedValue annotation is used along with the @Id annotation.

Different generation strategies can be used by the persistence provider to generate the primary keys. The generation strategy can be one of TABLE, SEQUENCE, IDENTITY, or AUTO. 

With the IDENTITY strategy, the database adds an identity column and takes care of the primary key generation. This is database-specific and requires the database to support the IDENTITY column type. Hence this strategy may not be portable across databases.

With the SEQUENCE strategy, persistence provider uses a database sequence that generates primary keys. This requires the database to support Sequences and therefore may not be portable across databases.

The TABLE strategy indicates that values are assigned based on an underlying database table. Since the TABLE strategy is not based on any  database-specific features, it is portable across databases.

AUTO indicates that the persistence provider should pick an appropriate strategy. 

The SEQUENCE, IDENTITY, and AUTO strategies are not portable across databases. Also, keep in mind that the AUTO, SEQUENCE, and TABLE strategies may not work well in a clustered environment. All these three strategies need the provider to do something. This might cause data inconsistencies if the provider is not capable of handling this. In the IDENTITY strategy, the database does it without the provider so data inconsistency is not expected in a clustered environment.

The id generator that we specify as an element of the GeneratedValue annotation has to be declared by the SequenceGenerator annotation or the TableGenerator annotation depending on whether the generation strategy is SEQUENCE or TABLE. Both of these annotations can be declared either on the entity class or on the primary key field. The scope of the generator is the persistence unit.

Let's see how primary keys are auto generated with the TABLE generation strategy. With the table generator, the persistence provider uses a database table to store the generated Ids. An entity will have a single row in the table. The table would have a primary key name column and a value column.  The SQL command to create the generator table would look like this:

CREATE TABLE ID_GEN(GEN_KEY VARCHAR(10) NOT NULL, GEN_VALUE INTEGER NOT NULL,primary key (GEN_KEY));

A row has to be inserted in this table as a seed for the generator.
INSERT INTO ID_GEN VALUES('ITEM_ID',101);

Now if we have an Item table that has the item id as the primary key, the generated Ids will start from 102. You have to refer to this table in the TableGenerator annotation in the persistent entity for Item. The annotation would look like this:

  @TableGenerator(name="ID_GEN",
            table="ID_GEN",
            pkColumnName="GEN_KEY",
            valueColumnName="GEN_VALUE",
            pkColumnValue="ITEM_ID",
            allocationSize=1)
  @GeneratedValue(strategy=GenerationType.TABLE,generator="ID_GEN")
  @Id
  public int getItemID() {
      return itemID;
  }
  ...


A common use case is when we have an existing database and we need to auto-generate primary keys for the newly inserted records. This is possible if  the existing primary keys are of sequential type.

Maintaining Primary Keys

All entities must have a primary key, which is set when the entity is persisted to the database for the first time. Applications should not modify primary key values of Java entities. The actual behavior when an application tries to modify the value of a primary key is not defined. If an application really wants to modify a primary key, it should first remove the entity and then add it with a new primary key, while also ensuring referential integrity. For most cases, the application does not need to modify the primary key. For example, when using automatically generated ids there is no need for the application to ever set the value. Since entities have an identity field (or primary key class) which the application code should not modify, it is useful to put some safeguards in the code to prevent the primary key from being changed. One way is to document it through Javadocs on the setPK(.. ) method or the class to indicate to the users to not call the set method of primary keys and not access the primary key field directly to modify it. However, it does not always work by itself. The reason is that the entity developer wants to make it easy for other developers on the team to use the entity properly. For example, a web developer writing code that would call the entities might accidently modify the primary key value on an entity.

Note that besides your application code accessing the entity fields and methods, the container and persistence provider also will be accessing the entity. The container will access the entity either through the fields or property methods depending on what the application code specifies. The container/provider must access the primary key value because it manages the interactions between the data in the entities and the data in the database tables and keeps them in sync, and do other work. The container will also be accessing the entity in the case of automatically generated Ids as it will be assigning the values for the primary key. So whatever mechanism you use to try and prevent the code from accidentally modifying the primary keys, must take into account that the container needs to access the primary key field (or primary key class) of the entity.

Lets outline one strategy to develop an entity bean in such a way that users of the entity can avoid accidently trying to modify the primary key. Here are the five steps involved in implementing this strategy:

References


© Sun Microsystems 2006. All of the material in The Java BluePrints Solutions Catalog is copyright-protected and may not be published in other works without express written permission from Sun Microsystems.