Using Primary Keys with Java Persistence
Sean
Brydon, Smitha
Kangath
Status: In Early Access
Problem Description
Persistent entities need primary keys. This document will cover some
guidelines
and tips for using primary keys while developing the model tier of an
application using the Java Persistence APIs. We will first look at
how primary keys are defined and then cover some strategies for
generating them.
Solution
There are several guidelines to be followed for primary keys.
Simple or Composite?
The primary key type can be of Java
primitives type (int
, byte
, long
, etc.), primitive wrapper type (Integer
, Byte
, Long
, etc.), java.lang.String
, java.util.Date
or java.sql.Date
.
In addition to choosing a type, the application developers must also choose whether
to use a simple primary key or a composite primary key.
A simple primary key involves a single persistent field or property of
the
entity. The @Id
annotation is used to denote a simple primary key. The field
(or
property) of the entity that is denoted with the @Id
annotation is
mapped to the primary key of the corresponding database table. The code
example 1 below shows how a simple primary key is defined:
@Entity public class Item {
...
//use default column mapping
String itemId;
@Id
public String getItemId() {
return itemId;
}
...
}
Code Example 1: Simple Primary Key
A composite key may correspond to either a single persistent field or a
group of fields and is represented by a primary key class. Composite
keys
are usually defined as embedded classes which contain
several columns from the table. Composite keys are useful when the
primary key of the
corresponding database table
has consists of more than one column. Composite
keys can be defined by the @EmbeddedId
or the
@IdClass
annotation. When the @EmbeddedId
annotation is used, the
class
representing the composite key is an embedded class specified by the @Embeddable
annotation. Please note the composite key class
must
be serializable and must define equals
and hashCode
methods. The code example 2 below shows how a composite key is defined
using the @EmbeddedId
annotation.
@Embeddable public class ItemId implements Serializable {
private String firstName;
private String lastName;
...
}
@Entity public class Item implements Serializable {
private ItemId id;
@EmbeddedId public ItemId getId() {
return
id;
}
...
}
Code Example 2: Composite Primary Key using @EmbeddedId annotation
Strategies for Generation of Primary Keys
Primary keys can be generated by the database, by the
container or by the application itself. The primary key is said to be auto-generated when
it is generated by the
provider or
the database.
Generating primary keys in application
One
option here is to use the primary key that is inherent part of the
data. In
this case, data already has a primary key so there is no need to
generate or assign one, the application can just use the primary
key of the data. An example would be the social security number for an entity that represents a tax payer in the US. If your
application data does not already contain a primary key, then your
application will need to assign a primary key to
the data when the data is created and stored in the database. The
application
would have to create and set the value of primary key
programmatically. One way
to do this is to create a table of Ids like sequence numbers and an
associated object that is mapped to that table, and have an API like
IdGenerator.getNextId()
. Another approach would be to just call
System.currentTimeMillis()
. Keep in mind that these methods may not
work as expected in a clustered environment.
One advantage of having the application generate the primary key is
that you get to know the key before the data is persisted in the
database.
Automatically generating primary keys
Another option is to automatically generate Id and let the provider or the
database
set the value.
That way the application developer doesn't have to write any Id
generation logic. You just use an annotation on the Id and let the
provider or the database do the work.
Java Persistence provides a mechanism for automatic generation of
primary keys in this manner through the use of @GeneratedValue
annotation. We have to provide an optional id generation strategy and
an optional id generator. The @GeneratedValue
annotation
is used along with the @Id
annotation.
Different generation strategies can be used by the persistence provider to generate
the primary keys. The generation strategy can be one of TABLE,
SEQUENCE, IDENTITY, or AUTO.
With the IDENTITY
strategy, the database adds an identity column and takes care of the
primary key generation. This is database-specific and
requires
the database to support the IDENTITY column type. Hence this strategy may not be
portable across databases.
With the SEQUENCE strategy, persistence
provider uses a database sequence that generates primary keys. This
requires the database to support Sequences and therefore may not be
portable across databases.
The TABLE
strategy
indicates that values are assigned based on an underlying database
table. Since the TABLE strategy is not based on any database-specific features, it is portable across databases.
AUTO
indicates that the persistence provider
should pick an appropriate strategy.
The SEQUENCE, IDENTITY,
and
AUTO strategies are not portable across databases. Also, keep in mind
that the AUTO,
SEQUENCE, and TABLE strategies may not work well in a clustered
environment. All these three strategies need the provider to do
something. This might cause data inconsistencies
if the provider is not capable of handling this. In the IDENTITY
strategy, the database does it without the provider so data
inconsistency is not expected in a clustered environment.
The id generator that we specify as an element of the GeneratedValue
annotation has to be declared by the SequenceGenerator
annotation or the TableGenerator
annotation
depending on
whether the generation strategy is SEQUENCE or TABLE. Both of
these annotations can be declared either on the entity class or on the
primary key field. The scope of the generator is the persistence unit.
Let's see how primary keys are auto generated with the TABLE generation
strategy. With the table generator, the persistence provider uses a
database
table to store the generated Ids. An entity will have a single row in
the table. The table would have a primary key name column and a value
column. The SQL command to create the generator table would
look
like this:
CREATE TABLE ID_GEN(GEN_KEY VARCHAR(10) NOT NULL, GEN_VALUE
INTEGER NOT NULL,primary key (GEN_KEY));
A row has to be inserted in this table as a seed for the generator.
INSERT INTO ID_GEN VALUES('ITEM_ID',101);
Now if we have an Item table that has the item id as the primary key,
the generated Ids will start from 102. You have to refer to this
table in the TableGenerator
annotation in the
persistent
entity for Item. The annotation would look like this:
@TableGenerator(name="ID_GEN",
table="ID_GEN",
pkColumnName="GEN_KEY",
valueColumnName="GEN_VALUE",
pkColumnValue="ITEM_ID",
allocationSize=1)
@GeneratedValue(strategy=GenerationType.TABLE,generator="ID_GEN")
@Id
public int getItemID() {
return itemID;
}
...
A common use case is when we have an existing database and we need to
auto-generate primary keys for the newly inserted records. This is
possible if the existing primary keys are of sequential type.
Maintaining Primary Keys
All entities must have a primary key, which is set when the entity is
persisted to the database for the first time. Applications
should
not modify primary key values of Java
entities. The actual behavior when an application tries to
modify
the value of a primary key is not defined. If an application really
wants to modify
a primary key, it should first remove the entity and then add
it with a new primary key, while also ensuring referential
integrity. For most cases, the application does not
need to modify the primary key. For example, when using
automatically generated ids there is no need
for the application to ever set the value. Since entities have an
identity
field (or primary key class)
which the application code should not modify, it is useful to put some
safeguards in the code to prevent the primary key from being changed.
One way is to document it through Javadocs on the setPK(.. )
method or the class to indicate to the users to not call the set method
of
primary keys
and not access the primary key field directly to modify it.
However, it does not always work by itself. The reason is that the
entity
developer
wants to
make it easy for other developers on the team to use the entity
properly. For example, a web developer writing code that would call the
entities might accidently modify the primary key value on an entity.
Note that besides your application code accessing the entity fields and
methods, the container and persistence provider also will be accessing
the entity. The container will access the entity either through the
fields or property methods depending on what the application
code specifies. The
container/provider must access the primary key value because it manages
the
interactions between the data in the entities and the data in the
database tables
and keeps them in sync, and do other work. The container will also be accessing the
entity in the case of automatically generated Ids as it will be
assigning
the values for the primary key. So whatever mechanism you use to try
and prevent the code from accidentally modifying the primary keys, must
take into account that the container needs to access the primary key
field (or primary key class) of the entity.
Lets outline one strategy to develop an entity bean in such a way that
users
of the entity can avoid accidently trying to modify the primary key.
Here are the five steps involved in implementing this strategy:
- Use field-based annotations. By using field-based
annotations the
container/provider will access the entity values using field-based
access.
If you use property-based access then you are
forced to expose APIs for setters and getters for all persistent fields
including the primary key, but in this case you do not want to
expose a setter for the primary key.
- Use package scoping on fields. This makes the primary key
field less visible to other developers but still accessible to
container.The persistence provider and container should be able to
access them so they cannot be made private.
- Provide getter and setter methods for the other fields and
maybe a getter for the primary key field. These methods are accessible
by
clients of the entity.
- Have a policy of developers accessing the entities only
through
getter and setter methods.
- Make a private setPK(..) method on the entity for the
primary key
so it can only be accessed within the class. This prevents a developer
using the entities in their code from calling the setPK(...) method and
modifying the value. Alternatively you could just not provide a setter
method for the primary key.
References
© Sun Microsystems 2006. All of the
material in The
Java BluePrints Solutions Catalog is copyright-protected
and may not be published in other works without express
written permission from Sun Microsystems.