EJB QL




EJB QL

Now that you have a basic understanding of how to work with Query objects, you can learn what features are available to you for creating your own EJB QL queries. EJB QL is expressed in terms of the abstract persistence schema of an entity: its abstract schema name, basic properties, and relationship properties. EJB QL uses the abstract schema names to identify beans, the basic properties to specify values, and the relationship properties to navigate across relationships.

To discuss EJB QL, we will use the relationships among the Customer, Address, CreditCard, Cruise, Ship, Reservation, and Cabin entities defined in Chapter 7. Figure is a class diagram that shows the direction and cardinality (multiplicity) of the relationships among these beans.

Titan Cruises class diagram


Abstract Schema Names

The abstract schema name can be defined by metadata or it can default to a specific value. It defaults to the unqualified name of the entity bean class if the name( ) attribute is not specified when declaring the @Entity annotation.

In the following example, the @Entity.name( ) attribute is not specified on the Customer bean class, so Customer is used to reference the entity within EJB QL calls:

package com.titan.domain;

@Entity
public class Customer {...}

entityManager.createQuery("SELECT c FROM Customer AS c");

In the following example, since the @Entity.name( ) attribute is defined, you would reference Customer entities in EJB QL as Cust:

package com.titan.domain;

@Entity(name="Cust")
public class Customer {...}

entityManager.createQuery("SELECT c FROM Cust AS c");

Simple Queries

The simplest EJB QL statement has no WHERE clause and only one abstract schema type. For example, you could define a query method to select all Customer beans:

SELECT OBJECT( c ) FROM Customer AS c

The FROM clause determines which entity bean types will be included in the SELECT statement (i.e., it provides the scope of the select). In this case, the FROM clause declares the type to be Customer, which is the abstract schema name of the Customer entity. The AS c part of the clause assigns c as the identifier of the Customer entity. This is similar to SQL, which allows an identifier to be associated with a table. Identifiers can be any length and follow the same rules that are applied to field names in the Java programming language. However, identifiers cannot be the same as existing abstract schema name values. In addition, identification variable names are not case-sensitive, so an identifier of customer would be in conflict with an abstract schema name of Customer. For example, the following statement is illegal because Customer is the abstract schema name of the Customer EJB:

SELECT OBJECT 
( customer ) FROM Customer AS customer

The AS operator is optional, but it is used in this book to help make the EJB QL statements clearer. The following two statements are equivalent:

SELECT OBJECT(c) FROM Customer AS c

SELECT c FROM Customer c

The SELECT clause determines the type of any values that are returned. In this case, the statement returns the Customer entity bean, as indicated by the c identifier.

The OBJECT( ) operator is optional and is a relic requirement of the EJB 2.1 spec. It is there for backward compatibility.

Identifiers cannot be EJB QL reserved words. In Java Persistence, the following words are reserved: SELECT, FROM, WHERE, UPDATE, DELETE, JOIN, OUTER, INNER, GROUP, BY, HAVING, FETCH, DISTINCT, OBJECT, NULL, TRUE, FALSE, NOT, AND, OR, BETWEEN, LIKE, IN, AS, UNKNOWN, EMPTY, MEMBER, OF, IS, AVG, MAX, MIN, SUM COUNT, ORDER ASC, DESC, MOD, UPPER, LOWER, TRIM, POSITION, CHARACTER_LENGTH, CHAR_LENGTH, BIT_LENGTH, CURRENT_TIME, CURRENT_DATE, CURRENT_TIMESTAMP , and NEW . It's a good practice to avoid all SQL reserved words, because you never know which ones will be used by future versions of EJB QL. You can find more information in the appendix of SQL in a Nutshell (O'Reilly).

Selecting Entity and Relationship Properties

EJB QL allows SELECT clauses to return any number of basic or relationship properties. For example, we can define a simple SELECT statement to return the first and last names of all of Titan Cruises' customers:

SELECT c.firstName, 
 c.lastName FROM Customer AS c

The SELECT clause uses a simple path to select the Customer entity's firstName and lastName properties as the return type. The persistence property names are identified by the access type of your entity bean class, regardless of whether you've applied your mapping annotations on a get or set method or on the member fields of the class.

If you use get or set methods to specify your persistent properties, then the property name is extracted from the method name. The get part of the method name is removed, and the first character of the remaining string is lowercase:

@Entity
public class Customer {
   private ind id;
   private String first; 

   private String last;

   @Id
   public int getId( ) { return id; }
   public String getFirstName( ) { return first; }
   public String getLastName( ) { return first; }

In this example, we are using get and set methods to define our persistent properties. The SELECT clause would look like this:

SELECT c.firstName, c.lastName FROM Customer AS c

If you are mapping your entity directly on the member fields of your bean class, then the field name is used in the SELECT clause:

@Entity
public class Customer {
   @Id private int id;
   private String first;
   private String last;
}

Our example EJB QL statement would need to be rewritten with the changed property names first and last:

SELECT c.first, c.last FROM Customer AS c

When a query returns more than one item, you must use the Query.getResultList( ) method. If the SELECT clause queries more than one column or entity, the results are aggregated in an object array (Object[]) in the java.util.List returned by geTResultList( ). The following code shows how to access the returned results:

Query query = manager.createQuery(
                "SELECT c.firstName, c.lastName FROM Customer AS c");
List results = query.getResultList( );
Iterator it = results.iterator( );
while (it.hasNext( )) {
   Object[] result = (Object[])it.next( );
   String first = (String)result[0];
   String last = (String)result[1];
}

You can also use single-valued relationship field types in simple select statements. The following EJB QL statement selects all CreditCards from their related Customer entities:

SELECT c.creditCard FROM Customer AS c

In this case, the EJB QL statement uses a path to navigate from the Customers to their creditCard relationship fields. The creditCard identifier is obtained by the same access type rules for basic properties.

Paths can be as long as required. It's common to use paths that navigate over one or more relationship fields to end at either a basic or a relationship property. For example, the following EJB QL statement selects all the city fields of all the Addresses in each Customer:

SELECT c.address.city  FROM Customer AS c

In this case, the path uses the abstract schema name of the Customer, the Customer's address relationship field, and the Address's city field.

To illustrate more complex paths, we'll need to expand the class diagram. Figure shows that the CreditCard is related to a CreditCompany that has its own Address.

Expanded class diagram for CreditCard


Using these relationships, we can specify a more complex path that navigates from the Customer to the CreditCompany to the Address. Here's an EJB QL statement that selects the addresses of all the credit card companies used by Titan's customers:

SELECT c.creditCard.creditCompany.address FROM Customer AS c

The EJB QL statement could also navigate all the way to the Address bean's fields. The following statement selects all the cities in which the credit card companies that distribute credit cards used by Titan's customers are based:

SELECT c.creditCard.creditCompany.address.city FROM Customer AS c

Note that these EJB QL statements return address relationship properties or city properties only for those credit card companies responsible for cards owned by Titan's customers. The address information of any credit card companies whose cards are not currently used by Titan's customers won't be included in the results.

Paths cannot navigate beyond persistent properties. For example, imagine that the Address uses a ZipCode class as its zip property and this property is stored as a byte stream in the database:

public class ZipCode implements java.io.Serializable {
    public int mainCode;
    public int codeSuffix;
    ...
}

@Entity
public class Address {
   private ZipCode zip;
}

You can't navigate to one of the ZipCode class's instance fields:

// this is illegal
SELECT c.address.zip.mainCode FROM Customer AS c

Of course, you could make the ZipCode class @Embeddable . If you did this, then you could obtain properties of the ZipCode class:

public class ZipCode implements java.io.Serializable {
    public int mainCode;
    public int codeSuffix;
    ...
}

@Entity
public class Address {
   @Embedded private ZipCode zip;
}

This EJB QL would now be legal:

// @Embedded makes this legal now
SELECT c.address.zip.mainCode FROM Customer AS c

It's illegal to navigate across a collection-based relationship field. The following EJB QL statement is illegal, even though the path ends in a single-type relationship field:

// this is illegal
SELECT c.reservations.cruise FROM Customer AS c

If you think about it, this limitation makes sense. You can't use a navigation operator (.) in Java to access elements of a java.util.Collection object. For example, if getreservations( ) returns a java.util.Collection type, this statement is illegal:

// this is illegal in the Java programming language
customer.getReservations( ).getCruise( );

Referencing the elements of a collection-based relationship field is possible, but it requires the use of an IN or JOIN operator and an identification assignment in the FROM clause.

Constructor Expressions

One of the most powerful features of EJB QL is the ability to specify a constructor within the SELECT clause that can allocate plain Java objects (nonentities) and pass in columns you select into that constructor. For example, let's say we want to aggregate first and last names from our Customer entity into a plain Java object called Name:

public class Name {
   private String first;
   private String last;

   public Name(String first, String last) {
      this.first = first;
      this.last = last;
   }

   public String getFirst( ) { return this.first; }
   public String getLast( ) { return this.last; }
}

We can actually have our query return a list of Name classes instead of a plain list of strings. We do this by calling the constructor of Name directly within our query:

SELECT new com.titan.domain.Name(c.firstName, 
 c.lastName)
FROM Customer c

The Query object will automatically allocate an instance of Name for each row returned, passing in the firstName and lastName columns as arguments to the Name's constructor. This feature is incredibly useful for generating typed reports and can save you a lot of typing.

The IN Operator and INNER JOIN

Many relationships between entity beans are collection-based, and being able to access and select beans from these relationships is important. We've seen that it is illegal to select elements directly from a collection-based relationship. To overcome this limitation, EJB QL introduces the IN operator, which allows an identifier to represent individual elements in a collection-based relationship field.

The following query uses the IN operator to select the elements from a collection-based relationship. It returns all the reservations of all the customers:

SELECT r
FROM Customer AS c,  IN( c.reservations ) r

The IN operator assigns the individual elements in the reservations property to the r identifier. Once we have an identifier to represent the individual elements of the collection, we can reference them directly and even select them in the EJB QL statement. We can also use the element identifier in path expressions. For example, the following statement selects every cruise for which Titan's customers have made reservations:

SELECT r.cruise
FROM Customer AS c, IN( c.reservations ) r

The identifiers assigned in the FROM clause are evaluated from left to right. Once you declare an identifier, you can use it in subsequent declarations in the FROM clause. The c identifier, which was declared first, was subsequently used in the IN operator to define the r identifier.

This query can also be expressed as an INNER JOIN:

SELECT r.cruise
FROM Customer c INNER JOIN 
 c.reservations r

The INNER JOIN syntax parallels the SQL language much better and is more intuitive for developers coming from the relational world.

Identification chains can become very long. The following statement uses two IN operators to navigate two collection-based relationships and a single relationship. While not necessarily useful, this statement demonstrates how a query can use IN operators across many relationships.

SELECT cbn.ship
FROM Customer AS c, IN ( c.reservations ) r,
IN( r.cabins ) cbn

Alternatively, again, the INNER JOIN syntax could be used:

SELECT cbn.ship
FROM Customer c INNER JOIN c.reservations r
INNER JOIN r.cabins cbn

These queries select all ships for which customers have reservations. The INNER keyword is actually optional, so the previous query could be rewritten as:

SELECT cbn.ship
FROM Customer c JOIN c.reservations r
JOIN r.cabins cbn

LEFT JOIN

The LEFT JOIN syntax enables retrieval of a set of entities where matching values in the join statement may not exist. For values that do not exist, a null value is placed in the result set.

For example, let's say we want to generate a report with a customer's name and all the customer's phone numbers. Some customers may not have specified a phone number, but we still want to list their names. We would use a LEFT JOIN to acquire all of this information, including customers with no phone numbers:

SELECT c.firstName, c.lastName, p.number From Customer c
LEFT JOIN c.phoneNumbers p

If there were three customers in our system, and Bill Burke did not provide any phone numbers, the return values might look like this:

David Ortiz 617-555-0900
David Ortiz 617-555-9999
Trot Nixon 781-555-2323
Bill Burke null

The previous query can also be expressed as a LEFT OUTER JOIN . This is just syntax sugar to parallel SQL 92.

SELECT c.firstName, c.lastName, p.number From Customer c
LEFT OUTER JOIN c.phoneNumbers p

Fetch Joins

The JOIN FETCH syntax allows you to preload a returned entity's relationships even if the relationship property has a FetchType of LAZY . For example, let's say we have defined our customer's one-to-many relationship to Phone as follows:

@OneToMany(fetch=FetchType.LAZY)
public Collection<Phone> getPhones( ) { return phones; }

If we want to print out all customer information including their phone numbers, we would usually just query for all customers and then traverse the getPhones( ) method inside a for loop:

1  Query query = manager.createQuery("SELECT c FROM Customer c");
2  List results = query.getResultList( );
3  Iterator it = results.iterator( );
4  while (it.hasNext( )) {
5    Customer c = (Customer)it.next( );
6    System.out.print(c.getFirstName( ) + " " + c.getLastName( ));
7    for (Phone p : c.getPhoneNumbers( )) {
8    System.out.print(p.getNumber( ) + " ");
9    }
10   System.out.println("");
11 }

There are performance problems with the preceding code. Because the Phone relationship is annotated as being lazily loaded in the Customer bean class, the Phone collection will not be initialized when we do the initial query at Line 1. When getPhonesNumbers( ) is executed at Line 7, the persistence engine has to do an additional query to get the Phone entities associated with the customer. This is called the N + 1 problem, as we have to do N extra queries beyond our initial query. When tuning database applications, it is always important to reduce the number of round trips made to the database as much as possible. This is where the JOIN FETCH syntax comes into play. Let's modify our query to preload the Phone association:

SELECT c FROM Customer c LEFT JOIN FETCH c.phones

Using LEFT JOIN FETCH will additionally preload the Phone association. This can have a dramatic effect on performance because instead of N + 1 queries, only one query is made to the database.

Using DISTINCT

The DISTINCT keyword ensures that the query does not return duplicates. For example, the following query finds all customers with reservations. This query will return duplicates:

SELECT res FROM Reservation AS res, IN (res.customers) cust

If a customer has more than one reservation, there will be duplicate references to that customer in the result. Using the DISTINCT keyword ensures that each customer is represented only once in the result:

SELECT DISTINCT cust FROM Reservation AS res,
IN (res.customers) cust

The WHERE Clause and Literals

You can use literal values to narrow the scope of the elements selected. This is accomplished through the WHERE clause, which behaves in much the same way as the WHERE clause in SQL.

For example, you can define an EJB QL statement that selects all the Customer entities that use a specific brand of credit card. The literal in this case is a String literal. Literal strings are enclosed by single quotes. Literal values that include a single quote, like the restaurant name Wendy's, use two single quotes to escape the quote: Wendy''s. The following statement returns customers that use Capital One credit cards. (If you don't want to bother with such details, use a query parameter; the job will be done for you by the query API.)

SELECT c
FROM Customer AS c
WHERE c.creditCard.creditCompany.name = 'Capital One'

Path expressions in the WHERE clause are used in the same way as in the SELECT clause. When making comparisons with a literal, the path expression must evaluate to a basic property; you can't compare a relationship field with a literal.

In addition to literal strings, literals can be exact numeric values (long types) and approximate numeric values (double types). Exact numeric literal values are expressed using the Java integer literal syntax (321, -8932, +22). Approximate numeric literal values are expressed using Java floating-point literal syntax in scientific (5E3, -8.932E5) or decimal (5.234, 38282.2) notation. For example, the following EJB QL statement selects all the ships that weigh 100,000 metric tons:

SELECT s
FROM Ship AS s
WHERE s.tonnage = 100000.00

Boolean literal values use trUE and FALSE. Here's an EJB QL statement that selects all the customers who have good credit:

SELECT c
FROM Customer AS c
WHERE c.hasGoodCredit = TRUE

The WHERE Clause and Operator Precedence

The WHERE clause is composed of conditional expressions that reduce the scope of the query and limit the number of items selected. Several conditional and logical operators can be used in expressions; they are listed here in order of precedence:

  • Navigation operator (.)

  • Arithmetic operators: +, - (unary); *, / (multiplication and division); +, - (addition and subtraction)

  • Comparison operators: =, >, >=, <, <=, <> (not equal), LIKE, BETWEEN, IN, IS NULL, IS EMPTY, MEMBER OF

  • Logical operators: NOT, AND, OR

The WHERE Clause and Arithmetic Operators

The arithmetic operators allow a query to perform arithmetic in the process of doing a comparison. Arithmetic operators can be used only in the WHERE clause, not in the SELECT clause.

The following EJB QL statement returns references to all the Reservation EJBs that will be charged a port tax of more than $300:

SELECT r
FROM Reservation AS r
WHERE (r.amountPaid * .01)  > 300.00

The rules applied to arithmetic operations are the same as those used in the Java programming language, where numbers are widened, or promoted, in the process of performing a calculation. For example, multiplying a double and an int value requires that the int first be promoted to a double value. (The result will always be that of the widest type used in the calculation, so multiplying an int and a double results in a double value.)

String, boolean, and entity object types cannot be used in arithmetic operations. For example, using the addition operator with two String values is considered an illegal operation. There is a special function for concatenating String values, covered later in this chapter in the section titled "Functional expressions in the WHERE clause."

The WHERE Clause and Logical Operators

Logical operators such as AND, OR, and NOT operate the same way in EJB QL as their corresponding logical operators in SQL.

Logical operators evaluate only Boolean expressions, so each operand (i.e., each side of the expression) must evaluate to true, false, or NULL. Logical operators have the lowest precedence so that all the expressions can be evaluated before they are applied.

The AND and OR operators don't behave like their Java language counterparts, && and ||. EJB QL does not specify whether the righthand operands are evaluated conditionally. For example, the && operator in Java evaluates its righthand operand only if the lefthand operand is true. Similarly, the || logical operator evaluates the righthand operand only if the lefthand operand is false. We can't make the same assumption for the AND and OR operators in EJB QL. Whether these operators evaluate righthand operands depends on the native query language into which the statements are translated. It's best to assume that both operands are evaluated on all logical operators.

NOT simply reverses the Boolean result of its operand; expressions that evaluate to the Boolean value of true become false, and vice versa

The WHERE Clause and Comparison Symbols

Comparison operators, which use the symbols =, >, >=, <, <=, and <>, should be familiar to you. The following statement selects all the Ship entities whose tonnage field is greater than or equal to 80,000 tons but less than or equal to 130,000 tons:

SELECT s
FROM Ship AS s
WHERE s.tonnage >= 80000.00 AND s.tonnage <= 130000.00

Only the = and <> (equals and not equals) operators may be used on boolean and entity object identifiers. The greater-than and less-than symbols (>, >=, <, <=) can be used on numeric values as well as strings. However, the semantics of these operations are not defined by the Java Persistence specification. Is character case (upper or lower) important? Does leading and trailing whitespace matter? Issues like these affect the ordering of string values. In order for EJB QL to maintain its status as an abstraction of native query languages, it cannot dictate String ordering, because native query languages may have very different ordering rules. In fact, even different relational database vendors vary on the question of String ordering, which makes it all but impossible to standardize ordering even for SQL "compliant" databases.

Of course, this is all academic if you plan on using the same database well into the future. In such a case, the best thing to do is to examine the documentation for the database you are using to find out how it orders strings in comparisons. This tells you exactly how your EJB QL comparisons will work.

The WHERE Clause and Equality Semantics

While it is legal to compare an exact numeric value (short, int, long) to an approximate numeric value (double, float), all other equality comparisons must compare the same types. You cannot, for example, compare a String value of 123 to the Integer literal 123. However, you can compare two String types for equality.

You can compare numeric values for which the rules of numeric promotion apply. For example, a short may be compared to an int, an int to a long, etc. Java Persistence also states that primitives may be compared to primitive wrapper typesthe rules of numeric promotion apply.

In older versions of the spec, String type comparisons had to match exactly, character for character. EJB 2.1 dropped this requirement, making the evaluation of equality between String types more ambiguous. This continued in Java Persistence. Again, this ambiguity arises from the differences between kinds of databases (relational versus object-oriented versus file), as well as differences between vendors of relational databases. Consult your vendor's documentation to determine exactly how String equality comparisons are evaluated.

You can also compare entity objects for equality, but these too must be of the same type. To be more specific, they must both be entity object references to beans from the same deployment. As an example, the following query finds all the Reservation entities made by a specific Customer. It takes a Customer entity as a parameter:

SELECT r
FROM Reservation r, IN ( r.customers ) AS cust
WHERE  cust = :specificCustomer

Once it's determined that the bean is the correct type, the actual comparison is performed on the beans' primary keys. If they have the same primary key, they are considered equal.

You may use java.util.Date objects in equality comparisons. See the earlier section titled "Date Parameters."

The WHERE Clause and BETWEEN

The BETWEEN clause is an inclusive operator specifying a range of values. In this example, we use it to select all ships weighing between 80,000 and 130,000 tons:

SELECT s
FROM Ship AS s
WHERE s.tonnage BETWEEN 80000.00 AND 130000.00

The BETWEEN clause may be used only on numeric primitives (byte, short, int, long, double, float) and their corresponding java.lang.Number types (Byte, Short, Integer, etc.). It cannot be used on String, boolean, or entity object references.

Using the NOT logical operator in conjunction with BETWEEN excludes the range specified. For example, the following EJB QL statement selects all the ships that weigh less than 80,000 tons or more than 130,000 tons but excludes everything in between:

SELECT s
FROM Ship AS s
WHERE s.tonnage NOT BETWEEN 80000.00 AND 130000.00

The net effect of this query is the same as if it had been executed with comparison symbols:

SELECT s
FROM Ship AS s
WHERE s.tonnage < 80000.00 OR s.tonnage > 130000.00

The WHERE Clause and IN

The IN conditional operator used in the WHERE clause is not the same as the IN operator used in the FROM clause (that's why the JOIN keyword in the FROM clause should be preferred over the IN keyword for collection navigation). In the WHERE clause, IN tests for membership in a list of literal values. For example, the following EJB QL statement uses the IN operator to select all the customers who reside in a specific set of states:

SELECT c
FROM Customer AS c
WHERE c.address.state  IN ('FL', 'TX', 'MI', 'WI', 'MN')

Applying the NOT operator to this expression reverses the selection, excluding all customers who reside in the list of states:

SELECT c
FROM Customer AS c
WHERE c.address.state NOT IN ('FL', 'TX', 'MI', 'WI', 'MN')

If the field tested is null, the value of the expression is "unknown," which means it cannot be predicted.

The IN operator can be used with operands that evaluate to either string or numeric values. For example, the following EJB QL statement uses the IN operator to select all cabins on deck levels 1, 3, 5, and 7:

SELECT cab
FROM Cabin AS cab
WHERE cab.deckLevel IN (1,3,5,7)

The IN operator can also be used with input parameters. For example, the following query selects all the customers who live in the designated states:

SELECT c
FROM Customer AS c
WHERE c.address.state IN ( ?1, ?2, ?3, 'WI', 'MN')

In this case, the input parameters (?1, ?2, and ?3) are combined with string literals ('WI' and 'MN') to show that mixing literal and input parameters is allowed, providing they are "like" types.

The WHERE Clause and IS NULL

The IS NULL comparison operator allows you to test whether a path expression is null. For example, the following EJB QL statement selects all the customers who do not have an address:

SELECT c
FROM Customer AS c
WHERE c.address IS NULL

Using the NOT logical operator, we can reverse the results of this query, selecting all the customers who do have an address:

SELECT c
FROM Customer AS c
WHERE c.address IS NOT NULL

Path expressions are composed using "inner join" semantics. If an entity has a null relationship field, any query that uses that field as part of a path expression eliminates that entity from consideration. For example, if the Customer entity representing "John Smith" has a null value for its address relationship field, then the "John Smith" Customer entity won't be included in the result set for the following query:

SELECT c FROM Customer AS c
WHERE c.address.state = 'TX'
AND c.lastName = 'Smith' AND c.firstName = 'John'

This seems obvious at first, but stating it explicitly helps eliminate much of the ambiguity associated with null relationship fields.

The NULL comparison operator can also be used to test input parameters. In this case, NULL is usually combined with the NOT operator to ensure that an input parameter is not a null value. For example, this query can be used to test for null input parameters. The EJB QL statement first checks that the city and state input parameters are not null and then uses them in comparison operations:

SELECT c FROM Customer AS c
WHERE :city IS NOT NULL AND :state IS NOT NULL
AND c.address.state = :state
AND c.address.city = :city

In this case, if either of the input parameters is a null value, the query returns an empty List , avoiding the possibility of UNKNOWN results from null input parameters. Your Java code should do these null checks up front to avoid an unnecessary database roundtrip.

If the results of a query include a null relationship or a basic field, the results must include null values. For example, the following query selects the addresses of customers whose last name is Smith:

SELECT c.address FROM Customer AS c
WHERE c.lastName = 'Smith'

If the Customer entity representing "John Smith" has a null value for its address relationship field, the previous query returns a List that includes a null valuethe null represents the address relationship field of "John Smith"in addition to a bunch of Address entity references. You can eliminate null values by including the NOT NULL operator in the query, as shown here:

SELECT c.address.city FROM Customer AS c
WHERE c.address.city NOT NULL AND c.address.state = 'FL'

The WHERE Clause and IS EMPTY

The IS EMPTY operator allows the query to test whether a collection-based relationship is empty. Remember from Chapter 7 that a collection-based relationship will never be null. If a collection-based relationship field has no elements, it returns an empty Collection or Set.

Testing whether a collection-based relationship is empty has the same purpose as testing whether a single relationship field or basic field is null: it can be used to limit the scope of the query and items selected. For example, the following query selects all the cruises that have not booked any reservations:

SELECT crs FROM Cruise AS crs
WHERE crs.reservations IS EMPTY

The NOT operator reverses the result of IS EMPTY. The following query selects all the cruises that have at least one reservation:

SELECT crs FROM Cruise AS crs
WHERE crs.reservations IS NOT EMPTY

It is illegal to use IS EMPTY against collection-based relationships that have been assigned an identifier in the FROM clause:

// illegal query
SELECT r
FROM Reservation AS r INNER JOIN 
 r.customers AS c
WHERE
r.customers IS NOT EMPTY AND
c.address.city = 'Boston'

While this query appears to be good insurance against UNKNOWN results, it's not. It's illegal because the IS EMPTY operator cannot be used on a collection-based relationship identified within an INNER JOIN. Because the relationship is specified in the INNER JOIN clause, only those Reservation entities that have a nonempty customers field will be included in the query; any Reservation entity that has an empty relationship field is already excluded because its customers elements cannot be assigned the c identifier.

The WHERE Clause and MEMBER OF

The MEMBER OF operator is a powerful tool for determining whether an entity is a member of a specific collection-based relationship. The following query determines whether a particular Customer entity (specified by the input parameter) is a member of any of the Reservation/Customer relationships:

SELECT crs
FROM Cruise AS crs, IN (crs.reservations) AS res, Customer AS cust
WHERE
cust = :myCustomer
  AND
cust MEMBER OF res.customers

Applying the NOT operator to MEMBER OF has the reverse effect, selecting all the cruises on which the specified customer does not have a reservation:

SELECT crs
FROM Cruise AS crs, IN (crs.reservations) AS res, Customer AS cust
WHERE
cust = :myCustomer
  AND
cust NOT MEMBER OF res.customers

Checking whether an entity is a member of an empty collection always returns false.

The WHERE Clause and LIKE

The LIKE comparison operator allows the query to select String type fields that match a specified pattern. For example, the following EJB QL statement selects all the customers with hyphenated names, like "Monson-Haefel" and "Berners-Lee":

SELECT OBJECT( c ) FROM Customer AS c
WHERE c.lastName LIKE '%-%'

You can use two special characters when establishing a comparison pattern: % (percent) stands for any sequence of characters and _ (underscore) stands for any single character. You can use these characters at any location within a string pattern. If a % or _ actually occurs in the string, you can escape it with the \ character. The NOT logical operator reverses the evaluation so that matching patterns are excluded. The following examples show how the LIKE clause evaluates String type fields:


phone.number LIKE '617%'

True for "617-322-4151"

False for "415-222-3523"


cabin.name LIKE 'Suite _100'

True for "Suite A100"

False for "Suite A233"


phone.number NOT LIKE '608%'

True for "415-222-3523"

False for "608-233-8484"


someField.underscored LIKE '\_%'

True for "_xyz"

False for "abc"


someField.percentage LIKE '\%%'

True for "% XYZ"

False for "ABC"

The LIKE operator can also be used with input parameters:

SELECT c FROM Customer AS c
WHERE c.lastName LIKE :param

Functional Expressions

EJB QL has numerous functions that you can use to process strings and numeric values.

Functional expressions in the WHERE clause

EJB QL has seven functional expressions that allow for simple String manipulation and three functional expressions for basic numeric operations. The String functions are:


LOWER(String)

Converts a string to lowercase.


UPPER(String)

Converts a string to uppercase.


TRIM([[LEADING | TRAILING | BOTH] [trim_char] FROM] String)

Allows you to trim a specified character from the beginning (LEADING), end (trAILING), or both (BOTH). If you do not specify a trim character, the space character will be assumed.


CONCAT(String1, String2)

Returns the String that results from concatenating String1 and String2.


LENGTH(String)

Returns an int indicating the length of the string.


LOCATE(String1, String2 [, start])

Returns an int indicating the position at which String1 is found within String2. If it's present, start indicates the character position in String2 at which the search should start. Support for the start parameter is optional; some containers will support it, and others will not. Don't use it if you want to ensure that the query is portable.


SUBSTRING(String1, start, length)

Returns the String consisting of length characters taken from String1, starting at the position given by start.

The start and length parameters indicate positions in a String as integer values. You can use these expressions in the WHERE clause to refine the scope of the items selected. Here's how the LOCATE and LENGTH functions might be used:

SELECT c
FROM Customer AS c
WHERE
LENGTH(c.lastName) > 6
  AND
LOCATE( c.lastName, 'Monson') > -1

This statement selects all the customers with Monson somewhere in their last names but specifies that the names must be longer than six characters. Therefore, "Monson-Haefel" and "Monson-Ares" would evaluate to true, but "Monson" would return false because it has only six characters.

The arithmetic functions in EJB QL may be applied to primitive as well as corresponding primitive wrapper types:


ABS(number)

Returns the absolute value of a number (int, float, or double)


SQRT(double)

Returns the square root of a double


MOD(int, int)

Returns the remainder for the first parameter divided by the second (i.e., MOD(7, 5) is equal to 2)

Functions returning dates and times

EJB QL has three functions that can return you the current date, time, and timestamp: CURRENT_DATE, CURRENT_TIME, and CURRENT_TIMESTAMP. Here's an example of searching for reservations made on the current date:

SELECT res FROM Reservation res WHERE res.date = CURRENT_DATE

Aggregate functions in the SELECT clause

Aggregate functions are used with queries that return a collection of values. They are fairly simple to understand and can be handy, especially the COUNT( ) function.

COUNT (identifier or path expression)

This function returns the number of items in the query's final result set. The return type is a java.lang.Long, depending on whether it is the return type of the query method. For example, the following query provides a count of all the customers who live in Wisconsin:

SELECT COUNT( c )
FROM Customers AS c
WHERE c.address.state = 'WI'

The COUNT( ) function can be used with an identifier, in which case it always counts entities, or with path expressions, in which case it counts either CMR fields or CMP fields. For example, the following statement provides a count of all the Zip codes that start with the numbers 554:

SELECT  COUNT(c.address.zip)
FROM Customers AS c
WHERE c.address.zip LIKE '554%'

In some cases, queries that count a path expression have a corresponding query that can be used to count an identifier. For example, the result of the following query, which counts Customers instead of the zip field, is equivalent to the previous query:

SELECT COUNT( c )
FROM Customers AS c
WHERE c.address.zip LIKE '554%'

MAX( path expression), MIN( path expression)

These functions can be used to find the largest or smallest value from a collection of any type of field. They cannot be used with identifiers or paths that terminate in a relationship field. The result type will be the type of field that is being evaluated. For example, the following query returns the highest price paid for a reservation:

SELECT MAX( r.amountPaid )
FROM Reservation AS r

The MAX( ) and MIN( ) functions can be applied to any valid value, including primitive types, strings, and even serializable objects. The result of applying the MAX( ) and MIN( ) functions to serializable objects is not specified, because there is no standard way to determine which serializable object is greater than or lesser than another one.

The result of applying the MAX( ) and MIN( ) functions to a String field depends on the underlying data store. This has to do with the inherent problems associated with String type comparisons.

AVG( numeric ), SUM( numeric)

The AVG( ) and SUM( ) functions can be applied only to path expressions that terminate in a numeric primitive field (byte, long, float, etc.) or in one of their corresponding numeric wrappers (Byte, Long, Float, etc.). The result of a query that uses the SUM( ) function has the same type as the numeric type it's evaluating. The result type of the AVG( ) function is a java.lang.Double, depending on whether it is used in the return type of the SELECT method.

For example, the following query uses the SUM( ) function to get the total amount paid by all customers for a specific cruise (specified by the input parameter):

SELECT SUM( r.amountPaid)
FROM Cruise c join c.reservations r
WHERE  c = :cr

DISTINCT, nulls, and empty arguments

The DISTINCT operator can be used with any of the aggregate functions to eliminate duplicate values. The following query uses the DISTINCT operator to count the number of different Zip codes that match the pattern specified:

SELECT DISTINCT COUNT(c.address.zip)
FROM Customers AS c
WHERE c.address.zip LIKE '554%'

The DISTINCT operator first eliminates duplicate Zip codes; if 100 customers live in the same area with the same Zip code their Zip code is counted only once. After the duplicates have been eliminated, the COUNT( ) function counts the number of items left.

Any field with a null value is automatically eliminated from the result set operated on by the aggregate functions. The COUNT( ) function also ignores values with null values. The aggregate functions AVG( ), SUM( ), MAX( ), and MIN( ) return null when evaluating an empty collection. For example, the following query attempts to obtain the average price paid by customers for a specific cruise:

SELECT AVG( r.amountPaid)
FROM Cruise As c JOIN c.reservations r
WHERE c = :myCruise

If the cruise specified by the input parameter has no reservations, the collection on which the AVG( ) function operates is empty (there are no reservations and therefore no amounts paid).

The COUNT( ) function returns 0 (zero) when the argument it evaluates is an empty collection. If the following query is evaluated on a cruise with no reservations, the result is 0 (zero) because the argument is an empty collection:

SELECT COUNT( r )
FROM Cruise AS c, IN( c.reservations ) AS r
WHERE c = ?1

The ORDER BY Clause

The ORDER BY clause allows you to specify the order of the entities in the collection returned by a query. The semantics of the ORDER BY clause are basically the same as in SQL. For example, we can construct a simple query that uses the ORDER BY clause to return an alphabetical list of all of Titan Cruises' customers:

SELECT c
FROM Customers AS c
ORDER BY c.lastName

This might return a Collection of Customer entities in the following order (assume that their last and first names are printed to output):

Aares, John
Astro, Linda
Brooks, Hank
.
.
Xerces, Karen
Zastro, William

You can use the ORDER BY clause with or without the WHERE clause. For example, we can refine the previous query by listing only those U.S. customers who reside in Boston:

SELECT c
FROM Customers AS c
WHERE c.address.city = 'Boston' AND c.address.state = 'MA'
ORDER BY c.lastName

The default order of an item listed in the ORDER BY clause is always ascending, which means that the lesser values are listed first and the greater values last. You can explicitly specify the order as ascending or descending by using the keywords ASC and DESC. The default is ASC. Null elements will be placed on top or at the bottom of the query result depending on the underlying database. Here's a statement that lists customers in reverse (descending) order:

SELECT c
FROM Customers AS c
ORDER BY c.lastName DESC

The results of this query are:

Zastro, William
Xerces, Karen
.
.
Brooks, Hank
Astro, Linda
Aares, John

You can specify multiple order-by items. For example, you can sort customers by lastName in ascending order and firstName in descending order:

SELECT c
FROM Customers AS c
ORDER BY c.lastName ASC, c.firstName DESC

If you have five Customer entities with the lastName equal to Brooks, this query sorts the results as follows:

Brooks, William
Brooks, Henry
Brooks, Hank
Brooks, Ben
Brooks, Andy

Although the fields used in the ORDER BY clause must be basic fields, the value selected can be an entity identifier, a relationship field, or a basic field. For example, the following query returns an ordered list of all Zip codes:

SELECT addr.zip
FROM Address AS addr
ORDER BY addr.zip

The following query returns all the Address entities for customers named Smith, ordered by their Zip code:

SELECT c.address
FOR Customer AS c
WHERE c.lastName = 'Smith'
ORDER BY c.address.zip

You must be careful which basic fields you use in the ORDER BY clause. If the query selects a collection of entities, then the ORDER BY clause can be used with only basic fields of the entity type that is selected. The following query is illegal, because the basic field used in the ORDER BY clause is not a field of the entity type selected:

// Illegal EJB QL
SELECT c
FROM Customer AS c
ORDER BY c.address.city

Because the city field is not a direct field of the Customer entity, you cannot use it in the ORDER BY clause.

A similar restriction applies to results. The field used in the ORDER BY clause must also be in the SELECT clause. The following query is illegal, because the field identified in the SELECT clause is not the same as the one used in the ORDER BY clause:

SELECT c.address.city
FROM Customer AS c
ORDER BY c.address.state

In the previous query, we wanted a list of all the cities ordered by their state. Unfortunately, this is illegal. You can't order by the state field if you are not selecting the state field.

GROUP BY and HAVING

The GROUP BY and HAVING clauses are commonly used to apply stricter organization to a query and to narrow the results for aggregate functions. The GROUP BY clause is usually used in combination with aggregate functions, because it allows you to cluster data by category.

Let's say you want to do a report to find out how many reservations each particular cruise had. You would want to use the COUNT function to count each reservation, but how would you group the calculation of the count on a per-cruise basis? This is what the GROUP BY syntax allows you to do. Here's a query that returns the cruise name and a count of each reservation per cruise:

SELECT cr.name, COUNT (res) FROM Cruise cr
LEFT JOIN 
 cr.reservations res
GROUP BY cr.name

The GROUP BY clause must specify one of the columns that you are returning in the query. Because we are using a LEFT JOIN, cruises that do not have any reservations will be returned with a count of zero. If you wanted to exclude cruises from the report that did not have any reservations, you would use an INNER JOIN instead.

The GROUP BY syntax is even more interesting if we combine it with a constructor expression. Let's populate a list of ReservationSummary instances. ReservationSummary is a helper class comprised of the cruise name, the number of reservations, and how much money was collected:

public class ReservationSummary {
   public String cruise;
   public int numReservations;
   public double cashflow;

   public ReservationSummary(String c, int num, double cash) {
     this.cruise = c;
     this.numReservations = num;
     this.cashflow = cash;
   }
}

We will invoke the constructor directly within our query:

SELECT new ReservationSummary(cr.name, COUNT(res), SUM(res.amountPaid))
FROM Cruise cr
LEFT JOIN cr.reservations res
GROUP BY cr.name

The HAVING clause is used with a GROUP BY clause and acts as a filter, restricting the final output. The HAVING clause employs aggregate functional expressions using only the identifiers used in the SELECT clause. You can restrict the GROUP BY result by using the HAVING syntax. Let's restrict our report to show only those cruises with more than 10 reservations:

SELECT cr.name, COUNT (res) FROM Cruise cr
JOIN cr.reservations res
GROUP BY cr.name
HAVING count(res) > 10

The same rules that govern SELECT clauses govern HAVING clauses. Only grouped properties may appear outside of any function that you use, though.

Subqueries

Subqueries are SELECT statements embedded in another query. EJB QL supports subqueries in WHERE and HAVING clauses. Subqueries are very useful when normal mechanisms for narrowing your search cannot obtain the data you desire. Here's an example of finding the count of all reservations whose amount paid is greater than the average amount paid for all reservations:

SELECT COUNT(res)
FROM Reservation res
WHERE res.amountPaid > (SELECT avg(r.amountPaid) FROM Reservation r)

If you look carefully, you can see that this example can actually be broken up into two separate queries. You could execute one query that found the average amount paid and then pass this value into the second query to find all amounts paid that are greater than the average. For performance reasons, it is better to submit one query, as you can avoid the overhead of an extra network call to the database. It is also possible that the database could optimize this large query.

You can also reference identifiers in the FROM clause of the outer query of your subquery. For example, let's say we want to find all cruises that made more than $100,000. The query would look like this:

FROM Cruise cr
WHERE 100000 < (
   SELECT SUM(res.amountPaid) FROM cr.reservations res)
)

The subquery in this example references the reservation association of the cruise specified in the outer query's FROM clause.

ALL, ANY, SOME

When a subquery returns multiple rows, it is possible to quantify the results with the ALL, ANY, and SOME expressions.

The ALL operator returns TRue if all the things in the subquery match the conditional expression. For example, maybe we want to list the cruises for which every reservation has a down payment:

FROM Cruise cr
WHERE 0 < ALL (
   SELECT res.amountPaid from cr.reservations res
)

The ANY operator returns true if anything in the subquery matches the conditional expression. For example, maybe we want to find all cruises that have at least one reservation without a down payment:

FROM Cruise cr
WHERE 0 = ANY (SELECT res.amountPaid from cr.reservations res);

SOME is a synonym of ANY, and the two are syntactically equivalent. You can use a NOT expression to obtain the same query:

FROM Cruise cr
WHERE 0 < NOT ALL (SELECT res.amountPaid from cr.reservations res)

EXISTS

The EXISTS operator returns true if the subquery result consists of one or more values. If no values are returned by the subquery, then false is returned. We could rewrite our query to search for cruises for which some customers had not paid their balance due:

FROM Cruise cr
WHERE EXISTS (SELECT res FROM cr.reservations WHERE res.amountPaid = 0)

Bulk UPDATE and DELETE

Java Persistence has the ability to perform bulk UPDATE and DELETE operations. This can save you a lot of typing. For example, let's say we want to give a $10 credit across the board to any customer named Bill Burke. We can do the following bulk UPDATE:

UPDATE Reservation res SET res.amountPaid = (res.amountPaid + 10)
WHERE EXISTS (
   SELECT c FROM res.customers c
   WHERE c.firstName = 'Bill' AND c.lastName='Burke'
)

An example of DELETE could be that we want to remove all reservations made by Bill Burke:

DELETE FROM Reservation res
WHERE EXISTS (
   SELECT c FROM res.customers c
   WHERE c.firstName = 'Bill' AND c.lastName='Burke'
)

Be very careful how you use bulk UPDATE and DELETE. It is possible, depending on the vendor implementation, to create inconsistencies between the database and entities that are already being managed by the current persistence context. Vendor implementations are required only to execute the update or delete directly on the database. They do not have to modify the state of any currently managed entity. For this reason, it is recommended that you do these operations within their own transaction or at the beginning of a transaction (before any entities are accessed that might be affected by these bulk operations). Alternatively, executing EntityManager.flush( ) and EntityManager.clear( ) before executing a bulk operation will keep you safe.