CLOSED-LOOP: Help needed: Mapping a bidirectional list with Hibernate

Tuesday, February 3, 2009

Help needed: Mapping a bidirectional list with Hibernate

I don't understand the behavior of Hibernate when mapping a bidirectional list. The SQL statements that Hibernate produces seem not optimal to me. Can somebody enlighten me?

The scenario is the following: I have a one-to-many parent-child relationship. I map this relationship with a bidirectional list.

According to the Hibernate Annotation Reference Guide (Chapter 2.4.6.2.3. Bidirectional association with indexed collections) the mapping should look like this:

@Entity
public class Parent {

    @Id  @GeneratedValue private long id;
    @Version  private int version;
    private String name;

    @OneToMany(cascade = CascadeType.ALL)
    @JoinColumn(name = "parent_id", nullable=false)
    @org.hibernate.annotations.IndexColumn(name = "parent_index")
    List<Child> children = new ArrayList<Child>();

	...

@Entity
public class Child {

    @Id @GeneratedValue private Long id;
    @Version private int version;
    private String name;

    @ManyToOne
    @JoinColumn(name = "parent_id", updatable = false, insertable = false, nullable=false)
    private Parent parent;

	...

But in this case Hibernate produces three SQL statements when persisting a parent with one child:

Hibernate: insert into Parent (name, version, id) values (?, ?, ?)
Hibernate: insert into Child (name, price, version, parent_id, parent_index, id) values (?, ?, ?, ?, ?, ?)
Hibernate: update Child set parent_id=?, parent_index=? where id=?

The third statement seems to be redundant, because parent_id and parent_index seem to be set already in the second statement.

When I change the mapping and repeat the attributes 'updatable = false, insertable = false' to the declaration of the @JoinColumn in the Parent like this:

@Entity
public class Parent {

    @Id  @GeneratedValue private long id;
    @Version  private int version;
    private String name;

    @OneToMany(cascade = CascadeType.ALL)
    @JoinColumn(name = "parent_id", updatable = false, insertable = false, nullable=false)
    @org.hibernate.annotations.IndexColumn(name = "parent_index")
    List<Child> children = new ArrayList<Child>();

	...

@Entity
public class Child {

    @Id @GeneratedValue private Long id;
    @Version private int version;
    private String name;

    @ManyToOne
    @JoinColumn(name = "parent_id", updatable = false, insertable = false, nullable=false)
    private Parent parent;

	...

...then Hibernate seems to produce much more optimized SQL:

Hibernate: insert into Parent (name, version, id) values (?, ?, ?)
Hibernate: insert into Child (name, price, version, parent_id, parent_index, id) values (?, ?, ?, ?, ?, ?)

The client code looks like this:

        EntityManagerFactory emf = Persistence.createEntityManagerFactory("test");
        EntityManager em = emf.createEntityManager();
        EntityTransaction tx = em.getTransaction();
        tx.begin();

        Parent newParent = new Parent();
        newParent.setName("Parent1");

        Child newChild = new Child();
        newChild.setName("Child1");

        newParent.getChildren().add(newChild);
        newChild.setParent(newParent);

        em.persist(newParent);

        em.flush();
        tx.commit();

I am using hibernate-entitymanager 3.4.0.GA.

What am I missing? Is the Hibernate Reference Guide not correct, or am I overlooking something?

Update 2009-01-06:
Ok, I was not reading the Annotations Reference Guide thoroughly enough.

In Chapter 2.2.5.3.2.1 it is stated clearly:

To map a bidirectional one to many, with the one-to-many side as the owning side, you have to remove the mappedBy element and set the many to one @JoinColumn as insertable and updatable to false. This solution is obviously not optimized and will produce some additional UPDATE statements.

It probably would not hurt to repeat this information in Chapter 2.4.6.2.3 .

But a question remains: If I repeat the @JoinColumn attributes 'updatable = false' and 'insertable = false' on the Parent (see code above) the additional update statements seem not to get produced... is this a legitimate workaround? Or does this result in another problem?

4 comments:

UnknownFebruary 3, 2009 at 1:09 AM
What you're overlooking is that this Hibernate stuff is total nonsense. So much effort to avoid having to write SQL, and for what? You waste a bunch of time sprinkling your source code with user-friendly annotations such as:

@org.hibernate.annotations.IndexColumn(name = "parent_index")

and

@OneToMany(cascade = CascadeType.ALL)

"Great!", you say. "It talks to the database for me! ... Why is this query taking eight minutes??"

So you cozy up to "hibernate statistics" or whatever for three hours to try to figure out where and why it decided to put a SQL query in an inner loop, and how to get it to stop doing that.

"Success!", you cry. "Only 71 queries!"

Then, if you're like me, you realize that if you'd done it by hand it'd be half as many queries, 4x as fast, and you'd have finished two and a half hours ago.
ReplyDelete
Replies
AnonymousFebruary 3, 2009 at 1:33 AM
@how.gauche, I hear this argument a lot, but I totally think you're misunderstanding the point of an ORM. Sure, for searching for content matching a single query, it can be rather annoying and inefficient. Where it shines is complex data, and the associations therein. For example, when you have something like a Question and an Answer, with flags attached to both of those, and those are associated with any given page on your website, it's really annoying to get all that data in a sql statement. The ORM does all that for you, putting them in nice little packages so you can use them anywhere else. It also decouples you between your code and the database. For some complicated queries, I agree you should bypass the ORM and use SQL, but using the ORM provides every level of convenience for both maintenance and creating new code. Another nicety is that it's easily unit-testable. It also provides a nice level 2 cache. How much longer can this paragraph go?

-Josh
ReplyDelete
Replies
UnknownFebruary 3, 2009 at 7:19 AM
I don't think I misunderstand the purpose of an ORM at all. The theory goes: I sprinkle my Java class with some goofy annotations and Hibernate auto-magically derives an interface for pickling and unpickling objects of that class to the database.

This is supposed to save you time and make it more likely that your data access layer has consistent and correct semantics. In reality, Hibernate annotations are more verbose and baroque than the SQL they're supposed to be replacing, and unless you spend four hours hand-tweaking it, the data layer ends up slow as fuck.
ReplyDelete
Replies
UnknownFebruary 3, 2009 at 3:12 PM
Can you use a bridging table/object would that help?

As for the whole ORM thing... personally, I'm partial to GORM. ORM may not solve all problems and may even create new ones. But, this particular problem is probably not all that bad either way.

You know, a highly skilled human can still write better binary than some compilers. The question is... is it worth it? Sometimes it is. That's why we have guys who write drivers for a living. I figure the ORM and SQL performance issue is the same deal. Sometimes it is better to crack open Hibernate's HQL and do the mapping yourself... sometimes it's better to do JDBC yourself. Sometimes it's better to use a search engine technology for persistence. It depends on your goals and requirements.
ReplyDelete
Replies

Add comment