Recently I had to look for ways of optimizing a very slow database request on a very large ResultSet (more than 5000 results with several @OneToMany associations).
While reading EclipseLink's documentation I stumbled upon an interesting piece of information regarding handling large ResultSets. EclipseLink recommends using Streamed Cursors when handling large resultsets
According to the documentation : Cursored streams provide the ability to read back a query result set from the database in manageable subsets, and to scroll through the result set stream.
The initial query would load practically the entire database into memory in one request (all the associations are marked with QueryHint.LEFT_FETCH in order to eagerly fetch the associations upon the initial request).
I will not go into details regarding the process since it's irrelevant to this post not the goal of this post, but basically the program would query an Oracle Database transform every JPA Entity and store it in a somewhat different object in a Mongo Database
The process was really long (more than 1 hour between recovering all the objects from the Oracle Database and processing them into the Mongo Database
The DAO class
public CursoredStream getMyLargeResultset(){ Query programsQuery = em.createNamedQuery("MyNamedQuery"); // tell the EntityManager to return a StreamedCursor programsQuery.setHint("eclipselink.cursor", true); return (CursoredStream) programsQuery.getSingleResult(); }
The client
public void synchronizeDatabases(){ final int PAGE_SIZE = 50; //get the cursor CursoredStream cursor= myDao.getMyLargeResultset(); // iterate through curor cursor(!cursor.atEnd()){ // get the next batch of objects and // cast it to the target entity ListmyEntities = (List )(List)cursor.next(PAGE_SIZE); processEntities(myEntities); } cursor.close(); }
By using this technique I was able to reduce by a factor of 10 the process time of this operation!
PS: I realize that the double casting on the List is not very pretty and I could use a different approach by using the next() method without arguments but the existent method processEntities() accepted a List
Documentation sources
No comments:
Post a Comment