Custom Database function using JPA JPQL and eclipselink

Sometimes it's useful to exploit some of the native functions provided by database vendors while using JPA and JPQL.

And although it's possible to execute native queries, you'll loose some of the JPA magic in the process.

Thankfully Eclipselink (and probably Hibernate) allow some degree of flexibility through special operators.

According to the official documentation the FUNC operator allows for a database function to be call from JPQL. It allows calling any database functions not supported directly in JPQL, and calling user or library specific functions.


@Entity
public class Employee implements Serializable{
   
   @Id
   private Long id;   
   private String firstname;
   private String lastname;

  // Setters - Getters

}

Below is a quick example on how to use FUNC to call Oracle's TO_CHAR function and convert all the ids in the query to a String.

Let's say that for an obscure reason we would like to get a list of all the employees id's but not as a list of java.lang.Long but as a list of java.lang.String.

We could write the query like so :



List employeesId = 
     em.createQuery("SELECT FUNC('TO_CHAR', e.id) FROM Employee e ").getResultList();

The usage is pretty straightforward the first argument is the name of the native function followed by the function arguments

More information about this can be found in Eclipselink's official documentation here

It's good no note that FUNC will be changed to FUNCTION in JPA 2.1

JPA Eclipselink optimizing query performance on a large resultset

Recently I had to look for ways of optimizing a very slow database request on a very large ResultSet (more than 5000 results with several @OneToMany associations).

While reading EclipseLink's documentation I stumbled upon an interesting piece of information regarding handling large ResultSets. EclipseLink recommends using Streamed Cursors when handling large resultsets

According to the documentation : Cursored streams provide the ability to read back a query result set from the database in manageable subsets, and to scroll through the result set stream.

The initial query would load practically the entire database into memory in one request (all the associations are marked with QueryHint.LEFT_FETCH in order to eagerly fetch the associations upon the initial request).

I will not go into details regarding the process since it's irrelevant to this post not the goal of this post, but basically the program would query an Oracle Database transform every JPA Entity and store it in a somewhat different object in a Mongo Database

The process was really long (more than 1 hour between recovering all the objects from the Oracle Database and processing them into the Mongo Database

The DAO class

public CursoredStream getMyLargeResultset(){

   Query programsQuery = em.createNamedQuery("MyNamedQuery");
   // tell the EntityManager to return a StreamedCursor
   programsQuery.setHint("eclipselink.cursor", true);    
   return (CursoredStream) programsQuery.getSingleResult();
}

The client


public void synchronizeDatabases(){


    final int PAGE_SIZE = 50;

    //get the cursor
    CursoredStream cursor= myDao.getMyLargeResultset();
      
    // iterate through curor 
    cursor(!cursor.atEnd()){ 
 
        // get the next batch of objects and
        // cast it to the target entity 
        List myEntities = (List)(List)cursor.next(PAGE_SIZE);
 processEntities(myEntities);
      }

    cursor.close();

}

By using this technique I was able to reduce by a factor of 10 the process time of this operation!

PS: I realize that the double casting on the List is not very pretty and I could use a different approach by using the next() method without arguments but the existent method processEntities() accepted a List as a parameter and I wasn't allowed to modify that method

Documentation sources


MongoDB $where clause to query array length

There is no direct way in mongodb to return all the documents in where a sub collection has at least X number of entries

  {
    "name" : "Terry Brooks",
    "books" : [ 
              "The Sword of Shannara", 
               "The Elfstones of Shannara",
               "The Wishsong of Shannara"
              ]
  },
 {
    "name" : "Oscar Wilde",
    "books" : [ 
              "The Picture of Dorian Gray"
             
              ]
  }

Let's say that I want all the authors that have written more than 1 book. There is no direct way in mongodb to do this. it needs to be done either by map reduce or perhaps with the new aggregation framework but you cannot combine $gt and $size operators like so :
    db.AUTHORS.find({"books" : {$size : {$gt : 1}}});
It doesn't work, you wont get any error messages but an empty result. MongoDb allows Javascript evaluation through the $where operator although it's significantly slower than native operators it's very flexible and a quick way of executing a query without using map reduce or other means :
    db.AUTHORS.find({$where : "this.books.length > 1"});
But when this query was executed the following error kept coming up :
{
    "error": {
        "$err": "erroroninvocationof$wherefunction: JSError: TypeError: this.bookshasnopropertiesnofile_a: 0",
        "code": 10071
    }
}

The error is not very helpfull (at least to me) and as it turns out the origin of the problem was the fact that not all Author documents in my database had the "books" array. So in order to execute a length query on the "books" array it's necessary to ensure that the array field (books) exists :
    db.AUTHORS.find({"books" : {$exists: true}, $where : "this.books.length > 0"});

EclipseLink (JPA) Spring - No persistence exception translators exception

I recently stumbled upon a Spring Exception while working with Spring Data JPA and attempting to deploy a WAR file on glassfish. I have a JTA datasource defined in my glassfish which is used by Spring (through JNDI) to instantiate my entity manager. The JPA configuration was pretty straight forward but every time I tried to deploy the app on Glassfish I stumbled upon the error :
org.springframework.beans.factory.BeanCreationException: 
Error creating bean with name 
'org.springframework.dao.annotation.PersistenceExceptionTranslationPostProcessor#0': 
Initialization of bean failed; 
nested exception is java.lang.IllegalStateException: 
No persistence exception translators found in bean factory.
Cannot perform exception translation.
Apparently when using Hibernate there is an easy fix for this : Declaring a Hibernate Exception Translator in the spring beans config file :

When using EclipseLink though this doesn't work... So I looked for implementations of the spring interface :

org.springframework.dao.support.PersistenceExceptionTranslator 

And found out that there is not an EclipseLinkExceptionTranslator (as for Hibernate) but there is a EclipseLinkJpaDialect
   org.springframework.orm.jpa.vendor.EclipseLinkJpaDialect

That implements the interface so I created a bean in the spring configuration file and I was able to deploy the app on the glassfish.

REST load testing with siege

Siege is a pretty nifty linux utility for HTTP/HTTPS load testing it can be downloaded here : siege homepage
Siege allows you to write a list of URLs in a file (one URL per line) the utility will then parse the file and execute the load tests according to your configuration file .siegerc or the command's arguments.

The only thing I do not like about siege is the fact that I would like to have stats per URL; when siege parses the url file it will write a log file with global statics for the siege session

Something like so :

2012-09-21 12:03:52,   1813,     121.43,          19,       0.49,       14.93,        0.16,        7.38,    1813,       0

which to my taste is not very clear

While executing the tests siege prints out statistics that are not found in the siege log file (which in my case are more useful) but if you have multiple URLs you don't know for which URL the printed stats are.

So I decided to handle the situation differently using awk and a bash script

Note that I'm no Linux expert so the script could probably be better written but here is how I broke it down :

I have 3 files

  1. The urls.txt file (a txt file containing the URLs I want to test :
  2. #URL1
    http://myURL1/param1/param2.json
    http://myURL1/param3/param2.json
    #URL2
    http://myURL2/param3/param2.json
    http://myURL2/param3/param1.json
    
    
  3. The AWK script file
  4. !/#/{
    
       print "\n"
       print "**********************************************************"
       print "Testing URL : "$1
       print "**********************************************************"
       system ("echo Testing URL : "$1 " >> " SIEGE_OUTPUT " 2>&1")
       system ("siege " $1 "-v -b -r"REPS" -c"CONC" --mark="$1 " --log="SIEGE_LOG_FILE ">> "SIEGE_OUTPUT " 2>&1 ")
       print "\n"
    
    }
    
    
  5. The bash script containing all the configuration and calling the awk script
  6. #! /bin/sh
    #LOG_FILE=./siege-log
    URL_FILE=./urls.txt # file containing the URLs to handle
    USER=$(whoami)
    NOW=$(date +"%d-%m-%Y_%H-%M-%S")
    PWD=$(pwd)
    SIEGE_LOG_FILE=$PWD"/log_"$NOW".log"  #siege log's file
    SIEGE_OUTPUT=$PWD"/output_"$NOW".log" #where siege's output will be redirected
    CONC=50 #number of concurrent users
    REPS=1  # number of repetitions
    echo "loading URLS from the file : "$URL_FILE
    echo "Writing siege log into the file :" $SIEGE_LOG_FILE
    
    SET -- $CONC
    
    awk -f siege-benchmark.awk -v CONC=$CONC REPS=$REPS SIEGE_LOG_FILE=$SIEGE_LOG_FILE SIEGE_OUTPUT=$SIEGE_OUTPUT $URL_FILE
    
    
    
    

Now according to how your siegerc file is configured output can be a bit different I have the verbose mod off and benchmark mode on. Below is an excerpt from my log file

******************************************
Testing URL : http://myURL1/param1/param2.json
******************************************
** SIEGE 2.72
** Preparing 50 concurrent users for battle.
The server is now under siege...


Transactions:                     50 hits
Availability:                 100.00 %
Elapsed time:                   0.08 secs
Data transferred:               0.00 MB
Response time:                  0.04 secs
Transaction rate:             625.00 trans/sec
Throughput:                     0.03 MB/sec
Concurrency:                   24.62
Successful transactions:          50
Failed transactions:               0
Longest transaction:            0.07
Shortest transaction:           0.01

MongoDB & Spring Data remove Elements from Arrays

In this blogpost I will be showing how to execute a Mongo Query using Spring Data in order to "pull" multiple items in a object's collection. The model :

@Document(collection="Book")
 public class Book{

      private String ISBN;
      private String bookTitle;
      private List< String > categories;


  //SETTERS - GETTERS
  }

  public class Category{
    
      private String categoryId;
      private String categoryName;
   }

A typical Mongo Document would be something like :

   {
      "isbn" : "11111111", 
      "bookTitle" : "My Book Title", 
      "categories" :[
          {
            "categoryId" : "cat1",
            "categoryName": "My Category 1"
          },
          {
            "categoryId" : "cat2",
            "categoryName": "My Category 2"
          }
        ]
     }

    {
      "isbn" : "22222222222", 
      "bookTitle" : "My Book Title2", 
      "categories" :[
          {
            "categoryId" : "cat1",
            "categoryName": "My Category 1"
          },
          {
            "categoryId" : "cat3",
            "categoryName": "My Category 3"
          }
        ]
     }

     {
      "isbn" : "333333333", 
      "bookTitle" : "My Book Title3", 
      "categories" :[
          {
            "categoryId" : "cat2",
            "categoryName": "My Category 2"
          },
          {
            "categoryId" : "cat3",
            "categoryName": "My Category 3"
          }
        ]
     }


Now let's assume that cat1 and cat2 are no longer valid. It would be nice if we could remove these categories from all the books in one mongo update query. Here is how to do it using Spring Data.

@Component
public class MyDAO{
   
   // Mongo template injected through Spring

   @Autowired
   private MongoOperations mongoTemplate;

    public void deleteCategories(String[] categories){
     
      //find query
      // find all the programs containing the the categories passed as parameter
      Query findQuery = 
       Query.query(Criteria.where("categories.categoryId").in(Arrays.asList(categories)));      


       // build the Update
        DBObject pullUpdate = 
            BasicDBObjectBuilder.start().add(
               "categoryId",
                      BasicDBObjectBuilder.start()
                                              .add("$in",
                            categories).get()).get();
    
      Update update = new Update().pull("categories", pullUpdate );
     

     // execute the update
      template.updateMulti(findQuery, update, Book.class);

   }

}



    public class SpringDataDemo{
      
       public static void main(String[] args){

             ClassPathXmlApplicationContext context = 
                  new ClassPathXmlApplicationContext("classpath:spring-beans.xml");
             MyDAO myDao = context.getBean("myDao");
             //delete categories "cat1" and "cat2"
             mydao.deleteCategories(new String[]{"cat1", "cat2"});
       
       }
    }


Now the categories cat1 and cat2 will be removed from all the books in the collection.

Resolving Spring's data error message "Cannot resolve the name to a(n) 'element declaration"

While working on a project combining Spring Data JPA and Spring Data MongoDB I run into a lot of trouble when I introduced the









The error message :

Cannot resolve the name to a(n) 'element declaration

I spent some time looking for the cause, and then I realised that the version of SpringDataMongoDB and SpringDataJPA had dependencies on different versions of SpringDataCore. and it was causing trouble

So if you get the same error just check your maven dependency hierarchy or go check the POM files for the libraries.

You can check out the POM files in Spring's repo :

Spring Maven repository

Untar a multi-volume tar splitted file

One way to untar multi-volume tar files (splitted) is to use a combination of CAT and TAR command.
   cat splitFile1 splitFile2 | tar -xf -
You can also add a v to your tar command to make verbose.

Bash - create subfolder in each folder

I recently had to add a subfolder to each folder inside a directory in a directory that had more than 100 folders it was a bit tedious to copy paste the folder inside each one of them.Thankfully this can easily done with a bash command :


cd /opt/myfolder
for f in *; do mkdir "$f/myFolder"; done


Find out linux distribution

This post is mostly for me as I keep forgeting how to do it.

Type the following command in the terminal in order to find the Linux distribution :

     cat /etc/issue

MyFaces Trinidad invalid_lhs_in_assignment using Partial Page Rendering (PPR)

I stumbled upon a weird error last week. I was having the following error in some of my pages after a PPR request :
["Error ", 
ReferenceError
arguments: Array[0]
get message: function getter() { [native code] }
get stack: function getter() { [native code] }
set message: function setter() { [native code] }
set stack: function setter() { [native code] }
type: "invalid_lhs_in_assignment"
__proto__: Error
, " delivering XML request status changed to ", function (requestEvent)
{
  if (requestEvent.getStatus() == TrXMLRequestEvent.STATUS_COMPLETE)
  {
    var statusCode = requestEvent.getResponseStatusCode();

    // The server might not return successfully, for example if an
    // exception is thrown.  When that happens, a non-200 (OK) status
    // code is returned as part of the HTTP prototcol.
    if (statusCode == 200)
    {
      _pprStopBlocking(window);

      if (requestEvent.isPprResponse())
      {
        var responseDocument = requestEvent.getResponseXML();

        // Though not yet supported, Nokia browser sometimes fails to get
        // XML content. (Currently being investigated.) When that happens,
        // the function should simply return without calling
        // _handlePprResponse rather than crushing the function with null
        // pointer reference.
        // This is a temporary workaround and should be revisited when
        // Nokia browser is officially supported.
        if (responseDocument != null)
        {
          this._handlePprResponse(responseDocument.documentElement);
        }
      }
      else
      {
        // Should log some warning that we got an invalid response
      }
    }
    else if (statusCode >= 400)
    {
      // The RequestQueue logs these for us, so
      // we don't need to take action here.  IMO, that's probably
      // wrong - we should do the handling here
      _pprStopBlocking(window);
    }

  }
}]

In some pages this caused everything to break (nothing was being updated), while in other pages my PPR requests where working properly, despite the error message. What I found out, from this very eloquent error message, is that when a component has a "-" in their ID like so :




So I removed the "-" in the ID




And everything went back to normal!.

OSX show used ports or listening applications with their PID

On OSX you can display applications listening on a given port using the lsof the commands described below will show listening application...