REST with Asynchronous Jersey and RXJava – Part 3

By | December 17, 2016

In this series of articles I will implement a RESTful web service using Spring Boot and Jersey. I will then modify the service as to use asynchronous server-side request processing and RXJava. Load-tests will be made using Gatling before and after the modifications, in order to determine any improvements.

The first article in this series is available here and the second part here.

In this part, part three, I will examine the performance of the example program using the Gatling load-test developed in part two. Apart from the numbers and the excellent reports generated by Gatling, I will also use Java Mission Control to examine the example program during the load-tests.

Raise the Number of Allowed Files/Ports in the OS

In *nix OSes such as Ubuntu and OS X etc, there is a limitation of the number of files and ports that may be open. Depending on how this limit is set, this may affect the result of load-tests. There is an excellent section in the Gatling documentation on OS Tuning and Open File Limits.

In my case, I have raised the maximum number of open file handles on both the computer on which the external database is run and the computer on which the REST service and the Gatling load-test is run.

Use an External Database

During the course of load-testing the example program in preparation for this article, I noticed that running an embedded database or even an external database on the same computer as the example program while it is being load-tested does consume a significant amount of resources.

After trying several alternatives in my somewhat limited machine park, I came to the conclusion that the best alternative seems like running an external H2 database on my laptop and running the load-tests on my stationary computer.

Install and Run the H2 Database

To install and run the H2 database, I did the following:

  • Download the H2 Database Engine.
    I downloaded version 1.4.193, which at the time of writing this article is the latest version available. I downloaded the All Platforms version that comes in a ZIP archive.
  • Unpack the H2 database to a suitable location.
  • In the bin directory in the H2 database directory, create a file named loadtesth2.sh with the following contents:

    Note that you may have to adapt the size of RAM allocated for the H2 database (4096m) depending on the size of your computer’s RAM. The name of the JAR file will also need to be modified if you have downloaded another version of the H2 database engine.
  • Make the custom H2 start-scrip executable.
  • Raise the number of open files and ports permitted in the system if this number is low.
    In a *nix OS such as Ubuntu or OS X, the number of permitted files and ports can be examined using the ulimit -n command. To set this for the database, the same command can be used followed by a number, for instance ulimit -n 65536
  • Start the H2 database.

    You should see output similar to this:

If you do not know which IP your “database server” has, you need to find that out. On Ubuntu, for instance, use the ifconfig command.

The H2 Database Web Console

Note that there was a message about a web console when starting the H2 database. This is a nice feature of the H2 database that allows us to inspect the data in the database using a browser. To examine the external database:

  • Open the URL http://[ip of external database]:8081 in a browser.
  • Use the following connection parameters in the login form that appears in your web browser:
    Driver class: org.h2.Driver
    JDBC URL: jdbc:h2:tcp://[ip of external database]:1521/./dbname
    User Name: sa
    Password: [empty]

    H2 Database web console login screen.

    H2 Database web console login screen.

Modify the Example Program to Use an External Database

The following modifications to the example program are required in order for it to use the external H2 database instead of the embedded HSQLDB database:

  • Modify the application.properties file to have the contents below.
    Note that you need to replace the IP address 192.168.1.68 with the IP address of your “database server”.
  • In the Maven pom.xml of the example program, replace the HSQLDB dependency with this H2 DB dependency:

The above configuration in the application.properties file will cause the unit-tests to fail unless the external database is actually available at the datasource URL. To run the tests without the external database in place, just comment out the property spring.datasource.url using a # first on the line in question. Don’t forget to remove the comment when you actually want to connect to the external database!

The example application should now be able to start up and connect to the external database without any errors.

Load-Testing

First of all, I will mention that I am running both the example program and the load-tests on one and the same computer. This will of course affect the result, but nevertheless I hope I will get a hint about any performance improvements.

Running a Load-Test

In order to become acquainted with the current load-testing setup, I will run one load-test on the REST example application.

  • Start the example program.
    As earlier, right-click the RestExampleApplication class and run it.
  • Launch Java Mission Control and attach to the example program’s JVM.

    Attach Java Mission Control to the example program.

  • Run the Gatling load-test in a terminal window.

Observe the display of Java Mission Control as the program runs. In the picture below I have run two 60-second load-tests with Gatling without restarting the example application.

Monitoring two load-tests of the example program in Java Mission Control.

We can see that the CPU and memory usage is moderate during the load-tests.
Note that the example application uses more memory during the second load-test. My suggestion is to run not only load-tests that focus on a high number of requests per second, but to run load-tests that span over a significantly longer period of time, hours if not days, and in which the load is close to the real load expected on the application in question. The latter type of load-tests are focused on long-term memory use and will hopefully catch problems like memory leaks.

If you take a look in the console of the example program, you may see errors like this one after the load-test has ended:

This type of errors are caused by the server, in my case the REST example program, not having had time to process a request before the connection was closed by the client, Gatling in this example.

In the terminal window in which the Gatling load-test was launched, a text-only report will be printed after a load-test:

On the first row below Global Information, we can see that a total of 4645 requests were sent out by the Gatling load-test. Some rows down we see that the mean number of requests per second during the load-test were 76.148 requests per second.

Looking at the numbers below the Response Time Distribution, we can see that 4% of the requests had a response time between 800 and 1200 ms.
What happened to the requests that I spoke of earlier that caused broken pipes? As far as I have seen such requests are not counted by Gatling. Probably because Gatling doesn’t have an opportunity to record a response before the load-test ends and thus there is no HTTP status or response time for these requests.

Finally I want to recommend restarting the application to be load-tested before each load-test, unless of course you are interested in looking at the long-term behaviour of your application. The restarts are in order to attempt to establish some kind of predictability in the load-tests and will also ensure that the database is cleared.

Load-Test Baseline

If I increase the parameters that specify the number of simulated users and number of requests per second in the load-test until there are errors in the REST example program console after the load-test and some distribution between the three response time distribution groups, I end up with the following values:

These values are no absolute science so don’t spend too much time on trying to get them “right”.

The final console output from the load-test run with these parameters on my computer:

I will focus my interest to two graphs from the Gatling report; the response time distribution graph and the graph showing the number of responses per second.

The two graphs for the load-test baseline looks like this:

Response time distribution of the baseline load-test of the unmodified version of the REST service.

Response time distribution of the baseline load-test of the unmodified version of the REST service.

Responses per second of the baseline load-test of the unmodified version of the REST service.

Load-Test With Stubbed Database

I know that the database is a bottleneck as far as performance of my example application is concerned, but I did not know the magnitude of impact. In order to determine the impact I created a new base-class for the services which did not use a repository but just returned some data as to simulate instantaneous storage and retrieval. In addition I tweaked the load-test parameters and ended up with the following, after which no additional increase in performance were noted:

The two Gatling graphs that I have chosen to look at look like this for the load-test with the stubbed database:

cResponses per second for the load-test of the REST example application with the stubbed database.

We can see that the response times are very short and that the number of responses per second reach about 20,000 responses per second on my computer.
What is the value of this test with a stubbed database?
Well, it did confirm my suspicion that the database slows my application down significantly. I also learned that there seem to be some kind of upper limit at about 20K requests per second in my environment. Regardless of this, I will perform all the remaining load-tests with a real database since this will cause a certain stress on my service layer, which will have to wait for the slower persistence layer.
Learning whether my application is able to cope with interacting with a slow persistence layer is, as far as I am concerned, one important reason for running the load-tests.

Introduce RXJava

In the quest for performance and beautiful code, the first step will be to use RXJava in the example program. We’ll do this by replacing the abstract base classes of services and resources with RXJava versions.

Add RXJava Dependency

As mentioned in the first article, I have chosen to use version 2. of RXJava. Adding RXJava to a project is as simple as adding one single dependency.

  • In the pom.xml file of the example program, add the RXJava dependency:

New Service Base Class

Instead of modifying the service base class, I have chosen to implement a new version.

  • Implement the AbstractServiceBaseRxJava as shown below:

Note that:

  • The return type of all methods have changed to Observable.
    An observable in the RXJava sense can be likened to a box in which we pack some code. The code won’t be executed until we, at some later point in time, open the box to see what is inside. Observables can be composed and also allow for synchronous or asynchronous execution – please refer to the RXJava wiki for more information.
    Later we will see how an observable is used on the client side.
  • The implementation of the methods have changed.
    To be honest, the amount of boilerplate code has increased. Let’s take a look at the save method before and after RXJava.

Concerning the new version of the save method, note that:

  • The saved entity is not returned but instead passed to the observable using the onNext method.
    A RXJava observable is an emitter of signals. Three types of signals can be emitted by an emitter: A normal value is emitted by invoking the onNext method, an error is emitted by invoking the onError method and finally the completion of an emitter is signaled by invoking the onComplete method on the emitter.
    Thus the saved entity is emitted as a normal value from the observable.
  • After the saved entity has been emitted, the onComplete method is invoked on the observable.
    As above, this emits a signal that indicates that the emitter (observable) has completed doing what it is supposed to do and that there will be no further signals from the observable.
  • The RXJava version of the save method does handle exceptions.
    In the old save method, we could rely on regular Java exceptions, with its advantages and drawbacks. In the RXJava version we want to emit an error signal instead of throwing an exception – in the catch block, the onError method is invoked on the observable with an exception as parameter.
  • In the catch block, onComplete is not invoked after having called onError on the observable.
    Calling onError terminates the observable and any further calls on it, such as onComplete, will not emit any signals.

If we go back to the AbstractServiceBaseRxJava class:

  • In the find method, the found entity is verified not to be null before calling onNext on the observable.
    RXJava 2 does not allow emitting null values and thus we need to make sure that there indeed is an entity to emit.

Update Services’ Superclass

Introducing the new service base-class in the concrete service classes is as simple as replacing the superclass.

  • In the CircleService class, replace the superclass (was AbstractServiceBasePlain) with AbstractServiceBaseRxJava.
  • In the RectangleService class, replace the superclass with AbstractServiceBaseRxJava.
  • In the DrawingService class, replace the superclass with AbstractServiceBaseRxJava.
  • Delete the old superclass AbstractServiceBasePlain from the project.

New REST Resource Base Class

Updating the services is not sufficient, the REST resource base-class also must be updated since it is the client of the services. As with the service base class, I will create a new base-class for the REST resources.

  • Create the class RestResourceBaseRxJava implemented as follows:

Note that:

  • The instance variable mService is of the type AbstractServiceBaseRxJava.
  • In for instance the method getAll the previous construct that used the performServiceOperation method has been replaced by an RXJava invocation that subscribes to the observable produced by the service invocation.
    When subscribing to an observable, you may provide up to three callbacks; one for the next value emitted by the observable, one for errors and a final callback for completion of the observable.
  • All the methods that return a Response object use a container of the type AtomicReference<Response> to convey the object to be returned from the RXJava callbacks to the surrounding method.
    One can supposedly use a final variable in the surrounding method and set its value from the RXJava callbacks, but my IDE would not let, so I opted for the AtomicReference<Response> solution.

Update REST Resources’ Superclass

As with the service, the superclass of the concrete REST resource classes also need to be updated to the new REST resource base class that uses RXJava.

  • In the CircleResource class, replace the superclass (was RestResourceBasePlain) with RestResourceBaseRxJava.
  • In the RectangleResource class, replace the superclass with RestResourceBaseRxJava.
  • In the DrawingResource class, replace the superclass with RestResourceBaseRxJava.
  • Delete the old superclass RestResourceBasePlain from the project.

Run Tests

To verify that our modifications hasn’t broken the program, we can now run all the tests. Remember that the tests are TestNG tests – in my IDE there is a special alternative for running all the TestNG tests in a project.

Alternatively you can run all the tests using Maven. In a terminal window, use the command:

All tests should pass.

Run the Load-Test

We are now ready to run the load-test on the new version of the REST service.

  • Start the example program.
    As earlier, right-click the RestExampleApplication class and run it.
  • Run the Gatling load-test in a terminal window.
    mvn gatling:test
  • Examine the Gatling load-test report.

My report written to the terminal window looks like this:

The response time distribution diagram from the graphical version of the report looks like this:

Response time distribution from load test of the RXJava version of the example program.

Response time distribution from load-test of the RXJava version of the example program.

This is the diagram showing responses per second:

Responses per second from the load-test of the RXJava version of the example program.

RXJava With Stubbed Database

After having written the conclusion, I felt that this article would be more incomplete if I did not run a load-test after having introduced RXJava and with the stubbed database.

Here are the two graphs from the Gatling load-test report from that test:

Response time distribution graph from load-test of example program with RXJava and stubbed database.

Responses per second graph from load-test of example program with RXJava and stubbed database.

Conclusion

If we place the response time distribution graph from the baseline load-test in the background, having a blue colour, and the same graph from the RXJava load-test above it, having a purple colour we can see that there are only minor differences between the two graphs.

Response time distribution graphs from the baseline (blue) and RXJava (purple) load-tests.

One conclusion that can be drawn from this is that the performance after having introduced RXJava is about the same as before and that RXJava has neither introduced any performance degradation but also no performance improvements.
Given the earlier load-test with a stubbed database, I would like to rephrase the conclusion a bit:
Any performance improvement or degradation obtained by introducing RXJava in the example program is negligible in comparison with the performance impact caused by the database.

Having added the above section with load-test of the RXJava version of the example program with a stubbed database, we can indeed see that RXJava does not seem to cause neither performance degradation nor performance improvements.

So what is there to learn from this?
If this article were about performance optimization of the example program, I would have made a big mistake by introducing RXJava instead of looking at the real bottleneck which is the database.

RXJava seems to be a good framework in that it holds up well under load, given the simple scenarios in the example program, and does not introduce any performance degradation.
In the example program, the benefit from introducing RXJava is limited; somewhat better error handling and a higher degree of uniformity, as far as RXJava signal passing is concerned.
I do suspect that the example program is too trivial and/or of a type that does not reap large advantages from the use of RXJava.

In the next part of this series, asynchronous request processing with Jersey (JAX-RS) will be added to the example program and further load-tests run.

Happy coding!

Leave a Reply

Your email address will not be published. Required fields are marked *