Wednesday, 30 October 2013

Getting Started with Hazelcast

In July I wrote a blog introducing erlang to Java developers, highlighting some of the similarities and differences between the two languages. The erlang virtual machine has a number of impressive, built-in features, one of which is that they are location independent and can talk to each other. This means that that data can be synchronised between VMs by writing very few lines of code. This is really good news if you have a networked cluster of servers all doing the same thing.

You could argue that there's something lacking in the JVM if it can't even perform the most basic interprocess communication; however, Java takes the opposite view, it has a basic VM and then layers different services on top as and when required. Whether this is right is a matter of opinion and I'll leave it as a subject for a future blog, because it seems that the Hazelcast Guys have solved the problem of JVMs talking to each other; which is the point of this blog.

So, what is Hazelcast?


The Hazelcast press release goes something like this: "Hazelcast (www.hazelcast.com) is reinventing in-memory data grid through open source. Hazelcast provides a drop-in library that any Java developer can include in minutes to enable them to build elegantly simple mission-critical, transactional, and terascale in-memory applications".

So, what does that really mean?


Okay, so that's just marketing/PR bumpf. What is Hazelcast… in real life? The answer can be succinctly given using code. Imagine you're writing an application and you need a Map<String,String> and when you're in production you'll have multiple instances of your app in a cluster. Then writing the following code:

    HazelcastInstance instance = Hazelcast.newHazelcastInstance();
    loggedOnUsers = instance.getMap
("Users");

…means that data added to your map by one instance of your application is available to all the other instances of your application2

There are a few points that you can deduce from this. Firstly, Hazelcast nodes are 'masterless', which means that it isn't a client-server system. There is a cluster leader, which is by default the oldest member of the cluster, which manages how data is spread across the system; however, if that node went down, then the next oldest will take over.

Having a bunch of distributed Maps, Lists, Queues etc, means that everything is held in memory. If one node in your cluster dies, then you're okay, there's no loss of data; however, if a number of nodes die at the same time, then you're in trouble and you'll get data loss as the system won't have time to rebalance itself. It also goes without saying that if the whole cluster dies, then you're in big trouble.

So, why is Hazelcast a good bet?


  1. It's open source. This is usually a good thing…
  2. Hazelcast have just received a large cash injection to 'commoditize' the product. For more on this take a look here and here.
  3. Rod Johnson, yes Mr Spring, is now on the board of Hazelcast.
  4. It just works1.
  5. Getting started is pretty easy.

The Scenario


To demonstrate Hazelcast imagine that you're writing an application, in this case modelled by the MyApplication class and then there's a big, wide world of users as modelled by the BigWideWorld class. As expected, users from the BigWideWorld log in and out of your application. Your application is very popular and you're running multiple instances of it in a cluster, so when a user logs in an instance of the app it stores their details (as modelled by the User class) in a Map and the contents of the map are synchronised with the maps held by other instances of your application.


POM Configuration


The first thing to do is to setup the POM.xml and there's only one entry to consider:

<dependency>
        <groupId>com.hazelcast</groupId>
        <artifactId>hazelcast</artifactId>
        <version>3.1</version>
    </dependency>

The Code


The BigWideWorld is the starting point for the code and it's a very small class for such a large concept. It has one method, nextUser(), which randomly chooses the name of the next user to log in or out from a collection of all your application's users.

public class BigWideWorld {

 
private static Random rand = new Random(System.currentTimeMillis());

 
private final Users users = new Users();

 
private final int totalNumUsers = users.size();

 
public String nextUser() {

   
User user = users.get(rand.nextInt(totalNumUsers));
    String name = user.getUsername
();

   
return name;

 
}

}

The collection of users is managed by the Users class. This is a sample code convenience class that contains a number of hard coded users' details.

public class Users {

 
/** The users in the database */
 
private final User[] users = { new User("fred123", "Fred", "Jones", "[email protected]"),
     
new User("jim", "Jim", "Jones", "[email protected]"),
     
new User("bill", "Bill", "Jones", "[email protected]"),
     
new User("ted111", "Edward", "Jones", "[email protected]"),
     
new User("annie", "Annette", "Jones", "[email protected]"),
     
new User("lucy", "Lucy", "Jones", "[email protected]"),
     
new User("jimj", "James", "Jones", "[email protected]"),
     
new User("jez", "Jerry", "Jones", "[email protected]"),
     
new User("will", "William", "Jones", "[email protected]"),
     
new User("shaz", "Sharon", "Jones", "[email protected]"),
     
new User("paula", "Paula", "Jones", "[email protected]"),
     
new User("leo", "Leonardo", "Jones", "[email protected]"), };

 
private final Map<String, User> userMap;

 
public Users() {

   
userMap = new HashMap<String, User>();

   
for (User user : users) {
     
userMap.put(user.getUsername(), user);
   
}
  }

 
/**
   * The number of users in the database
   */
 
public int size() {
   
return userMap.size();
 
}

 
/**
   * Given a number, return the user
   */
 
public User get(int index) {
   
return users[index];
 
}

 
/**
   * Given the user's name return the User details
   */
 
public User get(String username) {
   
return userMap.get(username);
 
}

 
/**
   * Return the user names.
   */
 
public Set<String> getUserNames() {
   
return userMap.keySet();
 
}
}

This class contains a few database type of calls, such as get(String username) to return the user object for a given name, or get(int index) to return a given user from the DB, or size() to return the number of users in the database.

The user is described by the User class; a simple Java bean:

public class User implements Serializable {

 
private static final long serialVersionUID = 1L;
 
private final String username;
 
private final String firstName;
 
private final String lastName;
 
private final String email;

 
public User(String username, String firstName, String lastName, String email) {
   
super();
   
this.username = username;
   
this.firstName = firstName;
   
this.lastName = lastName;
   
this.email = email;
 
}

 
public String getUsername() {
   
return username;
 
}

 
public String getFirstName() {
   
return firstName;
 
}

 
public String getLastName() {
   
return lastName;
 
}

 
public String getEmail() {
   
return email;
 
}

 
@Override
 
public String toString() {

   
StringBuilder sb = new StringBuilder("User: ");
    sb.append
(username);
    sb.append
(" ");
    sb.append
(firstName);
    sb.append
(" ");
    sb.append
(lastName);
    sb.append
(" ");
    sb.append
(email);

   
return sb.toString();
 
}
}

Moving on the crux of the blog, which is the MyApplication class. Most of the code in this blogs is merely window dressing, the code that's of importance is in MyApplication's constructor. The construct contains two lines of code; the first gets hold of a new Hazelcast instance, whilst the second uses that instance to create a Map<String, User> with a namespace of "Users". This is all the Hazelcast specific code that's needed. The other methods: logon(), logout() and isLoggedOn() just manage the users.

public class MyApplication {

 
private final Map<String, User> loggedOnUsers;

 
private final Users userDB = new Users();

 
private final SimpleDateFormat sdf = new SimpleDateFormat("kk:mm:ss-SS");

 
private long lastChange;

 
public MyApplication() {

   
HazelcastInstance instance = Hazelcast.newHazelcastInstance();

    loggedOnUsers = instance.getMap
("Users");
 
}

 
/**
   * A user logs on to the application
   *
   *
@param username
   *            The user name
   */
 
public void logon(String username) {

   
User user = userDB.get(username);

    loggedOnUsers.put
(username, user);
    lastChange = System.currentTimeMillis
();
 
}

 
/**
   * The user logs out (or off depending on your pov).
   */
 
public void logout(String username) {

   
loggedOnUsers.remove(username);
    lastChange = System.currentTimeMillis
();
 
}

 
/**
   *
@return Return true if the user is logged on
   */
 
public boolean isLoggedOn(String username) {
   
return loggedOnUsers.containsKey(username);
 
}

 
/**
   * Return a list of the currently logged on users - perhaps to sys admin.
   */
 
public Collection<User> loggedOnUsers() {
   
return loggedOnUsers.values();
 
}

 
/**
   * Display the logged on users
   */
 
public void displayUsers() {

   
StringBuilder sb = new StringBuilder("Logged on users:\n");
    Collection<User> users = loggedOnUsers.values
();
   
for (User user : users) {
     
sb.append(user);
      sb.append
("\n");
   
}
   
sb.append(loggedOnUsers.size());
    sb.append
(" -- ");
    sb.append
(sdf.format(new Date(lastChange)));
    sb.append
("\n");
    System.out.println
(sb.toString());
 
}

}

All the above is tied together using a simple Mainclass:

public class Main {

 
public static void main(String[] args) throws InterruptedException {

   
BigWideWorld theWorld = new BigWideWorld();

    MyApplication application =
new MyApplication();

   
while (true) {

     
String username = theWorld.nextUser();

     
if (application.isLoggedOn(username)) {
       
application.logout(username);
     
} else {
       
application.logon(username);
     
}

     
application.displayUsers();
      TimeUnit.SECONDS.sleep
(2);
   
}
  }

}

This code creates an instance of the BigWideWorld and MyApplication. It then infinitely loops grabbing hold of the next random user name. If the user is already logged in, then the user logs out. If the user is not logged in, then the user logs in. The logged in users are then displayed so that you can see what's going on.

Running the App


After building the app, open a terminal and navigate to the projects target/classes directory. Then type in the following command:

java -cp /your path to the/hazelcast-3.1/lib/hazelcast-1.jar:. com.captaindebug.hazelcast.gettingstarted.Main

When running, you'll get output that looks something like this:

Logged on users:
User: fred123 Fred Jones [email protected]
User: jimj James Jones [email protected]
User: shaz Sharon Jones [email protected]
User: paula Paula Jones [email protected]
User: lucy Lucy Jones [email protected]
User: jez Jerry Jones [email protected]
User: jim Jim Jones [email protected]
7 -- 14:54:16-17

Next, open more terminals and run a few more instances of your application.

If you trail through the output you can see users logging in and out, with the user Map being displayed on each change. The clue that the changes in one app's map are reflected in the other instances can be hard to spot, but can be deduced from the total size of the map (the first number on the last line of the output). Each time the map is displayed one user has either logged in or out; however, the total size can change by more than one, meaning that other instances' changes have affected the size of the map you're looking at.


So, there you have it a simple app that when four instances are running keep themselves in synch and know which users are logged in.

It's supposed to work in large clusters, but I've never tried it. Apparently, in large clusters, you have to do some jiggery-pokery with the config file, but that's beyond the scope of this blog.


1Okay, enough of the marketing speak. In general is does 'just work', but remember that it is software, written by developers like you and me, it does have its features and idiosyncrasies. For example, if you're still using version 2.4 then upgrade NOW. This has a memory leak that means it 'just silently stops working' when it feels like it. The latest version is 3.1.

2I've chosen Map as an example, but it's also true for other collection types such as List, Set and Queue, plus Hazelcast has many other features that are beyond the scope of this blog including a bunch of concurrency utilities and publish/subscribe messaging.

The code for this blog is available on github at: https://github.com/roghughe/captaindebug/tree/master/hazelcast

No comments: