neo4j and haproxy: some best practices and tricks

TL;DR

The goal of that blog post is to provide a reasonable haproxy configuration suitable for most neo4j clusters. It covers the Neo4j specific needs (authentication, stickyness, request routing) suitable for typically deployments.

Some words for introduction

When running a Neo4j cluster you typically want to front it with a load balancer. The load balancer itself is not part of Neo4j since a lot of customers already have hardware load balancers in their infrastructure. In case they have not or want some easy-to-go solution for that the recommended tool is haproxy.

In a Neo4j cluster its perfectly possible to run any operation (reads and writes) on any cluster member. However performance-wise it’s a good idea to send write requests directly to the master and send read-only requests to all cluster member or just to the slaves – depending how busy your master is.

What we need to cover

These days most folks interact with Neo4j by sending Cypher commands through the transactional cypher end point. This endpoint uses always a HTTP POST regardless if the Cypher commands are intended to perform read or write operation. This causes some pain for load balancing since we need to distinguish reads from writes.

One approach for this is to use a additional custom HTTTP header on the client side. Assume any request that should perform writes gets a X-Write:1 header in the request. Then haproxy just needs to check for this header in and acl and associate this acl with the backend containing only the master instance.

Another idea is that the load balancer inspects the request’s payload (aka the Cypher command you’re sending). If that payload contains write clauses like CREATE, MERGE, SET, DELETE, REMOVE or FOREACH it is most likely a write operation to be directed to the master. Luckily haproxy has capbilites to inspect the payload and apply regex matches.

A third approach would be to encapsulate each and every of your use cases into an unmanaged extension and use Cypher, Traversal API or core API inside as an implementation detail. By using the semantics of REST appropriately you use GET for side-effect free reads, PUT, POST and DELETE for writes. These can be used in haproxy’s acl as well very easily.

Another issue we have to deal when setting up the load balancer with is authentication. Since Neo4j 2.2 the database is protected by default with username and password. Of course we can switch that feature off, but if security is a concern it’s wise to have it enabled. In combination with haproxy we need to be aware that haproxy needs to use authentication as well for its status checks to identify who’s available and who’s master.

recommended setup

I’ve spent some time to fiddle out a haproxy config addressing all the points mentioned above. Let’s go through the relevant parts below line by line – for the rest I delegate to the excellent documentation for haproxy. Please note that this config requires a recent version of haproxy:

 

glory details

 

defining access control lists

 

The acl declare conditions to be used later one. We have acls to identify if the incoming request

  • has http method of POST, PUT or DELETE (l.16)
  • has a specific request header X-Write:1
  • contains one of the magic words in the payload to identify a write: CREATE, MERGE, DELETE, REMOVE, SET
  • is targeted at the transactional cypher endpoint
store a variable indicating tx endpoint

We do a nasty trick here: we store a internal variable holding a boolean depending on wether the request is for the tx endpoint. Later on we refer back to that variable. The rationale is that haproxy cannot access acls in other sections of the config file.

decide the backend to be used (aka do we read or write)

 

The acls defined above decide to which backend a request should be directed. We have one endpoint for the neo4j cluster’s master and another one for a pool of all available neo4j instances (master and slaves). A request is send to master if

  • the X-Write:1 header is set, or
  • the request hits the tx cypher endpoint AND it contains one of the magic words.
  • it is a POST, PUT or DELETE.
check status of neo4j instances, eventually with authentication

 

A backend needs to check which of its member is available. Therefore Neo4j has a REST endpoint exposing the individual instance’s status. In case we use Neo4j authentication the requests to the status endpoint need to be authenticated. Thanks to stackoverflow, I’ve figured out that you can add a Authorization to these requests. As usual, the value for the auth header is a base64 encoded string of “<username>:<password>”. On a unix command line use:

In case you’ve disabled authentication in Neo4j, use the commented lines instead.

stickyness for tx endpoint

Using our previously stored variable, we get the value back and use it to declare an acl in this backend. To make sure any request for the same transaction number is sent to the same neo4j instance we create a sticky table to store the transaction ids. You might want to adopt the size of that table (here 1k) to your needs, as well the expire time should be aligned with neo4j’s org.neo4j.server.transaction.timeout setting. It defaults to 60s, so if we expire the sticky table entry after 70s we’re on the safe side. If a http response from the tx endpoint has a Location header we store its 6th word – that is the transaction number – in the sticky table. If we hit the tx endpoint we check if the request path’s 4th word (the transaction id) is found in the sticky table. If so, the request gets send to the same server again, otherwise we use the default policy (round-robin).

define neo4j cluster instances

 

These lines hold a list of all cluster instances. Be sure to align the maxconn value with the amount of max_server_threads. By default Neo4j uses 10 threads per CPU core. So 32 is a good value for a 4 core box.

haproxy’s stats interface

Haproxy provides a nice status dashbord. In most cases you want to protect that with a password. haproxy_stats

Conclusion

I hope this post could provide you some insights on how to use haproxy in front of a neo4j cluster efficiently. Of course I am aware that my experience and knowledge with haproxy is limited. So I appreciate any feedback to improve the config described here.

Update 2015-10-13

Based on couple of questions I should have more clearly mentioned that the sketched setup requires a haproxy 1.6 which is currently in development.

Update 2015-11-09

Haproxy 1.6 has been released GA on Oct 23rd – so no need to use a in-dev version any more.

Update 2016-10-18

There’s a docker image providing a preconfigured haproxy for neo4j available at https://github.com/UKHomeOffice/docker-neo4j-haproxy. Kudos to Giles.

7 thoughts on “neo4j and haproxy: some best practices and tricks

  1. Christophe Willemsen

    Hi Stefan,

    Thanks a lot for the detailed blogpost. I have a “little” concern for the acls defining if the request should go to master. I generally find that only the header is a reliable manner.

    1. You can have CREATE, MERGE in the value of a property
    2. POST method is used for all Cypher transactional requests, even for READ.

    Secondly for “the request hits the tx cypher endpoint AND it contains one of the magic words.” : If you need to handle manually the opening and commiting of transactions, you will issue a POST request to the transactional endpoint without body for eg, so the acl rule will not be matched.

  2. Stefan Armbruster Post author

    Christophe, thanks for your feedback. I agree a custom header is the most reliable approach.

    on 1) I’m aware that my approach might lead to false positive. Since I didn’t want to write a full cypher parser – haproxy 1.6 would allow this using lua code 😉 I think it’s acceptable.

    Your second point could in fact lead to situations where a transaction starting with empty (or read only) requests and later on amended a write statement are processed on one of the slaves. This is something you can only avoid by the custom header to indicate a write explicitly.

  3. Christoph

    Could you please explain what the 6th word of the response’s Location header is (line 33)?

  4. Phanor Coll

    Hi, Im getting started with Neo4j HA clustering and HAproxy.. lets say I have all that configuration done.. how do I connect an app to the cluster, I know is through the HAproxy, but how?.. thanks

  5. admin

    You have to point your application to the port configured in the frontend section. In my example it would be http://:8090. Amend the URL for the REST endpoint to that base URL, e.g. http://:8090/db/data/transaction/commit.

Leave a Reply

Your email address will not be published. Required fields are marked *