Cypher fun: finding the position of an element in an array

At a onsite workshop with a new potential customer in wonderful Zurich I was challenged with a requirement I’ve never had in the last few years with Cypher.

Since I cannot disclose any details of the use case or the data used let’s create an artificial example with a similar structure:

  • a restaurant has several rooms
  • each room has various tables
  • each table is occupied with 0 to 6 guests

The goal of the query is to find the most occupied table for each room. As an example, I’ve created a Neo4j console at http://console.neo4j.org/r/ufmpwi.

The challenging here is that we want to do a kind of local filtering: for a given room we need the most occupied table. Calling the reduce function to the rescue. The idea is to run the reduce with 3 state variables:

  • the index of the highest occupation so far
  • the current index (aka iteration number)
  • the value of highest occupation so far

Reduce allows just one single state variable to be used but we can use a three element array instead (which is one variable 🙂 ). When the current element is larger than the maximum so far (aka the 3rd element of the state), we update the first element to the current position. The second element (current index) is always incremented. If the current element is not larger then the maximum so far, we just increment the current count (2nd element) and keep the other values:

The full query is:

The first line is pretty much obvious. Since there might be tables without guests OPTIONAL MATCH is required in line 2.

Cypher does not allow you to do direct aggregations of aggregations. Using multiple WITH helps here. In line 3 we first calculate counts of guests per table. Line 4 returns one line per room with two collections – one holding the tables, the other their occupation. Note that both collections do have the same order. Finally from line 5 on the reduce function is applied to find the most occupied table in each room.

4 thoughts on “Cypher fun: finding the position of an element in an array

  1. Michael Hunger

    You can also use: range for the index handling.

    reduce(x=[0,0], idx in range(0,size(tables)-1) | case when tables[idx] > x[1] then [idx,tables[idx] else x end)

  2. F.

    How about

    MATCH (r:Restaurant)-[:HAS_ROOM]->(room)-[:HAS_TABLE]->(table)
    OPTIONAL MATCH (guest)-[:SITS_AT]->(table)
    WITH room, table, count(guest) AS guestsAtTable
    ORDER BY guestsAtTable DESC
    WITH room, collect(table) as tables
    RETURN room, tables[0] as mostOccupied

    Or is it about the line 6
    CASE WHEN o>x[2]
    which might be more complicated and can’t be achieved with order by?

  3. Stefan Armbruster Post author

    Hi, the intention was on getting the position of the largest element in an collection and not just the largest element by itself. In the latter case, your cypher statement would do the job. Cheers, Stefan

Leave a Reply

Your email address will not be published. Required fields are marked *