At a onsite workshop with a new potential customer in wonderful Zurich I was challenged with a requirement I’ve never had in the last few years with Cypher.
Since I cannot disclose any details of the use case or the data used let’s create an artificial example with a similar structure:
- a restaurant has several rooms
- each room has various tables
- each table is occupied with 0 to 6 guests
The goal of the query is to find the most occupied table for each room. As an example, I’ve created a Neo4j console at http://console.neo4j.org/r/ufmpwi.
The challenging here is that we want to do a kind of local filtering: for a given room we need the most occupied table. Calling the reduce function to the rescue. The idea is to run the reduce with 3 state variables:
- the index of the highest occupation so far
- the current index (aka iteration number)
- the value of highest occupation so far
RETURN reduce(x=[0,0,0], i IN [1,2,2,5,2,1] |
CASE WHEN i>x[2] THEN [x[1],x[1]+1,o] ELSE [x[0], x[1]+1,x[2]] END
)[0]
Reduce allows just one single state variable to be used but we can use a three element array instead (which is one variable 🙂 ). When the current element is larger than the maximum so far (aka the 3rd element of the state), we update the first element to the current position. The second element (current index) is always incremented. If the current element is not larger then the maximum so far, we just increment the current count (2nd element) and keep the other values:
The full query is:
MATCH (:Restaurant)-[:HAS_ROOM]->(room)-[:HAS_TABLE]->(table)
OPTIONAL MATCH (guest)-[:SITS_AT]->(table)
WITH room, table, count(guest) AS guestsAtTable
WITH room, collect(table) AS tables, collect(guestsAtTable) AS occupation
RETURN room, tables[reduce(x=[0,0,0], o IN occupation |
CASE WHEN o>x[2]
THEN [x[1], x[1]+1,o]
ELSE [x[0], x[1]+1,x[2]]
END
)[0]] AS mostOccupied, tables, occupation
The first line is pretty much obvious. Since there might be tables without guests OPTIONAL MATCH
is required in line 2.
Cypher does not allow you to do direct aggregations of aggregations. Using multiple WITH
helps here. In line 3 we first calculate counts of guests per table. Line 4 returns one line per room with two collections – one holding the tables, the other their occupation. Note that both collections do have the same order. Finally from line 5 on the reduce function is applied to find the most occupied table in each room.
4 replies on “Cypher fun: finding the position of an element in an array”
You can also use: range for the index handling.
reduce(x=[0,0], idx in range(0,size(tables)-1) | case when tables[idx] > x[1] then [idx,tables[idx] else x end)
Good catch, Michael. Thank you.
How about
MATCH (r:Restaurant)-[:HAS_ROOM]->(room)-[:HAS_TABLE]->(table)
OPTIONAL MATCH (guest)-[:SITS_AT]->(table)
WITH room, table, count(guest) AS guestsAtTable
ORDER BY guestsAtTable DESC
WITH room, collect(table) as tables
RETURN room, tables[0] as mostOccupied
Or is it about the line 6
CASE WHEN o>x[2]
which might be more complicated and can’t be achieved with order by?
Hi, the intention was on getting the position of the largest element in an collection and not just the largest element by itself. In the latter case, your cypher statement would do the job. Cheers, Stefan