cancel
Showing results for 
Search instead for 
Did you mean: 

Speed up SAP HANA graph query

0 Kudos

Hi,

I'm working with a graph database in sap hana and i've notices that performance isn't (consistently) great.

I've defined my edges, vertices and graph as:

CREATE COLUMN TABLE "VERTICES" (
	"ID" BIGINT PRIMARY KEY,
	"DIRECT_ID" BIGINT,
	"TYPE" NVARCHAR(100)
);

CREATE TABLE "EDGES" (
		"ID" BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
		"DIRECT_ID" BIGINT,
		"TYPE" NVARCHAR(255),
		
		"SOURCE" BIGINT REFERENCES "VERTICES"("ID") ON DELETE CASCADE NOT NULL,
		"TARGET" BIGINT REFERENCES "VERTICES"("ID") ON DELETE CASCADE NOT NULL
);

CREATE GRAPH WORKSPACE "GRAPH"
	EDGE TABLE "EDGES"
		SOURCE COLUMN "SOURCE"
		TARGET COLUMN "TARGET"
		KEY COLUMN "ID"
	VERTEX TABLE "VERTICES" 
		KEY COLUMN "ID"

It consists or ~30M vertices and 57M edges. The edges are almost always double between two nodes: one edge (n1)-(n2) and another edge (n2)-(n1).

The problems I want to tackle are:

  1. The shortest (weighted) path between two vertices in the graph (one-to-one)
  2. The shortest path between a known vertex and a vertex of a specific type

I've managed to get both questions answered, buy performance is not what I expect. Sometimes the one-to-one is lightning fast, and sometimes it take 1,30 minutes to do the same job.

How I managed to get the known one-to-one shortest path. This take 1,5 when call the procedure the first time. Calling it again with the same or some different node(s) is lightning fast.

CREATE TYPE "TT_VERTICES_SPOTO" AS TABLE ("ID" BIGINT, "TYPE" NVARCHAR(100), VERTEX_ORDER BIGINT);
CREATE OR REPLACE PROCEDURE "SHORTEST_ONE_TO_ONE"(
	IN i_startVertex BIGINT, 		-- the ID of the start vertex
	IN i_endVertex BIGINT, 			-- the ID of the end vertex
	IN i_dir NVARCHAR(10), 			-- the the direction of the edge traversal: OUTGOING (default), INCOMING, ANY
	OUT o_pathLength BIGINT,		-- the hop distance between start and end
	OUT o_pathWeight DOUBLE,	-- the path weight/cost based on the WEIGHT attribute
	OUT o_vertices "TT_VERTICES_SPOTO"
	)
LANGUAGE GRAPH READS SQL DATA AS
BEGIN
	-- Create an instance of the graph, refering to the graph workspace object
	GRAPH g = Graph("GRAPH");


	-- Create an instance of the start/end vertex
	VERTEX v_start = Vertex(:g, :i_startVertex);
	VERTEX v_end = Vertex(:g, :i_endVertex);
	-- Running shortest path using the WEIGHT column as cost
	WeightedPath<DOUBLE> p = Shortest_Path(:g, :v_start, :v_end, (Edge e) => DOUBLE{ return :e."SYSTEEMLENGTE"; });
	
	-- Project the results from the path
	o_pathLength = LENGTH(:p);
	o_pathWeight = WEIGHT(:p);
	o_vertices = SELECT :v."ID", :v."TYPE", :VERTEX_ORDER FOREACH v IN Vertices(:p) WITH ORDINALITY AS VERTEX_ORDER;
END;

How I've managed to get the known vertex to vertex of a certain type. This always takes about ~1,5 to 2,5 minutes

CREATE TYPE "TT_VERTICES_SPTT" AS TABLE ("ID" BIGINT, "TYPE" VNARCHAR(100), "DISTANCE" DOUBLE);
CREATE OR REPLACE PROCEDURE "SHORTEST_PATH_TO_ALL"(
	IN i_startVertex BIGINT, 		-- the ID of the start vertex
	OUT o_vertices "TT_VERTICES_SPTT"
	)
LANGUAGE GRAPH READS SQL DATA AS
BEGIN
	-- Create an instance of the graph, refering to the graph workspace object
	GRAPH g = Graph("GRAPH");

	-- Create an instance of the start/end vertex
	VERTEX v_start = Vertex(:g, :i_startVertex);

	GRAPH henl = SHORTEST_PATHS_ONE_TO_ALL(:g, :v_start, "DISTANCE", (Edge e) => DOUBLE{ return :e."SYSTEEMLENGTE";});


	o_vertices = SELECT :v."ID", :v."TYPE", :v."DISTANCE" FOREACH v IN Vertices(:henl);
END;

-- Wrapper around SHORTEST_PATH_TO_ALL
CREATE OR REPLACE FUNCTION "SHORTEST_PATH_TO_TARGET_TYPE"(
	IN "SOURCE" BIGINT, IN "TARGET_TYPE" NVARCHAR(100)
	)
	RETURNS TABLE ("ID" BIGINT, "TYPE" NVARCHAR(100), "DISTANCE" DOUBLE)
LANGUAGE SQLSCRIPT READS SQL DATA AS


BEGIN
	CALL SHORTEST_PATH_TO_ALL(:SOURCE, o_table);
	RETURN
		WITH DESTINATIONS AS (
				SELECT 	ID,
						TYPE,
						DISTANCE
				FROM :o_table
				WHERE NAME = :TARGET_TYPE
			),
			
			CLOSEST_VERTICES_OF_INTEREST AS (
				SELECT MIN(DISTANCE) AS MIN_DIST
				FROM DESTINATIONS
			)
		SELECT 	ID,
				TYPE,
				DISTANCE
		FROM DESTINATIONS
		JOIN CLOSEST_VERTICES_OF_INTEREST
			ON DESTINATIONS.DISTANCE = CLOSEST_VERTICES_OF_INTEREST.MIN_DIST;
END;

How do I get this to perform better? Ultimately I want to do a lot of calls from a list of nodes to nodes of a certain type and get the shortest paths back.

former_member751591
Participant
0 Kudos

Thank you for visiting SAP Community to get answers to your questions. Since you're asking a question here for the first time, I recommend you to read this overview Community Q&A, and to take our Q&A tutorial . With these tips you'll be able to prepare questions that draw responses from our members.

The more details you provide, the more likely it is that members will be able to answer your question.

Should you wish, you can revise your question by selecting Actions, then Edit.

By adding a Picture to your profile you encourage readers to respond.

Accepted Solutions (0)

Answers (2)

Answers (2)

mfath
Advisor
Advisor

Hej Duco,

when calling a graph procedure for the first time, an internal index is built. For subsequent calls, the index is used and hence query performance is much better. There are certain situations where the index is re-built. Changing a procedure is one of these situation. So if you work with HANA Cloud, I'd suggest to use "anonymous blocks" (DO() BEGIN...END; see the attached code) while in development/design phase. Once the logic is ok, turn the block into a "real" procedure.

For your second procedure (nearest vertex of a certain type), I'd suggest to do the filtering in the Graph Procedure. I don't see a reason why you should pass the complete SPOA result back to SQL. In the end, I guess your SQL statement in the RETURN clause is not optimal. In case you have "ties", i.e. neighbors with the same distance - do you want to return both?

Hope the attached code helps. It works on HANA Cloud QRC2nn-weigthed-and-filtered.txt.

Regards, Markus

0 Kudos
Hi  markus.fath 

Thanks so mutch for such an elaborate answer!

Your answer confirms my hunge that when changing the procedure the graph needs to be reinitialized.

As for the supplied example for the "nearest vertex of a certain type" I have two questions. Hopefully you could push me in the right direction.

I unfortunately cannot work with anonymous blocks with Graph language 😞 So I need to make procedures.

I've written it as:

CREATE OR REPLACE PROCEDURE "SHORTEST_PATH_TO_ALL"(
	IN i_startVertex BIGINT, 		-- the ID of the start vertex
	IN i_k BIGINT, -- number OF nearest neighbors TO be returned
	IN i_dir NVARCHAR(10), -- direction OF edges TO be traversed
	IN i_maxDist DOUBLE, -- maximum distance, TO LIMIT the SEARCH SCOPE OF the SPOA agorithm
	IN i_type NVARCHAR(100), -- the FILTER criteria OF the nearest neighbor(s) TO be found
	OUT o_res TABLE("ID" BIGINT, "TYPE" NVARCHAR(100), "$DISTANCE" DOUBLE)
	)
	LANGUAGE GRAPH READS SQL DATA AS
BEGIN


	GRAPH g = Graph("AMKV", "NEO4OJEE");
	VERTEX v_start = Vertex(:g, :i_startVertex);
	GRAPH g_spoa = SHORTEST_PATHS_ONE_TO_ALL(:g, :v_start, "$DISTANCE", 
		(Edge e, DOUBLE distance, BIGINT hops) => 
			DOUBLE{ 
				IF(:distance <= :i_maxDist) {	RETURN :e."SYSTEEMLENGTE";	}
				ELSE { END TRAVERSE; }
			}, :i_dir);
	-- filter the SPOA vertice, and order by distance
	SEQUENCE<Vertex> s_neighbors = SEQUENCE<Vertex>(v IN Vertices(:g_spoa) WHERE :v."TYPE" == :i_type) ORDER BY ("$DISTANCE" ASC);
	FOREACH v IN :s_neighbors WITH ORDINALITY AS i {
		IF(:i > :i_k){break;}
		o_res."ID"[:i] = :v."ID";
		o_res."TYPE"[:i] = :v."TYPE";
		o_res."$DISTANCE"[:i] = :v."$DISTANCE";
	}
END;
Could not execute 'CREATE OR REPLACE PROCEDURE "SHORTEST_PATH_TO_ALL"( IN i_startVertex BIGINT, -- the ID of the start ...'
SAP DBTech JDBC: [4901]: GraphEngine: compilation failed: exception 73002201: syntax error, unexpected "," near ",": line 19 col 5 (at pos 852)

This yield into a syntax error. I've had this before when implementing the SHORTEST_PATH procedure. When leaving out the :i_dir variable it usually works, but now gives the following compilation error:

Could not execute 'CREATE OR REPLACE PROCEDURE "SHORTEST_PATH_TO_ALL"( IN i_startVertex BIGINT, -- the ID of the start ...'
SAP DBTech JDBC: [4901]: GraphEngine: compilation failed: exception 73002201: Unexpected reserved keyword "ORDER": line 21 col 101 (at pos 1008)

This looks like the SQL-like language is not allowed in the graph-procedure.

mfath
Advisor
Advisor

Hej Duco,

which HANA version are you on? The code I gave you runs on HANA Cloud QRC2. If you are on an older release we need to see how we can adapt. I'll try to get you into a chat... let's see never used before.

Regards, Markus

0 Kudos

Hi!

I've switched browsers and for some reason I can see the comments on my question again. Weird.

My version of HANA is:

It runs on HEC and is maintained by our companies IT-staff.

if a chat works, that'll be amazing 🙂