Skip to Content

Scalar Function Memory Allocation - Poor Performance

Hi all,

I am doing tests using Scalar Functions and I realize that using this type of function my SQL queries get much slower. To try to understand the reasons for this poor performance I created two scenarios to compare.

I'm using TABLE_1 whith 6k records.

SELECT FIELD1, FIELD2, (SELECT 4 FROM DUMMY) AS VLR  FROM "TABLE_1" ;

and the plan of above SQL

To compare, I create a simple Scalar Fucntion as follow:

FUNCTION "MY_SCHEMA"."comp.test.functions::return_value" (P_1 NUMBER ) 
	RETURNS RESULT NUMBER
	LANGUAGE SQLSCRIPT
	SQL SECURITY INVOKER AS
BEGIN


 DECLARE V_V1  NUMBER;
 
 SELECT :P_1 INTO V_V1 FROM DUMMY;
 
 RESULT := V_V1;
 
END;
SELECT FIELD1, FIELD2,"MY_SCHEMA"."comp.test.functions::return_value" (3 ) AS VLR FROM "TABLE_1"

and the plan of above SQL

Perceive the difference between the execution times, compilation and the volume of memory allocated.

Why Scalar Function are so slow?

Why does Scalar Function allocate so much memory? (14GB in second image)

Thanks

plan1.png (20.8 kB)
plan2.png (34.0 kB)
Add comment
10|10000 characters needed characters exceeded

  • Get RSS Feed

1 Answer

  • Apr 01, 2017 at 12:19 AM

    The two cases are not comparable. The sub-select in the first statement is a constant and gets factored out by the optimizer.

    By rewriting the function to not use a SELECT INTO ... FROM DUMMY you get the same level of performance and memory consumption.

    Also, up until HANA 2 SQLScript didn't provide statement caching, which added parsing overhead to every executed query.

    Besides that, when you compile your function this is the warning you get:

    java.sql.SQLWarning: Not recommended feature: Using SELECT INTO in Scalar UDF

    Generally, you want to avoid single record processing (which is what you do with scalar functions) but try and process whole sets of data instead.

    Add comment
    10|10000 characters needed characters exceeded

    • Because it is in fact not a simple function, but one that does involve a SQL query. Any SQL query in itself is treated as a sort of program that needs to be parsed, analysed, optimised, and executed. As you call the program practically in a loop, the intermediate result sets & parsed statements are kept during the lifetime of your query.

      So, the high memory consumption represents the accumulated memory consumption of all the thousands of calls.

      With HANA2 there is more caching of SQLScript statements and resultset happening and it's possible to declare functions as deterministic - which reduces the need to re-execute function for the same input parameters.

      In total, the recommendation is to avoid any SELECT statement in scalar functions when performance is important. Instead, these should be small and ideally self-contained.