Skip to Content
0
Former Member
Nov 01, 2012 at 02:12 PM

Communication Between SAP HANA and R Server

64 Views

Hello,

Where can I find more information about the internals on how the communication between SAP HANA and R Server occurs? What I'm trying to find out is that does R receive the data in bulk, or entry by entry?

For example, assuming I have an INPUT_TABLE with a column PERIOD (for a date) and a column VALUE (for some double value); then if I have the following implementation:

CREATE PROCEDURE CALC_STATS( )LANGUAGE SQLSCRIPT ASBEGIN    input_data = SELECT * FROM INPUT_TABLE;  CALL CALC_INPUT( :input_data, T_OUTPUT_TABLE );  INSERT INTO OUTPUT_TABLE SELECT * FROM :T_OUTPUT_TABLE;  END;

And for the RLANG procedure I have:

CREATE PROCEDURE CALC_INPUT( IN input_table INPUT_TABLE, OUT result T_OUTPUT_TABLE )LANGUAGE RLANG ASBEGIN  input_period <- input_table$PERIOD  input_value <- sum( as..double( input_table$VALUE ) )      result <- data.frame( PERIOD = input_period, VALUE = input_value )END;

After I run CALC_STATS(), my RLANG procedure for the sum() function will treat the input_table$VALUE as a vector representing the values from the whole table; that is, I will get the sum based on the values of ALL entries, whereas the input_period will contain only the value for the current entry. What does this mean?

Does the R-script get some sort of a cursor and then it pulls data as necessary? If I have 100-million of entries, do all of them get sent to the R-server at once? (assuming I do SELECT * FROM ...).

On the example above I would expect that for the following case:

INPUT_TABLE

PERIOD, VALUE

2012-11-01, 100

2012-11-02, 200

2012-11-03, 300

To have:

OUTPUT_TABLE

PERIOD, VALUE

2012-11-01, 100

2012-11-02, 200

2012-11-03, 300

Instead, I have:

OUTPUT_TABLE

PERIOD, VALUE

2012-11-01, 600

2012-11-02, 600

2012-11-03, 600

Why is that? Wouldn't R process an entry at a time?

Thanks in advance for reading thus far and for any assistance you can provide.

Genc