Skip to Content

using table variables in combined virtual and non virtual environment

According to one of the earlier posts on table variables in HANA SQL Script, a table variable is not materialized when it is created but it is rather 'inlined' whenever it is used in a query. See answer from Lars Breddemann in this thread: According to Lars this allows for more optimizations than simply materializing, which makes sense. However, for an environment where you have data in HANA and data in virtual tables (SDA), this is unwanted behavior (at least it is in my use case). I've tried creating table variables that only query data in the virtual tables and table variables that only query data in tables that are in HANA memory and then combining the two data sets assuming that HANA would materialize both data sets. But as Lars confirmed, HANA doesn't materialize the table variables which in the end causes the overall query that contains both virtual and non-virtual tables to be pushed to the virtual database. For performance reasons that is not what I want. Is there a way to force HANA to materialize the data that I pull from the virtual tables before I join it to the data that lives on HANA? If I can't force HANA to materialize a table variable, could I use an array to 'materialize' the virtual data?

Add comment
10|10000 characters needed characters exceeded

  • Follow
  • Get RSS Feed

1 Answer

  • May 24, 2018 at 02:44 AM

    You are mixing different concepts/technologies here.

    Query processing with virtual tables allows for "push-down" of join tables to the remote site. This is completely independent of how SQLScript deals with table variables.

    As for SDA, it can be quite complicated to determine whether or not a join or sub-select should be pushed to the remote site or whether it is better to get all necessary data into the local instance and continue processing the query there.

    To my knowledge, HANA 1 basically looked into optimizing for minimal total data transfer costs as this is one of the main drivers for long runtime. So HANA 1 tries to estimate how much data would need to be transferred to the remote site and back to the local instance to process the query. In many cases I've looked at - especially those with relatively large result sets - this approach will lead to the decision to not push down data to the remote site.

    Note, that this data transfer cost estimation does not take into account how long the data transfer between two systems actually take. As far as I know, there is no configurable transfer cost coefficient available in HANA 1

    HANA 2 appears to me to be more sophisticated for this scenario and produces even better execution plans. But, I don't have much insight into the implementation details for this.

    A rather important factor for using this query-block push-down is that the table/view meta-data (primary key/unique constraints, NOT NULL constraints and data statistics) is correct, up to date and as specific as possible. Not specifying e.g. the primary key on a table can lead to vastly different optimisation results.

    Coming back to your question: so far it worked out well for me to use a temporary table instead of a table variable in order to avoid push-down to the remote site, which is what you describe you want to do.

    Add comment
    10|10000 characters needed characters exceeded