cancel
Showing results for 
Search instead for 
Did you mean: 

2 billion row limit?

Former Member
0 Kudos

I am new to SAP HANA and came across several issues when handling large tables (>2 billion rows).

As far as I understood, such tables need to be partitioned when written but apart from that on does not need to bother. Therefore I am quite surprised to run constantly into trouble when querying large tables. Regardless of performance-reasons to split such large tables, I am wondering whether there is really no other option.

To be more specific, I am working in the following setup:

- A table "T" with primary key "a" and roughly 2.2 billion rows

- A table "S" with several million rows

- A decomposition of "T" into "T1 and "T2" of similar size such that T=T1 union T2

On these tables the following statements fail:

1.

select T.a, S.a

from T

inner join S

on T.a=S.a

--> SAP DBTech JDBC: [2048]: column store error: search table error:  [2598] column search intermediate result exceeds 2 billion rows limitation

2.

select b, count(a)

from (

    select a, b from T1

    union

    select a, b from T2

);

--> SAP DBTech JDBC: [2048]: column store error: search table error:  [34104] Intermediate result is too large in CalculationEngine.

3.

create column table MYTABLE as (

    select a from T

) partition by roundrobin partitions 16;

--> [129]: transaction rolled back by an internal error: exception 10001001: Search result size limit exceeded: 2236482994

In general, it appears that large tables (>2 billion rows) can be handled whenever they are already partitioned and written. In contrast, when they are created during a query (e.g. by an outer join or a union), HANA mostly terminates with an errror.

Is this behaviour intended by HANA?

Is there an explicit rule what HANA can handle?

Are there hints or other routines to overcome the described issues?

I really appreciate any clarifications.

Cheers,

Mapkyc

Accepted Solutions (0)

Answers (3)

Answers (3)

suresh_devarajan
Explorer
0 Kudos

Please refer to OSS note 2154870 - How-To: Understanding and defining SAP HANA Limitations


Some of the scenarios shown by you are listed in this note.


Thanks,


Suresh

Former Member
0 Kudos

Thanks for pointing me to the "Limitations"-Note.

It doesn't really improve my understanding of the issue though. The note paritcularly says:

In very specific scenarios limits for (intermediate) result sets exist, e.g. 2147483648 (2 billion) or 4294967296 (4 billion). If a limit is hit, an error like the following is issued:

search table error: [2598] column search intermediate result exceeds 2 billion rows limitation search table error: [2999] Docid sequence with more that 4 billion elements not supported

search table error: [2724] Olap temporary data size exceeded 31/32 bit limit

exception 10001001: Search result size limit exceeded: <result_size>

In order to prevent these errors, you have to optimize the underlying database requests (see SAP Note 2000002).

This is really vague. What are "very specific scenarios"? And what does "optimize the underlying database requests" generally mean?

Former Member
0 Kudos

Hi Mapkyc,

I got the same error, did you find a solution to fix that ?

Thanks

Houssem

Former Member
0 Kudos

Hi Houssem,

it seems that although you can create tables with more then 2bn rows (by partitioning), HANA is in general simply not capable of working with those. This holds also for intermediate results which exceed 2bn rows. You will find queries that work on such tables but I couldn't find a characterisation when it does work.

Accordingly, the only solution that I found is to break up my queries into several smaller ones. Which can become very cumbersome... 😞

Cheers,

mapkyc

lbreddemann
Active Contributor
0 Kudos

The limitations around 2 billion records per table partition as well as the limitations around the intermediate result set sizes have been documented and explained quite extensively.

When you write you don't understand what it means to optimise the database requests in order to avoid data processing situations where these limits are hit, what is it that you don't get?

Looking at your queries, I wonder whether these are just general "try out" queries or if there is an actual application use linked to them.

For example:

select b, count(a)

from (

    select a, b from T1

    union

    select a, b from T2

);

Is a pretty odd statement in terms of semantics.

You want a count of a per group by b over the distinct results of the union.

This could be easily rewritten to

SELECT b, sum(cnt_a) FROM (

select b, count(a) cnt_a from T1

UNION ALL

select b, count(a) cnt_a from T2)

Which - as long as B is not the 2 billion unique key, but something that actually leads to some record aggregation - would avoid the otherwise necessary large intermediate result set required for the original UNION.

Having written that now, I'm eager to learn about the many use cases where you have to break up your queries in a cumbersome fashion.

After nearly 6 years of working with SAP HANA most of the cases I've seen where these limitations come up where either test-scenarios (like this one) or naive data model implementations.

former_member183326
Active Contributor
0 Kudos

Hello,

I have seen this issue multiple times.

Truncating the table will solve the issue.

0 Kudos

Hi,

as you already mentioned, these tables need to be partitioned on writing. Could you please try to ctreate the structure of the table, partition it and re -load it with the data.

Regards

-Seb.

Former Member
0 Kudos

Thanks for your message, Seb.

Not sure, what you mean. Statements 1 and 2 do not refer to writing the table so I guess you refer to statement 3.

When I first create a partitioned structure of the table, I can fill it with more than 2 billion rows. But only if this is done in several steps. I.e. the following works:

create column table MYTAB (

...

) partition by roundrobin partitions 8;

insert into MYTAB (

select * from T1

);

insert into MYTAB (

select * from T2

);

But not:

create column table MYTAB (

...

) partition by roundrobin partitions 8;

insert into MYTAB (

select * from T

);

mapkyc