Discussion:
Overflow on SUM of a column defined as INT
(too old to reply)
Martin
2010-09-11 07:36:56 UTC
Permalink
If a column is defined as SMALLINT, and one does a SUM (or other similar
function) on that column, the result is an INT. However, if the column is an
INT, the result of the SUM function remains an INT.

This is causing overflow on our data warehouse queries that are
automatically generated by a point and click tool (such as Cognos), and we
cannot ask end users to customize the SQL to cast the result as BIGINT,
because it is all point and click and they never see the SQL.

Any chance of getting DB2 to automatically make SUM function result into
BIGINT if the table column is INT? How do I request such an enhancement from
IBM?

Having to define the detail columns as BIGINT just because a column may be
SUM'ed will waste a lot space in our warehouse since we have billions of
rows.
Tonkuma
2010-09-11 09:07:09 UTC
Permalink
One idea is to create a view using "BIGINT(int_column_name) AS
int_column_name".
Then, show the view to users.
Luuk
2010-09-11 10:43:11 UTC
Permalink
Post by Martin
If a column is defined as SMALLINT, and one does a SUM (or other similar
function) on that column, the result is an INT. However, if the column is an
INT, the result of the SUM function remains an INT.
This is causing overflow on our data warehouse queries that are
automatically generated by a point and click tool (such as Cognos), and we
cannot ask end users to customize the SQL to cast the result as BIGINT,
because it is all point and click and they never see the SQL.
Any chance of getting DB2 to automatically make SUM function result into
BIGINT if the table column is INT? How do I request such an enhancement from
IBM?
Having to define the detail columns as BIGINT just because a column may be
SUM'ed will waste a lot space in our warehouse since we have billions of
rows.
According to some googling, i found that DB2 functions can be overloaded

So you should be able (but i dont know how) to create your own SUM()
function that always returns a BIGINT
--
Luuk
Luuk
2010-09-11 13:09:39 UTC
Permalink
Post by Luuk
Post by Martin
If a column is defined as SMALLINT, and one does a SUM (or other similar
function) on that column, the result is an INT. However, if the column is an
INT, the result of the SUM function remains an INT.
This is causing overflow on our data warehouse queries that are
automatically generated by a point and click tool (such as Cognos), and we
cannot ask end users to customize the SQL to cast the result as BIGINT,
because it is all point and click and they never see the SQL.
Any chance of getting DB2 to automatically make SUM function result into
BIGINT if the table column is INT? How do I request such an
enhancement from
IBM?
Having to define the detail columns as BIGINT just because a column may be
SUM'ed will waste a lot space in our warehouse since we have billions of
rows.
According to some googling, i found that DB2 functions can be overloaded
So you should be able (but i dont know how) to create your own SUM()
function that always returns a BIGINT
to clarify more,
there is an "How to Create User-Defined Functions" in COGNOS
(i found it at: http://apsdwh.vlaanderen.be/cognos/help/hwtcrtudfs.pdf)

i think this is a problem for COGNOS, or any reporting tool...?
--
Luuk
Tonkuma
2010-09-11 13:20:45 UTC
Permalink
Post by Luuk
According to some googling, i found that DB2 functions can be overloaded
So you should be able (but i dont know how) to create your own SUM()
function that always returns a BIGINT
Excellent!

I tried it and successed.

Here are examples.
(Please see data type of result column educated_manpower.)

------------------------- Commands Entered -------------------------
CREATE FUNCTION db2admin.SUM( INTEGER )
SOURCE sysibm.SUM( BIGINT )
RETURNS BIGINT
;
--------------------------------------------------------------------
DB20000I The SQL command completed successfully.

------------------------- Commands Entered -------------------------
VALUES current path;
--------------------------------------------------------------------

1
--------------------------------------------------
"SYSIBM","SYSFUN","SYSPROC","SYSIBMADM","DB2ADMIN"

1 record(s) selected.


------------------------- Commands Entered -------------------------
SET PATH = "DB2ADMIN" , "SYSIBM","SYSFUN","SYSPROC","SYSIBMADM";
--------------------------------------------------------------------
DB20000I The SQL command completed successfully.

------------------------- Commands Entered -------------------------
DESCRIBE
SELECT workdept
, SUM( INT(edlevel) ) AS educated_manpower
, SUM(salary + bonus) AS yearly_payment
FROM employee
GROUP BY
workdept
;
--------------------------------------------------------------------

Column Information

Number of columns: 3

SQL type Type length Column name
Name length
-------------------- ----------- ------------------------------
-----------
453 CHARACTER 3
WORKDEPT 8
493 BIGINT 8
EDUCATED_MANPOWER 17
485 DECIMAL 31, 2
YEARLY_PAYMENT 14


------------------------- Commands Entered -------------------------
SELECT workdept
, SUM( INT(edlevel) ) AS educated_manpower
, SUM(salary + bonus) AS yearly_payment
FROM employee
GROUP BY
workdept
;
--------------------------------------------------------------------

WORKDEPT EDUCATED_MANPOWER YEARLY_PAYMENT
-------- -------------------- ---------------------------------
A00 51 271000.00
B01 18 95050.00
C01 54 242370.00
D11 149 516500.00
D21 93 323820.00
E01 16 80975.00
E11 71 237090.00
E21 60 217210.00

8 record(s) selected.


/**************************************************
********** Return CURRENT PATH ********************
**************************************************/

------------------------- Commands Entered -------------------------
SET PATH = "SYSIBM","SYSFUN","SYSPROC","SYSIBMADM","DB2ADMIN";
--------------------------------------------------------------------
DB20000I The SQL command completed successfully.

------------------------- Commands Entered -------------------------
DESCRIBE
SELECT workdept
, SUM( INT(edlevel) ) AS educated_manpower
, SUM(salary + bonus) AS yearly_payment
FROM employee
GROUP BY
workdept
;
--------------------------------------------------------------------

Column Information

Number of columns: 3

SQL type Type length Column name
Name length
-------------------- ----------- ------------------------------
-----------
453 CHARACTER 3
WORKDEPT 8
497 INTEGER 4
EDUCATED_MANPOWER 17
485 DECIMAL 31, 2
YEARLY_PAYMENT 14
Martin
2010-09-12 01:04:31 UTC
Permalink
Post by Tonkuma
Excellent!
I tried it and successed.
SUM( INT(edlevel) )
I don't understand how a point-and-click report writer (which only
understands standard SQL functions) is going to generate the above
expression. Maybe I am missing something.

BTW, you don't need to cast a SUM as INT, since that is the default of doing
a SUM, even if the base column is a SMALLINT. The problem is that a SUM of
INT also returns INT (instead of BIGINT).
Tonkuma
2010-09-12 02:30:26 UTC
Permalink
Post by Martin
Post by Tonkuma
Excellent!
I tried it and successed.
SUM( INT(edlevel) )
I don't understand how a point-and-click report writer (which only
understands standard SQL functions) is going to generate the above
expression. Maybe I am missing something.
If current_path was "DB2ADMIN" ,
"SYSIBM","SYSFUN","SYSPROC","SYSIBMADM",
db2admin.SUM was used for SUM( INT(edlevel) ) in the query:
SELECT workdept
, SUM( INT(edlevel) ) AS educated_manpower
, SUM(salary + bonus) AS yearly_payment
FROM employee
GROUP BY
workdept
;

If current_path was SYSIBM","SYSFUN","SYSPROC","SYSIBMADM","DB2ADMIN",
sysibm.SUM was used for SUM( INT(edlevel) ) in same query.

So, it is possible to let DB2 use db2admin.SUM instead of sysibm.SUM
without modifying queries(wheather the queries were generated by tools
or constructed by users/programmers/administrators), by updating
current_path.
Post by Martin
BTW, you don't need to cast a SUM as INT, since that is the default of doing
a SUM, even if the base column is a SMALLINT. The problem is that a SUM of
INT also returns INT (instead of BIGINT).
I used INT(edlevel), because I couldn't find INT column in DB2
supplied SAMPLE tables.
Martin
2010-09-12 00:59:00 UTC
Permalink
Post by Luuk
According to some googling, i found that DB2 functions can be overloaded
So you should be able (but i dont know how) to create your own SUM()
function that always returns a BIGINT
--
Luuk
I don't understand how a custom function is going to be used by a
point-and-click report generator.
Luuk
2010-09-12 16:20:01 UTC
Permalink
Post by Martin
Post by Luuk
According to some googling, i found that DB2 functions can be overloaded
So you should be able (but i dont know how) to create your own SUM()
function that always returns a BIGINT
--
Luuk
I don't understand how a custom function is going to be used by a
point-and-click report generator.
"Tonkuma" explained that in another thread....
(if you dont see it, look here: http://tinyurl.com/2w54s7x)

Basically you define a new function for the user of this report-regenerator,
give it the name SUM
and make it return a BIGINT (which contains, of course, the SUM)
--
Luuk
Martin
2010-09-12 17:39:15 UTC
Permalink
Post by Luuk
"Tonkuma" explained that in another thread....
(if you dont see it, look here: http://tinyurl.com/2w54s7x)
Basically you define a new function for the user of this
report-regenerator,
give it the name SUM
and make it return a BIGINT (which contains, of course, the SUM)
--
Luuk
I did the following:

1. created the new function as DB2INST1.SUM that converts SUM to BIGINT
2. db2 SET PATH = "DB2INST1","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM"
3. db2 select sum(edlevel) from emp -- this returns an INT
4. db2 select db2inst1.sum(edlevel) from emp -- this returns a BIGINT

I need number 3 above to return a BIGINT. I cannot have the query tool
customize the SQL in any way.
Tonkuma
2010-09-12 19:45:31 UTC
Permalink
Post by Martin
1. created the new function as DB2INST1.SUM that converts SUM to BIGINT
2. db2 SET PATH = "DB2INST1","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM"
3. db2 select sum(edlevel) from emp -- this returns an INT
4. db2 select db2inst1.sum(edlevel) from emp -- this returns a BIGINT
I need number 3 above to return a BIGINT. I cannot have the query tool
customize the SQL in any way.
How did you created the table emp?
(I don't know DB2 supplied emp table.)

I guessed that data type of edlevel in emp is SMALLINT.
Another my guess was that you created the new function as
DB2INST1.SUM( INT ).
If so, best fit for sum(edlevel) in number 3 above is sysibm.sum.
Because sysibm.sum accept SMALLINT directly, while db2inst1.sum need
to convert SMALLINT to INT.

If all my guess looks appropriate, you can confirm my guess by using
sum( INT(edlevel) ) in number 3.
Or change data type of edlevel in emp to INT.
Martin
2010-09-13 00:09:58 UTC
Permalink
Post by Tonkuma
How did you created the table emp?
(I don't know DB2 supplied emp table.)
I guessed that data type of edlevel in emp is SMALLINT.
Yes it is SMALLINT. So what? The IBM supplied SUM function on a SMALLINT
returns a INT. The SUM funciton on an INT returns an INT, and a SUM function
on a BIGINT returns a BIGINT. That's they way DB2 LUW 9.7 FP2 works.
Post by Tonkuma
Another my guess was that you created the new function
as DB2INST1.SUM( INT ).
If so, best fit for sum(edlevel) in number 3 above is sysibm.sum.
Because sysibm.sum accept SMALLINT directly, while db2inst1.sum need
to convert SMALLINT to INT.
If all my guess looks appropriate, you can confirm my guess by using
sum( INT(edlevel) ) in number 3.
Or change data type of edlevel in emp to INT.
I have no idea what you are talking about. The problem is that, even though
I created a UDF called DB2INST1.SUM that does always return a BIGINT as I
want, it will not get invoked by using the SUM(COL-NAME), unless I specify
DB2INST1.SUM(COL_NAME). But I can't do that since the select statemnet is
created by the query tool and is not hand-coded.
Luuk
2010-09-13 09:16:35 UTC
Permalink
Post by Martin
Post by Tonkuma
How did you created the table emp?
(I don't know DB2 supplied emp table.)
I guessed that data type of edlevel in emp is SMALLINT.
Yes it is SMALLINT. So what? The IBM supplied SUM function on a SMALLINT
returns a INT. The SUM funciton on an INT returns an INT, and a SUM function
on a BIGINT returns a BIGINT. That's they way DB2 LUW 9.7 FP2 works.
Post by Tonkuma
Another my guess was that you created the new function
as DB2INST1.SUM( INT ).
If so, best fit for sum(edlevel) in number 3 above is sysibm.sum.
Because sysibm.sum accept SMALLINT directly, while db2inst1.sum need
to convert SMALLINT to INT.
If all my guess looks appropriate, you can confirm my guess by using
sum( INT(edlevel) ) in number 3.
Or change data type of edlevel in emp to INT.
I have no idea what you are talking about. The problem is that, even though
I created a UDF called DB2INST1.SUM that does always return a BIGINT as I
want, it will not get invoked by using the SUM(COL-NAME), unless I specify
DB2INST1.SUM(COL_NAME). But I can't do that since the select statemnet is
created by the query tool and is not hand-coded.
ok, i finnally tries it out meself... ;-)

i'm using database "TEST"

i did create 2 functions:
CONNECT TO TEST;
CREATE FUNCTION luuk.SUM( SMALLINT )
SOURCE sysibm.SUM( BIGINT )
RETURNS BIGINT
;

CONNECT TO TEST;
CREATE FUNCTION luuk.SUM( INTEGER )
SOURCE sysibm.SUM( BIGINT )
RETURNS BIGINT
;

DESCRIBE SELECT * FROM TEST
returns:
SQL type;Type length; Column name; Name length
497 INTEGER; 4; TESTINT; 7
501 SMALLINT; 2; TESTSMALL; 9

after:
set path = "LUUK","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM"

this:
DESCRIBE SELECT SUM(TESTINT) AS sum FROM TEST
returned:
SQL type; Type length; Column name; Name length
493 BIGINT; 8; SUM; 3

this:
DESCRIBE SELECT SUM(TESTSMALL) AS sum FROM TEST
returned:
SQL type; Type length; Column name; Name length
493 BIGINT; 8; SUM; 3


My conclusion is that this works,
The only thing you have to find out
(and my DB2 knowlede is too limited to know)
it this default PATH can be changed.....
--
Luuk
Mark A
2010-09-13 14:13:59 UTC
Permalink
Post by Luuk
ok, i finnally tries it out meself... ;-)
i'm using database "TEST"
My conclusion is that this works,
The only thing you have to find out
(and my DB2 knowlede is too limited to know)
it this default PATH can be changed.....
--
Luuk
Yes, we already know that creating and executing the functions work if the
schema name of the new function is specified. That was absolutely clear from
the previous posts if you read them.

The problem is in getting the newly defined function called <schema>.SUM to
be invoked when SUM is specified in the SQL. To say that this is "the only
thing ...." is a gross understatement.
Luuk
2010-09-13 16:53:59 UTC
Permalink
Post by Mark A
Post by Luuk
ok, i finnally tries it out meself... ;-)
i'm using database "TEST"
My conclusion is that this works,
The only thing you have to find out
(and my DB2 knowlede is too limited to know)
it this default PATH can be changed.....
--
Luuk
Yes, we already know that creating and executing the functions work if the
schema name of the new function is specified. That was absolutely clear from
the previous posts if you read them.
sorry, it was not that clear to /me
cause will be the combination of my English, and my knowledge of DB2
Post by Mark A
The problem is in getting the newly defined function called<schema>.SUM to
be invoked when SUM is specified in the SQL. To say that this is "the only
thing ...." is a gross understatement.
Do you expect any other problems?

I do mean, when only the PATH of this 1 user changes, all other users of
the same database should not notice anything.
--
Luuk
Mark A
2010-09-13 18:05:50 UTC
Permalink
Post by Luuk
Do you expect any other problems?
I do mean, when only the PATH of this 1 user changes, all other users of
the same database should not notice anything.
--
Luuk
This is what Martin posted:

I did the following:

1. created the new function as DB2INST1.SUM that converts SUM to BIGINT
2. db2 SET PATH = "DB2INST1","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM"
3. db2 select sum(edlevel) from emp -- this returns an INT
4. db2 select db2inst1.sum(edlevel) from emp -- this returns a BIGINT

Notice that even after setting the path in step 2, that step 3 does not work
as desired (returns INT).

Step 4 works (returns BIGINT) when explicitly naming the new UDF (including
schema name).
Tonkuma
2010-09-13 18:55:08 UTC
Permalink
Post by Martin
1. created the new function as DB2INST1.SUM that converts SUM to BIGINT
2. db2 SET PATH = "DB2INST1","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM"
3. db2 select sum(edlevel) from emp -- this returns an INT
4. db2 select db2inst1.sum(edlevel) from emp -- this returns a BIGINT
Notice that even after setting the path in step 2, that step 3 does not work
as desired (returns INT).
Because data type of edlevel used in sum(edlevel) is SMALLINT, so
sysibm.SUM is "best fit" to sum(edlevel) than db2inst1.SUM which takes
INT as a parameter.

You can't say "SET PATH not worked as expected" from Martin's test
steps.
(Matching data type is precedent than schema selection using
current_path.)

DB2 will choose db2inst1.SUM for sum( INT ).

You can test this by changing data type of edlevel to INT or by using
INT column(or expression) for sum(expression) in your query.

If you couldn't understand my explanations(including my previous
post),
you should read carefully "Function resolution", "Determining the set
of candidate functions" and "Determining the best fit" in Chapter 2.
Language elements in the manual "DB2 SQL Reference, Volume 1".
Luuk
2010-09-13 20:01:32 UTC
Permalink
Post by Tonkuma
Post by Martin
1. created the new function as DB2INST1.SUM that converts SUM to BIGINT
2. db2 SET PATH = "DB2INST1","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM"
3. db2 select sum(edlevel) from emp -- this returns an INT
4. db2 select db2inst1.sum(edlevel) from emp -- this returns a BIGINT
Notice that even after setting the path in step 2, that step 3 does not work
as desired (returns INT).
Because data type of edlevel used in sum(edlevel) is SMALLINT, so
sysibm.SUM is "best fit" to sum(edlevel) than db2inst1.SUM which takes
INT as a parameter.
You can't say "SET PATH not worked as expected" from Martin's test
steps.
(Matching data type is precedent than schema selection using
current_path.)
lucky /me.... my first reading make me think this,
thats why i started testing tooo.

my tests shows changing PATH works....
Post by Tonkuma
DB2 will choose db2inst1.SUM for sum( INT ).
You can test this by changing data type of edlevel to INT or by using
INT column(or expression) for sum(expression) in your query.
If you couldn't understand my explanations(including my previous
post),
you should read carefully "Function resolution", "Determining the set
of candidate functions" and "Determining the best fit" in Chapter 2.
Language elements in the manual "DB2 SQL Reference, Volume 1".
nah, too much IBM documents..... ;-)
--
Luuk
Martin
2010-09-14 04:59:39 UTC
Permalink
Post by Tonkuma
Because data type of edlevel used in sum(edlevel) is SMALLINT, so
sysibm.SUM is "best fit" to sum(edlevel) than db2inst1.SUM which takes
INT as a parameter.
OK, my apologies. It does work when doing a SUM on INT column with my own
UDF and the set path does work. I didn't understand (or I didn't believe)
that there was a distiction between SUMing a SMALLINT and INT, but there
obviously is.

Now I just need to to figure out how to automatically set the path for
ad-hoc users (they will not do it manually). Looks like 9.7 PF 3 may do
that, but waiting on FP3 for the Linux 64-bit version for AMD/Intel to be
released.
Serge Rielau
2010-09-13 04:17:30 UTC
Permalink
Martin,

Such a change would break a large number of existing applications.
So your request will yield a near "over my dead body" answer ;-)
The behavior of SUM is well documented and when a tool is summing INTs
and there is a fear that may overflow then it shall cast to a higher
type. And that may not even be BIGINT, but DECFLOAT(34).

To fix an existing behavior the proposal to overload is a good one.
Not that in DB2 9.7 FP3 there is also a "connect procedure" which allows
you to force e.g. a SET PATH statement onto any session.

Cheers
Serge
--
Serge Rielau
SQL Architect DB2 for LUW
IBM Toronto Lab
Martin
2010-09-13 06:58:00 UTC
Permalink
Post by Serge Rielau
Martin,
Such a change would break a large number of existing applications.
So your request will yield a near "over my dead body" answer ;-)
The behavior of SUM is well documented and when a tool is summing INTs and
there is a fear that may overflow then it shall cast to a higher type. And
that may not even be BIGINT, but DECFLOAT(34).
To fix an existing behavior the proposal to overload is a good one.
Not that in DB2 9.7 FP3 there is also a "connect procedure" which allows
you to force e.g. a SET PATH statement onto any session.
Cheers
Serge
--
Serge Rielau
SQL Architect DB2 for LUW
IBM Toronto Lab
How about a environment variable to override the current SUM function that
would casue the SUM to return a BIGINT. This would be useful for Data
Warehouse apps where query tools generate the SQL and we cannot cast the
columns becasue the tool generates the SQL on the fly for ad-hoc users.

I tried to create my own SUM funciton (DB2INST1.SUM) that works fine, but
cannot override the search order with the PATH statement that would cause it
to be invoked with just referencing SUM (as I mentioned in my previous
post). Am I missing something?
Ian
2010-09-13 22:09:32 UTC
Permalink
Post by Serge Rielau
Not that in DB2 9.7 FP3 there is also a "connect procedure" which allows
you to force e.g. a SET PATH statement onto any session.
This is so awesome, you just made my day. Thank you for adding this
feature!


Ian
Martin
2010-09-14 04:52:18 UTC
Permalink
Post by Serge Rielau
Not that in DB2 9.7 FP3 there is also a "connect procedure" which allows
you to force e.g. a SET PATH statement onto any session.
Is this a client feature or server feature? Do you know where it is
documented?
Helmut Tessarek
2010-09-14 05:21:18 UTC
Permalink
Post by Martin
Is this a client feature or server feature? Do you know where it is
documented?
This is a server feature, implemented as a db configuration parameter.

See: http://bit.ly/c6T8ay
--
Helmut K. C. Tessarek
DB2 Performance and Development

/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/
Serge Rielau
2010-09-14 14:16:38 UTC
Permalink
Post by Helmut Tessarek
Post by Martin
Is this a client feature or server feature? Do you know where it is
documented?
This is a server feature, implemented as a db configuration parameter.
See: http://bit.ly/c6T8ay
db2 => CREATE OR REPLACE PROCEDURE SERGE.ON_CONNECT() SET PATH MYSTUFF,
CURRENT PATH;
DB20000I The SQL command completed successfully.
db2 => UPDATE DB CFG USING CONNECT_PROC 'SERGE.ON_CONNECT';
DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.
db2 => VALUES CURRENT PATH;
1
--------------------------------------------------------------------
"SYSIBM","SYSFUN","SYSPROC","SYSIBMADM","SERGE"
1 record(s) selected.

db2 => CONNECT RESET;
DB20000I The SQL command completed successfully.
db2 => CONNECT TO TEST;

Database Connection Information

Database server = DB2/NT 9.7.3
SQL authorization ID = SERGE
Local database alias = TEST

db2 => VALUES CURRENT PATH;
1
---------------------------------------------------------
"MYSTUFF","SYSIBM","SYSFUN","SYSPROC","SYSIBMADM","SERGE"
1 record(s) selected.

db2 =>
--
Serge Rielau
SQL Architect DB2 for LUW
IBM Toronto Lab
p***@gmail.com
2018-06-18 18:13:58 UTC
Permalink
I know this is an old thread, but I just ran into this. Wrapping the column in bigint solved it.

SELECT
SUM(BIGINT(COLUMNNAME))
FROM TABLE.

Loading...