A Look at the Oracle Group-by Bug

Oracle introduced a new feature, group by elimination, for queries where the group by column is also the table’s unique key. As with many new features this one still has not had all the kinks resolved. The problem arises when key values are manipulated with function calls. The following example will illustrate the issue by using a table with a DATE as the primary key and by extracting the year is extracted using TO_CHAR or EXTRACT.

A table is created as follows:


create table bug_test_calendar(
        cal_name   char(17),
        bus_dt   date,
        updt_timestamp       timestamp (6) default systimestamp,
        constraint pk_bug_test_calendar 
                        primary key (bus_dt)
)
/

insert into bug_test_calendar (bus_dt)
select
        sysdate + 10 * rownum
from 
        all_objects 
where 
        rownum <= 40 
/

commit;

When the query shown below is executed, it produces the following results:


select
        to_char(bus_dt,'YYYY') bus_dt, count(*) ct
from
       bug_test_calendar
group by 
        to_char(bus_dt,'YYYY')
order by 
        to_char(bus_dt,'YYYY')
/


BUS_DF   CT
-------  --
2020      1
2020      1
...
2020      1

40 rows returned

Oracle doesn’t ‘know’ that the key values have been manipulated so that they are no longer unique, thus the optimizer applies the unique-key-based group-by elimination with less than stellar results,

EXTRACT fares no better, returning the same results. This behavior is controlled by the “_optimizer_aggr_groupby_elim” parameter, which is set to true by default. As it’s a hidden parameter, its setting is not reported by Oracle in either of the V$PARAMEter or V$SPPARAMETER views. The work-around is to simply set this parameter to false. However, having it active might help other group-by queries where the unique key values are not manipulated.

Enter Oracle 19c, where this functionality is partially fixed:


select
        to_char(bus_dt,'YYYY') bus_dt, count(*) ct
from
       bug_test_calendar
group by 
        to_char(bus_dt,'YYYY')
order by 
        to_char(bus_dt,'YYYY')
/


BUS_DF   CT
-------  --
2020     40


Unfortunately EXTRACT is still broken in 19c:


select
        to_char(bus_dt,'YYYY') bus_dt, count(*) ct
from
       bug_test_calendar
group by 
        extract(year deom bus_dt)
order by 
        extract(year deom bus_dt)
/


BUS_DF   CT
-------  ==
2020      1
2020      1
...
2020      1

40 rows returned

Obviously given truly unique key values a group-by query would produce a count of 1 for each key. And, just as obvious, Oracle should be able to recognize when values are no longer unique and invoke the proper group-by mechanism. It remains to be seen if versions after 19c will fix the second condition and thus return correct results without having to turn off this feature.

This may not affect every installation of Oracle newer than 12.1, but it is worth knowing about should wrong results start appearing in selected group by queries.

# # #

See articles by David Fitzjarrell

David Fitzjarrell
David Fitzjarrell
David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles