Performance is the reason companies choose Exadata over other platforms, as it can provide avenues of improvement commodity hardware can’t match. One of those avenues is the Smart Scan, a mechanism where the database servers and the storage cells join forces to return the requested results. Let’s see what triggers a Smart Scan and discover how it reduces the processing time for qualifying queries.
Smart Scans are remarkable; they take a seemingly common query and divide the labor between the database servers and the storage cells to return results in a fraction of the time required from non-Exadata hardware. There are several conditions, which must be met before a Smart Scan can be executed; I’ve mentioned these before but I will provide them again. To qualify for a Smart Scan a query must:
Use full table scans and/or full index scans coupled with direct path reads, have at least one predicate and use one or more of the following simple comparison operators
=,<,>,>=,=<,BETWEEN,IN,IS NULL,IS NOT NULL
– or –
Use parallel query, again with at least one predicate and using the above mentioned simple comparison operators.
Once the criterion is met the ‘magic’ begins. Oracle will pass the query to the storage cells in a manner similar to that used by Parallel Query (but, remember, Parallel Query does not need to be in use to get a Smart Scan to run). Each storage cell then processes the query/where clause, builds or updates the various storage indexes and returns only the requested data to the database servers eliminating the need for these servers to actually scan/read entire data blocks. Since the end result of such pre-processing is much smaller, the database servers can more quickly process the data, returning the final results to the user very quickly.
Knowing if a Smart Scan was used is fairly simple, but does require a bit of work. Execution plans can show Smart Scans were executed but those plans need to be generated by the optimizer at run time. Explain Plan won’t be a reliable indicator of Smart Scan usage as a different mechanism is used to generate the anticipated plan. Note that I said ‘anticipated plan’ as that is exactly what Explain Plan provides, a guess at what Oracle will actually do to return the data. Valid methods for gathering usable plans are:
- Using autotrace on
- Querying either V$SQL_PLAN or DBA_HIST_SQL_PLAN for the actual plan
- Tracing the session with event 10046 and processing the trace file with tkprof
[There are a number of options to autotrace, including explain, but the explain option is no better than using Explain Plan.]
There are also two counters available in the V$SQL and GV$SQL views, which report on Smart Scan activity; these are IO_CELL_OFFLOAD_ELIGIBLE_BYTES and IO_CELL_OFFLOAD_RETURNED_BYTES. Let’s look at a query that uses a Smart Scan and query V$SQL for the resulting I/O savings:
SQL> select * 2 from emp 3 where empid = 7934; EMPID EMPNAME DEPTNO ---------- ---------------------------------------- ---------- 7934 Smorthorper7934 15 Elapsed: 00:00:00.02 Execution Plan ---------------------------------------------------------- Plan hash value: 3956160932 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 26 | 1167 (1)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| EMP | 1 | 26 | 1167 (1)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("EMPID"=7934) filter("EMPID"=7934) Statistics ---------------------------------------------------------- 1 recursive calls 1 db block gets 4349 consistent gets 4276 physical reads 0 redo size 680 bytes sent via SQL*Net to client 524 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL> set autotrace off timing off SQL> SQL> select sql_id, 2 io_cell_offload_eligible_bytes qualifying, 3 io_cell_offload_returned_bytes actual, 4 round(((io_cell_offload_eligible_bytes - io_cell_offload_returned_bytes)/io_cell_offload_eligible_bytes)*100, 2) io_saved_pct, 5 sql_text 6 from v$sql 7 where io_cell_offload_returned_bytes > 0 8 and instr(sql_text, 'emp') > 0 9 and parsing_schema_name = 'BING'; SQL_ID QUALIFYING ACTUAL IO_SAVED_PCT SQL_TEXT ------------- ---------- ---------- ------------ -------------------------------------- gfjb8dpxvpuv6 35028992 6872 99.98 select * from emp where empid = 7934 SQL> SQL> create index empid_idx on emp(empid); Index created. SQL> SQL> set autotrace on timing on SQL> SQL> select * 2 from emp 3 where empid = 7934; EMPID EMPNAME DEPTNO ---------- ---------------------------------------- ---------- 7934 Smorthorper7934 15 Elapsed: 00:00:01.00 Execution Plan ---------------------------------------------------------- Plan hash value: 1109982043 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 26 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 26 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | EMPID_IDX | 1 | | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("EMPID"=7934) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 5 consistent gets 2 physical reads 148 redo size 684 bytes sent via SQL*Net to client 524 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL> set autotrace off timing off SQL> SQL> select sql_id, 2 io_cell_offload_eligible_bytes qualifying, 3 io_cell_offload_returned_bytes actual, 4 round(((io_cell_offload_eligible_bytes - io_cell_offload_returned_bytes)/io_cell_offload_eligible_bytes)*100, 2) io_saved_pct, 5 sql_text 6 from v$sql 7 where io_cell_offload_returned_bytes > 0 8 and instr(sql_text, 'emp') > 0 9 and parsing_schema_name = 'BING'; no rows selected SQL>
The execution plan reports that a Smart Scan was used — the TABLE SCAN STORAGE FULL entry and the storage(“EMPID”=7934) predicate information indicate this. To back up that assertion the query from V$SQL shows that there were qualifying and returned offload bytes and computes the percentage of I/O saved because of the Smart Scan. Following traditional practices, an index was created to eliminate the full table scan; notice that the presence of the index also disabled the Smart Scan and increased the response time for the query. Granted, had it not been reported by SQL*Plus, the difference would not have been noticeable but it is interesting to see the elapsed times for both queries.
The I/O savings with Smart Scans can be dramatic; in some cases a Smart Scan can outperform an index scan on the same data (proven by the example shown), which is why it may be beneficial to remove some indexes from tables migrated to Exadata.
Smart Scans may not be the greatest thing since sliced bread but they do fulfill the promise of improved performance offered by Exadata. The nice thing about these is you don’t have to set any obscure parameter to get them to work, which goes to show that simpler can indeed, be better.