Monitoring Smart Scans in Oracle Exadata

Smart Scans are part of what Kerry Osborne calls the ‘secret sauce’ of Exadata, and he’s right. Smart Scans can definitely improve throughput by using the various optimizations to reduce the amount of data read and the volume of data returned to the database servers. You really want to have them working on an Exadata machine. But how do you know they’re working? Let’s look at a few ways to prove you have Smart Scans working for you.

The easiest way to prove you have a Smart Scan working (from the database server side) is to query the V$SQL view (or GV$SQL if you want to check across the RAC cluster). There are two ‘counters’ available, which report on Smart Scan activity; these are IO_CELL_OFFLOAD_ELIGIBLE_BYTES and IO_CELL_OFFLOAD_RETURNED_BYTES. I’ve used this example in another article but it’s a good example for showing Smart Scan activity so I’ll use it again here:

SQL> select *
  2  from emp
  3  where empid = 7934;

     EMPID EMPNAME                                      DEPTNO
---------- ---------------------------------------- ----------
      7934 Smorthorper7934                                  15

Elapsed: 00:00:00.02

Execution Plan
Plan hash value: 3956160932

| Id  | Operation                 | Name | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT          |      |     1 |    26 |  1167   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL| EMP  |     1 |    26 |  1167   (1)| 00:00:01 |

Predicate Information (identified by operation id):

   1 - storage("EMPID"=7934)

          1  recursive calls
          1  db block gets
       4349  consistent gets
       4276  physical reads
          0  redo size
        680  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> set autotrace off timing off
SQL> select       sql_id,
  2          io_cell_offload_eligible_bytes qualifying,
  3          io_cell_offload_returned_bytes actual,
  4          round(((io_cell_offload_eligible_bytes - io_cell_offload_returned_bytes)/io_cell_offload_eligible_bytes)*100, 2) io_saved_pct,
  5          sql_text
  6  from v$sql
  7  where io_cell_offload_returned_bytes > 0
  8  and instr(sql_text, 'emp') > 0
  9  and parsing_schema_name = 'BING';

------------- ---------- ---------- ------------ --------------------------------------
gfjb8dpxvpuv6   35028992       6872        99.98 select * from emp where empid = 7934


Looking at the output from the V$SQL query we can see that a Smart Scan was executed since we had both eligible and returned bytes from the cell offload process. This is probably the quickest and easiest way to verify Smart Scan activity. There is another way though, that is a bit more involved but gives more detailed stats on Smart Scan activity. This method involves the storage cells and the cellsrvstat program; since it involves the storage cells it can be a bit more work, especially if passwordless ssh connectivity isn’t configured between the database servers and the storage cells. [I won’t go into configuring that in this article but there are various resources on the Internet available to assist with that task. is a great place to start looking for those.] I will presume that such connectivity is configured; getting these stats is a fairly straightforward task that can be scripted and run from the command line on one of the database servers. The script would be similar to this:


ssh [email protected]  "cellsrvstat -stat_group=smartio -interval=$1 -count=$2"
ssh [email protected]  "cellsrvstat -stat_group=smartio -interval=$1 -count=$2"
ssh [email protected]  "cellsrvstat -stat_group=smartio -interval=$1 -count=$2"

The script takes two parameters, the sample interval and the count of samples to execute; as an example of how this would look at the Linux command line:

$ 5 10

which would set a sampling interval of 5 seconds with 10 samples gathered. One nice aspect of setting this up with a script and ssh access is that you can redirect the output to a local file on the database server and look at it later:

$ 5 10 > cellsrvstat_all_cells_smartio.lst

This makes it easier to scan all of the output later for review. The output is presented in a tabular format, shown below:

===Current Time===                                      Fri Mar  8 14:37:15 2013

== SmartIO related stats ==
Number of active smart IO sessions                              8             11
High water mark of smart IO sessions                            0            130
Number of completed smart IO sessions                           4        8490054
Smart IO offload efficiency (percentage)                        0              7
Size of IO avoided due to storage index (KB)                    0  2511867327696
Current smart IO to be issued (KB)                              0         214456
Total smart IO to be issued (KB)                            69592  5335617009440
Current smart IO in IO (KB)                                     0              0
Total smart IO in IO (KB)                                   69592  5333234484896
Current smart IO being cached in flash (KB)                     0              0
Total smart IO being cached in flash (KB)                       0              0
Current smart IO with IO completed (KB)                         0           3200
Total smart IO with IO completed (KB)                       69592  5427224552840
Current smart IO being filtered (KB)                            0              0
Total smart IO being filtered (KB)                          69592  5426811394136
Current smart IO filtering completed (KB)                       0          24528
Total smart IO filtering completed (KB)                     69592  5332811276704
Current smart IO filtered size (KB)                             0           8075
Total smart IO filtered (KB)                                13258   378946996063
Total cpu passthru output IO size (KB)                          0      449965296
Total passthru output IO size (KB)                              0      472758949
Current smart IO with results in send (KB)                      0              0
Total smart IO with results in send (KB)                    69592  5332478889672
Current smart IO filtered in send (KB)                          0              0
Total smart IO filtered in send (KB)                        13258   378925244226
Total smart IO read from flash (KB)                             0              0
Total smart IO initiated flash population (KB)                  0              0
Total smart IO read from hard disk (KB)                     69592  2786064722152
Total smart IO writes (fcre) to hard disk (KB)                  0    35298961256
Number of smart IO requests < 512KB                            20      134456077
Number of smart IO requests >= 512KB and < 1MB                 65      116155599
Number of smart IO requests >= 1MB and < 2MB                    3        2226841
Number of smart IO requests >= 2MB and < 4MB                    0      188723594
Number of smart IO requests >= 4MB and < 8MB                    0     1085851766
Number of smart IO requests >= 8MB                              0              0
Size of the smart IO 1MB IO quota being used                    0              0
Hwm of the smart IO 1MB IO quota being used                     0           1002
Number of failures to get 1MB IO quota for smart IO             0         674099
Number of times smart IO buffer reserve failures                0              0
Number of times smart IO request misses                         0         296985
Number of times IO for smart IO not allowed to be issued        0    58504757930
Number of times smart IO prefetch limit was reached             2      125315622
Number of times smart scan used unoptimized mode                0              0
Number of times smart fcre used unoptimized mode                0              0
Number of times smart backup used unoptimized mode              0              0

The metrics are, to me at least, clearly described in the table so you won’t be wondering what an abbreviated metric name actually represents.

Another nice aspect of the storage cell method is you can monitor Smart Scan activity in progress, instead of waiting for the process to finish and Oracle to update the stats in the V$SQL/GV$SQL views. To monitor all cells simultaneously you can create three (or more, depending upon how many storage cells you have in your Exadata configuration) separate scripts and run them at basically the same time in the background. To be ‘really cool’ you could wrap the individual storage cell scripts into a single driver script and, at the end of the monitoring, concatenate the log files together into one huge report. Neat.

Exadata provides not only Smart Scans but at least a couple of ways to monitor them; which one you use is your choice really, and it depends on if you want real time monitoring or just a daily report on how much Smart Scan activity occurred on a given day. Either way you’ll know if you’re executing Smart Scans and how much time (and I/O) you’ve saved because of them.

See all articles by David Fitzjarrell

David Fitzjarrell
David Fitzjarrell
David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Latest Articles