Introduction to SQL Server Proactive Caching


Arshad Ali explores Proactive Caching, available in SQL Server Analysis Services 2005 and later versions. Proactive Caching helps DBAs to better control the cube data automatic refresh frequency. Proactive Caching offers near real time data for reports, while at same time providing a query performance similar to MOLAP storage mode.

Proactive Caching

SQL Server Analysis Services offers three basic storage modes (MOLAP, ROLAP and HOLAP) to store
cube data and aggregations. These
storage modes determine storage location, storage space requirements, cube
processing and cube query performance. For example. the default MOLAP storage
mode stores both the cube data and its aggregation in the OLAP server itself
and hence it requires more space on the OLAP server and provides the fastest
query performance. In ROLAP mode, both data and aggregation are stored in the
relational data store and no additional space is required at the OLAP server
(other than to store meta data, which is stored on the OLAP server anyway,
irrespective of any storage modes) but query performance is slowest in this
case. HOLAP storage mode is a hybrid between MOLAP and ROLAP; it does mean in
this storage mode data is stored in the relational data store whereas the
aggregations are stored in the OLAP server; in this mode there is a trade-off
between required storage space and query performance.

SQL Server Analysis Services 2005 and later versions have a feature called
Proactive Caching, which allows the database administrator to better control
the cube data automatic refresh frequency so that reports based on these can
get near real time data while at the same time providing a query performance
similar to MOLAP storage mode.

Introduction

Mostly data warehousing and Business Intelligence (BI)
applications work on historical data, which are normally refreshed once in a daily,
weekly or monthly basis. Although there are some near real time data
warehousing and BI applications, their query performance is painfully slow.  To
help in this scenario, Analysis Services (Microsoft SQL Server 2005 Analysis
Services and later versions) provides an amazing feature called "Proactive
Caching". This new feature gives a query performance, similar to what you
had with historical data, with near real time data and minimal impact on the overall
system. As a BI developer, you just need to configure proactive caching, and
then SQL Server Analysis Services (SSAS) will ensure that you get the performance
of a fully processed data warehouse on near real time data.

How it works

When you configure proactive caching for a cube, a measure group partition
or a dimension, SSAS starts listening to data change notifications from the
underlying relational database. On the relational database side, if there is
any change in the data of the related objects, an event is generated and
notification is sent. SSAS intercepts this notification and start refreshing
the MOLAP cache. Depending on your configuration, during that time frame SSAS
either drops the Multidimensional OLAP structure (MOLAP cache) and serves the
query request from the underlying relational database (behaves as ROLAP storage
mode) or creates another MOLAP cache in parallel to the outdated one. As long
as the new MOLAP cache is being built, the queries are served from the outdated
MOLAP cache (of course query result will be based on old data while the new
MOLAP cache is being built). The moment the new refreshed MOLAP cache is ready,
all queries are switched back to this new MOLAP cache and the outdated MOLAP
cache is dropped. This process of synchronization happens in the background so
there is minimal impact on the overall system.

While configuring the proactive caching you can specify the following:

  • How frequently the MOLAP cache is to be rebuilt
  • From where SSAS should get data while the data refresh is
    in progress; it should get it from underlying relational database or from the
    outdated MOLAP cache.
  • The MOLAP cache should be refreshed automatically when a
    notification arrives or on specified interval/scheduled basis.

After enabling proactive caching, you get the best query performance of
MOLAP storage mode on the most recent data. You should consider if your clients
want to see recent data without compromising with the query performance.

MOLAP

In MOLAP storage mode (default mode), detail data and aggregations are
stored in multidimensional format at the SSAS server and no notifications are
received by the SSAS server because proactive caching does not work in this
mode. MOLAP storage mode is applicable if you want the best query performance,
freshness of data is not that important and you have enough storage space on
the SSAS server. You can either schedule processing of MOLAP cache or process
it manually whenever required. Apart from this mode, all other storage modes
discussed below support enabling proactive caching.

Scheduled
MOLAP

Like MOLAP storage mode, in Scheduled MOLAP mode the detail data and
aggregations will be stored in multidimensional format at the SSAS server. The
only difference is, MOLAP cache is automatically rebuilt on a 24 hour schedule
by SSAS itself (SSAS does not listen for change notification from relational
data source; it just does the processing every 24 hours). As a Data
Warehouse/BI developer, to do the processing you aren’t required to create an
SSIS Package or SQL Server agent job with this mode.

Scheduled MOLAP is suitable for applications that have daily
data refresh requirement.

Automatic
MOLAP

Again, like MOLAP storage mode, in Automatic MOLAP mode the detail data and
aggregations will be stored in multidimensional format at the SSAS server but
in this mode, SSAS listens for change notifications from the underlying
relational database source. If any update happens in the relational database,
an event is triggered and SSAS is notified about the data changes. On receiving
this change notification, SSAS automatically starts refreshing the cube or
partition data. It actually a creates a new MOLAP cache and also maintains the
old one in parallel so that the existing request can be served from the
outdated/old MOLAP cache. The moment creation and processing of new MOLAP cache
is complete, all query requests are switched back to the new MOLAP cache and the
old one is dropped.

If you want processing to occur,
to refresh the data in MOLAP, with a latency interval of 2 hours and if you
also want high query performance, this mode would be your choice. During
rebuild and processing of the new MOLAP cache, the query gets stale data as it
is accessing the outdated MOLAP cache.

With this mode there are two important settings (Silence Interval
and Silence Override Interval) that you need to be aware of. Silence
Interval has a default value of 10 seconds, which means after receiving a
change notification SSAS waits for another 10 seconds to see if there is any
more change notification coming (this ensures more updates can be considered in
a single processing). If another change notification does not arrive in 10
seconds, SSAS will start processing the cube or partition. If any change
notification arrive within 10 seconds then it waits for another 10 seconds
until the Silence Override Interval is reached (default is 10 minutes). Silence
Override Interval is required because without Silence Override Interval, your
cube will never be processed if continuous updates are happening at relational
data source.

Medium
Latency MOLAP

Like the MOLAP storage mode, in this mode the detail data and aggregations
will be stored in multidimensional format at the SSAS server. Also, as in case
of Automatic MOLAP, SSAS listens for change notifications from the underlying
relational database. On receiving this change notification, SSAS automatically
invalidates the existing MOLAP caches and starts reprocessing it (notice it
does not create another MOLAP cache in parallel to the old one, as in the case
of Automatic MOLAP). While reprocessing, the queries are switched to real time
ROLAP (discussed in the next article) and once the processing of the MOLAP
cache is complete; queries are again switched back to the MOLAP cache. Since
during reprocessing real time ROLAP is used to serve the queries, performance
would obviously be slower during this time frame.

This mode has a target latency of 4 hours; it means SSAS will drop out the MOLAP
cache after 4 hours of receiving any change notification from the underlying
relational database. Similar to automatic MOLAP, in this case Silence Interval
has a default value of 10 seconds and Silence Override Interval has a default
value of 10 minutes.

You would use this storage mode when the relational data source is updated
infrequently and query performance is important for your users.

Low Latency MOLAP

As you might have guessed, Low Latency MOLAP is, to some extent, similar to
Medium Latency MOLAP mode. The only basic difference is the latency; in this
mode, the processing occurs automatically as data changes with a target latency
of 30 minutes after the first change notification is received.

A point to note here is that since this mode does switching to real time
ROLAP frequently (30 minutes after a change notification), the performance of a
query would be poorer.

You would use this storage mode when the relational data source is updated
frequently and refreshed data is more important than query performance.

Conclusion

In this article, I’ve discussed a new feature, called Proactive Caching, of
SSAS 2005 and later versions. This new feature gives similar query performance
(as you had with historical data) with near real time data, with minimal impact
on the overall system. As a BI developer, you just need to configure proactive
caching, and then SQL SSAS will ensure you get performance of a fully processed
data warehouse on near real time data without creating any SSIS packages or SQL
Server Agent jobs. In this article, I have also covered different variants of
settings if the storage mode is MOLAP. In my next article, I will be covering
settings for HOLAP and ROLAP, and how to configure proactive caching.

References

MSDN: Proactive
Caching (Partitions)

MSDN: Choosing
a Standard Storage Setting

»


See All Articles by Columnist

Arshad Ali

Arshad Ali
Arshad Ali
Arshad Ali works with Microsoft India R&D Pvt Ltd. He has 8+ years of experience, mostly on Microsoft Technologies. Most recently, as a SQL Developer and BI Developer he has been working on a Data Warehousing project. Arshad is an MCSD, MCITP: Business Intelligence, MCITP: Database Developer 2008 and MCITP: Database Administrator 2008 certified and has presented at several technical events including SQL-School. On an educational front, he has an MCA (Master in Computer Applications) and MBA-IT. Disclaimer : I work for Microsoft and help people and businesses make better use of technology to realize their full potential. The opinions mentioned herein are solely mine and do not reflect those of my current employer or previous employers.

Latest Articles