SecureFiles in Oracle Database 11g

Large Objects (LOBs): A Brief History Lesson

The
first time I encountered an application that required a binary large object
(BLOB) was literally a millennium ago. In 1999, our IT organization at the
American Medical Association (AMA) in Chicago, Illinois was tasked with
upgrading that organization’s seventy-plus mainframe applications into four
client server applications. I led one of several development teams that used
PowerBuilder to develop the next generation of client-server front-end applications.

Our
flagship membership processing application had a requirement to store a
physician’s personal photograph so that when she accessed the AMA’s web site,
her picture would appear next to her personal information. We later expanded
this requirement so that we could store images of a physician practice location
or an organization’s logo. We also developed a mailing list application that
utilized Microsoft Word documents as templates for mailing labels, form
letters, and fax cover sheets so that we could easily generate documents for
distributing membership communication in standardized formats. During the initial
project development cycle we utilized Informix as our RDBMS, and it had the
capability to store JPEGs and Microsoft Word documents as BLOBs.

At
that time our long-term vision for these images and documents was that they’d
be loaded one time, updated only infrequently, and would be relatively small in
size. And as it turns out, we weren’t the only IT team with this relatively
narrow vision; in fact, these assumptions are reflected in the initial roll-out
of Large Object capabilities in Oracle Database 8i as well:

  • LOBs would
    never grow beyond a size of a few
    hundred kilobytes or in extreme cases a megabyte or two. Based on this
    assumption, the largest “chunk size” – the maximum amount of space that Oracle
    would have to convert from its source file to a LOB at one time – could be kept
    relatively small as well, at just 32KB. 
  • LOBs would be loaded
    relatively infrequently – usually, only once – and then would be updated very rarely. After all, how often
    does a standard mailing label format or form letter change?
  • Likewise, we anticipated that only one application user session would be
    modifying a physician’s picture or a mailing label master document at one time.
    Therefore, concurrency and read consistency was not going to be a major
    concern, so we didn’t need to worry about the impact of maintaining additional
    rollback segments just for LOBs.
  • Finally, remember that Oracle 8i didn’t yet offer the complete power of
    the clustered, multi-instance database to support highly-available, scalable
    applications; that only happened when Oracle 9i
    introduced Cache Fusion as part of its deployment of Real Application Clusters.

So … What’s Changed?

At
the risk of overstating the obvious, the footprint of Large Objects has increased
exponentially within our databases over the last decade. It’s being driven by
diverse business needs that span every industry: medical and pharmaceutical
applications that capture and serve up patient images, manufacturing
applications that store components’ designs, and semi-structured data that
business-to-business (B2B) applications use to order products to replenish a
company’s warehouse. However, these newer application requirements have several
things in common:

  • LOB sources are
    much more diverse.
    These sources include Extended Markup Language (XML) documents, geospatial data,
    biometric information, video files, and other semi-structured or non-structured
    data.
  • Simultaneous
    multiple-user updates are common.
    As semi-structured data like XML
    has proliferated, the need to provide the ability for more than one user to
    simultaneously update the same LOB has also exploded.
  • Large objects
    are gigantic … and getting larger.
    It’s not uncommon for LOBs to
    extend into the gigabyte and even terabyte range, especially for audiovisual
    and medical imaging data.
  • Transmission
    bandwidths have increased dramatically.
    Likewise, the applications
    that consume these LOBs are now able to take advantage of the fiber-optics
    broadband lines that telecommunications providers laid down in the late 1990s.
    (I have several friends who’ve terminated their cable TV subscriptions because
    most of the television shows they want to watch are available as streaming
    video from Internet content providers for no additional cost beyond that of their
    broadband internet service.)
  • Self-published
    user content has exploded.
    The YouTube phenomenon and other web
    sites that store self-produced audiovisual files for wide consumption has also
    contributed to the explosion of LOB sources.
  • Security
    requirements have increased dramatically.
    Finally, many of these
    LOBs require special security – especially personal medical data that is stored
    within Digital Imaging
    for Communication of Medical Information
    (DICOM)
    sources – because of federal, state, and international regulations like the Health Insurance Portability and Accountability Act
    (HIPAA).

LOBs, the Next Generation: SecureFiles

As a
result of these new business requirements, the LOB data type has been
redesigned completely in Oracle 11g
Release 1. This new LOB datatype is called a SecureFile,
but unfortunately that’s a bit of a misnomer because security enhancements are
a relatively small subset of SecureFile’s features, as we’ll see throughout this
article series.

Performance and Manageability Improvements.
Since the sources of LOBs have increased in size by several orders of magnitude
in the past decade, it was obvious that the methods that load, store, and
return data from LOBs needed complete refurbishment. Oracle 11g doesn’t disappoint in this regard:

Bigger CHUNK sizes. In prior releases, the maximum size of a
“chunk” of LOB data that could be read or written at one time was limited to 32KB.
Internal and external LOB sizes have grown exponentially since Oracle 8i, so this was a serious limitation to
effective LOB I/O. Oracle 11g remedies this by automatically choosing an
appropriate CHUNK
size between the database’s standard block size and a maximum of 64MB. Further,
Oracle 11g helps limit the internal fragmentation of SecureFile LOBs
by attempting to store extremely large SecureFile LOB segments on disk so that
they’re physically adjacent. Finally, the ability to utilize variable CHUNK
sizes means that the corresponding UNDO segments generated during updates of
SecureFile LOBs only need to be as large as the CHUNKs and thus contain less
wasted space.

Improved REDO and UNDO Generation.
Before Oracle 11g, the Oracle DBA
had relatively free rein when configuring CHUNK size, as well as the retention
of undo segments for LOBs, and that didn’t always yield optimal results. But
since Oracle 11g will choose automatically
the best method for generating redo and undo for SecureFile LOBs, only two storage-related
parameters are required: RETENTION and MAXSIZE. The RETENTION
directive tells Oracle 11g how to
handle the consistent read data
(stored in UNDO segments) for a SecureFile LOB. Here’s how the setting affects
read consistency:

Table 1-1. RETENTION
Directive Settings for SecureFiles

Setting

Explanation

MAX

is
used to start reclaiming old versions after segment MAXSIZE is reached

MIN

keeps old versions
for the specified least amount of time

AUTO

This
is the default setting. It’s
basically a trade-off between space and time. This is automatically
determined

NONE

Instructs
Oracle 11g to reuse older versions of the SecureFile LOB as much as possible

The
new MAXSIZE storage setting works in concert with the
setting for RETENTION,
and it determines the maximum size of a SecureFile LOB. When RETENTION
is set to MAX,
then the LOB is allowed to grow up to a size of MAXSIZE, after which space can be
reclaimed from UNDO segments; otherwise, the MAXSIZE setting is treated as a
hard limit on the size of the SecureFile LOB’s segment.

Note
that the storage attributes CHUNK, PCTVERSION, FREEPOOLS,
FREELISTS,
and FREELIST
GROUPS
are simply ignored for SecureFile LOBs

“Intelligent Compression.” This new
feature allows Oracle 11g to compress and uncompress SecureFile LOB data
automatically and transparently, and this means that disk space for these data
is reduced dramatically. Whenever a query or DML statement retrieves or applies
changes to a SecureFile LOB’s data, Oracle 11g
uncompresses only those data blocks needed for the operation, and it
automatically maintains the required mapping between the uncompressed data
blocks in the Database Buffer Cache and the database blocks on disk. There are two
compression levels: MEDIUM (the default) and HIGH;
specifying HIGH
will tend to increase latency during SecureFile LOB access, but will also
result in the highest storage compression ratios.

Automatic Deduplication. One of the more
intriguing SecureFile LOB features is the ability for Oracle 11g to automatically locate any LOBs that
store exactly the same data within the same table and column. Once detected,
Oracle 11g will store only one copy
of the LOB data. The advantage here, of course, is dramatically reduced amounts
of disk space are needed to store the same amount of data; further, redo
logging and UNDO is reduced significantly because an update to SecureFile LOB
data doesn’t require retaining duplicate versions of redo log entries and UNDO
segments. Deduplication of SecureFile LOB data is enforceable at either the
table or partition level. Note that deduplication does require the Oracle Advanced Compression option, a
separately-licensed option that’s also required for OLTP table compression and
several other advanced compression features in Oracle 11g.

Transparent Encryption. Every organization
has ethical and legal obligations to protect customer-sensitive
information
stored within LOBs. For example, a patient’s medical
records and radiology images must be protected under HIPAA regulations.
However, there’s also enormous amounts of intellectual
property information
that must be stored in semi-structured object
forms – for example, molecular formulas and manufacturing blueprints – that is
crucial to any company’s competitive advantage.

SecureFile
LOBs fill this hitherto serious gap in database security because it’s now
possible to encrypt LOB data within an Oracle 11g database. SecureFile LOBs are encrypted using the same Transparent Data Encryption (TDE)
algorithms that were introduced in Oracle 10gR2,
and this means it’s possible to encrypt either any or all SecureFile LOBs
within the same table using the four standard TDE encryption algorithms
(AES128, AES192, AES256 and 3DES168). Finally, note that TDE does require the
Oracle Advanced Security pack, an extra-cost option, to enable encryption of
SecureFile LOBs.

Wait! What About My “Old” LOBs?

The
DBA must set the value of the COMPATIBLE initialization
parameter to 11.0.0.0.0
or higher to enable SecureFile LOB features. However, the original LOB datatype
and its capabilities are still preserved for backward compatibility: A
pre-11gR1 LOB is now called a BasicFile.
BasicFile and SecureFile LOBs can even co-exist in the same table. In fact, if
a LOB is not specified as a SecureFile during its creation, then Oracle 11g
simply assumes it to be a BasicFile LOB.

Controlling Selection of SecureFiles for LOB Storage.
The new DB_SECUREFILE
initialization parameter offers another layer of control over how Oracle 11g chooses the type of LOB to create when
neither the BasicFile or SecureFile option is specified:

Table 1-2.
DB_SECUREFILE Initialization Parameter Values

 

Value

Meaning

 

ALWAYS

Attempts
to create all LOBs as SecureFile LOBs but creates any LOBs not in Automatic
Segment Space Management (ASSM) tablespaces as BasicFile LOBs

FORCE

Forces
all LOBs created going forward to be SecureFile LOBs

PERMITTED

(Default) Allows SecureFiles to be created

NEVER

Disallows
SecureFiles from being created going forward. If specified, any LOBs directed
to be created as SecureFiles will be created as BasicFiles instead.

IGNORE

Disallows
SecureFiles and ignores any errors that would otherwise be caused by forcing
BasicFiles with SecureFiles options

Note
that if a SecureFile LOB storage option is specified during the creation or
modification of a BasicFile LOB – for example, TDE encryption, deduplication,
or compression – Oracle 11g will raise an exception during the operation.

Using SecureFiles: A Practical Example

To
illustrate how simple and practical it is to create and use SecureFile LOBs, I’ll
construct a sample application that captures trouble ticket information. Along
with each trouble ticket identifier and description, the application needs to
retain a Microsoft Word document that each customer submits when a trouble
ticket is created; optionally, the customer can also provide an appropriate
screen print or other image in JPEG format to help clarify the problem
encountered.

I’ve
provided the code to create the initial infrastructure for this application in Listing
1.1
:

  • First, I’ll create two new tablespaces, BASICFILES
    and SECUREFILES,
    in which all BasicFile and SecureFile LOBs will be stored, respectively.
  • Next, I’ll create a new schema named TRBTKT
    that will own all database objects for this application.
  • To provide a location from which the Microsoft Word
    documents and JPEG images will be stored as external LOBs until they’re loaded
    into the database, I’ll create two new directory objects, TRBTKT_DOCS
    and TRBTKT_IMGS.

The
sample Microsoft Word documents and JPEG files can be downloaded as a single ZIP file. To duplicate this application
infrastructure, you’ll need to create directory folders that correspond to the
ones created above and then copy the documents and images into their respective
directories.

Now
that my infrastructure is in place, I’ll construct two tables to demonstrate
how to store the trouble ticket information and related LOBs. As shown in Listing
1.2
, both tables are virtually identical; however, table TRBTKT.TICKETS
will utilize BasicFiles to store its LOBs, while, table TRBTKT.SECURE_TICKETS
will use SecureFiles. I’ll be using both tables in this article series to
compare and contrast the different features of BasicFiles and SecureFiles.

I’ll also
construct a new package, TRBTKT.PKG_SECUREFILES, that’s
shown in Listing 1.3.
I’ll use this package to implement all methods for loading LOBs, determining
LOB sizing and space utilization, and displaying rudimentary information from
the character-based LOBs.

Next,
I’ll load sample data into table TRBTKT.TICKETS via calls to
procedure AddTroubleTickets
in package TRBTKT.PKG_SECUREFILES.
I’ve shown the invocation of this procedure in Listing
1.4
. The sample Microsoft Word documents and JPEG files can be
downloaded as a single ZIP file here;
to duplicate my loading technique, you’ll need to create directory folders that
correspond to the ones defined in for directory objects TRBTKT_DOCS
and TRBTKT_IMGS.

Finally,
I’ll transfer the data from TRBTKT.TICKETS to TRBTKT.SECURE_TICKETS
using a simple INSERT
INTO … SELECT
statement as shown in Listing
1.5
.

Next Steps

In
the next article in this series, I’ll delve a lot more deeply into the advanced
capabilities of Oracle Database 11g’s
new SecureFile features, including:

  • Migrating
    efficiently from BasicFile LOBs to SecureFile LOBs
  • Determining the relative storage efficiency of SecureFile vs. BasicFile LOBs
  • Exploring the different compression and deduplication
    options for SecureFile LOBs
  • Encrypting
    SecureFile LOBs via Transparent Data
    Encryption

References and Additional Reading

While
I’m hopeful that I’ve given you a thorough grounding in the technical aspects
of the features I’ve discussed in this article, I’m also sure that there may be
better documentation available since it’s been published. I therefore strongly
suggest that you take a close look at the corresponding Oracle documentation on
these features to obtain crystal-clear understanding before attempting to
implement them in a production environment. Please note that I’ve drawn upon
the following Oracle Database 11g
documentation for the deeper technical details of this article:

B28279-02 Oracle Database 11g New Features Guide

B28320-01 Oracle Database 11g Reference Guide

B28393-03 Oracle Database 11g SecureFiles and Large Objects
Developer’s Guide

B28419-02 Oracle Database 11g PL/SQL Packages and Types
Reference

B28424-03 Oracle Database 11g Advanced Application Developer’s
Guide

»


See All Articles by Columnist
Jim Czuprynski

Jim Czuprynski
Jim Czuprynski
Jim Czuprynski has accumulated over 30 years of experience during his information technology career. He has filled diverse roles at several Fortune 1000 companies in those three decades - mainframe programmer, applications developer, business analyst, and project manager - before becoming an Oracle database administrator in 2001. He currently holds OCP certification for Oracle 9i, 10g and 11g. Jim teaches the core Oracle University database administration courses on behalf of Oracle and its Education Partners throughout the United States and Canada, instructing several hundred Oracle DBAs since 2005. He was selected as Oracle Education Partner Instructor of the Year in 2009. Jim resides in Bartlett, Illinois, USA with his wife Ruth, whose career as a project manager and software quality assurance manager for a multinational insurance company makes for interesting marital discussions. He enjoys cross-country skiing, biking, bird watching, and writing about his life experiences in the field of information technology.

Latest Articles