Large Objects (LOBs): A Brief History Lesson
first time I encountered an application that required a binary large object
(BLOB) was literally a millennium ago. In 1999, our IT organization at the
American Medical Association (AMA) in Chicago, Illinois was tasked with
upgrading that organizations seventy-plus mainframe applications into four
client server applications. I led one of several development teams that used
PowerBuilder to develop the next generation of client-server front-end applications.
flagship membership processing application had a requirement to store a
physicians personal photograph so that when she accessed the AMAs web site,
her picture would appear next to her personal information. We later expanded
this requirement so that we could store images of a physician practice location
or an organizations logo. We also developed a mailing list application that
utilized Microsoft Word documents as templates for mailing labels, form
letters, and fax cover sheets so that we could easily generate documents for
distributing membership communication in standardized formats. During the initial
project development cycle we utilized Informix as our RDBMS, and it had the
capability to store JPEGs and Microsoft Word documents as BLOBs.
that time our long-term vision for these images and documents was that theyd
be loaded one time, updated only infrequently, and would be relatively small in
size. And as it turns out, we werent the only IT team with this relatively
narrow vision; in fact, these assumptions are reflected in the initial roll-out
of Large Object capabilities in Oracle Database 8i as well:
never grow beyond a size of a few
hundred kilobytes or in extreme cases a megabyte or two. Based on this
assumption, the largest chunk size the maximum amount of space that Oracle
would have to convert from its source file to a LOB at one time could be kept
relatively small as well, at just 32KB.
LOBs would be loaded
relatively infrequently usually, only once and then would be updated very rarely. After all, how often
does a standard mailing label format or form letter change?
Likewise, we anticipated that only one application user session would be
modifying a physicians picture or a mailing label master document at one time.
Therefore, concurrency and read consistency was not going to be a major
concern, so we didnt need to worry about the impact of maintaining additional
rollback segments just for LOBs.
Finally, remember that Oracle 8i didnt yet offer the complete power of
the clustered, multi-instance database to support highly-available, scalable
applications; that only happened when Oracle 9i
introduced Cache Fusion as part of its deployment of Real Application Clusters.
So ... Whats Changed?
the risk of overstating the obvious, the footprint of Large Objects has increased
exponentially within our databases over the last decade. Its being driven by
diverse business needs that span every industry: medical and pharmaceutical
applications that capture and serve up patient images, manufacturing
applications that store components designs, and semi-structured data that
business-to-business (B2B) applications use to order products to replenish a
companys warehouse. However, these newer application requirements have several
things in common:
LOB sources are
much more diverse. These sources include Extended Markup Language (XML) documents, geospatial data,
biometric information, video files, and other semi-structured or non-structured
multiple-user updates are common. As semi-structured data like XML
has proliferated, the need to provide the ability for more than one user to
simultaneously update the same LOB has also exploded.
are gigantic ... and getting larger. Its not uncommon for LOBs to
extend into the gigabyte and even terabyte range, especially for audiovisual
and medical imaging data.
bandwidths have increased dramatically. Likewise, the applications
that consume these LOBs are now able to take advantage of the fiber-optics
broadband lines that telecommunications providers laid down in the late 1990s.
(I have several friends whove terminated their cable TV subscriptions because
most of the television shows they want to watch are available as streaming
video from Internet content providers for no additional cost beyond that of their
broadband internet service.)
user content has exploded. The YouTube phenomenon and other web
sites that store self-produced audiovisual files for wide consumption has also
contributed to the explosion of LOB sources.
requirements have increased dramatically. Finally, many of these
LOBs require special security especially personal medical data that is stored
within Digital Imaging
for Communication of Medical Information (DICOM)
sources because of federal, state, and international regulations like the Health Insurance Portability and Accountability Act
LOBs, the Next Generation: SecureFiles
result of these new business requirements, the LOB data type has been
redesigned completely in Oracle 11g
Release 1. This new LOB datatype is called a SecureFile,
but unfortunately thats a bit of a misnomer because security enhancements are
a relatively small subset of SecureFiles features, as well see throughout this
Performance and Manageability Improvements.
Since the sources of LOBs have increased in size by several orders of magnitude
in the past decade, it was obvious that the methods that load, store, and
return data from LOBs needed complete refurbishment. Oracle 11g doesnt disappoint in this regard:
Bigger CHUNK sizes. In prior releases, the maximum size of a
chunk of LOB data that could be read or written at one time was limited to 32KB.
Internal and external LOB sizes have grown exponentially since Oracle 8i, so this was a serious limitation to
effective LOB I/O. Oracle 11g remedies this by automatically choosing an
size between the databases standard block size and a maximum of 64MB. Further,
Oracle 11g helps limit the internal fragmentation of SecureFile LOBs
by attempting to store extremely large SecureFile LOB segments on disk so that
theyre physically adjacent. Finally, the ability to utilize variable CHUNK
sizes means that the corresponding UNDO segments generated during updates of
SecureFile LOBs only need to be as large as the CHUNKs and thus contain less
Improved REDO and UNDO Generation.
Before Oracle 11g, the Oracle DBA
had relatively free rein when configuring CHUNK size, as well as the retention
of undo segments for LOBs, and that didnt always yield optimal results. But
since Oracle 11g will choose automatically
the best method for generating redo and undo for SecureFile LOBs, only two storage-related
parameters are required: RETENTION and MAXSIZE. The RETENTION
directive tells Oracle 11g how to
handle the consistent read data
(stored in UNDO segments) for a SecureFile LOB. Heres how the setting affects
Table 1-1. RETENTION
Directive Settings for SecureFiles
used to start reclaiming old versions after segment MAXSIZE is reached
keeps old versions
for the specified least amount of time
is the default setting. Its
basically a trade-off between space and time. This is automatically
Oracle 11g to reuse older versions of the SecureFile LOB as much as possible
new MAXSIZE storage setting works in concert with the
setting for RETENTION,
and it determines the maximum size of a SecureFile LOB. When RETENTION
is set to MAX,
then the LOB is allowed to grow up to a size of MAXSIZE, after which space can be
reclaimed from UNDO segments; otherwise, the MAXSIZE setting is treated as a
hard limit on the size of the SecureFile LOBs segment.
that the storage attributes CHUNK, PCTVERSION, FREEPOOLS,
GROUPS are simply ignored for SecureFile LOBs
Intelligent Compression. This new
feature allows Oracle 11g to compress and uncompress SecureFile LOB data
automatically and transparently, and this means that disk space for these data
is reduced dramatically. Whenever a query or DML statement retrieves or applies
changes to a SecureFile LOBs data, Oracle 11g
uncompresses only those data blocks needed for the operation, and it
automatically maintains the required mapping between the uncompressed data
blocks in the Database Buffer Cache and the database blocks on disk. There are two
compression levels: MEDIUM (the default) and HIGH;
will tend to increase latency during SecureFile LOB access, but will also
result in the highest storage compression ratios.
Automatic Deduplication. One of the more
intriguing SecureFile LOB features is the ability for Oracle 11g to automatically locate any LOBs that
store exactly the same data within the same table and column. Once detected,
Oracle 11g will store only one copy
of the LOB data. The advantage here, of course, is dramatically reduced amounts
of disk space are needed to store the same amount of data; further, redo
logging and UNDO is reduced significantly because an update to SecureFile LOB
data doesnt require retaining duplicate versions of redo log entries and UNDO
segments. Deduplication of SecureFile LOB data is enforceable at either the
table or partition level. Note that deduplication does require the Oracle Advanced Compression option, a
separately-licensed option thats also required for OLTP table compression and
several other advanced compression features in Oracle 11g.
Transparent Encryption. Every organization
has ethical and legal obligations to protect customer-sensitive
information stored within LOBs. For example, a patients medical
records and radiology images must be protected under HIPAA regulations.
However, theres also enormous amounts of intellectual
property information that must be stored in semi-structured object
forms for example, molecular formulas and manufacturing blueprints that is
crucial to any companys competitive advantage.
LOBs fill this hitherto serious gap in database security because its now
possible to encrypt LOB data within an Oracle 11g database. SecureFile LOBs are encrypted using the same Transparent Data Encryption (TDE)
algorithms that were introduced in Oracle 10gR2,
and this means its possible to encrypt either any or all SecureFile LOBs
within the same table using the four standard TDE encryption algorithms
(AES128, AES192, AES256 and 3DES168). Finally, note that TDE does require the
Oracle Advanced Security pack, an extra-cost option, to enable encryption of
Wait! What About My Old LOBs?
DBA must set the value of the COMPATIBLE initialization
parameter to 18.104.22.168.0
or higher to enable SecureFile LOB features. However, the original LOB datatype
and its capabilities are still preserved for backward compatibility: A
pre-11gR1 LOB is now called a BasicFile.
BasicFile and SecureFile LOBs can even co-exist in the same table. In fact, if
a LOB is not specified as a SecureFile during its creation, then Oracle 11g
simply assumes it to be a BasicFile LOB.
Controlling Selection of SecureFiles for LOB Storage.
The new DB_SECUREFILE
initialization parameter offers another layer of control over how Oracle 11g chooses the type of LOB to create when
neither the BasicFile or SecureFile option is specified:
DB_SECUREFILE Initialization Parameter Values
to create all LOBs as SecureFile LOBs but creates any LOBs not in Automatic
Segment Space Management (ASSM) tablespaces as BasicFile LOBs
all LOBs created going forward to be SecureFile LOBs
(Default) Allows SecureFiles to be created
SecureFiles from being created going forward. If specified, any LOBs directed
to be created as SecureFiles will be created as BasicFiles instead.
SecureFiles and ignores any errors that would otherwise be caused by forcing
BasicFiles with SecureFiles options
that if a SecureFile LOB storage option is specified during the creation or
modification of a BasicFile LOB for example, TDE encryption, deduplication,
or compression Oracle 11g will raise an exception during the operation.
Using SecureFiles: A Practical Example
illustrate how simple and practical it is to create and use SecureFile LOBs, Ill
construct a sample application that captures trouble ticket information. Along
with each trouble ticket identifier and description, the application needs to
retain a Microsoft Word document that each customer submits when a trouble
ticket is created; optionally, the customer can also provide an appropriate
screen print or other image in JPEG format to help clarify the problem
provided the code to create the initial infrastructure for this application in Listing
First, Ill create two new tablespaces, BASICFILES
in which all BasicFile and SecureFile LOBs will be stored, respectively.
Next, Ill create a new schema named TRBTKT
that will own all database objects for this application.
To provide a location from which the Microsoft Word
documents and JPEG images will be stored as external LOBs until theyre loaded
into the database, Ill create two new directory objects, TRBTKT_DOCS
sample Microsoft Word documents and JPEG files can be downloaded as a single ZIP file. To duplicate this application
infrastructure, youll need to create directory folders that correspond to the
ones created above and then copy the documents and images into their respective
that my infrastructure is in place, Ill construct two tables to demonstrate
how to store the trouble ticket information and related LOBs. As shown in Listing
1.2, both tables are virtually identical; however, table TRBTKT.TICKETS
will utilize BasicFiles to store its LOBs, while, table TRBTKT.SECURE_TICKETS
will use SecureFiles. Ill be using both tables in this article series to
compare and contrast the different features of BasicFiles and SecureFiles.
construct a new package, TRBTKT.PKG_SECUREFILES, thats
shown in Listing 1.3.
Ill use this package to implement all methods for loading LOBs, determining
LOB sizing and space utilization, and displaying rudimentary information from
the character-based LOBs.
Ill load sample data into table TRBTKT.TICKETS via calls to
in package TRBTKT.PKG_SECUREFILES.
Ive shown the invocation of this procedure in Listing
1.4. The sample Microsoft Word documents and JPEG files can be
downloaded as a single ZIP file here;
to duplicate my loading technique, youll need to create directory folders that
correspond to the ones defined in for directory objects TRBTKT_DOCS
Ill transfer the data from TRBTKT.TICKETS to TRBTKT.SECURE_TICKETS
using a simple INSERT
INTO ... SELECT statement as shown in Listing
the next article in this series, Ill delve a lot more deeply into the advanced
capabilities of Oracle Database 11gs
new SecureFile features, including:
efficiently from BasicFile LOBs to SecureFile LOBs
Determining the relative storage efficiency of SecureFile vs. BasicFile LOBs
Exploring the different compression and deduplication
options for SecureFile LOBs
SecureFile LOBs via Transparent Data
References and Additional Reading
Im hopeful that Ive given you a thorough grounding in the technical aspects
of the features Ive discussed in this article, Im also sure that there may be
better documentation available since its been published. I therefore strongly
suggest that you take a close look at the corresponding Oracle documentation on
these features to obtain crystal-clear understanding before attempting to
implement them in a production environment. Please note that Ive drawn upon
the following Oracle Database 11g
documentation for the deeper technical details of this article:
B28279-02 Oracle Database 11g New Features Guide
B28320-01 Oracle Database 11g Reference Guide
B28393-03 Oracle Database 11g SecureFiles and Large Objects
B28419-02 Oracle Database 11g PL/SQL Packages and Types
B28424-03 Oracle Database 11g Advanced Application Developers
See All Articles by Columnist Jim Czuprynski