SecureFiles in Oracle Database 11g

January 29, 2009

Large Objects (LOBs): A Brief History Lesson

The first time I encountered an application that required a binary large object (BLOB) was literally a millennium ago. In 1999, our IT organization at the American Medical Association (AMA) in Chicago, Illinois was tasked with upgrading that organization’s seventy-plus mainframe applications into four client server applications. I led one of several development teams that used PowerBuilder to develop the next generation of client-server front-end applications.

Our flagship membership processing application had a requirement to store a physician’s personal photograph so that when she accessed the AMA’s web site, her picture would appear next to her personal information. We later expanded this requirement so that we could store images of a physician practice location or an organization’s logo. We also developed a mailing list application that utilized Microsoft Word documents as templates for mailing labels, form letters, and fax cover sheets so that we could easily generate documents for distributing membership communication in standardized formats. During the initial project development cycle we utilized Informix as our RDBMS, and it had the capability to store JPEGs and Microsoft Word documents as BLOBs.

At that time our long-term vision for these images and documents was that they’d be loaded one time, updated only infrequently, and would be relatively small in size. And as it turns out, we weren’t the only IT team with this relatively narrow vision; in fact, these assumptions are reflected in the initial roll-out of Large Object capabilities in Oracle Database 8i as well:

  • LOBs would never grow beyond a size of a few hundred kilobytes or in extreme cases a megabyte or two. Based on this assumption, the largest “chunk size” – the maximum amount of space that Oracle would have to convert from its source file to a LOB at one time – could be kept relatively small as well, at just 32KB. 
  • LOBs would be loaded relatively infrequently – usually, only once – and then would be updated very rarely. After all, how often does a standard mailing label format or form letter change?
  • Likewise, we anticipated that only one application user session would be modifying a physician’s picture or a mailing label master document at one time. Therefore, concurrency and read consistency was not going to be a major concern, so we didn’t need to worry about the impact of maintaining additional rollback segments just for LOBs.
  • Finally, remember that Oracle 8i didn’t yet offer the complete power of the clustered, multi-instance database to support highly-available, scalable applications; that only happened when Oracle 9i introduced Cache Fusion as part of its deployment of Real Application Clusters.

So ... What’s Changed?

At the risk of overstating the obvious, the footprint of Large Objects has increased exponentially within our databases over the last decade. It’s being driven by diverse business needs that span every industry: medical and pharmaceutical applications that capture and serve up patient images, manufacturing applications that store components’ designs, and semi-structured data that business-to-business (B2B) applications use to order products to replenish a company’s warehouse. However, these newer application requirements have several things in common:

  • LOB sources are much more diverse. These sources include Extended Markup Language (XML) documents, geospatial data, biometric information, video files, and other semi-structured or non-structured data.
  • Simultaneous multiple-user updates are common. As semi-structured data like XML has proliferated, the need to provide the ability for more than one user to simultaneously update the same LOB has also exploded.
  • Large objects are gigantic ... and getting larger. It’s not uncommon for LOBs to extend into the gigabyte and even terabyte range, especially for audiovisual and medical imaging data.
  • Transmission bandwidths have increased dramatically. Likewise, the applications that consume these LOBs are now able to take advantage of the fiber-optics broadband lines that telecommunications providers laid down in the late 1990s. (I have several friends who’ve terminated their cable TV subscriptions because most of the television shows they want to watch are available as streaming video from Internet content providers for no additional cost beyond that of their broadband internet service.)
  • Self-published user content has exploded. The YouTube phenomenon and other web sites that store self-produced audiovisual files for wide consumption has also contributed to the explosion of LOB sources.
  • Security requirements have increased dramatically. Finally, many of these LOBs require special security – especially personal medical data that is stored within Digital Imaging for Communication of Medical Information (DICOM) sources – because of federal, state, and international regulations like the Health Insurance Portability and Accountability Act (HIPAA).

LOBs, the Next Generation: SecureFiles

As a result of these new business requirements, the LOB data type has been redesigned completely in Oracle 11g Release 1. This new LOB datatype is called a SecureFile, but unfortunately that’s a bit of a misnomer because security enhancements are a relatively small subset of SecureFile’s features, as we’ll see throughout this article series.

Performance and Manageability Improvements. Since the sources of LOBs have increased in size by several orders of magnitude in the past decade, it was obvious that the methods that load, store, and return data from LOBs needed complete refurbishment. Oracle 11g doesn’t disappoint in this regard:

Bigger CHUNK sizes. In prior releases, the maximum size of a “chunk” of LOB data that could be read or written at one time was limited to 32KB. Internal and external LOB sizes have grown exponentially since Oracle 8i, so this was a serious limitation to effective LOB I/O. Oracle 11g remedies this by automatically choosing an appropriate CHUNK size between the database’s standard block size and a maximum of 64MB. Further, Oracle 11g helps limit the internal fragmentation of SecureFile LOBs by attempting to store extremely large SecureFile LOB segments on disk so that they’re physically adjacent. Finally, the ability to utilize variable CHUNK sizes means that the corresponding UNDO segments generated during updates of SecureFile LOBs only need to be as large as the CHUNKs and thus contain less wasted space.

Improved REDO and UNDO Generation. Before Oracle 11g, the Oracle DBA had relatively free rein when configuring CHUNK size, as well as the retention of undo segments for LOBs, and that didn’t always yield optimal results. But since Oracle 11g will choose automatically the best method for generating redo and undo for SecureFile LOBs, only two storage-related parameters are required: RETENTION and MAXSIZE. The RETENTION directive tells Oracle 11g how to handle the consistent read data (stored in UNDO segments) for a SecureFile LOB. Here’s how the setting affects read consistency:

Table 1-1. RETENTION Directive Settings for SecureFiles

Setting

Explanation

MAX

is used to start reclaiming old versions after segment MAXSIZE is reached

MIN

keeps old versions for the specified least amount of time

AUTO

This is the default setting. It’s basically a trade-off between space and time. This is automatically determined

NONE

Instructs Oracle 11g to reuse older versions of the SecureFile LOB as much as possible

The new MAXSIZE storage setting works in concert with the setting for RETENTION, and it determines the maximum size of a SecureFile LOB. When RETENTION is set to MAX, then the LOB is allowed to grow up to a size of MAXSIZE, after which space can be reclaimed from UNDO segments; otherwise, the MAXSIZE setting is treated as a hard limit on the size of the SecureFile LOB’s segment.

Note that the storage attributes CHUNK, PCTVERSION, FREEPOOLS, FREELISTS, and FREELIST GROUPS are simply ignored for SecureFile LOBs

“Intelligent Compression.” This new feature allows Oracle 11g to compress and uncompress SecureFile LOB data automatically and transparently, and this means that disk space for these data is reduced dramatically. Whenever a query or DML statement retrieves or applies changes to a SecureFile LOB’s data, Oracle 11g uncompresses only those data blocks needed for the operation, and it automatically maintains the required mapping between the uncompressed data blocks in the Database Buffer Cache and the database blocks on disk. There are two compression levels: MEDIUM (the default) and HIGH; specifying HIGH will tend to increase latency during SecureFile LOB access, but will also result in the highest storage compression ratios.

Automatic Deduplication. One of the more intriguing SecureFile LOB features is the ability for Oracle 11g to automatically locate any LOBs that store exactly the same data within the same table and column. Once detected, Oracle 11g will store only one copy of the LOB data. The advantage here, of course, is dramatically reduced amounts of disk space are needed to store the same amount of data; further, redo logging and UNDO is reduced significantly because an update to SecureFile LOB data doesn’t require retaining duplicate versions of redo log entries and UNDO segments. Deduplication of SecureFile LOB data is enforceable at either the table or partition level. Note that deduplication does require the Oracle Advanced Compression option, a separately-licensed option that’s also required for OLTP table compression and several other advanced compression features in Oracle 11g.

Transparent Encryption. Every organization has ethical and legal obligations to protect customer-sensitive information stored within LOBs. For example, a patient’s medical records and radiology images must be protected under HIPAA regulations. However, there’s also enormous amounts of intellectual property information that must be stored in semi-structured object forms – for example, molecular formulas and manufacturing blueprints – that is crucial to any company’s competitive advantage.

SecureFile LOBs fill this hitherto serious gap in database security because it’s now possible to encrypt LOB data within an Oracle 11g database. SecureFile LOBs are encrypted using the same Transparent Data Encryption (TDE) algorithms that were introduced in Oracle 10gR2, and this means it’s possible to encrypt either any or all SecureFile LOBs within the same table using the four standard TDE encryption algorithms (AES128, AES192, AES256 and 3DES168). Finally, note that TDE does require the Oracle Advanced Security pack, an extra-cost option, to enable encryption of SecureFile LOBs.

Wait! What About My “Old” LOBs?

The DBA must set the value of the COMPATIBLE initialization parameter to 11.0.0.0.0 or higher to enable SecureFile LOB features. However, the original LOB datatype and its capabilities are still preserved for backward compatibility: A pre-11gR1 LOB is now called a BasicFile. BasicFile and SecureFile LOBs can even co-exist in the same table. In fact, if a LOB is not specified as a SecureFile during its creation, then Oracle 11g simply assumes it to be a BasicFile LOB.

Controlling Selection of SecureFiles for LOB Storage. The new DB_SECUREFILE initialization parameter offers another layer of control over how Oracle 11g chooses the type of LOB to create when neither the BasicFile or SecureFile option is specified:

Table 1-2. DB_SECUREFILE Initialization Parameter Values

 

Value

Meaning

 

ALWAYS

Attempts to create all LOBs as SecureFile LOBs but creates any LOBs not in Automatic Segment Space Management (ASSM) tablespaces as BasicFile LOBs

FORCE

Forces all LOBs created going forward to be SecureFile LOBs

PERMITTED

(Default) Allows SecureFiles to be created

NEVER

Disallows SecureFiles from being created going forward. If specified, any LOBs directed to be created as SecureFiles will be created as BasicFiles instead.

IGNORE

Disallows SecureFiles and ignores any errors that would otherwise be caused by forcing BasicFiles with SecureFiles options

Note that if a SecureFile LOB storage option is specified during the creation or modification of a BasicFile LOB – for example, TDE encryption, deduplication, or compression – Oracle 11g will raise an exception during the operation.

Using SecureFiles: A Practical Example

To illustrate how simple and practical it is to create and use SecureFile LOBs, I’ll construct a sample application that captures trouble ticket information. Along with each trouble ticket identifier and description, the application needs to retain a Microsoft Word document that each customer submits when a trouble ticket is created; optionally, the customer can also provide an appropriate screen print or other image in JPEG format to help clarify the problem encountered.

I’ve provided the code to create the initial infrastructure for this application in Listing 1.1:

  • First, I’ll create two new tablespaces, BASICFILES and SECUREFILES, in which all BasicFile and SecureFile LOBs will be stored, respectively.
  • Next, I’ll create a new schema named TRBTKT that will own all database objects for this application.
  • To provide a location from which the Microsoft Word documents and JPEG images will be stored as external LOBs until they’re loaded into the database, I’ll create two new directory objects, TRBTKT_DOCS and TRBTKT_IMGS.

The sample Microsoft Word documents and JPEG files can be downloaded as a single ZIP file. To duplicate this application infrastructure, you’ll need to create directory folders that correspond to the ones created above and then copy the documents and images into their respective directories.

Now that my infrastructure is in place, I’ll construct two tables to demonstrate how to store the trouble ticket information and related LOBs. As shown in Listing 1.2, both tables are virtually identical; however, table TRBTKT.TICKETS will utilize BasicFiles to store its LOBs, while, table TRBTKT.SECURE_TICKETS will use SecureFiles. I’ll be using both tables in this article series to compare and contrast the different features of BasicFiles and SecureFiles.

I’ll also construct a new package, TRBTKT.PKG_SECUREFILES, that’s shown in Listing 1.3. I’ll use this package to implement all methods for loading LOBs, determining LOB sizing and space utilization, and displaying rudimentary information from the character-based LOBs.

Next, I’ll load sample data into table TRBTKT.TICKETS via calls to procedure AddTroubleTickets in package TRBTKT.PKG_SECUREFILES. I’ve shown the invocation of this procedure in Listing 1.4. The sample Microsoft Word documents and JPEG files can be downloaded as a single ZIP file here; to duplicate my loading technique, you’ll need to create directory folders that correspond to the ones defined in for directory objects TRBTKT_DOCS and TRBTKT_IMGS.

Finally, I’ll transfer the data from TRBTKT.TICKETS to TRBTKT.SECURE_TICKETS using a simple INSERT INTO ... SELECT statement as shown in Listing 1.5.

Next Steps

In the next article in this series, I’ll delve a lot more deeply into the advanced capabilities of Oracle Database 11g’s new SecureFile features, including:

  • Migrating efficiently from BasicFile LOBs to SecureFile LOBs
  • Determining the relative storage efficiency of SecureFile vs. BasicFile LOBs
  • Exploring the different compression and deduplication options for SecureFile LOBs
  • Encrypting SecureFile LOBs via Transparent Data Encryption

References and Additional Reading

While I’m hopeful that I’ve given you a thorough grounding in the technical aspects of the features I’ve discussed in this article, I’m also sure that there may be better documentation available since it’s been published. I therefore strongly suggest that you take a close look at the corresponding Oracle documentation on these features to obtain crystal-clear understanding before attempting to implement them in a production environment. Please note that I’ve drawn upon the following Oracle Database 11g documentation for the deeper technical details of this article:

B28279-02 Oracle Database 11g New Features Guide

B28320-01 Oracle Database 11g Reference Guide

B28393-03 Oracle Database 11g SecureFiles and Large Objects Developer’s Guide

B28419-02 Oracle Database 11g PL/SQL Packages and Types Reference

B28424-03 Oracle Database 11g Advanced Application Developer’s Guide

» See All Articles by Columnist Jim Czuprynski








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers