SHARE
Facebook X Pinterest WhatsApp

Storing Word Documents in Oracle

Written By
thumbnail
James Koopmann
James Koopmann
Feb 5, 2004

Have you ever wondered about
storing documents into your Oracle database and just didn’t know where to
start? Here is a quick introduction to the basics you need to know.

Manipulating
Oracle Files with UTL_FILE
showed you how to read the alert log
and do some manipulation on the file while it was external to the database. You
should review this article as it contains some background information you will
need to know, along with some explanation of some of the procedures in this
code that I will not go into here. The next logical extension to the last
article is the manipulation of external files, such as documents, and the
storage in the database. This article will take you through a brief overview of
the datatypes and procedures in order to store word documents within the
database.

The Datatypes

When talking about manipulating documents within a database,
there are only a few choices for a datatype that can handle a large document.
These large objects (LOBs) can use any one of the four datatypes depending on
the characteristics of the object you are storing. These large objects can be
in the form of text, graphics, video or audio.


Datatype

Description

BLOB

Used to store unstructured binary data up to 4G. This
datatype stores the full binary object in the database.

CLOB/NCLOB

Used to store up to 4G of character data. This datatype
stores the full character data in the database.

BFILE

Used to point at large objects that are external to the
database and in operating system files. The BFILE column also contains binary
data and cannot be selected.

Benefits of LOBs

It use to be that the largest object you could store in the
database was of the datatype LONG. Oracle has for the last few releases kept
telling us to convert our LONG datatypes to a LOB datatype (maybe they will
too). The reason for converting our LONGs to LOBs can be seen in this short
list of benefits.

  1. LOB
    columns can reach the size of 4G.
  2. You
    can store LOB data internally within a table or externally.
  3. You
    can perform random access to the LOB data.
  4. It
    is easier to do transformations on LOB columns.
  5. You
    can replicate the tables that contain LOB columns.

Create a Table to Store the Document

In order to store the documents into the database you must
obviously first create an object to store the information. Following is the DDL
to create the table MY_DOCS. You will notice that there is a holder for the
bfile location and a column (DOC_BLOB) to hold the document.

	CREATE TABLE my_docs
		(doc_id   NUMBER,
		bfile_loc BFILE,
		doc_title VARCHAR2(255),
		doc_blob  BLOB DEFAULT EMPTY_BLOB() );

The Load Procedure

The load procedure takes as arguments the document name and
an id number for the document. The procedure will then prime a row for update
based on the document id, BFILE location and document name (which becomes the
document title). The procedure will then open internal and external BLOBs and
load the internal from the external. At this point, the document has been
loaded into the database table.

Code
Meaning
bfile_loc := BFILENAME('DOC_DIR', in_doc);

In order to load the document, you must first point to the
external object through a BFILE locator. The BFILENAME procedure takes a
directory location and the document name.

INSERT INTO
  my_docs (doc_id, bfile_loc, doc_title) 
  VALUES (1, bfile_loc, in_doc);

This statement is to prime the row into which the external
object will be inserted.

SELECT doc_blob INTO temp_blob 
  FROM my_docs WHERE doc_id = in_id
  FOR UPDATE;

Associate the temporary blob object to the table blob
object for updating.

DBMS_LOB.OPEN(bfile_loc, DBMS_LOB.LOB_READONLY);

Open the external blob object for reading.

DBMS_LOB.OPEN(temp_blob, DBMS_LOB.LOB_READWRITE);

Open the temporary blob object for reading and writing.

DBMS_LOB.LOADFROMFILE
  (temp_blob, bfile_loc, Bytes_to_load);

Copy the entire external blob object (BFILE) into the
internal temporary blob object.

The Search Procedure

The search procedure takes as arguments a document id and a
search string. The search procedure takes as arguments a document id and a search string. The procedure then converts the search string into raw format and places it into the variable named PATTERN. Once the variable PATTERN is populated, it is used for searching the loaded temporary BLOB DOC_BLOB to see if the particular pattern exists.

Pattern    := utl_raw.cast_to_raw(in_search);

Take the input search characters and convert them to raw
characters that can be used to search your document.

SELECT doc_blob INTO lob_doc
  FROM my_docs WHERE doc_id = in_id;

Put the document into a temporary BLOB for manipulation.

DBMS_LOB.OPEN (lob_doc, DBMS_LOB.LOB_READONLY);

Open the temporary BLOB for reading.

Position := DBMS_LOB.INSTR
  (lob_doc, Pattern, Offset, Occurrence);

Search the temporary BLOB for the supplied search string.
If it finds the pattern then the variable POSITION will be not 0.

How to Use the code.

The procedures that I have given you are very simplistic in
nature and are intended to be part of a larger application for managing
external documents within a database. They are intended to setup a directory
where your documents live, load the documents into a database, and then search
for string patterns in the document id provided. I personally can see you taking
out the reliance of supplying a document id and allowing the search to span
multiple documents within your library. Below I have given a brief description
on how to use the code as is but feel free to modify and integrate into your
own set of procedures.

How to Use

  1. log into your database of choice as the SYS user
  2. compile the package
	SQL> @mydocs.sql
  1. set serveroutput on
	SQL> set serveroutput on
  1. initial setup of directory object where your documents
    live
	SQL> exec mydocs.doc_dir_setup
  1. Check to make sure you can read one of your documents on
    disk.
	SQL> exec mydocs.list(Your Document Here.doc');
  1. Load your document into the database
	SQL> exec mydocs.load(Your Document Here.doc', 1);
  1. Search your documents for a string pattern
	SQL> exec mydocs.search(Search Pattern', 1);



The Code

CREATE OR REPLACE PACKAGE
          mydocs
	  AS
 PROCEDURE doc_dir_setup;
 PROCEDURE list   (in_doc    IN VARCHAR2);
 PROCEDURE load   (in_doc    IN VARCHAR2,
                   in_id     IN NUMBER);
 PROCEDURE search (in_search IN VARCHAR2,
                   in_id     IN NUMBER);
END mydocs;
/
CREATE OR REPLACE PACKAGE BODY
          mydocs
	  AS
vexists      BOOLEAN;
vfile_length NUMBER;
vblocksize   NUMBER;
PROCEDURE doc_dir_setup IS
 BEGIN
  EXECUTE IMMEDIATECREATE DIRECTORY DOC_DIR AS’||
  ”'”E:jkoopmannpublishdatabasejournalOracle””’;
END doc_dir_setup;
PROCEDURE list (in_doc IN VARCHAR2) IS
BEGIN
      UTL_FILE.FGETATTR(‘DOC_DIR’,
                        in_doc,
			vexists,
			vfile_length,
			vblocksize);
      IF vexists THEN
        dbms_output.put_line(in_doc||’    ‘||vfile_length);
      END IF;
END list;
PROCEDURE load (in_doc IN VARCHAR2,
                in_id  IN NUMBER) IS
temp_blob         BLOB := empty_blob();
bfile_loc         BFILE;
Bytes_to_load     INTEGER := 4294967295;
BEGIN
  bfile_loc := BFILENAME(‘DOC_DIR’, in_doc);
  INSERT INTO my_docs (doc_id, bfile_loc, doc_title)
         VALUES (in_id, bfile_loc, in_doc);
  SELECT doc_blob INTO temp_blob
    FROM my_docs WHERE doc_id = in_id
     FOR UPDATE;
  DBMS_LOB.OPEN(bfile_loc, DBMS_LOB.LOB_READONLY);
  DBMS_LOB.OPEN(temp_blob, DBMS_LOB.LOB_READWRITE);
  DBMS_LOB.LOADFROMFILE(temp_blob, bfile_loc, Bytes_to_load);
  DBMS_LOB.CLOSE(temp_blob);
  DBMS_LOB.CLOSE(bfile_loc);
COMMIT;
END load;
PROCEDURE search (in_search VARCHAR2,
                  in_id     NUMBER) IS
lob_doc        BLOB;
Pattern        VARCHAR2(30);
Position       INTEGER := 0;
Offset         INTEGER := 1;
Occurrence     INTEGER := 1;
BEGIN
  Pattern    := utl_raw.cast_to_raw(in_search);
  SELECT doc_blob INTO lob_doc
    FROM my_docs WHERE doc_id = in_id;
  DBMS_LOB.OPEN (lob_doc, DBMS_LOB.LOB_READONLY);
  Position := DBMS_LOB.INSTR(lob_doc, Pattern, Offset, Occurrence);
  IF Position = 0 THEN
    DBMS_OUTPUT.PUT_LINE(‘Pattern not found’);
  ELSE
    DBMS_OUTPUT.PUT_LINE(‘The pattern occurs at ‘|| position);
  END IF;
  DBMS_LOB.CLOSE (lob_doc);
END search;
BEGIN
  DBMS_OUTPUT.ENABLE(1000000);
END mydocs;
/

»


See All Articles by Columnist
James Koopmann

thumbnail
James Koopmann

James Koopmann has fourteen years of database design, development and performance tuning experience. In addition, he has extensive database administration experience in Oracle and other relational databases in production environments, specializing in performance tuning of database engines and SQL based applications. Koopmann is an accomplished author with several technical papers in various Oracle related publications such as Oracle Magazine, Oracle Professional and SQL>UPDATE_RMOUG. He is a featured author and database expert for DatabaseJournal, a member of the editorial review committee for Select Journal (The Magazine for the International Oracle Users Group), an Oracle Certified Professional DBA and noted speaker at local Oracle User Groups around the country.

Recommended for you...

Best Certifications for Database Administrators
Ronnie Payne
Oct 14, 2022
Best Courses for Database Administrators
Ronnie Payne
Jul 22, 2022
TYPE Definition Change in Oracle 21c
Is COUNT(rowid) Faster Than COUNT(*)?
Database Journal Logo

DatabaseJournal.com publishes relevant, up-to-date and pragmatic articles on the use of database hardware and management tools and serves as a forum for professional knowledge about proprietary, open source and cloud-based databases--foundational technology for all IT systems. We publish insightful articles about new products, best practices and trends; readers help each other out on various database questions and problems. Database management systems (DBMS) and database security processes are also key areas of focus at DatabaseJournal.com.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.