BLOB data storage in geodatabases in Oracle

BLOB is a database management system (DBMS) industry acronym for binary large object. BLOB columns were implemented several years ago by Oracle Corporation to replace LONG RAW technology for storing binary data.

The architecture of the BLOB data type is divided into three basic components: the BLOB column, LOB segment, and LOB index. The BLOB column stores the LOB locator (36 bytes) and binary data in row if it is less than 3,965 bytes and in-row storage has not been disabled for the column.

NoteNote:

Tests by Esri have shown that allowing storage in row provides the best performance, so you are advised not to disable in-row storage.

If the binary data exceeds 3,964 bytes, the in-row storage space of the BLOB column is not allocated, and the LOB locator references the binary data stored in the LOB segment.

Therefore, a value stored in a BLOB column with in-row storage enabled is always at least 36 bytes (the space allocated to the LOB locator) and may be as large as 4,000 bytes (the combined space allocated to the LOB locator and the maximum space that can be allocated to binary data stored in row).

The LOB segment is divided into chunks. Chunks must be a multiple of the Oracle data block size. For example, if the data block size is 8K, the LOB segment can be created with a minimum chunk size of 8K. If the length of the data stored within the LOB segment is 5,000 bytes, it is stored in the LOB segment since it exceeds 3,964 bytes and the chunk size is 8K or 8,192 bytes. In this case, 3,192 bytes of the LOB segment chunk remains unused. Transferring data from LONG RAW to BLOB can result in more space being required—perhaps as much as a 30 percent increase due to the unused space in the LOB segment. This is unavoidable if your data exceeds the 3,964-byte, in-row storage threshold of the BLOB column.

The 8K chunk size experiences the best I/O performance while wasting the least amount of space. The 16K chunk size wastes more space than an 8K chunk size. Therefore, to avoid the loss of space, you are advised to either re-create the database that currently has a 16K data block size with an 8K data block size or, if that is not possible, create LOB segments in tablespaces that have been created with an 8K block size. To do this, you need to allocate an 8K buffer cache in the Oracle System Global Area (SGA).

Chunk sizes of 4K and 2K have been found to waste less space, but the increase in I/O cost does not warrant using them.

The LOB index is only used if the number of chunks addressed by the LOB locator exceeds 12; otherwise, the first 12 chunks are addressed by the LOB locator.

The following three figures illustrate the three possible storage cases of binary data stored in a BLOB column. In the first case, 3,000 bytes of binary data are stored in row, since 3,000 bytes is less than the 3,965-byte, in-row storage threshold. If in-row storage is not disabled for the BLOB column, the LOB segment and LOB index are not used. Typically, this results in a faster fetch of the BLOB data due to the reduced number of I/O operations since Oracle does not need to access the LOB segment or LOB index.

BLOB data smaller than 3,965 bytes in size stored in-row
BLOB data smaller than 3,965 bytes in size stored in-row

The next figure illustrates the second case, in which the binary data is larger than 3,964 bytes (in this case, the data is 81,920 bytes) and cannot fit in row. Therefore, the LOB locator references the binary data that is stored in the LOB segment. Since the binary data does not occupy more than 12 chunks in the LOB segment, the LOB locator stores its addresses. In this case, the LOB index is not used.

BLOB data greater than 3,964 bytes in size stored out-of-row.
BLOB data greater than 3,964 bytes in size stored out-of-row. A LOB locator in the table points to the LOB segment where the data is stored.

In the final illustration, the binary data is so large (106,496 bytes) that the LOB index is required. In this case, the binary data exceeds the in-row storage plus requires more than 12 chunks within the LOB segment to store it. For data this large, the LOB locator references the LOB index to obtain the location of the chunks within the LOB segment. This case is extremely rare for vector data and can be avoided for raster data.

BLOB data stored out-of-row, requiring a LOB index
BLOB data stored out-of-row, requiring a LOB index

Setting the DBTUNE parameters to store BLOB columns

The storage parameters of the DBTUNE table control how ArcGIS creates tables and indexes in Oracle. Some of the storage parameters also determine which data type is used when a table is created. For details on the DBTUNE table, see What is the DBTUNE table? For general information on DBTUNE storage parameters, see What are DBTUNE configuration keywords and parameters?

The DBTUNE storage parameters, GEOMETRY_STORAGE, RASTER_STORAGE, and ATTRIBUTE_BINARY, determine which Oracle data type is used to store data.

The GEOMETRY_STORAGE parameter controls how vector data is stored in a feature class. The RASTER_STORAGE parameter controls how raster data is stored in a raster dataset, raster catalog, or raster attribute. Finally, the ATTRIBUTE_BINARY parameter controls the storage of all other binary data that is not vector or raster.

To create BLOB columns, the parameters must be set as follows within a given DBTUNE keyword:

GEOMETRY_STORAGE SDELOB
RASTER_STORAGE BLOB
ATTRIBUTE_BINARY BLOB

Esri recommends the following LOB storage parameters for vector and raster data:

The following examples show how the raster DBTUNE storage parameters have been modified to accommodate a raster blocks table stored as a BLOB data type:

RASTER_STORAGE "BLOB"
BLK_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE RASTER 
             LOB (BLOCK_DATA) STORE AS 
             (TABLESPACE RASTER_LOB_SEGMENT 
              CACHE PCTVERSION 0)" 

AUX_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE RASTER 
             LOB (OBJECT) STORE AS 
             (TABLESPACE RASTER 
              CACHE PCTVERSION 0)"
RASTER_STORAGE "ST_RASTER"
BLK_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE RASTER 
             LOB (BLOCK_DATA) STORE AS 
             (TABLESPACE RASTER_LOB_SEGMENT 
              CACHE PCTVERSION 0)"

If the raster block pixel data is less than 3,965 bytes, it is stored within the BLOCK_DATA column in the RASTER tablespace. However, if it exceeds this threshold, it is stored in the LOB segment in the RASTER_LOB_SEGMENT tablespace. The LOB index is only used if the number of chunks exceeds 12. This is unlikely to happen for geodatabase data. Consider a LOB segment with a chunk size of 8K. Before the LOB index is used, the ArcSDE binary data needs to exceed 96K.

The following examples show how the vector DBTUNE storage parameters have been modified to accommodate the feature table stored in a BLOB data type:

GEOMETRY_STORAGE "SDELOB"

F_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE VECTOR 
             LOB (POINTS) STORE AS 
             (TABLESPACE VECTOR_LOB_SEGMENT 
              CACHE PCTVERSION 0)"
GEOMETRY_STORAGE  "ST_GEOMETRY"

If the feature's binary data is less than 3,965 bytes, it is stored within the POINTS column in the VECTOR tablespace. However, if it exceeds this threshold, it is stored in the LOB segment in the VECTOR_LOB_SEGMENT tablespace.

ATTRIBUTE_BINARY "BLOB"

B_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE BIZZTABS 
             LOB (DOCUMENT) STORE AS 
             (TABLESPACE BIZZ_LOB_SEGMENT 
              CACHE PCTVERSION 0)"

A_STORAGE "PCTFREE 0 INITRANS 4 TABLESPACE BIZZTABS 
             LOB (DOCUMENT) STORE AS 
             (TABLESPACE BIZZ_LOB_SEGMENT 
              CACHE PCTVERSION 0)"

In this example, if the business table's binary data is less than 3,965 bytes, it is stored within the business table's BLOB column in the BIZZTABS tablespace. However, if it exceeds this threshold, it is stored in the LOB segment in the BIZZ_LOB_SEGMENT tablespace. The BLOB column in this example is DOCUMENT. If the above B_STORAGE DBTUNE parameter is used to create a table that does not have a DOCUMENT column, the following error is returned by Oracle:

ORA-00904: "DOCUMENT": invalid identifier

Therefore, it is not wise to add B_STORAGE or A_STORAGE parameters referencing a specific BLOB column to the DEFAULTS keyword, since the business table must contain these columns. Instead, create separate DBTUNE keywords and add these storage parameters to the keywords. The keyword that contains the storage parameter is referenced during the creation of the table. It should also be noted that storage parameters of the DEFAULTS keyword are used if they are not included with a specific keyword. Due to this fact, it is not necessary to add a particular storage parameter within a keyword if its configuration string is identical to the storage parameter under the DEFAULTS keyword. For instance, if all the storage parameters except B_STORAGE and A_STORAGE of a new keyword, ROADS, have the same configuration string as those of the DEFAULTS keyword, you only need to create the B_STORAGE and A_STORAGE parameters under the ROADS keyword. All other storage parameters are read from the DEFAULTS keyword since they are not found in the ROADS keyword.

4/2/2015