Oracle8 ConText Cartridge Administrator's Guide
Release 2.4

A63820-01

Library

Product

Contents

Index

Prev Next

7
Automated Text Loading

This chapter describes the ConText data dictionary objects provided for automated text loading.

The topics discussed in this chapter are:

Overview of Automated Loading

Figure 7-1


If you set up sources for your columns, you can use ConText servers running with the Loader (R) personality to automate batch loading of text from operating system files.

See Also:

For an example of automated text loading, see "Using ConText Servers for Automated Text Loading" in Chapter 9, "Setting Up and Managing Text".  

ConText Servers

If a ConText server is running with the R personality, it regularly checks all the sources that have been defined for columns in the database, then scans the specified directories for new files. When a new file appears, it calls ctxload to load the contents of the file into the appropriate column.

When loading of the file contents is successful, the server deletes the file to prevent the contents from being loaded again.

Text Loading Utility (ctxload)

The text loading utility, ctxload, loads text from operating system files into the LONG or LONG RAW column in a table. ctxload requires the files to be in the load file format. If the files are not in the load file format, the files need to be formatted before loading.

To ensure that the files are in the correct format, a user-defined translator can be specified as one of the preferences in the source for the column.

A user-defined translator is any program that accepts a plain text file as input and generates a load file formatted for ctxload as its output. The user-defined translator could also be used to perform pre-loading cleanup and spell-checking of your text.

See Also:

For more information about ctxload and the required format for load files, see Chapter 10, "Text Loading Utility".

For more information about translators for text loading, see "Translator Tiles" in this chapter.  

Error Handling

If an error occurs while loading, the error is written to the error log, which can be viewed using CTX_INDEX_ERRORS. In addition, the original file is not deleted.

Sources

Figure 7-2


To automate loading text from operating system files into a database column, ConText requires the following information:

A source provides this information, in the form of text loading preferences (one preference for each of the requirements). Sources can be created by any ConText user with the CTXAPP role. Sources are stored in the ConText data dictionary.


Note:

A source must exist for a column before a ConText server with the Loader personality can load text from operating system files into the column.  


In addition to the preferences for a source, users specify a name and text column for the source. The text column in the source indicates the column to which text is loaded by ConText servers.


Note:

The column datatype must be LONG or LONG RAW, because ctxload only supports loading text for these types.  


Users can also choose to specify a description and a refresh rate for directory scanning.

The sources created by a user must be unique for the user. As such, the same source for a user cannot be assigned to more than one column.

See Also:

For an example of automated text loading with ConText servers, see "Loading Text" in Chapter 9, "Setting Up and Managing Text".

For more information about text loading preferences, see "Preferences for Text Loading" in this chapter.  

Preferences for Text Loading

This section provides conceptual, as well as reference, information for text loading preferences, which are stored in the ConText data dictionary:

What is a Text Loading Preference?

Text loading preferences specify the options that ConText uses to automatically load text. Each preference represents one (and only one) text loading option and is grouped into one of three categories or types, which correspond to the information ConText requires for automating text loading:

When creating a source, three preferences are specified for the source, one for each of the three types. If one of the types of preference is not specified when the source is created, the default, predefined preference for that type is used in the source.

A preference can be used in more than one source; however, two preferences of the same type cannot be used in the same source.

Tiles in Preferences

A text loading preference consists of a ConText Tile and one or more attributes (and their corresponding values) for the Tile.

See Also:

For more information about the Tiles used in text loading preferences, see "Reader Tiles", "Translator Tiles", or "Engine Tiles" in this chapter.  

Predefined Preferences

ConText provides predefined preferences for each type. These predefined preferences can be used by any ConText user with the CTXAPP role to create sources without first creating preferences.


Note:

The predefined preference for the Reader category should not be used. The directory specified in the default Reader preference is a generic directory specified for default purposes only; the directory most likely does not exist in your file system.  


User-defined Preferences

A ConText user with the CTXAPP role can create their own preferences by setting the required attributes for the appropriate Tile, then calling CTX_DDL.CREATE_PREFERENCE and specifying the name of the Tile.


Note:

When creating a source, users can use all preferences that have been defined in the ConText data dictionary, including their own preferences, preferences created by other users, or the predefined preferences provided by ConText.  


Reader Predefined Preferences

ConText provides a single predefined Reader preference, DEFAULT_READER, for text loading.

DEFAULT_READER

This preference calls the DIRECTORY READER Tile, which specifies a dummy directory for the Tile.


Note:

Because it is unknown which directory contains the files to be loaded and path names are operating-system specific, this preference is provided as a default only and should not be used when creating a source.

Before creating a source, you should create your own Reader preference that specifies the directory where your files to be loaded are located.  


Translator Predefined Preferences

ConText provides a single predefined Translator preference, DEFAULT_TRANSLATOR, for text loading.

DEFAULT_TRANSLATOR

This preference calls the NULL TRANSLATOR Tile, which indicates no translation is performed on the files to be loaded; the files are in the format required by ctxload.

Engine Predefined Preferences

ConText provides a single predefined Engine preference, DEFAULT_LOADER, for text loading.

DEFAULT_LOADER

This preference calls the GENERIC LOADER Tile, which indicates the preference can be used to load text from files in a operating system directory.

Reader Tiles

The Reader Tiles are used to specify the location of the files to be loaded.

ConText provides a single Tile, DIRECTORY READER, for creating Reader preferences for text loading sources.

DIRECTORY READER

The DIRECTORY READER Tile is used to specify the location of files to be loaded when the files are located in the local operating system.

DIRECTORY READER has the following attribute(s):

Attribute   Attribute Values  

directories  

pathname for the directory where text loading files are located  

directories

The directories attribute specifies the full pathname for the directory that the ConText server with the Loader personality scans when looking for new files to load into a column in a table or view.

The structure of the value for directories will vary depending on the directory naming conventions used by your operating system.

Translator Tiles

ConText provides the following Tiles for creating Translator preferences for text loading sources:

Tile   Description  

NULL TRANSLATOR  

Files to be loaded are already in the load file format required by ctxload.  

USER TRANSLATOR  

Files to be loaded are converted into the required load file format using a translator provided and specified by the user.  

NULL TRANSLATOR

The NULL TRANSLATOR Tile is used to specify that the load files for the loader (ctxload) are already in the format required by ctxload. It has no attributes.

USER TRANSLATOR

The USER TRANSLATOR Tile is used to specify a translator program that converts load files into the format required by ctxload.

USER TRANSLATOR has the following attribute(s):

Attribute   Attribute Values  

command  

executable for translator program  

command

The command attribute specifies the executable name of the translator program used to convert a load file into the format required by ctxload.


Note:

The specified translator executable must be stored in the appropriate directory in the Oracle home directory.

For example, in a UNIX-based environment, all translator executables must be stored in $ORACLE_HOME/ctx/bin.

In a Windows NT environment, the translator executables must be stored in \BIN in the Oracle home directory.

For more information about directory structures for ConText, see the Oracle8 installation documentation specific to your operating system.  


Engine Tiles

ConText provides a single Tile, GENERIC LOADER, for creating Engine preferences for text loading sources:

GENERIC LOADER

The GENERIC LOADER Tile is used to specify the command-line options for ctxload.

GENERIC LOADER has the following attribute(s):

Attribute   Attribute Values  

separate  

Y (text stored in separate file(s), load file contains pointers to separate file(s))  

 

N (text stored in load file, default)  

longsize  

maximum size, in kilobytes, of text to be loaded (default 64)  

separate

The separate attribute specifies whether the -separate option for ctxload is enabled. When the -separate option is enabled, the load files do not contain the actual text of the documents to be loaded, but, rather, contain pointers to separate files where the text of the documents is stored.

The default for separate is N.

longsize

The longsize attribute specifies a value for the -longsize option for ctxload. The -longsize option specifies the maximum size, in kilobytes, allowed for text loaded by ctxload.

See Also:

For more information about how the -separate and -longsize options work in ctxload for loading text, see "Command-line Syntax" in Chapter 10, "Text Loading Utility".  




Prev

Next
Oracle
Copyright © 1998 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index