Frequently Asked Questions

OPUS Process Resource Files

What is a process resource file?
What is contained in a process resource file?
What's an easy way to view the process resource file?
What is the complete format of a line in the process resource file?
What are the minimum keywords required in a process resource file?
Which process resource file keywords are reserved or have specific uses?
Where will I find process resource files in the OPUS directory tree?
What types of pipeline triggers can be set up in a process resource file?
How do I set up a process with an OSF trigger?
How do I set up a process to trigger on the existence of a certain file?
How do I set up a relative time trigger?
How do I set up a process to trigger once a day?
How do I set up a process to trigger once an hour starting at some point?
How can I use the exit status of a process to trigger pipeline actions?
Ok, but what if my external poller processes more than one item per event?
Can a process with a file trigger perform an arbitrary task upon successful completion?
How can process resource keyword values be accessed by an external polling process?
Can keyword values be obtained from files other than the process resource file?
What are the differences between path files and process resource files?
Can resource file parameters refer to a specific path?
How do you decide whether to put a keyword value in a path file or in a process resource file?
Can I give a keyword different values for the same process running in different paths?
Can process resource file keyword values come from the environment?
Are process resource file keyword names case sensitive?
Can OPUS protect against running out of disk space?
So what do I do if a process is in "iowait" state?

OPUS Process Resource Files

What is a process resource file?

A process resource file is a simple ASCII text file that describes how an OPUS pipeline starts a particular pipeline process and how the pipeline manages it.

What is contained in a process resource file?

The process resource file follows a "keyword = value" format, and a number of the keyword names have special meaning to the OPUS system. The process resource file also contains information that the particular pipeline process needs in order to run (input directory names, output directory names, etc.), but that the OPUS system doesn't care about. In this way the process resource file can be thought of as a configuration file for the pipeline process.

What's an easy way to view the process resource file?

The process resource file is located in the OPUS_DEFINITIONS_DIR. Its filename is the process name with the extension ".resource". Rather than hunting for a process resource file for a given process, you can have the Process Manager (PMG) display the file for you. Just select the "Select Process..." option from the "File" menu:

[PMG File
Menu]

and double click on the desired process name in the "run process..." list.

[OPUS
Process Selection window]

Another window will pop up with the resource file displayed.

What is the complete format of a line in the process resource file?

A process resource file entry follows a keyword, value format like this:

   KEYWORD = VALUE ! COMMENT

Tokens are space delimited (NO tabs). Stand-alone comment lines begin with "!". The comment field is optional, along with the "!". The value fields may include spaces only if the value is delimited by single quotes:

   KEYWORD = 'value with spaces'   !An example which includes spaces

A special syntax is used for keywords that should be set in the environment of the process. For those keywords, the string ENV. is prepended to the keyword name. This is the primary mechanism through which an external polling process gains access to values in the process resource and path files.

What are the minimum keywords required in a process resource file?

The only two required keyword entries for all processes are:

  POLLING_TIME = 10                      !seconds between tests for a new event
  TASK = < xpoll -p $PATH_FILE -r taskname >  !the OPUS method for invoking
                                              !the process

For so-called "external polling" processes, which include scripts and programs with no knowledge of how the OPUS pipeline works, these keywords are required also:

   COMMAND = name_of_script  !the script (in your path) that
                             !runs this pipeline process
   XPOLL_STATE.01 = SUCCESS  !maps a successful exit status (01 in this 
                             !case) from the process
                             !so that XPOLL does not think the process
                             !encountered an error

If the MINBLOCKS keyword is provided, the OUTPATH keyword MUST also be included:

   MINBLOCKS = 50000        !disk space in 512-byte blocks required for
                            !the process to begin execution
   OUTPATH = output_directory    !defines the disk device where MINBLOCKS of
                                 !free space must exist in order to start the
                                 !process

Simple examples of resource files for many types of external pollers are included with the sample pipeline.

Which process resource file keywords are reserved or have specific uses?

CLASS - a three character designator for the class of data that the process operates on. The value of this keyword appears in the PMG display in the CLS column, and the list of monitored processes can be sorted by this field in the PMG. If this field is missing from the process resource file, the CLS column will set to PRO.

COMMAND - the name of your script to be executed by the OPUS XPOLL task (when XPOLL appears in the TASK keyword value).

DELTA_TIME - time added to the current time at which the pipeline process will awaken and run again (DDD:HH:MM:SS).

DESCRIPTION - a short description of the process that appears with the process name in the PMG's process selection dialog. If this string contains spaces, it must be enclosed within single quotes.

DISPLAY_ORDER - a secondary sort key for processes in the PMG's process selection dialog. Once processes are separated into groups based on their SYSTEM value, they are sorted within SYSTEM according to the value of this keyword. Processes with the same value of DISPLAY_ORDER are sorted alphabetically according to the process name.

FILE_ABSENT - defines the modifier applied to the FILE_OBJECT when a process goes absent.

FILE_DIRECTORYn - where n=1...(Number of file searches); identifies the directory where files are searched for.

FILE_OBJECTn - where n=1...(Number of file searches); identifies the filename or file mask used to search for files.

FILE_RANK - priority of FILE events in the OPUS event queue (1...N, where N is a positive integer).

FILE_MAXTARGSn - where n=1...(Number of triggers); identifies how many files an application will handle per event (the application must support multiple files per event if > 1). If this key is not defined, a single file will be included per event.

FILE_ACTION - specifies that an external polling process should perform the task indicated by the value if an XPOLL_STATE definition maps an exit status to this keyword.

FILE_ACTION_OK - specifies that the exit status of a FILE_ACTION task be tested and if not equal to this value, an error should be posted to the process log file.

MINBLOCKS - number of 512-byte blocks that must be free on the output device specified by OUTPATH before a pipeline process will begin doing work. While less than this number of blocks are free, the process remains in IOWAIT state.

OSF_PROCESSING.XX - identifies the OSF stage (XX) and value assigned while a pipeline process is doing work.

OSF_RANK - priority of OSF events in the OPUS event queue (1...N, where N is a positive integer).

OSF_TRIGGERn.XX - where n=1...(Number of triggers); identifies the OSF stage (XX) that must be matched before a process begins doing work.

OSF_TRIGGERn.MAXTARGS - where n=1...(Number of triggers); identifies how many OSF's an application will handle per event (the application must support multiple OSF's per event if > 1). If this key is not defined, a single OSF will be included per event.

OSF_ABSENT - defines the modifier applied to the OSF_PROCESSING column of an OSF when a process goes absent during processing of that OSF.

OUTPATH - output path (must be present if MINBLOCKS is defined).

PASSWORD - indicates to the Process Manager (PMG) that it should prompt the user for a password when starting this task. The password entered is encrypted and passed to the task as a pair of extra parameters on the task's COMMAND line. If you include a PASSWORD resource item, you should also include a PASSWORD_PROMPT item, especially if you start multiple processes that require a password.

PASSWORD_PROMPT - displayed by the PMG when the user is asked for a password before starting this task. This value can help clarify for which process the password is needed, and it is also used by the PMG to "group" password-required processes. This grouping results in the user only being prompted for a password once for each unique combination of PASSWORD_PROMPT and node name for a set of processes. The password entered by the user is automatically passed to each of the processes in the same group that are started on the same node. Processes started on the same node which do not provide a PASSWORD_PROMPT resource value are assigned to the default process group, and all get assigned the same password as if they were grouped togther by a common blank PASSWORD_PROMPT value.

POLLING_TIME - interval time (seconds) at which the OPUS pipeline checks for a new process "event" (i.e., checks to see if the process has work to do).

RESUME_BLOCKS - can be used to specify the number of blocks needed for a process to exit IOWAIT state.

START_TIME - time at which the pipeline process will first awaken and run (DDD:HH:MM:SS).

SYSTEM - the primary key used to sort processes for display in the PMG's process selection dialog. Processes with the same value for SYSTEM are grouped together in this dialog. The order in which SYSTEM groups are displayed depends alphabetically on this value.

TASK - indicates the command syntax that the OPUS pipeline will use to invoke the pipeline process.

TIME_RANK - priority of TIME events in the OPUS event queue (1...N, where N is a positive integer).

XPOLL_ERROR.XX - identifies the OSF stage (XX) and value assigned when the process exit status does not map to the known exit status values given in the XPOLL_STATE.NN keywords.

XPOLL_ERROR_COUNT - the number of times that XPOLL will accept an error status return from a pipeline process and continue to activate the process on the next event. Once this threshold is exceeded, XPOLL will stop activating the process (in pipeline terms, the process will go ABSENT).

XPOLL_STATE.NN - maps the process exit status NN (zero-fill, e.g. 01) to a set of process resource file keywords that indicate the pipeline action performed when receiving that exit status.

Where will I find process resource files in the OPUS directory tree?

Process resource files are found under the OPUS_DEFINITIONS_DIR directory. This environment variable is defined by the opus_login.csh file in OPUS_DEFINITIONS_DIR. The directory can be defined to stretch through a local directory and then through the OPUS system directories, similar to a Unix path. This allows the user to override certain resource files with local copies, while using the OPUS system resource files for cases in which overrides are not necessary.

What types of pipeline triggers can be set up in a process resource file?

There are three (3) types of triggers in a process resource file: OSF (Observation Status File), File, and Time triggers.

OSF triggers cause a pipeline process to begin executing when an OSF with certain characteristics is posted on the OPUS blackboard. For example, a telemetry-parsing pipeline stage could be triggered by an OSF indicating that a data-receipt pipeline stage has completed.

File triggers occur when a file matching a certain filename mask appears in a particular directory. For example, a process could be triggered when files named f*.pod appear in the directory /mydata/incoming/.

A time trigger occurs either at specific calendar dates and times, at some delta time offset from the current time, or at some delta time offset from some time in the future. For example, a process could trigger every day at 08:00 AM, while another might trigger every 15 minutes around the clock. Yet another process might trigger every hour starting the following Sunday. Only one time trigger definition is allowed per process resource file.

How do I set up a process with an OSF trigger?

There are many possible resource file keyword combinations that can be used to define an OSF trigger. One example will be given here, with more to be found in the resource files for the sample pipeline. Given the following entries in a process resource file:

   OSF_RANK = 1                    !if multiple trigger types are used in
                                   !combination, this determines what order
                                   !this trigger will be checked relative to
                                   !the others
   OSF_TRIGGER1.XX = w             !when OSF column XX = "w" (waiting)
   OSF_TRIGGER1.YY = c             !and when OSF column YY = "c" (complete)
   OSF_TRIGGER1.DATA_ID = zzz      !and the OSF data id = "zzz"...
   OSF_PROCESSING.XX = p           !set this OSF stage upon being triggered

When this combination of OSF characteristics is met by an OSF on the blackboard, the process will trigger (i.e. start up, or awaken). The other columns of the OSF do not matter. Only the columns mentioned in the trigger are checked. Once the process is triggered, the OSF column XX will be set to "p" (processing) to indicate that work is being done. This will also prevent the same OSF from triggering other copies of this process, if multiple copies of this process are currently running.

The OSF_RANK, OSF_TRIGGERn.XX, and OSF_PROCESSING.xx keys are required to form a valid OSF trigger.

How do I set up a process to trigger on the existence of a certain file?

Add the following types of entries to the process resource file:

   FILE_RANK = 1                   !if multiple trigger types are used in
                                   !combination, this determines what order
                                   !this trigger will be checked relative to
                                   !the others
   FILE_DIRECTORY1 = /mydisk/mydata/   !incoming data found here
   FILE_OBJECT1 = *.incoming           !trigger on files that match the pattern
   FILE_PROCESSING = _proc         ! file renamed to include this on end
                                   ! when found (prevents retriggering)
   FILE_SUCCESS.DIRECTORY = /mydisk/done/ ! If successful, file is moved
                                          ! here (still has _proc on name)
   FILE_ERROR.DIRECTORY = /mydisk/bad/    ! If unsuccessful, file is moved
                                          ! here (still has _proc on name)

The FILE_RANK, FILE_DIRECTORYn, FILE_OBJECTn, FILE_PROCESSING, FILE_SUCCESS, and FILE_ERROR keys are required to form a valid file trigger.

How do I set up a relative time trigger?

Add the following types of entries to the process resource file:

   TIME_RANK = 1                    !if multiple trigger types are used in
                                    !combination, this determines what order
                                    !this trigger will be checked relative to
                                    !the others
   DELTA_TIME = 000:00:30:00   !trigger every 30 minutes from the initial 
                               !time at which the process is started

How do I set up a process to trigger once a day?

Add the following types of entries to the process resource file:

   TIME_RANK = 1                    !if multiple trigger types are used in
                                    !combination, this determines what order
                                    !this trigger will be checked relative to
                                    !the others
   DELTA_TIME = 001:00:00:00    !trigger every 24 hours from the initial 
                                !time at which the process is started

How do I set up a process to trigger once an hour starting at some point?

Add the following types of entries to the process resource file:

   TIME_RANK = 1                    !if multiple trigger types are used in
                                    !combination, this determines what order
                                    !this trigger will be checked relative to
                                    !the others
   START_TIME = 12:00:00        !start triggering at 12 noon, local time
   DELTA_TIME = 000:01:00:00    !trigger every hour from the START_TIME
                                !onward

How can I use the exit status of a process to trigger pipeline actions?

External pollers like those in the sample pipeline trigger pipeline actions via the exit status of the process. Special keywords are placed in the process resource file that map the exit status to other keywords which in turn indicate the type of action to take after the external poller exits.

For example, given an external polling process triggering on OSF stage GC, the following keywords define three possible exit status values (1, 3, and 5) and their mappings to other keywords in the process resource file as well as how the GC OSF stage should be updated for exit status values other than 1, 3, or 5:

   XPOLL_STATE.01 = OSF_SUCCESS1
   XPOLL_STATE.03 = OSF_ERROR1
   XPOLL_STATE.05 = OSF_SUCCESS2
   XPOLL_ERROR.GC = x             !any exit status != 1,3,5 sets OSF stage
                                  !GC to x

The corresponding keywords that map the exit status values to OSF update actions might look like:

   OSF_SUCCESS1.GC = c      !if exit status=1 set OSF stage GC to c...
   OSF_SUCCESS1.CA = n      !...and set OSF stage CA to n (not necessary)
   OSF_SUCCESS2.GC = c      !if exit status=5 set OSF stage CA to c...
   OSF_SUCCESS2.CA = w      !...and set OSF stage CA to w (waiting)
   OSF_ERROR1.GC = e        !if exit status=3 set OSF stage GC to e

In this case, should the external polling process exit with status = 1, the GC stage of the OSF it was working on would be updated to "c" (for complete, perhaps) and the CA stage would be updated to "n", possibly indicating that the CA stage is not necessary for this data set. An exit status of 5 results in a similar update except that CA is changed to "w". On the other hand, should an exit status of 3 be returned, the GC stage of the OSF would be set to "e" indicating that an error occurred during processing; the CA stage would not be changed since GC did not complete normally. Finally, an exit status not equal to 1, 3, or 5 would result in the GC stage being marked "x".

Ok, but what if my external poller processes more than one item per event?

If your external poller specifies an OSF_TRIGGERn.MAXTARGS or FILE_MAXTARGSn value > 1, you can assign separate "exit status" values to each of the items in the event using an intermediary file.

The process must define EVENT_STATE_FILE in the environment (the best way to do this is through an ENV. entry in the process resource file). The value of this environment variable should be the name of a file in OPUS_HOME_DIR in which the external poller will report the individual exit status values for each item in the event.

The external poller must place one line per event item in this file in the following format:

   OSF_STATUS  = s0
   OSF_STATUS1 = s1
   .
   .
   .
   OSF_STATUSn = s(n-1)

for OSF events or

   EVENT_STATUS  = s0
   EVENT_STATUS1 = s1
   . 
   .
   .
   EVENT_STATUSn = s(n-1)

for file events where n is the number of items in the event, the OSF_STATUS or EVENT_STATUS definitions in the file correspond to the items defined in the environment of the same name, and s0...s(n-1) are the exit status values to be mapped for that item.

Finally, the external poller must signal that the file should be read by exiting with a status that maps to an XPOLL_STATE value of CHECK_FILE. For example, if the process resource file defines

   XPOLL_STATE.01 = CHECK_FILE

the external poller should exit with a value of 1 in order to have the file read.

Can a process with a file trigger perform an arbitrary task upon successful completion?

Yes.

The FILE_ACTION keyword can be defined for a process with a file trigger. The value of this keyword should specify a script or process to run if the FILE_SUCCESS status is applied. The command-line arguments for the script can include the token ^f or ^F for which the file name that was processed is substituted.

For example, an entry of the form:

   FILE_ACTION = '/bin/rm -f ^f'

will cause the program /bin/rm to run if FILE_SUCCESS is signalled upon completion of file event processing. The file name is substituted in place of ^f, so this file action results in deletion of the file.

Optionally, the keyword FILE_ACTION_OK can accompany the definition of FILE_ACTION to define the expected return status of the file action command. If the return status does not match the value of FILE_ACTION_OK, an error message is reported in the process log file.

How can process resource keyword values be accessed by an external polling process?

Scripts run in a pipeline as an external poller can have process resource file keyword values available to them as system environment variables if those keywords are prepended with ENV. in the process resource file. The prefix ENV. does not become part of the environment variable name.

Can keyword values be obtained from files other than the process resource file?

In addition to placing keyword values directly in a process resource file, a method exists for obtaining keyword values indirectly through another file called a path file. Each "value" assigned in a process resource file keyword entry and prepended with ENV. becomes an environment variable for that process; this value is checked against the path in which the process is being run. If the value from the process resource file entry is found as a keyword in the path file, a second translation takes place to obtain the value of the item from the path file. For this resource entry,

   ENV.OUTPATH = DATOUT

and this path file entry

   DATOUT            /mydisk/mydata/

the value of OUTPATH is translated "/mydisk/data/". If the DATOUT entry in the path file were removed, the value of OUTPATH would not be translated the second time and would remain equal to "DATOUT". Path file keyword entries cannot be set as environment variables directly using ENV.; this only works for process resource file keywords.

What are the differences between path files and process resource files?

A process resource file defines parameters used to run a specific process within an OPUS pipeline. A path file defines an environment in which an OPUS pipeline (many pipeline processes) is run. A process cannot be run within an OPUS pipeline without the existence of both a process resource file and a path file.

The path file usually provides a mapping of symbolic names listed in the process resource file entries to specific file paths. This allows a pipeline process to be run on different types of data with different input and output directories (i.e., in parallel pipelines) without altering the process resource file values. For example, STScI runs a real-time pipeline on one path and a production pipeline on another path using the same set of process resource files: all pipeline-specific data are in the path files and are retrieved by the processes through indirection. The path file construct increases the flexibility of OPUS, particularly in multiple pipelines environments.

Can resource file parameters refer to a specific path?

Yes. This is referred to as creating a "bridge" process.

How do you decide whether to put a keyword value in a path file or in a process resource file?

If you wish to have a keyword value apply to all instances of a process, regardless of the path the process is run in, then the keyword and value belong in the process resource file. However, if you want to make the value path dependent, then you should put the keyword in the process resource file with a value that resolves to a keyword in the path file. An example of the latter case is as follows: An entry in a process resource file like

   DATABASE = PATH_DATABASE

and one in a path file like

   PATH_DATABASE = REQUIRED

results in a value of REQUIRED for DATABASE when the process is run in this path.

To override the value of a keyword in a process resource file for a path, you would prepend the name of the process to the keyword in the path file and supply the overriding value. For example, a path file might contain the entry:

   g2f.DATABASE = OPTIONAL

No matter what value is assigned to DATABASE in the process resource file of g2f, its value will be OPTIONAL when run in this path. If you wish to override the process resource keyword value for all processes in a particular path, you can use "*" as the process name prefix to the keyword in the path file as in:

   *.DATABASE = PROHIBITED

Can I give a keyword different values for the same process running in different paths?

Yes.

The process resource file keyword value should have a symbolic name. That name should appear as a keyword in the path file. Different path files, then, can assign different values to that symbolic name.

Can process resource file keyword values come from the environment?

In general, no.

However, if the process resource file keyword values for COMMAND (external pollers), TASK or FILE_ACTION contain the substring SUB[env_var_name], where env_var_name is the name of an environment variable, then the value of that environment variable is substituted for the SUB[] expression before the command is executed.

This same mechanism is supported for all path file values, so a process resource file can obtain values from the environment indirectly using the technique described here.

Are process resource file keyword names case sensitive?

Yes, both keyword names and values are case sensitive.

Can OPUS protect against running out of disk space?

Yes.

There is a built-in keyword pair designed to provide a disk space check. By specifying the keywords,

   MINBLOCKS = 50000           !number of free 512-byte blocks required on
                               !output device
   OUTPATH = /mydir/outgoing   !directory where output products are placed

The OPUS system will automatically verify that sufficient disk space exists on the output device defined by OUTPATH before each "event" (new input file available, or previous pipeline step complete, or process wake-up time arrives). If sufficient free space does NOT exist, the pipeline process will be suspended until enough space is made available. The pipeline process status file will have its status field set to "iowait" in this case. This will show up in the Process Manager (PMG) display.

So what do I do if a process is in "iowait" state?

You should free up enough space to allow the process to continue. "Enough" is defined either by the process resource RESUME_BLOCKS or MINBLOCKS value. If RESUME_BLOCKS is missing from the process resource file, then the value of MINBLOCKS will be used. It's not a good idea to set RESUME_BLOCKS to a value less than MINBLOCKS.

Frequently Asked Questions

OPUS Process Resource Files

OPUS Process Resource Files

Top of PRSC FAQ

Top of OPUS FAQ