[OPUS]

Frequently Asked Questions


OPUS Path Files



OPUS Path Files


What is a path?

You can think of a path as a set of directories on disk which contain the data being processed in its various stages. Thus the data may start out in a symbolic directory called "input_dir", intermediate products may be created in "middle_dir" of the pipeline, and the final output may be in "final_dir". These symbolic names are found in entries in the process resource files. For example, in the g2f.resource file you will find:
   INPATH     = gif_data   ! Directory where the input files are found
   OUTPATH    = fits_data  ! Directory where output files are written
It is the path file which binds those logical names to physical directories. The symbolic names are resolved by the path file where you will find something like:
   gif_data   =  /home/mydir/opus_test/dat/
   fits_data  =  /home/mydir/pipe/out/
This allows you to create several independent pipelines without having to change either code or the process resource files. All you need to do to define a distinct pipeline is a new path file.


Where do I find the path files in the OPUS directory tree?

Path files are created by the user and are put in the OPUS_DEFINITIONS_DIR directory. This environment variable is defined by the opus_login.csh file in OPUS_DEFINITIONS_DIR. The variable can be defined to stretch through a local user directory and then through the OPUS system directories, similar to a UNIX path. A template for a path file is included in the sample pipeline in the directory
   ~/opus_test/definitions/


How do I find out what directories are used in a path?

The specific directories for the sample pipeline paths are contained in the path files located in:
   ~/opus_test/definitions/
But, rather than hunt for this file, you can have the Process Manager (PMG) display the path file. Under the "File" menu, click on the "Select Process.." option.

PMG File Menu

Double-click on the name of the path, and the path file will be displayed.

PMG Path File View


What are the rules for making a path file?

First, the default path root names are limited to 9 characters. This is the default size of the "PATH" field in an observation status file. Although the OSF field sizes are user-configurable, the current OMG only supports the default sizes:
   babylon5.path                        ! correct
   xfiles.path                          ! correct
   nine_char.path                       ! correct

   samplepipe.path                      ! incorrect
   more_than_nine_char.path             ! incorrect
Each path file must use the extension ".path". The format of the path file contents is similar to a process resource file. Exclamation points are used to delimit comments, either in-line or at the beginning of a line; there are keywords and values in the path file separated by equal signs.
   !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
   ! this is a comment line
   !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!(so is this)!!!!!!!!!!!!!!!!
   KEYWORD     =     value          !comment
Unlike the process resource file, an override path file is not allowed. That is, path files with the same name are not allowed within the OPUS_DEFINITIONS_DIR stretch.


Are things other than directories in a path file?

Yes.

Keywords other than those that define directories can reside in the path file, and each of these keywords is accessible to the tasks which are running in that path. For example, the default database for the pipeline tasks can be set in the path file:

   OPUS_DB   =   OPUSTEST            !Database for testing
But this can be overridden by an identical keyword in the process resource file.


Do the path file variables supersede process resource file variables?

No.

It's the other way around. If a process resource file has a keyword which has an identical keyword in a path file, then the process resource file takes precedence.

The precedence rules between path keywords and resource file keywords are a little elaborate.

  1. The path file can override a keyword in a particular resource file by prepending the name of the process:
       ENV.OK_TO_UPDATE_DATABASE = TRUE       !in the getkw process resource file
       
       GETKW.OK_TO_UPDATE_DATABASE  =  FALSE  !in the path file
    
    Then when the getkw task inquires about the value of the OK_TO_UPDATE_DATABASE environment variable, it will be FALSE (recall that the ENV. prepended to the keyword name is not part of the name itself, but instead is a command telling OPUS to set this keyword and value in the process' environment).

  2. Similarly, the path file can override a keyword for all resource files by prepending "*." to the keyword name:
       ENV.OK_TO_UPDATE_DATABASE = TRUE       !in the process resource file
    
       *.OK_TO_UPDATE_DATABASE  =  FALSE      !in the path file
    
    Then when any task inquires about the value of OK_TO_UPDATE_DATABASE, it will be FALSE.

  3. If the process resource file points to a keyword in the path file, then the value of the process resource file will be the value of the path file keyword:
       ENV.OUTDIR = fits_directory            !in process resource file
    
       fits_directory  =  /home/test/fits/    !in the path file
    
    Then when the task inquires about the value of OUTDIR, the path "/home/test/fits/" will be returned.

  4. Finally, if the process resource file keyword is not overridden by the rules above, its keyword value will be unaltered:
       ENV.DSQUERY = nomad                    !in the process resource file
    
       DSQUERY = daneel                       !in the path file
    
    Then when the task inquires about the value of DSQUERY, the value "nomad" will be returned.


Can path file variables refer to environment variables ?

Path file values can be retrieved from the process' environment in total or in part by using the following syntax in the value field:
 
SUB[env_var_name]

 
where env_var_name is an environment variable. For example, a keyword/value entry in a path file might be written:

 
DATA_PATH = SUB[DPATH]

 
in order to assign the value of the environment variable DPATH to the path file keyword DATA_PATH.

 
One use for such a facility is to maintain static or configured versions of path files that can be used in different pipelines. The path file values can be tailored to each pipeline through the environment of the process in such a setup. Each instance of the SUB[] token is replaced with the value of the named environment variable at the time that the path file value is requested by the application. Note that this applies to process resource files values that resolve to path file values as well. More than one instance of this construct may appear in a single path file keyword value definition. Since the substitution is performed on demand, a process effectively could change path file values on-the-fly by altering the value of environment variables in between calls to fetch path file keyword values.

 
For example: (from an arbitrary path file)

 
USER_HOME_DIR = /home/SUB[USER]

 
In this case, the result of retrieving the value of USER_HOME_DIR from this path file will depend on the value of USER in the process' environment.

 
For example: (from an arbitrary process resource file)

 
OUTPATH = BLUE->OUTPATH

 
(and from blue.path)

 
OUTPATH  = SUB[OUTPUT_DIR]

 
In this case, OUTPATH in both the process resource and blue path files is
assigned the value of OUTPUT_DIR in the process' environment.

 


How do path names get translated?

The process resource file ordinarily has symbolic names for paths. For example:
   ENV.OUTDIR = TTAG_CALIB_DIR       !The timetag calibration directory
The physical names of the directories are usually maintained in the path file where, for example, you might see:
   TTAG_CALIB_DIR  =  /home/opus/ttag_data/calib/  !calibration directory
OPUS guarantees that the translation of the environment variable is done before the task is actually started, so the task can use $OUTDIR and be assured that the right path will be substituted. This indirection allows the same process to run in several different paths without changing the process resource file. If TTAG_CALIB_DIR is not defined in the path file, then $OUTDIR will literally be a directory of the name "TTAG_CALIB_DIR". Additionally, OUTDIR is an environment variable available only to the process(es) in whose process resource file it is defined. Also note that it is the value defined in the process resource file that becomes the environment variable, not the entry found in the path file.


Can one pipeline feed data into another pipeline?

Yes.

It is possible to construct a "bridge" task. A process in one pipeline can, for instance, create a file in another pipeline. The existence of that file might trigger new actions in the second pipeline. For example, assume the process resource file for the first task contained the following keyword:

   ENV.OUTDIR = blue->INPUT_DIR !Send data to another pipeline
Assume, further, that the first task is just a shell script which wants the "blue" path to be notified of an event. That first task need only create a notification file in the directory where the process in the second (blue) pipeline is expecting notification. For example:
   %touch $OUTDIR/F93874592_notification.fnd
OPUS takes care of the translation between the symbol OUTDIR and the actual directory used by the second process. That directory is normally specified in the blue.path file as some physical directory, for example:
   INPUT_DIR   =   /home/mydir/blue/notify/ !notification directory


How does a task "inquire" about a path file value?

The OPUS system does all the translations required and sets an environment variable for those keywords in the process resource file that are prepended with ENV. (and consequent indirect definitions in the path file). Thus a shell script can access any of those process resource file parameters in the same way it accesses environmental variables.

It is consequently a good idea, but not required, to use uppercase for the process resource file keywords that will be placed in the environmnent.


What keywords are required in the path file?

There is only one required keyword for path files: OPUS_OBSERVATIONS_DIR. Each path represents a distinct pipeline containing observations (datasets) being processed in that path. These observations are tracked through a set of Observation Status Files (OSF) maintained in the directory specified by OPUS_OBSERVATIONS_DIR. Each path must use a different OSF directory.

Each path also must be linked to a pipeline stage file that defines the processing stages for that path. If this file is named [path]_pipeline.stage, where [path] is the root name of the path, and located in OPUS_DEFINITIONS_DIR, then the pipeline stage file need not be defined in the path file. However, if a different file name or location is chosen, the STAGE_FILE keyword must be present in the path file, and it must indicate the complete pipeline stage file specification.

For example, a path called "blue" might contain the following lines in its path file.

   STAGE_FILE             = OPUS_DEFINITIONS_DIR:blue_pipeline.stage
   OPUS_OBSERVATIONS_DIR  = /home/mydir/blue/observations/


Top of Path FAQ

Top of OPUS FAQ