[OPUS]

Frequently Asked Questions


Introduction



Introduction


What is OPUS?

OPUS is a distributed pipeline system which allows multiple instances of multiple processes to run on multiple nodes over multiple paths. OPUS is a generic pipeline system, and is not tied to any particular processing environment nor to any particular mission. OPUS is a flexible system which can support the development of a pipeline for your own telemetry processing.

The OPUS system is designed to support a sequence of independent applications which take datasets from a raw form and process them to some intermediate or final state.

The OPUS system does not supply you with the mission-specific applications themselves. Instead OPUS provides you with a fully distributed "pipeline" processing environment which will help you organize your applications, monitor the processing, and control what is going on in your pipeline.

The monitoring is done with two Motif applications: the Process Manager (PMG), and the Observation Manager (OMG). The first gives you the tools you need to control the individual tasks in your pipeline. The second gives you a quick view of the status of datasets in your pipeline and allows you to control individual observations.

But more than monitoring, the OPUS system makes it easy for you to distribute processing over multiple nodes, running several instances of the same task either on one machine, or on several machines.

The OPUS system is designed to be used for a variety of purposes, simultaneously. You can establish a production OPUS pipeline, while running a reprocessing OPUS pipeline at the same time. You can use the OPUS system to control and monitor a variety of calibration sequences which simultaneously share reference data and input datasets.


What is included in the OPUS System?

The OPUS System includes, for several operating systems, the eXternal POLLer (XPOLL), the OPUS Process Manager (PMG), and the OPUS Observation Manager (OMG). These three components allow you to add your own applications and construct your own production pipeline.

In addition, the OPUS Application Programming Interface (OAPI) is included. The OAPI is distributed as an object library and C++ header files with which you can write applications that interface to the OPUS system directly from within C++ code, or even extend the functionality of OPUS to suite your needs.

Alone these components don't process any data. They simply provide the capability for you to construct a distributed, automated production pipeline designed to process telemetry for your instruments. Your pipeline consists of tasks you have written, which are then started, monitored, and managed by OPUS.


What are the OPUS "Managers"?

Two pipeline "managers" come with the OPUS system. They are Motif GUI applications which assist the user in monitoring the system. The Process Manager (PMG) not only assists with the task of configuring the system, but monitors what processes are running on which nodes, and what they are currently doing.

The Observation Manager (OMG) takes a second view of the pipeline activities, monitoring what datasets are where in the pipeline and alerting the operator when observations are unable to complete pipeline processing.


What is not included in the distribution?

The HST-specific applications are not included. These have limited applicability to other missions and are designed to process the telemetry for the specific instruments aboard the Hubble Space Telescope. However, there is sufficient software on the demonstration CD to enable you to build a complete production pipeline with your own applications.


What does the OPUS Sample Pipeline demonstrate?

In addition to the OPUS System, a simple set of applications is included in the CD-ROM distribution. This "sample pipeline" demonstrates some of the capabilities of the OPUS system. It allows you to run the pipeline, understand what happens when you modify process resource files, experiment with the OPUS Managers (OMG and PMG), and test the OPUS capability to distribute processing.

The sample pipeline was developed to test the functionality of the OPUS system. It is used at the Space Telescope Science Institute (STScI) to verify the correctness of new OPUS builds and installations.


Why can't I just use a shell script to tie my applications together?

Certainly you can. And for a low volume pipeline this might be the low cost solution. However, as the volume of data increases, as the number of applications increase, and as the complexity of the processing grows, the ability to distribute processing over multiple nodes, and to monitor the status of each process and each observation in the pipeline, becomes more complex.

This complex distribution and monitoring task is what OPUS is designed to handle in a robust way.


Why are multiple instances of applications important?

Not all applications are equal. Some run for a significant amount of time, others are quite speedy. Some require a large amount of resources, others are not so demanding. You can speed up total throughput of your pipeline by having multiple copies of an application running simultaneously, perhaps on different machines. This way the pipeline can process several datasets simultaneously.

OPUS allows you to tailor the mix of processes and to add multiple copies of critical applications to the pipeline.


How many steps can there be in the pipeline?

The more the merrier! Having more steps in the pipeline (decoupling processes so they do only a single task) is essential in constructing an efficient and flexible pipeline. The OPUS motto has always been: decouple, decouple, decouple. Only with a modular system can you use your own resources efficiently to attain the throughput you need.

The default observation status file (OSF) structure accommodates up to 24 stages. Additional stages are possible by reconfiguring the OSF size.


What are paths?

A path is a set of directories used when processing data in the pipeline. Multiple pipelines with identical steps, but with different paths, can be run simultaneously, yet without interference. For example, at STScI it is necessary to operate a real-time pipeline at the same time that a production pipeline is processing, while a reprocessing pipeline may also be simultaneously converting science images in the background, and another pipeline may be processing engineering telemetry.


How does OPUS work?

The success of OPUS can be attributed in part to adopting a blackboard architecture of interprocess communication. Processes do not communicate with each other directly; they post and update information about their status on a common blackboard.

The blackboard is implemented simply as an ordinary directory on an ordinary disk. The status messages are implemented simply as ordinary files. The files are empty; pertinent information is contained in the filenames.

This technique effectively decouples the communication processing and automatically makes the entire system more robust. The standard file system available under the operating system provides OPUS with a simple, robust blackboard.

The next release of the OPUS system will offer a distributed object alternative to the file system-based storage of blackboards that improves scalability of the system.


Wasn't OPUS developed for the HST project?

Yes, OPUS was first developed for the Hubble Space Telescope and constitutes the production pipeline system at the Space Telescope Science Institute (STScI). It takes the incoming telemetry stream from Goddard Space Flight Center, converts the data to standard FITS files, and stages the observations for calibration. When processing for an observation is complete, all data is then staged to be inserted into the DADS archive.

Special purpose OPUS applications were developed for the FUSE (Far Ultraviolet Spectrographic Explorer) mission. While the HST applications were not directly applicable to the FUSE telemetry (although code reuse was significant in this project), the OPUS system, the OPUS pipeline infrastructure, and the OPUS managers are used by the FUSE group.

Since OPUS was originally distributed on CD-ROM in 1997 a number of astronomical institutions have been reviewing its capabilities. The International Gamma-Ray Astrophysics Laboratory (INTEGRAL) in Switzerland is one of the missions that has definitely decided to use OPUS for its pipeline platform.

Other groups have picked up OPUS and are using or planning to use the OPUS platform for their own projects. This includes the Chandra X-Ray Observatory and the Mount Stromlo and Siding Spring Observatories in Australia (MSSSO) for their large camera mosaic project.

Recently, the Space Infrared Telescope Facility (SIRTF) has joined the OPUS platform group, and will be using OPUS to control and monitor their production pipeline after launch.


Isn't OPUS too elaborate for a small mission?

First, OPUS is not a large system. It is small, designed to solve a specific problem in a robust way: distributed processing with controlled monitoring. Even if your processing steps comprise simple shell scripts, OPUS can provide the glue which ties everything together.

Second, OPUS relieves your talented engineering and science staff to do the more "interesting" work. Your mission is to understand the instrument and the science, not to build pipelines.


Top of Introduction FAQ

Top of OPUS FAQ