GridBean Developer's Guide

Alexander Lukichev, Intel Corporation

Version 0.3 November 30, 2007

Version history

Version and date	By whom	Changes
0.0 December 19, 2005	Alexander Lukichev	Document created
0.1 October 26, 2006	Alexander Lukichev	Clients API updated
0.2 November 9, 2007	Alexander Lukichev	Document revised
0.3 November 30, 2007	Alexander Lukichev	Workflow (gridlet) support described
0.3.1 January 23, 2008	Alexander Lukichev	Fixes of typos

1 Introduction

GridBean is one of the main concepts of GPE.

A GridBean is an object responsible for:

generating job description for grid services;
providing graphical user interface for input data;
providing graphical user interface for output data;
providing interface for user's interaction with grid services.

GridBeans are divided into several modules:

job description generation module;
1 or more user interface modules.

GridBeans are developed using the GridBean SDK and deployed at GridBean Services. GPE clients contact GridBean services for available GridBeans and download the selected ones. Consider the typical use-cases of the GridBeans.

Use-case 1: Computational GridBean

In this case the GridBean provides a nice GUI to a "one-step (computational) job" generator. User fills in the fields on the input panels of the GridBean. During the submit (Application Client) or workflow-generation (Expert Client) stage the GridBean is requested to setup the job definition in JSDL POSIX-Extension format.

Such job description consists of:

Application name and version
Named application parameters
Stage-in files
Stage-out files

Such job descriptions can be submitted to atomic target systems. They are represented with objects implementing GPEJob interface. The job execution consists of:

Incarnating the job request using the template from the incarnation database, the supplied parameters and the predefined (per-user, per-application) system parameters
File stage-in
Executing the script
File stage-out

Use-case 2: Workflow GridBean

Different types of GridBeans for creating specific workflows. Typically an interface to some workflow service. Detailed description to de developed...

2 GPE Clients API

2.1 The general design of the GPE Clients API

The GPE Client API is a set of Java interfaces and abstract classes providing access to atomic and higher level services. The API classes include:

Clients: the objects providing access to remote operation invocations; each client corresponds to some port type.
Data structrues: the objects wrapping data of complex types obtained from the grid services.

The clients are:

`FileTransferClient`	The client for file transfer services
`GridBeanClient`	The client for the GridBean service
`RegistryClient`	The client for the Target System Registry
`TargetSystemClient`	The client for the Target System Resource
`JobClient`	The client for the Job Management Resource
`StorageClient`	The client for the Storage Management Resource
`WSRFClient`	The client for the WSRF-enabled Resource
`WSLTClient`	The client for the WSRF Resource with WS-Resource Lifetime support
`WSRPClient`	The client for the WSRF Resource with WS-Resource Properties support
`WSNClient`	The client for the WSRF Resource implementing WS-NotificationProducer porttype

Generally you cannot directly instantiate these clients using the constructors. They form some kind of tree with the RegistryClient as root where child clients are results of some method invocations of parent clients. E.g. you obtain JobClient as a result of TargetSystemClient.submit operation.

2.2 Generating job descriptions

One of the main data structures used in GridBeans is the abstraction of the Job - the interface Job. This interface is too general to be used in real job descriptions but it is the root of the GPE job description heirarchy.

Each target system may accept a variety of kinds of job definitions. The method TargetSystemClient.getJobType(String) may be used to get the descriptor that describes the job definition kind basing on the target system properties and the provided kind name. The sample names of such job definition kinds are listed in JobType.

Once the job kind is selected one may obtain an empty job definition template of this kind from the corresponding TargetSystemClient instance. Use method TargetSystemClient.newJob(JobType) for that. Then typecast the obtained Job instance to the requested less generic job abstraction (JSDLJob, GPEJob, etc.):

    TargetSystemClient targetSystem;
    ...
    JobType jobType = targetSystem.getJobType(JobType.JobDefinitions.GPEJSDL);
    GPEJob gpeJob = (GPEJob) targetSystem.newJob(jobType);

The concrete job definition template is then filled in with the job parameters. See sections below for details.

2.2.1 Specifying JSDLJob

The JSDLJob is a wrapper of a generic JSDL document. As this document is generic the only things it can specify are:

resource requirements;
files to stage-in/stage-out.

The possible applications of such description are:

request to a broker;
3rd party transfer job.

2.2.1.1 Specifying stage-in/stage-out

The job description may contain instructions to transfer files:

before job execution (staging-in);
after job execution (staging-out).

The typical application of staging-in for instance is to transfer the input files from some Grid location to the job's working directory.

To add stage-in element to the job description one needs to use JSDLJob.addDataStagingImportElement. For stage-out use JSDLJob.addDataStagingExportElement. For example:

    JSDLJob job;
    ...
    job.addDataStagingImportElement("http://www.intel.com", "Work", "intel.txt");

The first parameter is the URI of the file to get the data from. In this case this is a simple web-page accessible through HTTP. But that could be a terabyte Grid file those should be transferred over GridFTP. The second parameter is the file system at the target system where to store the imported file. It may be a static storage like Root, Temp, etc. Or working directory ("Work") like in this case. The third parameter is the name of the file to create.

2.2.1.2 Specifying job requirements

The job description may also provide requirements to be used by a brokering service to select the target system to execute the job. For example you may limit the acceptable target systems to Suse9 Linux machines with not less than 2GB of memory:

    JSDLJob job;
    ...
    job.setOperatingSystemRequirements(new OperatingSystemRequirementsType(OSType.linux, "Suse9"));
    job.setIndividualPhysicalMemoryRequirements(new RangeValueType(2147483648,2147483648,Double.POSITIVE_INFINITY));

2.2.2 Specifying GPEJob

The GPEJob is a wrapper of a GPE atomic job description. It is an extension of JSDLJob.

The following job attributes may be specified:

resource requirements;
files to stage-in/stage-out;
application name and version;
named application parameters;
application working directory;
names of stdout and stderr files.

The application name and version uniquely select the application (or script) to be run on the target system. Named application parameters are substituted into the selected application. The sample mechanism is described in OS Profile for GPE.

2.2.3 Creating workflow jobs

A workflow job is a sequence of instructrions of a workflow resource. It is submitted to the resource and executed there then. The client may be disconnected just after the submission. The workflow interpreter acts as a client of atomic services itself.

In GPE the gridlet approach is used for workflow specification. In such approach a workflow is a java class which code is transmitted to the workflow service and executed there. Since the workflow specification class (gridlet) is not know to the executing resource (service) its code has to be sent over the network. The transmitted code is executed in a restricted JVM to prevent undesired side-effects of user's workflows.

The workflow specification must be a class implementing com.intel.gpe.clients.api.workflow.gridlet.GridletJob interface.

class SomeGridlet implements Gridlet {
public void run() throws Throwable {
System.out.println("Hello, world!");
}
}

The Gridlet code is being packed into the JSDL job decsription then:

// obtain JSDL template
JobType jobType = gridletTSS.getJobType(JobType.JobDefinitions.GPEJSDL);
GridletJob job = (GridletJob) gridletTSS.newJob(jobType);

// create the Gridlet description element
GridletElement gridletElt = new GridletElement();

// add the class code
GridletUtil.addClasses(gridletElt, SomeGridlet.class.getName());

// set the name of the main class
gridletElt.setMainClass(SomeGridlet.class.getName());

// add some variables
GridletUtil.addVariable(gridletElt, "SomeVariable", "The value");

// obtain org.w3.dom.Element
Element glElem = GridletUtil.getGridletElement(gridletElt);

// set the JSDL template contents
job.setGridlet(glElem);

To access the passed variables or obtain the clients to remote services the workflow specification code may use the com.intel.gpe.clients.api.gridlet.Engine object:

// obtain the current engine reference
Engine engine = Engine.getEngine();

// get the variable value
Object[] value = engine.getVariable("SomeVariable");

// get the GPE object factory
GridProgrammingEnvironmentFactory gpef = GridProgrammingEnvironmentFactory.getInstance(engine);

// get the string value
String stringValue = gpef.getVariable("SomeVariable");

// get the broker client reference
TargetSystemClient broker = gpef.getBroker("Broker");

The workflow can also set the variables (they are access as resource properties of the corresponding job resource):

...
engine.setVariable("Result1", new Object[] { "Result" });
...
JobClient jobClient = ... ;
gpef.setVariable("JobClient", jobClient);
...

These values are accessed as resource properties of the corresponding job resource:

// submit the workflow job
GridletJobClient gridletJobClient = gridletTSS.submit(...);
...
// walk through variables
for (Variable variable : gridletJobClient.getVariables()) {
Object[] value = variable.getValue();
}
...
// get the WSRF client defined by some variable
WSRFClient wsrfClient = gridletJobClient.getEndpointReference("JobClient");
...

3 Developing GridBeans

3.1 Typical GridBean structure

A typical GridBean consists of 2 modules:

job description generation module;
user's interface module.

The job description generation module (GridBean model) inherits 2 interfaces and thus provides the following functions:

storing named GridBean parameters (IGridBeanModel);
generating job description using these stored parameters (IGridBean).

There may be several user's interface modules. The module for GPE standalone clients provides the plugin for Application and Expert Clients. It inherits IGridBeanPlugin and provides the following:

the set of input panels (IGridBeanPanel);
the set of output panels (IGridBeanPanel);
methods for loading data from and storing data to the GridBean model;
validating input data.

3.2 Storing GridBean parameters and generating job description

A user inputs his data into the GridBean input panels. During the job submission stage these data are transformed and stored into the GridBean model fields. Each such field has a name represented as a qname and a value.

The client application requests the GridBean to generate a job descriptin by calling IGridBean.setupJobDefinition(Job). The passed parameter is a job description template. The typical code for generating an atomic job description is following:

    public void setupJobDefinition(Job job) throws GridBeanException {
        if (job instanceof GPEJob) {
            GPEJob gpeJob = (GPEJob) job;
            gpeJob.setApplicationName(APPLICATION_NAME);
            gpeJob.setApplicationVersion(APPLICATION_VERSION);
            gpeJob.setWorkingDirectory(GPEConstants.JobManagement.TEMPORARY_DIR_NAME);
            gpeJob.addOption(FORMAT_OPTION, "PNG");
            gpeJob.addField(SOURCE_FIELD, ((AbstractFile) get(SOURCE)).getTargetSystemFile());
            gpeJob.addField(TARGET_FIELD, ((AbstractFile) get(TARGET)).getTargetSystemFile());
            gpeJob.setId((String) get(JOBNAME));
        } else {
            throw new GridBeanException("Unsupported job class: " + job.getClass().getName());
        }
    }

3.3 Input and output parameters

Some fields may be declared as input or output parameters.

Input parameters are used to declare input files (or arbitrary data), obtained from the local machine or other jobs.

Output parameters are used to declare output files (or arbitrary data), those may be transferred to the local machine or other jobs.

Input and output parameters belong to one of the following types (see GridBeanParameterType):

XML
URL
GPE File

Table 1. Input parameters

Parameter type	Description	Atomic job
GPE File	A file at a remote location	The object of class `InputFileParameterValue` representing a remote file.
URL	A file at a remote location represented by it's URL	The object of class `InputFileParameterValue` representing a remote file.
File Set	A set of files at a remote location	The object of class `InputFileSetParameterValue` representing a fileset.

Table 2. Output parameters

Parameter type	Description	Atomic job
GPE File	A file at a remote location	The object of class `OutputFileParameterValue` representing a remote file. The file is accessed via `getGpeFile` method. Before rendering the GridBean output panel (where the output parameters are usually accessed) the files are fetched from the job's outcome directory.
URL	A file at a remote location represented by it's URL	The object of class `OutputFileParameterValue` representing a remote file. The file is accessed via `getGpeFile` method. Before rendering the GridBean output panel (where the output parameters are usually accessed) the files are fetched from the job's outcome directory.
File Set	A set of files at a remote location	The object of class `OutputFileSetParameterValue` representing a fileset. At the GB job design time (before the submission) the methods of the object may be used to specify which files to be included into the fileset (`setIncludes`, `setExcludes`, `setDirStructure`, `setCaseSensitive`). During the rendering time the list of fetched files is accessed via `getFiles`.

3.4 Developing GridBean user's interface for GPE standalone clients

In order to be used in a GPE standalone client (Application or Expert) the GridBean must provide the corresponding graphical plugin. The plugin must implement the interafce IGridBeanPlugin. Such plugin provides:

the set of input panels;
the set of output panels.

Each input or ouput panel is an instance of IGridBeanPanel. This interface provides methods for:

uniform control of the user's interface components on the panel;
getting the component representing the panel;
validating the input data.

Each control within the input or output panel may have a name - qname - and in this case it may also have:

translator;
validator;
description.

A translator is an object implementing IValueTranslator that translates the original value that is contained in the graphical component into the GridBean's internal value representation. E.g. the value of the text field component that is used for specification of some numerical data is of type String but it may be translated and stored in the GridBean model as double.

A validator is an object implementing IValueValidator that validates the translated value of the graphical component and provides the error message in the case of invalid value. E.g. the validator may check that some input value is not empty.

A description is a string describing the corresponding field semantics. It is mostly used to generate diagnostics to the user in case of failed value validation.

The translated values of the input panel components are stored into the GridBean model under their names. The decoded values from the GridBean model may be loaded into the output panel when job's outcome is displayed.

The typical code for creating an input panel components looks like follows:

   private void buildComponents() throws DataSetException {
       setLayout(new GridBagLayout());

       JTextField nameTextField = new JTextField();
       add(new JLabel("Name:"), LayoutTools.makegbc(0, 0, 1, 0, false));
       GridBagConstraints c = LayoutTools.makegbc(1, 0, 1, 0, true);
       add(nameTextField, c);
       linkJobNameTextField(POVRayGridBean.JOBNAME, nameTextField);

       JTextField widthField = new JTextField();
       add(new JLabel("Width:"), LayoutTools.makegbc(1, 1, 1, 0, false));
       add(widthField, LayoutTools.makegbc(2, 1, 1, 0, true));
       linkTextField(POVRayGridBean.WIDTH, widthField);
       setValueTranslator(POVRayGridBean.WIDTH, StringValueTranslator.getInstance());
       setValueValidator(POVRayGridBean.WIDTH, IntegerValueValidator.getInstance());

       JTextField heightField = new JTextField();
       add(new JLabel("Height:"), LayoutTools.makegbc(3, 1, 1, 0, false));
       add(heightField, LayoutTools.makegbc(4, 1, 1, 0, true));
       linkTextField(POVRayGridBean.HEIGHT, heightField);
       setValueTranslator(POVRayGridBean.HEIGHT, StringValueTranslator.getInstance());
       setValueValidator(POVRayGridBean.HEIGHT, IntegerValueValidator.getInstance());
    }

Glossary

Atomic job. A one-step computational job. It can be submitted to some atomic target system and executed there. The expected outcome is a set of output files including standard output and standard error. The job is specified in the form of GPE Atomic JSDL extension.

Atomic services. The services implementing the corresponding port types defined by Unigrids project. These port types include:

TargetSystem
JobManagement
StorageManagement
FileTransfer
TargetSystemFactory

The WSRF resources behind these services perform atomic operations like job submission, file transfer, other file operations.

Atomic Target System. The target system resource that can execute atomic jobs.

Endpoint Reference. The essential element of Web Services Addressing specification. The endpoint reference is an XML structure providing the information for unique location of some network entity (web-service, resource, etc.).

GPE Atomic JSDL extension. The JSDL extension used to specify atomic jobs. It is based syntactically on the JSDL POSIX extension but uses the environment section of the Application element to pass the named parameters of the abstract application instead of actual environment variable values.

GPE Clients API. The set of client classes for work with atomic and higher level services. Refer also the section GPE Clients API.

GridBean SDK. A set of libraries and tools provided with GPE to develop GridBeans. Visit the following links for javadoc on the GPE API:

GridBean service. The service used to publish gridbeans in the Internet.

Higher level services. The services built on top of atomic services. They include:

Registry
Workflow (gridlet) Engine
other services like Streaming Service, Visualization Service, etc.

Job Management Resource. The type of resources managed through the job management service - one of the atomic services. This resource allows to manage a single job submitted to some target system. The list of operations in particular includes:

start job;
abort job;
destroy job.

Job Submission Description Language. The XML language that defines how to describe the submission of a single job. The description includes:

job specification;
request for resources;
file stage in/stage out specification.

Refer Job Submission Description Language (JSDL) Specification for JSDL specification details.

JSDL POSIX Extension. The normative extension of JSDL. Provides a method for specifying an executable invocation in a POSIX environment. Refer Job Submission Description Language (JSDL) Specification for details on JSDL and its normative extensions.

Registry. One of the higher level services used to store and provide the information on the available Grid resources such as target systems.

Storage Management Resource. The type of resources managed through the storage management service - one of the atomic services. Such resources are the abstractions of the remote file storages and provide the file operations and operations for establishing file transfers.

Target System. The type of resources managed through the target system service - one of the atomic services. The target system has only one operation: submit a job. The list of available properties includes:

the list of submitted jobs;
system performance characteristics;
available storage resources.

Web Services Resource Framework. The set of standards and guidelines defining an open framework for modeling and accessing stateful resources using Web services. Visit WSRF TC at OASIS homepage for more information.

Workflow job. A grid-service orchestration job. It can be submitted to some workflow target system. The job description consists of the workflow description in the form of GPE workflow (gridlet) JSDL extension.

Workflow Target System. The target system resource that can execute workflow jobs.

WS-NotificationProducer. The port type defined in Web Services Base Notification specification. This port type provides operation to subscribe for notifications from some entity.

WS-ResourceLifetime. The set of additional requirements for the WSRF resource in order to enable its lifetime management. This includes the definition of 2 port types:

ImmediateResourceTermination
ScheduledResourceTermination

Refer Web Services Resource Lifetime 1.2 (WS-ResourceLifetime) for more information.

WS-ResourceProperties. The set of additional requirements for the WSRF resource in order to enable its properties querying and management. This includes the definition of the following port types:

GetResourcePropertyDocument
GetResourceProperty
GetMultipleResourceProperties
PutResourcePropertyDocument
SetResourceProperties
InsertResourceProperties
UpdateResourceProperties
DeleteResourceProperties
QueryResourceProperties

Not all the listed operations are available through the GPE Clients API. For more information on WS-Resource Properties refer Web Services Resource Properties 1.2 (WS-ResourceProperties).

WSRF resource. A statefull entity that may be managed through a web-service interface following the guidelines of the WSRF standard. Refer Web Services Resource 1.2 (WS-Resource) for more information.

XPath. The language to query the data from XML documents. GPE uses XPath of version 1.0. Refer XML Path Language (XPath) for more details.