Alexander Lukichev, Intel Corporation
Version 0.3 November 30, 2007
Version history
Version and date |
By whom |
Changes |
0.0 December 19, 2005 |
Alexander Lukichev |
Document created |
0.1 October 26, 2006 |
Alexander Lukichev |
Clients API updated |
0.2 November 9, 2007 |
Alexander Lukichev |
Document revised |
0.3 November 30, 2007 |
Alexander Lukichev |
Workflow (gridlet) support described |
0.3.1 January 23, 2008 |
Alexander Lukichev |
Fixes of typos |
GridBean is one of the main concepts of GPE.
A GridBean is an object responsible for:
GridBeans are divided into several modules:
GridBeans are developed using the GridBean SDK and deployed at GridBean Services. GPE clients contact GridBean services for available GridBeans and download the selected ones. Consider the typical use-cases of the GridBeans.
In this case the GridBean provides a nice GUI to a "one-step (computational) job" generator. User fills in the fields on the input panels of the GridBean. During the submit (Application Client) or workflow-generation (Expert Client) stage the GridBean is requested to setup the job definition in JSDL POSIX-Extension format.
Such job description consists of:
Such job descriptions can be submitted to atomic target systems. They are represented with objects implementing
GPEJob
interface. The job execution consists of:
Different types of GridBeans for creating specific workflows. Typically an interface to some workflow service. Detailed description to de developed...
The GPE Client API is a set of Java interfaces and abstract classes providing access to atomic and higher level services. The API classes include:
The clients are:
FileTransferClient |
The client for file transfer services |
GridBeanClient |
The client for the GridBean service |
RegistryClient |
The client for the Target System Registry |
TargetSystemClient |
The client for the Target System Resource |
JobClient |
The client for the Job Management Resource |
StorageClient |
The client for the Storage Management Resource |
WSRFClient |
The client for the WSRF-enabled Resource |
WSLTClient |
The client for the WSRF Resource with WS-Resource Lifetime support |
WSRPClient |
The client for the WSRF Resource with WS-Resource Properties support |
WSNClient |
The client for the WSRF Resource implementing WS-NotificationProducer porttype |
Generally you cannot directly instantiate these clients using the constructors. They form some kind of tree with the RegistryClient as root
where child clients are results of some method invocations of parent clients. E.g. you obtain JobClient as a result of
TargetSystemClient.submit
operation.
One of the main data structures used in GridBeans is the abstraction of the Job - the interface Job
.
This interface is too general to be used in real job descriptions but it is the root of the GPE job description heirarchy.
Each target system may accept a variety of kinds of job definitions. The method
TargetSystemClient.getJobType(String)
may be used
to get the descriptor that describes the job definition kind basing on the target system properties and the provided kind name. The sample names of
such job definition kinds are listed in JobType
.
Once the job kind is selected one may obtain an empty job definition template of this kind from the corresponding TargetSystemClient instance. Use
method TargetSystemClient.newJob(JobType)
for that. Then typecast the obtained Job
instance to the requested
less generic job abstraction (JSDLJob
,
GPEJob
, etc.):
TargetSystemClient targetSystem;
...
JobType jobType = targetSystem.getJobType(JobType.JobDefinitions.GPEJSDL);
GPEJob gpeJob = (GPEJob) targetSystem.newJob(jobType);
The concrete job definition template is then filled in with the job parameters. See sections below for details.
The JSDLJob
is a wrapper of
a generic JSDL document. As this document is generic the only things it can specify are:
The possible applications of such description are:
The job description may contain instructions to transfer files:
The typical application of staging-in for instance is to transfer the input files from some Grid location to the job's working directory.
To add stage-in element to the job description one needs to use
JSDLJob.addDataStagingImportElement
. For stage-out use
JSDLJob.addDataStagingExportElement
. For example:
JSDLJob job;
...
job.addDataStagingImportElement("http://www.intel.com", "Work", "intel.txt");
The first parameter is the URI of the file to get the data from. In this case this is a simple web-page accessible through HTTP.
But that could be a terabyte Grid file those should be transferred over GridFTP. The second parameter is the file system at the target system
where to store the imported file. It may be a static storage like Root, Temp, etc. Or working directory ("Work"
) like in this case. The
third parameter is the name of the file to create.
The job description may also provide requirements to be used by a brokering service to select the target system to execute the job. For example you may limit the acceptable target systems to Suse9 Linux machines with not less than 2GB of memory:
JSDLJob job;
...
job.setOperatingSystemRequirements(new OperatingSystemRequirementsType(OSType.linux, "Suse9"));
job.setIndividualPhysicalMemoryRequirements(new RangeValueType(2147483648,2147483648,Double.POSITIVE_INFINITY));
The GPEJob
is a wrapper of a GPE atomic job
description. It is an extension of JSDLJob.
The following job attributes may be specified:
The application name and version uniquely select the application (or script) to be run on the target system. Named application parameters are substituted into the selected application. The sample mechanism is described in OS Profile for GPE.
A workflow job is a sequence of instructrions of a workflow resource. It is submitted to the resource and executed there then. The client may be disconnected just after the submission. The workflow interpreter acts as a client of atomic services itself.
In GPE the gridlet approach is used for workflow specification. In such approach a workflow is a java class which code is transmitted to the workflow service and executed there. Since the workflow specification class (gridlet) is not know to the executing resource (service) its code has to be sent over the network. The transmitted code is executed in a restricted JVM to prevent undesired side-effects of user's workflows.
The workflow specification must be a class implementing com.intel.gpe.clients.api.workflow.gridlet.GridletJob
interface.
class
SomeGridlet implements Gridlet {
public void run() throws Throwable {
System.out.println("Hello, world!");
}
}
The Gridlet code is being packed into the JSDL job decsription then:
// obtain JSDL template
JobType jobType = gridletTSS.getJobType(JobType.JobDefinitions.GPEJSDL);
GridletJob job = (GridletJob) gridletTSS.newJob(jobType);
// create the Gridlet description element
GridletElement gridletElt = new GridletElement();
// add the class code
GridletUtil.addClasses(gridletElt, SomeGridlet.class.getName());
// set the name of the main class
gridletElt.setMainClass(SomeGridlet.class.getName());
// add some variables
GridletUtil.addVariable(gridletElt, "SomeVariable", "The value");
// obtain org.w3.dom.Element
Element glElem = GridletUtil.getGridletElement(gridletElt);
// set the JSDL template contents
job.setGridlet(glElem);
To access the passed variables or obtain the clients to remote services the workflow specification code may use the
com.intel.gpe.clients.api.gridlet.Engine
object:
// obtain the current engine reference
Engine engine = Engine.getEngine();
// get the variable value
Object[] value = engine.getVariable("SomeVariable");
// get the GPE object factory
GridProgrammingEnvironmentFactory gpef = GridProgrammingEnvironmentFactory.getInstance(engine);
// get the string value
String stringValue = gpef.getVariable("SomeVariable");
// get the broker client reference
TargetSystemClient broker = gpef.getBroker("Broker");
The workflow can also set the variables (they are access as resource properties of the corresponding job resource):
...
engine.setVariable("Result1", new Object[] { "Result" });
...
JobClient jobClient = ... ;
gpef.setVariable("JobClient", jobClient);
...
These values are accessed as resource properties of the corresponding job resource:
// submit the workflow job
GridletJobClient gridletJobClient = gridletTSS.submit(...);
...
// walk through variables
for (Variable variable : gridletJobClient.getVariables()) {
Object[] value = variable.getValue();
}
...
// get the WSRF client defined by some variable
WSRFClient wsrfClient = gridletJobClient.getEndpointReference("JobClient");
...
A typical GridBean consists of 2 modules:
The job description generation module (GridBean model) inherits 2 interfaces and thus provides the following functions:
IGridBeanModel
);IGridBean
).There may be several user's interface modules. The module for GPE standalone clients provides the plugin for Application and Expert Clients. It
inherits IGridBeanPlugin
and provides the
following:
IGridBeanPanel
);IGridBeanPanel
);A user inputs his data into the GridBean input panels. During the job submission stage these data are transformed and stored into the GridBean model fields. Each such field has a name represented as a qname and a value.
The client application requests the GridBean to generate a job descriptin by calling
IGridBean.setupJobDefinition(Job)
.
The passed parameter is a job description template. The typical code for generating an atomic job description is following:
public void setupJobDefinition(Job job) throws GridBeanException {
if (job instanceof GPEJob) {
GPEJob gpeJob = (GPEJob) job;
gpeJob.setApplicationName(APPLICATION_NAME);
gpeJob.setApplicationVersion(APPLICATION_VERSION);
gpeJob.setWorkingDirectory(GPEConstants.JobManagement.TEMPORARY_DIR_NAME);
gpeJob.addOption(FORMAT_OPTION, "PNG");
gpeJob.addField(SOURCE_FIELD, ((AbstractFile) get(SOURCE)).getTargetSystemFile());
gpeJob.addField(TARGET_FIELD, ((AbstractFile) get(TARGET)).getTargetSystemFile());
gpeJob.setId((String) get(JOBNAME));
}
else {
throw new GridBeanException("Unsupported job class: " + job.getClass().getName());
}
}
Some fields may be declared as input or output parameters.
Input parameters are used to declare input files (or arbitrary data), obtained from the local machine or other jobs.
Output parameters are used to declare output files (or arbitrary data), those may be transferred to the local machine or other jobs.
Input and output parameters belong to one of the following types (see
GridBeanParameterType
):
Table 1. Input parameters
Parameter type | Description | Atomic job |
---|---|---|
GPE File | A file at a remote location | The object of class InputFileParameterValue representing
a remote file. |
URL | A file at a remote location represented by it's URL | The object of class InputFileParameterValue representing
a remote file. |
File Set | A set of files at a remote location | The object of class InputFileSetParameterValue representing a fileset. |
Table 2. Output parameters
Parameter type | Description | Atomic job |
---|---|---|
GPE File | A file at a remote location | The object of class OutputFileParameterValue representing
a remote file. The file is accessed via getGpeFile
method. Before rendering the GridBean output panel (where the output parameters are usually accessed) the files are fetched from the job's outcome
directory. |
URL | A file at a remote location represented by it's URL | The object of class OutputFileParameterValue representing
a remote file. The file is accessed via getGpeFile
method. Before rendering the GridBean output panel (where the output parameters are usually accessed) the files are fetched from the job's outcome
directory. |
File Set | A set of files at a remote location | The object of class OutputFileSetParameterValue representing a
fileset. At the GB job design time (before the submission) the methods of the object may be used to specify which files to be included into
the fileset
(setIncludes ,
setExcludes ,
setDirStructure ,
setCaseSensitive ). During
the rendering time the list of fetched files is accessed via
getFiles . |
In order to be used in a GPE standalone client (Application or Expert) the GridBean must provide the corresponding graphical
plugin. The plugin must implement the interafce IGridBeanPlugin
.
Such plugin provides:
Each input or ouput panel is an instance of IGridBeanPanel
.
This interface provides methods for:
Each control within the input or output panel may have a name - qname - and in this case it may also have:
A translator is an object implementing IValueTranslator
that translates the original value that is contained in the graphical component into the GridBean's internal value representation. E.g. the value of the text field
component that is used for specification of some numerical data is of type String but it may be translated and stored in the GridBean model as double.
A validator is an object implementing IValueValidator
that validates the translated value of the graphical component and provides the error message in the case of invalid value. E.g. the validator may
check that some input value is not empty.
A description is a string describing the corresponding field semantics. It is mostly used to generate diagnostics to the user in case of failed value validation.
The translated values of the input panel components are stored into the GridBean model under their names. The decoded values from the GridBean model may be loaded into the output panel when job's outcome is displayed.
The typical code for creating an input panel components looks like follows:
private void buildComponents() throws DataSetException {
setLayout(new GridBagLayout());
JTextField nameTextField = new JTextField();
add(new JLabel("Name:"), LayoutTools.makegbc(0, 0, 1, 0, false));
GridBagConstraints c = LayoutTools.makegbc(1, 0, 1, 0, true);
add(nameTextField, c);
linkJobNameTextField(POVRayGridBean.JOBNAME, nameTextField);
JTextField widthField = new JTextField();
add(new JLabel("Width:"), LayoutTools.makegbc(1, 1, 1, 0, false));
add(widthField, LayoutTools.makegbc(2, 1, 1, 0, true));
linkTextField(POVRayGridBean.WIDTH, widthField);
setValueTranslator(POVRayGridBean.WIDTH, StringValueTranslator.getInstance());
setValueValidator(POVRayGridBean.WIDTH, IntegerValueValidator.getInstance());
JTextField heightField = new JTextField();
add(new JLabel("Height:"), LayoutTools.makegbc(3, 1, 1, 0, false));
add(heightField, LayoutTools.makegbc(4, 1, 1, 0, true));
linkTextField(POVRayGridBean.HEIGHT, heightField);
setValueTranslator(POVRayGridBean.HEIGHT, StringValueTranslator.getInstance());
setValueValidator(POVRayGridBean.HEIGHT, IntegerValueValidator.getInstance());
}
Atomic job. A one-step computational job. It can be submitted to some atomic target system and executed there. The expected outcome is a set of output files including standard output and standard error. The job is specified in the form of GPE Atomic JSDL extension.
Atomic services. The services implementing the corresponding port types defined by Unigrids project. These port types include:
The WSRF resources behind these services perform atomic operations like job submission, file transfer, other file operations.
Atomic Target System. The target system resource that can execute atomic jobs.
Endpoint Reference. The essential element of Web Services Addressing specification. The endpoint reference is an XML structure providing the information for unique location of some network entity (web-service, resource, etc.).
GPE Atomic JSDL extension. The JSDL extension used to specify atomic jobs. It is based syntactically on the JSDL POSIX extension but uses the environment section of the Application element to pass the named parameters of the abstract application instead of actual environment variable values.
GPE Clients API. The set of client classes for work with atomic and higher level services. Refer also the section GPE Clients API.
GridBean SDK. A set of libraries and tools provided with GPE to develop GridBeans. Visit the following links for javadoc on the GPE API:
GridBean service. The service used to publish gridbeans in the Internet.
Higher level services. The services built on top of atomic services. They include:
Job Management Resource. The type of resources managed through the job management service - one of the atomic services. This resource allows to manage a single job submitted to some target system. The list of operations in particular includes:
Job Submission Description Language. The XML language that defines how to describe the submission of a single job. The description includes:
Refer Job Submission Description Language (JSDL) Specification for JSDL specification details.
JSDL POSIX Extension. The normative extension of JSDL. Provides a method for specifying an executable invocation in a POSIX environment. Refer Job Submission Description Language (JSDL) Specification for details on JSDL and its normative extensions.
Registry. One of the higher level services used to store and provide the information on the available Grid resources such as target systems.
Storage Management Resource. The type of resources managed through the storage management service - one of the atomic services. Such resources are the abstractions of the remote file storages and provide the file operations and operations for establishing file transfers.
Target System. The type of resources managed through the target system service - one of the atomic services. The target system has only one operation: submit a job. The list of available properties includes:
Web Services Resource Framework. The set of standards and guidelines defining an open framework for modeling and accessing stateful resources using Web services. Visit WSRF TC at OASIS homepage for more information.
Workflow job. A grid-service orchestration job. It can be submitted to some workflow target system. The job description consists of the workflow description in the form of GPE workflow (gridlet) JSDL extension.
Workflow Target System. The target system resource that can execute workflow jobs.
WS-NotificationProducer. The port type defined in Web Services Base Notification specification. This port type provides operation to subscribe for notifications from some entity.
WS-ResourceLifetime. The set of additional requirements for the WSRF resource in order to enable its lifetime management. This includes the definition of 2 port types:
Refer Web Services Resource Lifetime 1.2 (WS-ResourceLifetime) for more information.
WS-ResourceProperties. The set of additional requirements for the WSRF resource in order to enable its properties querying and management. This includes the definition of the following port types:
Not all the listed operations are available through the GPE Clients API. For more information on WS-Resource Properties refer Web Services Resource Properties 1.2 (WS-ResourceProperties).
WSRF resource. A statefull entity that may be managed through a web-service interface following the guidelines of the WSRF standard. Refer Web Services Resource 1.2 (WS-Resource) for more information.
XPath. The language to query the data from XML documents. GPE uses XPath of version 1.0. Refer XML Path Language (XPath) for more details.