User interface
To develop an application in the GRID superscalar paradigm, a programmer must go through the following three stages:
-
Task definition: identify those subroutines/programs in the application that are going to be executed in the computational Grid.
-
Task parameters definition: identify which parameters are input/output files and which are input/output generic scalars. These two first steps are equivalent at the architecture processor level to defining the instruction set.
- Write the sequential program (main program and task code).
Stages 1 and 2 (task definition and task parameters definition) are performed by writing an interface definition file (idl file). Figure 1 shows an example of a task and parameters definition in GRID superscalar.
|
interface OPT { |
Figure 1: Example of application IDL
Each task that the user identifies as a candidate to be run in the GRID appears in the IDL file. The list of parameters of the tasks are also specified, indicating its type and if it is an input (in), output (out) or input/output (inout).
The main program that the user writes for a GRID superscalar application is basically identical to the one that would be written for a sequential version of the application. The differences would be that at some points of the code, some primitives of the GRID superscalar are called. For example, GS_On and GS_Off are respectively called at the beginning and at the end of the application. Another change would be necessary on those parts of the main program where files are read or written. Since the files are the objects that define the data dependences, the run-time needs to be aware of any operation performed on a file. The current version offers four primitives for file handling: GS_Open, GS_Close, GS_FOpen and GS_FClose, which at user level implement the same behavior as the functions open, close, fopen and fclose. In addition, the GS_Barrier function has been defined to allow the programmers to explicitly control the tasks flow. This function waits till all GRID tasks finish.
An example of an application written with GRID superscalar paradigm is shown in figure 2. In this example, a set of N parametric simulations performed using a given simulator (in this case, we used the performance prediction simulator Dimemas) are launched, varying some parameters of the simulation. Later, the range of the parameters is modified according to the simulation results in order to move towards a goal. The application runs until a given goal is reached.
|
GS_On(); |
Figure 2: Example of GRID superscalar application
The interface definition file for this example is the one shown in figure 1. Function generate_new_range does not appear in this file, because it would be run locally on the client. For that reason, when opening the file final_result.txt the GRID superscalar specific file functions are used. Additionally, the user provides the code of the functions that have been selected to be run on the GRID (filter, dimemas_funct, extract). The code of those functions does not differ from the code of the functions for a sequential application. The only current requirement is that they should be provided in a separate file from the main program.
There are also available versions that allow for the specification of applications in Perl, Java and Shell script.
Recently the user interface for Globus versions has been extended to allow the users to express resource requirements and performance models for the tasks.
Expressing constraints and cost
A Grid is typically composed of a large set of different computing resources: from clusters to single personal computers, with different software installations, different architecture, operating system, network speed, etc. In this sense, there is a need for expressing what elements compose this heterogeneity. And thus, if there is a mean for describing each resource features, the user can ask for a concrete one.
To implement this feature GRID superscalar takes advantage of the ClassAds Library, developed by the Computer Science department at the University of Wisconsin.
Also, allowing the user to express performance models of the tasks will help the GRID superscalar scheduler.
To be able to express the user cost and constraints, gsstubgen tool has been extended to generate three extra files: {app_name}_constraints.h, {app_name}_constraints.cc and {app_name}_constraints_wrapper.cc. The last file contains wrappers to call the functions defined in {app_name}_constraints.cc and it should not be modified by the user. File {app_name}_constraints.cc contains default functions for specifying the tasks' constraints and performance models. The programmer can edit these functions to tailor its application to the Grid environment.
Figure 3 shows an example of user constraint function. In this case, the user is requiring to execute task dimemas_funct a resource that has in its list of software the Dimemas simulator.
|
string dimemas_funct_constraints(file cfgFile, file traceFile) |
Figure 3: Sample constraint function
Similarly, figure 4 shows an example of performance model for task dimemas_funct. In this case, the function indicates that the time required to execute this task depends on the size of input file tracefile and inversely depends on the performance of the machine selected. GS_Filesize and GS_Gflops are built-in functions of the GRID superscalar runtime library.
|
double dimemas_funct_cost(file cfgFile, file traceFile) |
Figure 4: Sample performance model function
Exception handling mechanism
The Globus versions of GRID superscalar provide a mechanism that allows speculative execution of tasks. The syntax of this mechanism is similar to the exception handling mechanism available in C++ or Java, although its behavior is not exactly the same.
The mechanism is composed of two primitives:
| GS_Speculative_End(func) | Indicates the point where the speculative area finishes. The function func will be executed if GS_Throw is called. This function should be a void function without parameters. |
| GS_Throw | Throws and exception. Can be called from any task listed in the IDL file. |
Figure 5 shows and example of main program where this mechanism is used. The application is programmed as a loop, where MAXITER iterations are started. GRID superscalar will start executing the tasks called in the loop, taking into account the available resources, the constraints and performance models given by the user and the dependencies between the tasks. If at any moment the GS_Throw primitive is called from any task (filter, dimemas_funct or extract), all tasks that have been generated after the task that rises the exception will be undone. This means that, if the task is pending it will not be executed, if the task is running it will be canceled, and if the task has ended its results will be discarded. Also, function myfunc will be executed, which allows some type of exception treatment.
|
#include "GS_master.h" |
Figure 5: Use of exception mechanism
Figure 6 shows an example of use of the GS_Throw primitive. In this case, GS_Throw is called when the result of dimemas_funct task reaches a given threshold.
|
void Dimemas(char * cfgFile, char * traceFile, double goal, char * \ DimemasOUT) |
Figure 6: Use of exception mechanism




