ASK HERE

31-12-2009, 06:36 PM

[attachment=938]
ABSTRACT
The idea that Java may enable new programming environments, combining attractive user interfaces with high performance computation, is gaining increasing attention amongst computational scientists. Java boasts a direct simplicity reminiscent of Fortran, but also incorporates many of the important ideas of modern object-oriented programming. Of course it comes with an established track-record in the domains of Web and Internet programming.
The language outlined here provides HPF-like distributed arrays as language primitives, and new distributed control constructs to facilitate access to the local elements of these arrays. In the SPMD mold, the model allows processors the freedom to independently execute complex procedures on local elements: it is not limited by SIMD-style array syntax. All access to non-local array elements must go through library functions--typically collective communication operations. This puts an extra onus on the programmer; but making communication explicit encourages the programmer to write algorithms that exploit locality, and simplifies the task of the compiler writer. On the other hand, by providing distributed arrays as language primitives we are able to simplify error-prone tasks such as converting between local and global array subscripts and determining which processor holds a particular element. As in HPF, it is possible to write programs at a natural level of abstraction where the meaning is insensitive to the detailed mapping of elements. Lower-level styles of programming are also possible.

1. Introduction
The explosion of java over the last year has been driven largely by its in role in bringing a new generation of interactive web pages to World Wide Web. Undoubtedly various features of the languages-compactness, byte code portability, security, and so onâ€make it particularly attractive as an implementation languages for applets embedded in web pages. But it is clear that the ambition of the Java development team go well beyond enhancing the functionality of HTML documents.
Java is designed to meet the chalanges of application development on the context of heterogeneous, network-wide distributed environments. Paramaount amoung these chalanges is secure delivery of applications that consume the minimum of systems resources, can run on any hardware and software platform, can be extended dynamically.
Several of these concerns are mirrored in developments in the High Prerformance Computing world over a number of years. A decade ago the focus of interest in the parallel computing community was on parallel hardware. A parallel computer was typically built from specialized processers through a proprietary high-performance communication switch. If the machine also had to be programmed in a proprietary language, that was an acceptable price for the benefits of using a supercomputer. This attitude was not sustainable as one parallel architecture gave way to another, and cost of porting software became exorbitant. For several years now, portability across platforms had been a central concern in parallel computing.
HPJava is a programming language extended from Java to support parallel programming, especially (but not exclusively) data parallel programming on message passing and distributed memory systems, from multi-processor systems to workstation clusters.
Although it has a close relationship with HPF, the design of HPJava does not inherit the HPF programming model. Instead the language introduces a high-level structured SPMD programming style--the HPspmd model. A program written in this kind of language explicitly coordinates well-defined process groups. These cooperate in a loosely synchronous manner, sharing logical threads of control. As in a conventional distributed-memory SPMD program, only a process owning a data item such as an array element is allowed to access the item directly. The language provides special constructs that allow programmers to meet this constraint conveniently.
Besides the normal variables of the sequential base language, the language model introduces classes of global variables that are stored collectively across process groups. Primarily, these are distributed arrays. They provide a global name space in the form of globally subscripted arrays, with assorted distribution patterns. This helps to relieve programmers of error-prone activities such as the local-to-global, global-to-local subscript translations which occur in data parallel applications.
In addition to special data types the language provides special constructs to facilitate both data parallel and task parallel programming. Through these constructs, different processors can either work simultaneously on globally addressed data, or independently execute complex procedures on locally held data. The conversion between these phases is seamless.
In the traditional SPMD mold, the language itself does not provide implicit data movement semantics. This greatly simplifies the task of the compiler, and should encourage programmers to use algorithms that exploit locality. Data on remote processors is accessed exclusively through explicit library calls. In particular, the initial HPJava implementation relies on a library of collective communication routines originally developed as part of an HPF runtime library. Other distributed-array-oriented communication libraries may be bound to the language later. Due to the explicit SPMD programming model, low level MPI communication is always available as a fall-back. The language itself only provides basic concepts to organize data arrays and process groups. Different communication patterns are implemented as library functions. This allows the possibility that if a new communication pattern is needed, it is relatively easily integrated through new libraries.
2. Overview of HPJava

HPJava stands for high performance java. Java already provides parallelism through threads. But that model of parallelism can only be easily exploited on shared memory computers. HPJava is targetted at distributed memory parallel computers (most likely, networks of PCs and workstations).
2.1. HPJava History
Active in the early 1990s, the High Performance Fortran Forum brought together leading high-performance computing practitioners to define a common language for data parallel computing. Inspired by the success of parallel dialects of Fortran such as Connection Machine Fortran, the resulting high performance Fortran(HPF) language definition extended the then-standard Fortran 90 with several parallel features.

HPJava project started around 1997, growing out of groupâ„¢s earlier HPF work.People were starting to talking about java Grande and the possibility that eventually java could be a high-performance language. The reason was that the syntax was considerably simpler, and perhaps there was scope to extend it with the features we wanted for data-parallel computing.
2.2. HPjava Philosophy

HPJava lifts some ideas directly from HPF, and its domain of applicability is likely to overlap HPF to a large extent. It has an almost equivalent model of distributed arrays(the only very significant difference in this respect is that eventually abandoned the block cyclic distribution format); you can write HPJava programs that very look like corresponding HPF programs. But the philosophies of the languages differ in some significant ways. HPJava was designed in bottom up manner.
3. Characterestics
We have explored the practicality of doing parallel computing in Java, and of providing Java interfaces to High Performance Computing software. For various reasons, the success of this exercise was not a foregone conclusion. Java sits on a virtual machine model that is significantly different to the hardware-oriented model which C or Fortran exploit directly. Java discourages or prevents direct access to the some of the fundamental resources of the underlying hardware (most extremely, its memory).
Which is the better strategy In the long term Java may become a major implementation language for large software packages like MPI. It certainly has advantages in respect of portability that could simplify implementations dramatically. In the immediate term recoding these packages does not appear so attractive. Java wrappers to existing software look more sensible. On a cautionary note, our experience with MPI suggests that interfacing Java to non-trivial communication packages may be less easy than it sounds. Nevertheless, we intend in the future to create a Java interface to an existing run-time library for data parallel computation.
So is Java, as it stands, a good language for High Performance Computing
It still has to be demonstrated that Java can be compiled to code of efficiency comparable with C or Fortran. Many avenues are being followed simultaneously towards a higher performance Java. Besides the Java chip effort of Sun, it has been reported at this workshop that IBM is developing an optimizing Java compiler which produces binary code directly, that Rice University and Rochester University are working on optimization and restructuring of bytecode generated by javac, and that Indiana University is working on source restructuring to parallelize Java. Parallel interpretation of bytecode is also an emerging practice. For example, the IBM JVM, an implementation of JVM on shared memory architectures, was released in spring 1996, and UIUC has recently started work aimed at parallel interpretation of Java bytecode for distributed memory systems.
Another promising approach under investigation is to integrate interpretation and compilation techniques for parallel execution of Java programs. In such a system, a partially ordered set of interpretive frames is generated by an II/CVM compiler. A frame is a description of some subtask, whose granularity may range from a single scalar assignment statement to a solver for a system of equations. Under supervision of the virtual machine (II/CVM), the actions specified in a frame may be performed in one of three ways:
Â¢ Executed by an interpretive module directly, which also incorporates JIT compilation capability.
Â¢ Some precompiled computational library function is invoked locally to accomplish the task; this function may be executed sequentially or in parallel.
Â¢ The frame is sent to some registered remote system, which will get the work done, once again either sequentially or in parallel.
With this approach, optimized binary codes for well formed computation subtasks exist in runtime libraries, supporting a high level interpretive environment. Task parallelism is observed among different frames executed by the three mechanisms simultaneously, while data parallelism is observed in the execution of some of the runtime functions.

Presuming these efforts satisfactorily address the performance issue, the second aspect in question concerns expressiveness of the Java language. Our final interface to MPI is quite elegant, and provides much of the functionality of the standard C and Fortran bindings. But creating this interface was a more difficult process than one might hope, both in terms of getting a good specification, and in terms of making the implementation work. The lack of features like C++ templates (or any form of parametric polymorphism) and user-defined operator overloading (available in many modern languages, from functional programming languages to Fortran) made it difficult to produce a completely satisfying interface to a data parallel library. The Java language as currently defined imposes various limits to the creativity of the programmer.
In many respects Java is undoubtedly a better language than Fortran. It is object-oriented to the core and highly dynamic, and there is every reason to suppose that such features will be as valuable in scientific computing as in any other programming discipline. But to displace established scientific programming languages Java will surely have to acquire some of the facilities taken for granted in those languages.
Popular acclaim aside, there are some reasons to think that Java may be a good language for scientific and parallel programming .
Java is a descendant of C++ .C and C++ are used increasingly in scientific programming; they are already used almost universally by implementers of parallel libraries and compilers. In recent years numerous variations on the theme of C++ for parallel computing have appeared.
Java omits various features of C and C++ that are considered difficultâ€notably, pointers. Poor compiler analysis has often been blamed on these features. The inference is that Java, like Fortran, may be a suitable source language for highly optimizing compilers (although direct evidence for this belief is still lacking).
Java comes in built in multithreading. Independent threads may be scheduled on different processor by a suitable runtime. In any case multithreading can be very convenient in explicit message-passing styles of parallel programming.
HPJava is a language for parallel programming, especially suitable for programming massively parallel, distributed memory computers.
3.1. Multidimensional arrays
First we describe a modest extension to Java that adds a class of true multi-dimensional arrays to the standard Java language. The new arrays allow regular section subscripting, similar to Fortran 90 arrays. The syntax described in this section is a subset of the syntax introduced later for parallel arrays and algorithms: the only motivation for discussing the sequential subset first is to simplify the overall presentation. No attempt is made to integrate the new multidimensional arrays with the standard Java arrays: they are a new kind of entity that coexists in the language with ordinary Java arrays. There are good technical reasons for keeping the two kinds of array separate.

The type-signatures and constructors of the multidimensional array use double brackets to distinguish them from ordinary arrays:
int [[,]] a = new int [[5, 5]] ;
float [[,,]] b = new float [[10, n, 20]] ;
int [[]] c = new int [[100]] ;
a, b and c are respectively 2-, 3- and one- dimensional arrays. Of course c is very similar in structure to the standard array d, created by
int [] d = new int [100] ;
c and d are not identical, though.
Access to individual elements of a multidimensional array goes through a subscripting operation involving single brackets, for example
for(int i = 0 ; i < 4 ; i++)
a [i, i + 1] = i + c [i] ;
For reasons that will become clearer in later sections, this style of subscripting is called local subscripting. In the current sequential context, apart from the fact that a single pair of brackest may include several comma-separated subscripts, this kind of subscripting works just like ordinary Java array subscripting. Subscripts always start at zero, in the ordinary Java or C style (there is no Fortran-like lower bound).
In general our language has no idea of Fortran-like array assignments.
int [[,]] e = new int [[n, m]] ;
...
a = e ;
In the assignment simply copies a handle to object referenced by e into a. There is no element-by-element copy involved. Similarly we introduce no idea of elemental arithmetic or elemental function application. If e and a are arrays, the expressions
e + a
Math.cos(e)
are type errors.

Our HPJava does import a Fortran-90-like idea of array regular sections. The syntax for section subscripting is different to the syntax for local subscripting. Double brackets are used. These brackets can include scalar subscripts or subscript triplets.

A section is an object in its own right--its type is that of a suitable multi-dimensional array. It describes some subset of the elements of the parent array. This is slightly different to the situation in Fortran, where sections cannot usually be captured as named entities.
int [[]] e = a [[2, 2 :]] ;
foo(b [[ : , 0, 1 : 10 : 2]]) ;
e becomes an alias for the 3rd row of elements of a. The procedure foo should expect a two-dimensional array as argument. It can read or write to the set of elements of b selected by the section. As in Fortran, upper or lower bounds can be omitted in triplets, defaulting to the actual bound of the parent array, and the stride entry of the triplet is optional. The subscripts of e, like any other array, start at 0, although the first element is identified with a [2, 2].
In our language, unlike Fortran, it is not allowed to use vectors of integers as subscripts. The only sections recognized are regular sections defined through scalar and triplet subscripts.
The language provides a library of functions for manipulating its arrays, closely analogous to the array transformational intrinsic functions of Fortran 90:
int [[,]] f = new int [[5, 5]] ;
HPJlib.shift(f, a, -1, 0, CYCL) ;
float g = HPJlib.sum(b) ;
int [[]] h = new int [[100]] ;
HPJlib.copy(h, c) ;
The shift operation with shift-mode CYCL executes a cyclic shift on the data in its second argument, copying the result to its first argument--an array of the same shape. In the example the shift amount is -1, and the shift is performed in dimension 0 of the array--the first of its two dimensions. The sum operation simply adds all elements of its argument array. The copy operation copies the elements of its second argument to its first--it is something like an array assignment. These functions may have to be overloaded to apply to some finite set of array types, eg they may be defined for arrays with elements of any suitable Java primitive type, up to some maximum rank of array. Alternatively the type-hierarchy for arrays can be defined in a way that allows these functions to be more polymorphic.
3.2. Process arrays
HPJava adds class libraries and some additional syntax for dealing with distributed arrays. These arrays are viewed as coherent global entities, but their elements are divided across a set of cooperating processes. As a pre-requisite to introducing distributed arrays we discuss the process arrays over which their elements are scattered.
An abstract base class Procs has subclasses Procs1, Procs2, ..., representing one-dimensional process arrays, two-dimensional process arrays, and so on.
Procs2 p = new Procs2(2, 2) ;
Procs1 q = new Procs1(4) ;
These declarations set p to represent a 2 by 2 process array and q to represent a 4-element, one-dimensional process array. In either case the object created describes a group of 4 processes. At the time the Procs constructors are executed the program should be executing on four or more processes. Either constructor selects four processes from this set and identifies them as members of the constructed group.
Procs has a member function called member, returning a boolean value. This is true if the local process is a member of the group, false otherwise.
if(p.member()) {
...
}
The code inside the if is executed only if the local process is a member p. We will say that inside this construct the active process group is restricted to p.

The multi-dimensional structure of a process array is reflected in its set of process dimensions. An object is associated with each dimension. These objects are accessed through the inquiry member dim:
Dimension x = p.dim(0) ;
Dimension y = p.dim(1) ;
Dimension z = q.dim(0) ;
The object returned by the dim inquiry has class Dimension. The members of this class include the inquiry crd. This returns the coordinate of the local process with respect to the process dimension. The result is only well-defined if the local process is a member of the parent process array. The inner body code in
if(p.member())
if(x.crd() == 0)
if(y.crd() == 0) {
...
}
will only execute on the first process from p, with coordinates(0,0).
3.3. Distributed arrays
Some or all of the dimensions of a multi-dimensional array can be declared to be distributed ranges. In general a distributed range is represented by an object of class Range. A Range object defines a range of integer subscripts, and defines how they are mapped into a process array dimension. In fact the Dimension class introduced in the previous section is a subclass of Range. In this case the integer range is just the range of coordinate values associated with the dimension. Each value in the range is mapped, of course, to the process (or slice of processes) with that coordinate. This kind of range is also called a primitive range. More complex subclasses of Range implement more elaborate maps from integer ranges to process dimensions. Some of these will be introduced in later sections. For now we concentrate on arrays constructed with Dimension objects as their distributed ranges.

The syntax of section 2 is extended in the following way to support distributed arrays
Â¢ A distributed range object may appear in place of an integer extent in the ``constructor'' of the array (the expression following the new keyword).
Â¢ If a particular dimension of the array has a distributed range, the corresponding slot in the type signature of the array should include a # symbol.
Â¢ In general the constructor of the distributed array must be followed by an on clause, specifying the process group over which the array is distributed. Distributed ranges of the array must be distributed over distinct dimensions of this group.
Assume p, x and y are declared as in the previous section, then
float [[#,#,]] a = new float [[x, y, 100]] on p ;
defines a as a 2 by 2 by 100 array of floating point numbers. Because the first two dimensions of the array are distributed ranges--dimensions of p--a is actually realized as four segments of 100 elements, one in each of the processes of p. The process in p with coordinates i, j holds the section a [[i, j, :]].
The distributed array a is equivalent in terms of storage to four local arrays defined by
float [] b = new float [100] ;
But because a is declared as a collective object we can apply collective operations to it. The HPJlib functions introduced in section 2 apply equally well to distributed arrays, but now they imply inter-processor communication.
float [[#,#,]] a = new float [[x, y, 100]] on p,
b = new float [[x, y, 100]] on p ;
HPJlib.shift(a, b, -1, 0, CYCL) ;
The shift operation causes the local values of a to be overwritten with values of b from a processor adjacent in the x dimension.
There is a catch in this. When subscripting the distributed dimensions of an array it is simply disallowed to use subscripts that refer to off-processor elements. While this:
int i = x.crd(), j = y.crd() ;
a [i, j, 20] = a [i, j, 21] ;
is allowed, this:
int i = x.crd(), j = y.crd() ;
a [i, j, 20] = b [(i + 1) % 2, j, 20] ;
is forbidden. The second example could apparently be implemented using a nearest neighbour communication, quite similar to the shift example above. But our language imposes an strict policy distinguishing it from most data parallel languages: while library functions may introduce communications, language primitives such as array subscripting never imply communication.
If subscripting distributed dimensions is so restricted, why are the i, j subscripts on the arrays needed at all In the examples of this section these subscripts are only allowed one value on each processor. Well, the inconvience of specifying the subscripts will be reduced by language constructs introduced later, and the fact that only one subscript value is local is a special feature of the primitive ranges used here. The higher level distributed ranges introduced later map multiple elements to individual processes. Subscripting will no longer look so redundant.

3.4. The on construct and the active process group
if(p.member()) {
...
}
appeared. Our language provides a short way of writing this construct
on(p) {
...
}
In fact the on construct provides some extra value. Informally we said in section 3 that the active process group is restricted to p inside the body of the p.member() conditional construct. The language incorporates a more formal idea of an active process group (APG). At any point of execution some process group is singled out as the APG. An on(p) construct specifically changes the value of the APG to p. On exit from the construct, the APG is restored to its value on entry.
Elevating the active process group to a part of the language allows some simplifications. For example, it provides a natural default for the on clause in array constructors. More importantly, formally defining the active process group simplifies the statement of various rules about what operations are legal inside distributed control constructs like on.
3.5. Other features
We have already described most of the important language features we propose to implement. Two additional features that are quite important in practice but have not been discussed are subranges and subgroups. A subrange is simply a range which is a regular section of some other range, created by syntax like x [0 : 49]. Subranges are created tacitly when a distributed array is subscripted with a triplet, and they can also be used directly to create distributed arrays with general HPF-like alignments. A subgroup is some slice of a process array, formed by restricting process coordinates in one or more dimensions to single values. Again they may be created implicitly by section subscripting, this time using a scalar subscript. They also formally describe the state of the active process group inside at and over constructs.
The framework described is much more powerful than space allows us to demonstrate in this paper. This power comes in part from the flexibility to add features by extending the libraries associated with the language. We have only illustrated the simplest kinds of distribution format. But any HPF 1.0 array distribution format, plus various others, can be incorporated by extending the Range hierarchy in the run-time library. We have only illustrated shift and writeHalo operations from the communication library, but the library also includes much more powerful operations for remapping arrays and performing irregular data accesses. Our intention is to provide minimal language support for distributed arrays, just enough to facilitate further extension through construction of new libraries.
For a more complete description of a slightly earlier version of the proposed language, see
4. Considerations in HPJava language design
This section discuss some design and implementation issues in the HPJava language. The language is briefly reviewed, then the class library that forms the foundation of the translation scheme is described. Through example codes, we illustrate how HPJava source codes can be translated straightforwardly to ordinary SPMD Java programs calling this library. This is followed by a discussion of the rationale for introducing the language in the first place, and of how various language features have been designed to facilitate efficient implementation.
4.1. Translation scheme

The initial HPJava compiler is implemented as a source-to-source translator converting an HPJava program to a Java node program, with calls to runtime functions. The runtime system is built on the NPAC PCRC runtime library, which has a kernel implemented in C++ and a Java interface implemented in Java and C++.
4.1.1. Java packages for HPspmd programming
The current runtime interface for HPJava is called adJava. It consists of two Java packages. The first is the HPspmd runtime proper. It includes the classes needed to translate language constructs. The second package provides communication and some simple I/O functions. These two packages will be outlined in this section.

The classes in the first package include an environment class, distributed array ``container classes'', and related classes describing process groups and index ranges. The environment class SpmdEnv provides functions to initialize and finalize the underlying communication library (currently MPI). Constructors call native functions to prepare the lower level communication package. An important field, apg, defines the group of processes that is cooperating in ``loose synchrony'' at the current point of execution.
The other classes in this package correspond directly to HPJava built-in classes. The first hierarchy is based on Group. A group, or process group, defines some subset of the processes executing the SPMD program. Groups have two important roles in HPJava. First they are used to describe how program variables such as arrays are distributed or replicated across the process pool. Secondly they are used to specify which subset of processes execute a particular code fragment. Important members of adJava Group class include the pair on(), no() used to translate the on construct.

Figure 1: The HPJava Group hierarchy
The most common way to create a group object is through the constructor for one of the subclasses representing a process grid. The subclass Procs represents a grid of processes and carries information on process dimensions: in particular an inquiry function dim® returns a range object describing the -th process dimension. Procs is further subclassed by Procs0, Procs1, Procs2, ...which provide simpler constructors for fixed dimensionality process grids. The class hierarchy of groups and process grids is shown in figure 1.
The second hierarchy in the package is based on Range. A range is a map from the integer interval into some process dimension (ie, some dimension of a process grid). Ranges are used to parametrize distributed arrays and the overall distributed loop.

Figure 2: The HPJava Range hierarchy
The most common way to create a range object is to use the constructor for one of the subclasses representing ranges with specific distribution formats. The current class hierarchy is given in figure 2. Simple block distribution format is implemented by BlockRange, while CyclicRange and BlockCyclicRange represent other standard distribution formats of HPF. The subclass CollapsedRange represents a sequential (undistributed range). Finally, DimRange represents the range of coordinates of a process dimension itself--just one element is mapped to each process.

The related adJava class Location represents an individual location in a particular distributed range. Important members of the adJava Range class include the function location(i) which returns the th location in a range and its inverse, idx(l), which returns the global subscript associated with a given location. Important members of the Location class include at() and ta(), used in the implementation of the HPJava that at construct.
Finally in this package we have the rather complex hierarchy of classes representing distributed arrays. HPJava global arrays declared using [[ ]] are represented by Java objects belonging to classes such as:
Array1dI, Array1cI,
Array2ddI, Array2dcI, Array2cdI, Array2ccI,
...
Array1dF, Array1cF,
Array2ddF, Array2dcF, Array2cdF, Array2ccF,
...
Generally speaking the class Arrayndc...T represents -dimensional distributed array with elements of type T, currently one of I, F, ..., meaning int, float, ... . The penultimate part of the class name is a string of ``c''s and ``d''s specifying whether each dimension is collapsed or distributed. These correlate with presence or absence of an asterisk in slots of the HPJava type signature. The concrete Array... classes implement a series of abstract interfaces. These follow a similar naming convention, but the root of their names is Section rather than Array (so Array2dcI, for example, implements Section2dcI). The hierarchy of Section interfaces is illustrated in figure 3.

Figure 3: The adJava Section hierarchy
The need to introduce the Section interfaces should be evident from the hierarchy diagram. The type hierarchy of HPJava involves a kind of multiple inheritance. The array type int [[*, *]], for example, is a specialization of both the types int [[*, ]] and int [[, *]]. Java allows ``multiple inheritance'' only from interfaces, not classes.

Here we mention some important members of the Section interfaces. The inquiry dat() returns an ordinary one dimensional Java array used to store the locally held elements of the distributed array. The member pos(i, ...), which takes arguments, returns the local offset of the element specified by its list of arguments. Each argument is either a location (if the corresponding dimension is distributed) or an integer (if it is collapsed). The inquiry grp() returns the group over which elements of the array are distributed. The inquiry rng(d) returns the th range of the array.
The second package in adJava is the communication library. The adJava communication package includes classes corresponding to the various collective communication schedules provided in the NPAC PCRC kernel. Most of them provide of a constructor to establish a schedule, and an execute method, which carries out the data movement specified by the schedule. The communication schedules provided in this package are based on the NPAC runtime library. Different communication models may eventually be added through further packages.

The collective communication schedules can be used directly by the programmer or invoked through certain wrapper functions. A class named Adlib is defined with static members that create and execute communication schedules and perform simple I/O functions. This class includes, for example, the following methods, each implemented by constructing the appropriate schedule and then executing it.
static public void remap(Section dst, Section src)
static public void shift(Section dst, Section src,
int shift, int dim, int mode)
static public void copy(Section dst, Section src)
static public void writeHalo(Section src,
int[] wlo, int[] whi, int[] mode)
Use of these functions will be illustrated in later examples. Polymorphism is achieved by using arguments of class Section.
4.2. Issues in the language design
With some of the implementation mechanisms exposed, we can better discuss the language design itself.
4.2.1. Extending the Java language

The first question to answer is why use Java as a base language Actually, the programming model embodied in HPJava is largely language independent. It can bound to other languages like C, C++ and Fortran. But Java is a convenient base language, especially for initial experiments, because it provides full object-orientation--convenient for describing complex distributed data--implemented in a relatively simple setting, conducive to implementation of source-to-source translators. It has been noted elsewhere that Java has various features suggesting it could be an attractive language for science and engineering
4.2.2. Datatypes in HPJava
In a parallel language, it is desirable to have both local variables (like the ones in MPI programming) and global variables (like the ones in HPF programming). The former provide flexibility and are ideal for task parallel programming; the latter are convenient especially for data parallel programming.

In HPJava, variable names are divided into two sets. In general those declared using ordinary Java syntax represent local variables and those declared with [[ ]] represent global variables. The two sectors are independent. In the implementation of HPJava the global variables have special data descriptors associated with them, defining how their components are divided or replicated across processes. The significance of the data descriptor is most obvious when dealing with procedure calls.
Passing array sections to procedure calls is an important component in the array processing facilities of Fortran90 [1]. The data descriptor of Fortran90 will include stride information for each array dimension. One can assume that HPF needs a much more complex kind of data descriptor to allow passing distributed arrays across procedure boundaries. In either case the descriptor is not visible to the programmer. Java has a more explicit data descriptor concept; its arrays are considered as objects, with, for example, a publicly accessible length field. In HPJava, the data descriptors for global data are similar to those used in HPF, but more explicitly exposed to programmers. Inquiry functions such as grp(), rng() have a similar role in global data to the field length in an ordinary Java array.

Keeping two data sectors seems to complicate the language and its syntax. But it provides convenience for both task and data parallel processing. There is no need for things like the LOCAL mechanism in HPF to call a local procedure on the node processor. The descriptors for ordinary Java variables are unchanged in HPJava. On each node processor ordinary Java data will be used as local varables, like in an MPI program
4.2.3. Programming convenience

The language provides some special syntax for the programmer's convenience. Unlike the syntax for data declaration, which has fundamental significance in the programming model, these extensions are purely provide syntactic conveniences.
There are a limited number of Java operators overloaded. A group object can be restricted by a location using the / operation, and a sub-range or location can be obtained from a range using the [ ] operator enclosing a triplet expression or an integer, These pieces of syntax can be considered as shorthand for suitable constructors in the corresponding classes. This is comparable to the way Java provides special syntax support for String class constructor.
Another kind of overloading occurs in location shift, which is used to support ghost regions. A shift operator + is defined between a location and an integer. It will be illustrated in the examples in the next section. This is a restricted operation--it has meaning (and is legal) only in an array subscript expression.
5. Discussion and related work

We have described a conservative set of extensions to Java. In the context of an explicitly SPMD programming environment with a good communication library, we claim these extensions provide much of the concise expressiveness of HPF, without relying on very sophisticated compiler analysis. The object-oriented features of Java are exploited to give an elegant parameterization of the distributed arrays of the extended language. Because of the relatively low-level programming model, interfacing to other parallel-programming paradigms is more natural than in HPF. With suitable care, it is possible to make direct calls to, say, MPI from within the data parallel program. In [3] we suggest a concrete Java binding for MPI.
We will mention two related projects. Spar [11] is a Java-based language for array-parallel programming. Like our language it introduces multi-dimensional arrays, array sections, and a parallel loop. There are some similarities in syntax, but semantically Spar is very different to our language. Spar expresses parallelism but not explicit data placement or communication--it is a higher level language. ZPL [10] is a new programming language for scientific computations. Like Spar, it is an array language. It has an idea of performing computations over a region, or set of indices. Within a compound statement prefixed by a region specifier, aligned elements of arrays distributed over the same region can be accessed. This idea has certain similarities to our over construct. Communication is more explicit than Spar, but not as explicit as in the language discussed in this article.
5.1. Implementation of Collectives
In this section we will discusses Java implementation of the Adlib collective operations. For illustration we concentrate on the important Remap operation. Although it is a powerful and general operation, it is actually one of the more simple collectives to implement in the HPJava framwork. General algorithms for this primitive have been described by other authors. For example it is essentially equivalent to the operation called Regular_Section_Copy_Sched in [2]. In this section we want to illustrate how this kind of operation can be implemented in term of the particular Range and Group hierarchies of HPJava (complemented by suitable set of messaging primitives). All collective operations in the library are based on communication schedule objects. Each kind of operation has an associated class of schedules. Particular instances of these schedules, involving particular data arrays and other parameters, are created by the class constructors. Executing a schedule initiates the communications required to effect the operation. A single schedule may be executed many times, repeating the same communication pattern. In this way, especially for iterative programs, the cost of computations and negotiations involved in constructing a schedule can often be amortized over many executions. This paradigm was pioneered in the CHAOS/PARTI libraries [8]. If a communication pattern is to be executed only once, simple wrapper functions can be made available to construct a schedule, execute it, then destroy it. The overhead of creating the schedule is essentially unavoidable, because even in the single-use case individual data movements generally have to be sorted and aggregated, for efficiency. The associated data structures are just those associated with schedule construction. Constructor and public method of the remap schedule for distributed arrays of float element can be described as follows:

The remap schedule combines two functionalities: it reorganizes data in the way indicated by the distribution formats of source and destination array. Also, if the destination array has a replicated distribution format, it broadcasts data to all copies of the destination. Here we will concentrate on the former aspect, which is handled by an object of class RemapSkeleton contained in every Remap object. During construction of a RemapSkeleton schedule, all send messages, receive messages, and internal copy operations implied by execution of the schedule are enumerated and stored in light-weight data structures. These messages have to be sorted before sending, for possible message agglomeration, and to ensure a deadlock-free communication schedule. These algorithms, and maintenance of the associated data structures, are dealt with in a base class of RemapSkeleton called BlockMessSchedule. The API for the superclass is outlined in Figure 11. To set-up such a low-level schedule, one makes a series of calls to sendReq and recvReq to define the required messages. Messages are characterized by an offset in some local array segment, and a set of strides and extents parameterizing a multi-dimensional patch of the (flat Java) array. Finally the build() operation does any necessary processing of the message lists. The schedule is executed in a ``forward'' or ``backward'' direction by invoking gather() or scatter().

Figure 11: API of the class BlockMessSchedule
5.2. Titanium

The Titanium language is designed to support high-performance scientific applications. Historically, few languages that made such a claim have achieved a significant degree of serious use by scientific programmers. Among the reasons are the high learning curve for such languages, the dependence on heroic parallelizing compiler technology and the consequent absence of compilers and tools, and the incompatibilities with languages used for libraries. Our goal is to provide a language that gives its users access to modern program structuring through the use of object-oriented technology, that enables its users to write explicitly parallel code to exploit their understanding of the computation, and that has a compiler that uses optimizing compiler technology where it is reliable and gives predictable results. The starting design point for Titanium is Java. We have chosen Java for several reasons. Although Titanium project extend the Java language for scientific computing but compile down to C or C++.Those approaches offer the hope of tuning for higher ultimate performance, but sacrifice various benefits of the full Java platform.
5.3. FIDIL
The multidimensional array support in HPJava is strongly influenced by FIDIL maps and domains [6, 11]. HPJava, however, sacrifices expressiveness for performance. FIDIL maps have arbitrary shapes. FIDIL has only a general domain type, thus making it harder to optimize code that uses the more common rectangular kind.
5.4. Split-C

The parallel execution model and global address space support in HPJava are closely related to Split-C [5] and AC [3]. HPJava shares a common communication layer with Split-C on distributed memory machines, which we have extended as part of the HPJava project to run on shared memory machines. Split-C differs from HPJava in that the default pointer type is local rather than global; a local pointer default simplifies interfacing to existing sequential code, but a global default makes it easier to port shared memory applications to distributed memory machines. Split-C uses sequential consistency as its default consistency model, but provides explicit operators to allow non-blocking operations to be used. In AC the compiler introduces non-blocking memory operations automatically, using only dependence information, not parallel program analysis.

6. Conclusion

Our experience thus far is that Java is a good choice as a base language: it is easy to extend, and its safety features greatly simplify the compiler writerâ„¢s task. We also believe that extending Java is easier than obtaining high performance within Javaâ„¢s strict language specs (assuming that the latter is at all feasible). Many of the features of HPJava would be hard or impossible to achieve as Java libraries, and the compiler would not be able to perform static analysis and optimizations on them. HPJava will be most helpful for problems that have some degree of regularity. To a first approximation, the HPJavaâ„¢s domain is similar to HPFâ„¢s. Many of the most challenging problems in modern computational science have irregular structure, so the value of our language features in those domains is more controversial.

7. References
[1] G. Fox, Java and Grande Application, Computing in Science & Engineering
[2] URL: http://hpjavaindex.html
[3] URL: http://communitygrids.iu.edu/IC2.html
[4] URL: http://nacseHPJava/index.html
[5] URL: http://hpjavampijava.html

CONTENTS
1 Introduction
2 Over view of HPJava
2.1 HPJava History
2.2 HPJava Philosophy
3 Characteretics
3.1 Multidimensional arrays
3.2 Process arrays
3.3 Distributed arrays
3.4 The on construct and the active process group
3.5 Other features
4 Considerations in HPJava language design
4.1 Translation scheme
4.1.1 Java packages for HPspmd programming
4.2 Issues in the language design
4.2.1 Extending the Java language
4.2.2 Datatypes in HPJava
4.2.3 Programming convenience
5 Discussion and related work
5.1 Implementation of Collectives
5.2 Titanium
5.3 FIDIL
5.4 Split-C
6 Conclusion
7 References

ACKNOWLEDGEMENTS

I express my sincere thanks to Prof. M.N Agnisarman Namboothiri (Head of the Department, Computer Science and Engineering, MESCE), Mr. Zainul Abid (Staff incharge) for their kind co-operation for presenting the seminars.
I also extend my sincere thanks to all other members of the faculty of Computer Science and Engineering Department and my friends for their co-operation and encouragement.
Rony V John

31-12-2009, 06:37 PM

[attachment=939]

Possibly Related Threads...
Thread		Author	Replies	Views	Last Post
	network security seminars report	computer science technology	14	20,494	24-11-2018, 01:19 AM Last Post:
	Modular Computing seminars report	computer science crazy	4	21,520	08-10-2013, 04:32 PM Last Post: Guest
	tele immersion seminars report	computer science technology	9	14,836	20-12-2012, 11:20 AM Last Post: seminar details
	computer science seminars topics	computer science crazy	1	10,076	16-03-2012, 10:38 AM Last Post: seminar paper
	GSM Security And Encryption (download seminars report)	Computer Science Clay	14	14,328	07-03-2012, 07:35 PM Last Post: kushi.8
	wireless lan security seminars report	computer science technology	8	11,780	24-02-2012, 12:21 PM Last Post: seminar paper
	wi-max seminars report	tanaya padhee	9	10,601	23-02-2012, 10:58 AM Last Post: seminar paper
	computer science seminars topics 2012-2011	project topics	2	19,990	21-02-2012, 04:38 PM Last Post: chethana mallya
	2011 seminars topics computer science	project topics	1	2,180	06-02-2012, 09:53 AM Last Post: seminar addict
	HYPER THREADING seminars report	computer science crazy	5	8,697	04-02-2012, 11:15 AM Last Post: seminar addict

Important Note..!

ASK HERE