Background concepts
-cloud computing
Resource management architecture
-Cloud management
-Image catalogue
-Security mechanisms


Cloud computing is emerging as a paradigm for the next generation of large scale scientific computing.
Here we deal with the usability of compute clouds to extend a grid workflow that can speed up executions of scientific workflows.


In the last decade, Grid Computing gained high popularity in the field of scientific computing.
Scientific computing is traditionally a high-utilization workload.

Advantages of cloud computing

Clouds promote the concept of leasing remote resources rather than buying own hardware.
Clouds eliminate the physical overhead cost of adding new hardware.
Clouds also promote the concept of hardware virtualization
Clouds provision the resources through business relationships.


ASKALON is a Grid application development and computing environment developed at the University of Innsbruck.
Objective: To simplify the development and optimization of application


Cloud computing is recently being increasingly used for the provisioning various services through the internet which are billed like utilities.
The most popular interpretation of cloud computing is infrastructure as a service.
Iaas is characterized by the concept of resource virtualization.


Cloud management
Image catalogue
Security mechanisms


The cloud-enabled resource manager extends the old Grid resource manager with two new runtime functions.
The cloud management component is responsible for provisioning, releasing and checking the status of an instance.


Each cloud infrastructure provides a different set of images offered by the provider or defined the users themselves.
The task of the image catalogue is to systematically organize missing information, which is registered manually by the resource manager administrator.


Security is a critical topic in cloud computing.
Several issues need to be addressed such as authentication to the cloud services.
Two types of credentials in cloud environment


Here we extended a grid workflow development and computing environment to use on-demand cloud resources in grid environments.
Workflows with large problem sizes can benefit from being executed in a combined grid and cloud environment.
This environment currently supports providers offering Amazon EC2-compliant interfaces, which can be extended for others cloud providers.


[1] A. Iosup, C. Dumitrescu, D. Epema, H. Li, and L. Wolters, How are real grids used the analysis of four grid traces and its implications, in International Conference on Grid Computing. IEEE Computer Society, 2006, pp. 262“269.
[2] G. D. Costa, M. D. Dikaiakos, and S. Orlando, Analyzing the workload of the south-east federation of the egee grid infrastructure, CoreGRID Technical Report, Tech. Rep. TR-0063, 2007.
In the last decade, Grid computing gained high popularity in the field of scientific computing through the idea of distributed resource sharing among institutions and scientists. Scientific computing is traditionally a high-utilization workload, with production Grids often running at over 80% utilization (generating high and often unpredictable latencies), and with smaller national Grids offering a rather limited amount of high-performance resources. Running large-scale simulations in such overloaded Grid environments often becomes latency bound or suffers from well-known Grid reliability problems. Today, a new research direction coined by the term Cloud computing proposes an alternative attractive to scientific computing scientists primarily because of four main advantages.
First, Clouds promote the concept of leasing remote resources rather than buying own hardware, which frees institutions from permanent maintenance costs and eliminates the burden of hardware deprecation following Mooreâ„¢s law.
Second, Clouds eliminate the physical overhead cost of adding new hardware such as compute nodes to clusters or supercomputers and the financial burden of permanent over-provisioning of occasionally needed resources. Through a new concept of scaling-by credit-card, Clouds promise to immediately scale up/down an infrastructure according to the temporal needs in a cost effective fashion.
Third, the concept of hardware virtualization can represent a significant breakthrough for the automatic and scalable deployment of complex scientific software and can also significantly improve the shared resource utilization.
Fourth, the provisioning of resources through business relationships constrains specialized data centre companies in offering reliable services which existing Grid infrastructures fail to deliver. Despite the existence of several integrated environments for transparent programming and high-performance use of Grid infrastructures for scientific applications , there are no results yet published in the community that report on extending them to enjoy the benefits offered by Cloud computing. While there are several early efforts that investigate the appropriateness of Clouds for scientific computing, they are either limited to simulations, do not address the highly successful workflow paradigm, and do not attempt to extend Grids with Clouds as a hybrid combined platform for scientific computing.
In this paper we extend a Grid workflow application development and computing environment to harness resources leased by Cloud computing providers. Our goal is to provide an infrastructure that allows the execution of workflows on conventional Grid resources which can be supplemented on demand with additional Cloud resources, if necessary. We concentrate our presentation on the extensions we brought to the resource management service to consider Cloud resources, comprising new Cloud management, software (image) deployment, and security components. We present experimental results using a real-world application in the Austrian Grid environment, extended with an own academic Cloud constructed using the Eucalyptus middleware and Xen virtualization technology.


While there are several workflow execution middlewares for Grid computing, none is known to support the new type of Cloud infrastructure.


ASKALON is a Grid application development and computing environment developed at the University of Innsbruck with the goal of simplifying the development and optimization of applications that can harness the power of Grid computing (see Figure 2.1). In ASKALON, the user composes workflow applications at a high level of abstraction using a UML graphical modeling tool. Workflows are specified as a directed graph of activity types representing an abstract semantic description of the computation such as a Gaussian elimination algorithm, a Fast Fourier Transform, or an N-body simulation. The activity types are interconnected in a workflow through control flow and data flow dependencies. The abstract workflow representation is given in an XML form to the ASKALON middleware services for transparent execution onto the Grid.
This task is mainly accomplished by a fault tolerant enactment engine, together with a scheduling service in charge of computing optimized mappings of workflow activities onto the available Grid resources. To achieve this task, the scheduler employs a resource management service that consists of two main components: GridARM for discovery and brokerage of hardware resources by interfacing with a Grid information service, and GLARE for registration and provisioning of software resources. An important functionality component of GLARE is the automatic provisioning of activity deployments on remote Grid sites, which are properly configured installations of the legacy software and services implementing the activity types. Once an activity deployment has been installed, we say that the remote resource has been provisioned and can be used by the scheduler and enactment engine for the workflow execution. This execution can be monitored using graphical tools or via the engines event system.

2.2 Cloud Computing

The buzzword Cloud computing is recently being increasingly used for the provisioning various services through the Internet which are billed like utilities. From a scientific point of view, the most popular interpretation of Cloud computing is Infrastructure as a Service (IaaS), which provides generic means for hosting and provisioning of access to raw computing infrastructure and its operating software. IaaS are typically provided by data a center renting modern hardware facilities to customers that only pay for what hey effectively use, which frees them from the burden of hardware maintenance and deprecation. IaaS is characterized by the concept of resource virtualization which allows a customer to deploy and run his own guest operating system on top of the virtualization software offered by the provider.
Virtualization in IaaS is also a key step towards distributed, automatic, and scalable deployment, installation, and maintenance of software. To deploy a guest operating system showing to the user another abstract and higher-level emulated platform, the user creates a virtual machine image, in short image. In order to use a Cloud resource, the user needs to copy and boot an image on top, called virtual machine instance, in short instance. After an instance has been started on a Cloud resource , we say that the resource has been provisioned and can be used. If a resource is no longer necessary, it must be released such that the user no longer pays for its use. Commercial Cloud providers typically provide to customers a selection of resource classes or instance types with different characteristics including CPU type, number of cores, memory, hard disk, and I/O performance.


To enable the ASKALON Grid environment use Cloud resources from different providers, we extended the resource management service three new components: Cloud management, image catalogue, and security mechanisms. Whenever the high-performance Grid resources are exhausted, the ASKALON scheduler has the option of supplementing them with additional ones leased from Cloud providers to faster complete the workflow. A limit for the maximum number of leased resources that are requested is set for each cloud in their credential properties. This limit helps to save money and stay within the resource limits given by the cloud provider. EC2 allows the users to request up to 20 instances on a normal account while bigger resource requests require to contact Amazon manually. The used dps.cloud offers 12 cores and any further requests could not be served so the limit for resource requests was set to 12. When a deployment request for a new Cloud resource arrives from the scheduler, the resource manager arranges its provisioning by performing the following steps(see fig 3.1).

Fig 3.1 The cloud-enhanced resource management architecture
1) Retrieves a signed request for a certain number of activity deployments needed to complete the workflow;
2) The security component checks the credential of the request and which Clouds are available for the requesting user;
3) The image catalogue component retrieves the predefined registered images for the accessible Clouds;
4) The images are checked if they include the requested activity deployment or if they have the capability to auto-deploy;
5) The instances are started using the Cloud management component and the image boot process is monitored until a (SSH) control connection is possible to the new instance. If the instance does not contain the requested activity deployment, an optional auto-deployment process using GLARE takes place;
6) A new entry is created in GridARM with all information required by the new instance such as identifier, IP address, and number of CPUs;
7) All the activity deployments contained in the booted image are registered in GLARE;
8) The resource manager replies to the scheduler with the new deployments for the requested activity types.

3.1 Cloud management

In terms of functionality, the Cloud-enabled resource manager extends the old Grid resource manager with two new runtime functions: the request for new deployments for a specific activity type and the release of a resource after its use ended. The Cloud management component is responsible for provisioning, releasing, and checking the status of an instance. Figure 3.2 shows a generic instance state transition diagram which we constructed by analyzing the instance states in different Cloud implementations.

Upon a request for an additional resources, the Cloud management component selects the resources (instance types) with the best price/performance ratio to which it transfers a image containing the required activity deployments, or enabled withauto-deployment functionality (state starting). In the running state the image is booted, while in the accessible the instance is ready to be used. In the resizing phase the underlying hardware is reconfigured, e.g. by adding more cores or memory , while in the restarting phase the image is rebooted, for example upon a kernel change. The release of an image upon shut down is signaled by the terminated state.Thefailed state indicates an error of any kind that automatically releases the resource. Upon a resource release, the instance and all the deployments registered are removed from GridARM and GLARE. However, if there are pending requests for an existing instance containing the required deployments, the resource manager can optimize the provisioning by reusing the same instance for the next user if they share same Cloud credential.
The Cloud manager also maintains a registry of the available resources classes (or instance types) offered by different Cloud providers containing the number of cores, the amount of memory and hard disk, I/O performance, and cost per unit of computation. For example, Table 3.1 contains the resource class information offered by four Cloud providers, which need to be manually entered by the resource manager administrator in the Cloud management registry due to the lack of a corresponding API.Today, different commercial and academic Clouds provide different interfaces to their services, as no official standard has been defined yet.
We are using in the Cloud management component the Amazon API defined by EC2, which is also implemented by Eucalyptus and Nimbus middlewares used for building academic Clouds. To support more Clouds, plugins to other interfaces or using a metacloud software are required. Table 3.2 shows an overview of the Cloud providers those are currently offering API access to provision and release their resources, and which could therefore be integrated into an automatic resource management system. This overview also shows the difference in available hardware configurations of the selected five providers. There is a also wide range of Cloud providers that do not offer an API to control the instances and therefore are not listed.

3.2 Image catalogue

Each Cloud infrastructure provides a different set of images offered by the provider or defined by the users themselves, which need to be organized in order to be of effective use. For example, the Amazon EC2 API provides built-in functionality to retrieve the list of available images, while other providers only offer plain text HTML pages listing their offers, while some providers have the lists of possible images hidden in their instance start API documentation. The information about the images provided by different Cloud providers is in all cases limited to simple string name and lacks additional semantic descriptions of image characteristics such the supported architecture, operating system type, embedded software deployments, or support for auto-deployment functionality. The task of the image catalogue is to systematically organize this missing information, which is registered manually by the resource manager administrator.
Figure 3.3 shows the hierarchical image catalogue structure were each provider has an assigned set of images, and for each image there is a list of embedded activity deployments, or which can be automatically deployed. Custom images with embedded deployments have reduced the provisioning overhead, as the deployment part is skipped. Images are currently not interoperable between Cloud providers which generate a large image catalogue that needs to be managed. As Table 3.2 demonstrates, the variety of the offers between different providers is high. For example, Amazon EC2 has by far the most images available, also due to the fact that users can upload their custom or modified images and make them available to the community. At the other extreme, AppNexus only provides one standard instance for its users.
The bus size of the different images may create additional problems with the activity deployments on the started instances, e.g. Amazon EC2 only offers 32 bit architectures on their two cheapest instance types, while the others are 64 bit.

3.3 Security

Security is a critical topic in Cloud computing with applications running and producing confidential data on remote unknown resources that need to pe protected. Several issues need to be addressed such as authentication to the Cloud services and to the started instances, as well as securing user credit card information. Authentication is supported by existing providers either through a key pair and certificate mechanism, or by using login and password combinations (see Table 3.2).
One can distinguish between two types of credentials in Cloud environments:
user credential is a persistent credential associated with a credit card number used for provisioning and releasing Cloud resources;
instance credential is a temporary credential used for manipulating an instance through the SSH protocol.
Since these credentials are issued separately by the providers, users will have different credentials for each Cloud infrastructure, in addition to their Grid Security Infrastructure (GSI) certificate. The resource manager needs to manage these credentials in a safe manner, while granting to the other services and to the application secure access to the deployed Cloud resources. The security mechanism of the resource manager is based on GSI proxy delegation credentials, which we extended with two secured repositories for Cloud access:
A MyCloud repository which, similar to a MyProxy repository, stores copies of the user credentials which can only be accessed by authenticating with a correct GSI credential associated to it;

A MyInstance repository for storing temporary instance credentials generated for each started instance.The detailed security procedure upon an image deploymentrequest is as follows (see Figure 3.4):
1) A GSI-authenticated request for a new image deployment is received.
2) The security component checks in the MyCloud repository for the Clouds for which the user has valid credentials;
3) A new credential is generated for the new instance that needs to be started. In case multiple images need to be started, the same instance credential can be used to reduce the credential generation overhead (i.e. about 6-10 seconds in our experiments, including the communication overhead);
4) The new instance credentials are stored in the MyImage repository, which will only be accessible to the enactment engine service for job execution after a proper GSI authentication;
5) A start instance request is sent to the Cloud using the newly generated instance credential;
6) When an instance is released, the resource manager deletes the corresponding credential from the MyInstance repository.


In this paper we extended a Grid workflow development and computing environment to use on-demand Cloud resources in Grid environments offering a limited amount of high-performance resources. We presented the extensions to the resource management architecture to consider Cloud resources comprising three new components: Cloud management for automatic image management, image catalogue for management of software deployments, and security for authenticating with multiple Cloud providers. We presented experimental results of using a real-world application in the Austrian Grid environment, extended with an own academic Cloud. Our results demonstrate that workflows with large problem sizes can significantly benefit from being executed in a combined Grid and Cloud environment.
Similarly, the cost of using Cloud resources is more convenient for large workflows due to the hourly billing increment policies applied. Our environment currently supports providers offering Amazon EC2-compliant interfaces, which we plan to extend for other Cloud providers. We also plan to investigate more sophisticated multi-criteria scheduling strategies such as the effect of the resource class granularity (i.e. number of underlying cores) on the execution time, resource allocation efficiency, and the overall cost. We also intend to use the Cloud simulation framework presented in for validating various scheduling and optimization strategies at a larger scale.


[3] J. Yu and R. Buyya, A taxonomy of scientific workflow systems for grid computing, ACM SIGMOD Rec., vol. 34, no. 3, pp. 44“49, 2005.
[4] E. Deelman, G. Singh, M. Livny, J. B. Berriman, and J. Good, The cost of doing science on the cloud: the montage example, in Proceedings of the ACM/IEEE Conference on High Performance Computing, SC 2008, November 15-21, 2008, Austin, Texas, USA. IEEE/ACM, 2008, p. 50.
[5] A. C. M. Assuncao and R. Buyya, Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters, in 11th IEEE International Conference on High Performance Computing and Communications, HPCC 2009, D. Kranzlm¨uller, A. Bode, H.-G. Hegering, H. Casanova, and M. Gerndt, Eds. ACM, 2009.
[6] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov, Eucalyptus: A technical report on an elastic utility computing architecture linking your programs to useful systems, UCSB Computer Science Technical Report, Tech. Rep. 2008-10, 2008.

Channel access method

Channel access method
In telecommunications and computer networks, a channel access method or multiple access method allows several terminals connected to the same physical medium to transmit over it and to share its capacity. Examples of shared physical media are bus networks, ring networks, hub networks, wireless networks and half-duplex point-to-point links. Respective wording is recommended with IETF [http://tools.ietfhtml/draft-ietf-manet-term-00 on Mobile Ad Hoc Networking Terminology] .
Multiple access protocols and control mechanisms are called media access control (MAC) for Data links, which is provided by the Data Link Layer in the OSI model and the Link Layer of the TCP/IP model.
A multiple access method is based on a multiplex method, that allows several data streams or signals to share the same communication channel or physical media. Multiplexing is provided by the Physical Layer. Note that multiplexing also may be used in full-duplex point-to-point communication between nodes in a switched network, which should not be considered as multiple access.
List of channel access methods
Circuit mode and channelization methods
The following are common circuit mode and channelization channel access methods:
*Frequency division multiple access (FDMA)
**Orthogonal frequency division multiple access (OFDMA)
**Wavelength division multiple access (WDMA)
*Time-division multiple access (TDMA)
**Multi-Frequency Time Division Multiple Access (MF-TDMA)
*Spread spectrum multiple access (SSMA)
**Direct-sequence spread spectrum (DSSS)
**Frequency-hopping spread spectrum (FHSS)
**Orthogonal Frequency-Hopping Multiple Access (OFHMA)
**Code division multiple access (CDMA) - the overarching form of DS-SS and FH-SS
**Multi-carrier code division multiple access (MC-CDMA)
*Space division multiple access (SDMA)
Packet mode methods
The following are examples of packet mode channel access methods:
*Contention based random multiple access methods:
**Slotted Aloha
**Multiple Access with Collision Avoidance (MACA)
**Multiple Access with Collision Avoidance for Wireless (MACAW)
**Carrier sense multiple access (CSMA)
**Carrier sense multiple access with collision detection (CSMA/CD)
**Carrier sense multiple access with collision avoidance (CSMA/CA)
***Distributed Coordination Function (DCF)
***Point Coordination Function (PCF)
**Carrier sense multiple access with collision avoidance and Resolution using Priorities (CSMA/CARP)
* Token passing:
**Token ring
**Token bus
* Polling
* Resource reservation (scheduled) packet-mode protocols:
** Dynamic Time Division Multiple Access (Dynamic TDMA)
** Packet reservation multiple access (PRMA)
** Reservation ALOHA (R-ALOHA)
Duplexing methods
Where these methods are used for dividing forward and reverse communication channels, they are known as duplexing methods, such as:
*Time division duplex (TDD)
*Frequency division duplex (FDD)
Hybrid channel access scheme application examples
Note that hybrids of these techniques can be - and frequently are - used. Some examples:
* The GSM cellular system combines the use of frequency division duplex (FDD) to prevent interference between outward and return signals, with FDMA and TDMA to allow multiple handsets to work in a single cell.
* GSM with the GPRS packet switched service combines FDD and FDMA with slotted Aloha for reservation inquiries, and a Dynamic TDMA scheme for transferring the actual data.
* Bluetooth packet mode communication combines frequency hopping (for shared channel access among several private area networks in the same room) with CSMA/CA (for shared channel access inside a medium).
* IEEE 802.11b wireless local area networks (WLANs) are based on FDMA and DS-CDMA for avoiding interference among adjacent WLAN cells or access points. This is combined with CSMA/CA for multiple access within the cell.
* HIPERLAN/2 wireless networks combine FDMA with dynamic TDMA, meaning that resource reservation is achieved by packet scheduling.
Definition within certain application areas
Local and metropolitan area networks
In local area networks (LANs) and metropolitan area networks (MANs), multiple access methods enable bus networks, ring networks, hubbed networks, wireless networks and half duplex point-to-point communication, but are not required in full duplex point-to-point serial lines between network switches and routers, or in switched networks (logical star topology). The most common multiple access method is CSMA/CD, which is used in Ethernet. Although today's Ethernet installations typically are switched, CSMA/CD is utilized anyway to achieve compatibility with hubs.
atellite communications
In satellite communications, multiple access is the capability of a communications satellite to function as a portion of a communications page link between more than one pair of satellite terminals concurrently. Three types of multiple access presently used with communications satellites are code-division, frequency-division, and time-division multiple access.
witching centers
In telecommunication switching centers, multiple access is the connection of a user to two or more switching centers by separate access lines using a single message routing indicator or telephone number.


