24-01-2011, 05:12 PM
[attachment=8420]
Dheeraj Bhardwaj
Department of Computer Science and Engineering
Indian Institute of Technology, Delhi
Outline
The technology landscape
Grid computing
The Globus Toolkit
Applications and technologies
Data-intensive; distributed computing; collaborative; remote access to facilities
Grid infrastructure
Open Grid Services Architecture
Global Grid Forum
Summary and conclusions
Living in an Exponential World:(2) Storage
Storage density doubles every 12 months
Dramatic growth in online data (1 petabyte = 1000 terabyte = 1,000,000 gigabyte)
2000 ~0.5 petabyte
2005 ~10 petabytes
2010 ~100 petabytes
2015 ~1000 petabytes?
Transforming entire disciplines in physical and, increasingly, biological sciences; humanities next?
Data Intensive Physical Sciences
High energy & nuclear physics
Including new experiments at CERN
Gravity wave searches
LIGO, GEO, VIRGO
Time-dependent 3-D systems (simulation, data)
Earth Observation, climate modeling
Geophysics, earthquake modeling
Fluids, aerodynamic design
Pollutant dispersal scenarios
Astronomy: Digital sky surveys
Ongoing Astronomical Mega-Surveys
Large number of new surveys
Multi-TB in size, 100M objects or larger
In databases
Individual archives planned and under way
Multi-wavelength view of the sky
> 13 wavelength coverage within 5 years
Impressive early discoveries
Finding exotic objects by unusual colors
L,T dwarfs, high redshift quasars
Finding objects by time variability
Gravitational micro-lensing
Coming Floods of Astronomy Data
The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008!
All-sky survey every few days, so will have fine-grain time series for the first time
Data Intensive Biology and Medicine
Medical data
X-Ray, mammography data, etc. (many petabytes)
Digitizing patient records (ditto)
X-ray crystallography
Molecular genomics and related disciplines
Human Genome, other genome databases
Proteomics (protein structure, activities, …)
Protein interactions, drug delivery
Virtual Population Laboratory (proposed)
Simulate likely spread of disease outbreaks
Brain scans (3-D, time dependent)
Evolution of Business
Pre-Internet
Central corporate data processing facility
Business processes not compute-oriented
Post-Internet
Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B)
Outsourcing becomes feasible => service providers of various sorts
Business processes increasingly computing- and data-rich
The Grid
“Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations”
The Grid Opportunity:eScience and eBusiness
Physicists worldwide pool resources for peta-op analyses of petabytes of data
Civil engineers collaborate to design, execute, & analyze shake table experiments
An insurance company mines data from partner hospitals for fraud detection
An application service provider offloads excess load to a compute cycle provider
An enterprise configures internal & external resources to support eBusiness workload
Challenging Technical Requirements
Dynamic formation and management of virtual organizations
Online negotiation of access to services: who, what, why, when, how
Establishment of applications and systems able to deliver multiple qualities of service
Autonomic management of infrastructure elements
Open Grid Services Architecture