03-03-2011, 03:48 PM
presented by:
Tran, Van Hoai
[attachment=9467]
Introduction to DISTRIBUTED COMPUTING
• Tran, Van Hoai
• Department of Systems & Networking
• Faculty of Computer Science & Engineering
• HCMC University of Technology
• Outline
• Why distributed computing needed ?
– performed by distributed systems
• Examples
Definitions
• Goals to build distributed systems
• Why distributed systems needed ? (1)
• Functional distribution: computers have different functional capabilities
– Client/server
– Host/terminal
– Data gathering/data processing
– sharing of resources with specific functionalities
• Inherent distribution: stemming from application domain, e.g.,
– cash register and inventory systems for supermarket chains
– computer supported collaborative work
• Why distributed systems needed ? (2)
• Load distribution/balancing: assign tasks to computers such that overall performance is optimized
• Replication of processing power: independent computers working on the same task
– collection of microcomputers may have processing power that no supercomputer will ever achieve
• Why distributed systems needed ? (3)
• Physical separation: relying on the fact that computers are physically separated (e.g., to satisfy reliability requirements)
• Economics: collections of microprocessors offer a better price/performance ratio than large mainframes
– mainframes: 10 times faster, 1000 times as expensive
Examples (1)
• Network of workstations
– all files accessible from all machines in the same way and using the same path name
– system looks for the best place to execute a command
– distributed system
• Workflow information system: automatic order processing
– people from several departments at different locations
– users unaware how an order to be processed
– distributed system
• World Wide Web: offering uniform model of distributed documents
– in theory, no need to know where the document is fetched
– in practice, the location should be awared
Definitions (1)
“A system in which hardware or software located at networked computers communicate and coordinate their actions only by message passing”.
[Coulouris]
“A system that consists of a collection of two or more independent computers which coordinate their processing through exchange of synchronous or asynchronous message passi
“A distributed system is a collection of independent computers that appear to the users of the system as a single computer”.
[Tanenbaum]
“A distributed system is a collection of autonomous computers linked by a network with software designed to produce an integrated computing facility”.
• There are several autonomous computational entities, each of which has its own local memory.
[Andrews et al]
• Computer networks vs.
Distributed systems
• Computer network: autonomous computers are explicitly visible (have to be explicitly addressed)
• Distributed system: existence of multiple computers is transparent
• However,
– many problems in common
– in some sense networks (or parts of them, e.g. name services) are also distributed systems
– normally, every distributed system relies on services provided by a computer network
Which examples are distributed systems ?
• Network of workstations
– distributed system
• Workflow information system: automatic order processing
– distributed system
• World Wide Web
– not fully qualified as a distributed system (Tanenbaum)
– distributed system (Coulouris)
– Middleware service
• To guarantee
– supporting heterogeneous computers
– providing single view to users
Goals to build a distributed systems (1)
• Connecting users and resources
– sharing resource
– easier to collaborate and exchange information
– disadvantage: security (intrusion), privacy violation (communication tracking)
Transparency
• Openness
– Offering services according to standard rules that describe syntax and semantics of those services
• syntax specification: in interface definition language
• semantic specification: in natural language
– Interoperability and portability
– Flexibility: using different components from different developers
• Scalability
– Measured in three dimensions
• size: more users, resources can be added easily
• geographics: users, resources may lie far apart
• administration: still easy to manage even spanning many independent administrative organizations
– Some problems must be solved
• size: centralization
– centralized service: single server for all users
– centralized data: single online telephone book
– centralized algorithm: routing based on complete information
• size: centralization
– centralized service: single server for all users
– centralized data: single online telephone book
– centralized algorithm: routing based on complete information
• geographics: synchronous & unreliable communication,
– some system only designed for LAN (blocking communication depends strongly on quick response)
• administration: conflicting policies w.r.t. resource usage, management, security
Scaling techniques
• Asynchronous communication
• Distribution
• Replication, caching
Typical properties
• tolerate failures in individual computers
• The structure of the system (network topology, network latency, number of computers) is not known in advance
• Each computer has only a limited, incomplete view of the system
Architectures
• Client-server:
– permanent data on server
• 3-tier architecture:
– stateless client,
– N-tier: web applications
• Tightly-coupled (clustered):
– NOW, cluster of machines
• Peer-to-peer
– Grid computing (VO level)
• Space-based
– virtualization as one single address-space
Tran, Van Hoai
[attachment=9467]
Introduction to DISTRIBUTED COMPUTING
• Tran, Van Hoai
• Department of Systems & Networking
• Faculty of Computer Science & Engineering
• HCMC University of Technology
• Outline
• Why distributed computing needed ?
– performed by distributed systems
• Examples
Definitions
• Goals to build distributed systems
• Why distributed systems needed ? (1)
• Functional distribution: computers have different functional capabilities
– Client/server
– Host/terminal
– Data gathering/data processing
– sharing of resources with specific functionalities
• Inherent distribution: stemming from application domain, e.g.,
– cash register and inventory systems for supermarket chains
– computer supported collaborative work
• Why distributed systems needed ? (2)
• Load distribution/balancing: assign tasks to computers such that overall performance is optimized
• Replication of processing power: independent computers working on the same task
– collection of microcomputers may have processing power that no supercomputer will ever achieve
• Why distributed systems needed ? (3)
• Physical separation: relying on the fact that computers are physically separated (e.g., to satisfy reliability requirements)
• Economics: collections of microprocessors offer a better price/performance ratio than large mainframes
– mainframes: 10 times faster, 1000 times as expensive
Examples (1)
• Network of workstations
– all files accessible from all machines in the same way and using the same path name
– system looks for the best place to execute a command
– distributed system
• Workflow information system: automatic order processing
– people from several departments at different locations
– users unaware how an order to be processed
– distributed system
• World Wide Web: offering uniform model of distributed documents
– in theory, no need to know where the document is fetched
– in practice, the location should be awared
Definitions (1)
“A system in which hardware or software located at networked computers communicate and coordinate their actions only by message passing”.
[Coulouris]
“A system that consists of a collection of two or more independent computers which coordinate their processing through exchange of synchronous or asynchronous message passi
“A distributed system is a collection of independent computers that appear to the users of the system as a single computer”.
[Tanenbaum]
“A distributed system is a collection of autonomous computers linked by a network with software designed to produce an integrated computing facility”.
• There are several autonomous computational entities, each of which has its own local memory.
[Andrews et al]
• Computer networks vs.
Distributed systems
• Computer network: autonomous computers are explicitly visible (have to be explicitly addressed)
• Distributed system: existence of multiple computers is transparent
• However,
– many problems in common
– in some sense networks (or parts of them, e.g. name services) are also distributed systems
– normally, every distributed system relies on services provided by a computer network
Which examples are distributed systems ?
• Network of workstations
– distributed system
• Workflow information system: automatic order processing
– distributed system
• World Wide Web
– not fully qualified as a distributed system (Tanenbaum)
– distributed system (Coulouris)
– Middleware service
• To guarantee
– supporting heterogeneous computers
– providing single view to users
Goals to build a distributed systems (1)
• Connecting users and resources
– sharing resource
– easier to collaborate and exchange information
– disadvantage: security (intrusion), privacy violation (communication tracking)
Transparency
• Openness
– Offering services according to standard rules that describe syntax and semantics of those services
• syntax specification: in interface definition language
• semantic specification: in natural language
– Interoperability and portability
– Flexibility: using different components from different developers
• Scalability
– Measured in three dimensions
• size: more users, resources can be added easily
• geographics: users, resources may lie far apart
• administration: still easy to manage even spanning many independent administrative organizations
– Some problems must be solved
• size: centralization
– centralized service: single server for all users
– centralized data: single online telephone book
– centralized algorithm: routing based on complete information
• size: centralization
– centralized service: single server for all users
– centralized data: single online telephone book
– centralized algorithm: routing based on complete information
• geographics: synchronous & unreliable communication,
– some system only designed for LAN (blocking communication depends strongly on quick response)
• administration: conflicting policies w.r.t. resource usage, management, security
Scaling techniques
• Asynchronous communication
• Distribution
• Replication, caching
Typical properties
• tolerate failures in individual computers
• The structure of the system (network topology, network latency, number of computers) is not known in advance
• Each computer has only a limited, incomplete view of the system
Architectures
• Client-server:
– permanent data on server
• 3-tier architecture:
– stateless client,
– N-tier: web applications
• Tightly-coupled (clustered):
– NOW, cluster of machines
• Peer-to-peer
– Grid computing (VO level)
• Space-based
– virtualization as one single address-space