Distributing Frequency-Dependent Data Stream Computations
#1

Abstract
Data stream computations in domains such as internet applications are often performed in a highly distributed fashion in order to save time. An example is the class of applications that use the Google Mapreduce framework of scalable distributed processing as presented by (Dean & Ghemawat 2004). A basic question here is: what kind of data stream computations admit scalable and efficient distributed algorithms? We show that the class of data stream computations that approximate functions of the frequency vector of the stream can be computed efficiently in a distributed manner.
1 Introduction
Modern distributed data-centric applications typically process massive amounts of data over a collection of nodes that are organized in the form of a tree. The applications include internet data processing and sensor networks, among others. An example is the class of applications that use the Google Mapreduce framework of scalable and flexible distributed processing as presented by (Dean & Ghemawat 2004). In this model, data resides at the hundreds or even thousands of nodes that form the leaf nodes of a depth one tree. Data at each of the nodes is locally reduced, and the reduced data is sent to the root site, which combines the locally reduced copies into a single copy and then computes an answer from it. In sensor networks, the sensors are organized in the form of a directed spanning tree and local data at nodes is reduced and combined up the tree. Many such practical algorithms are maximally flexible in the sense that all trees over the same set of leaf nodes may be used to compute the answer. There is another practical model of data processing, namely, data stream processing, that also processes massive amounts of data, but, in a sequential, online fashion. Typically, a low space summary of the data stream is constructed and maintained with respect to newly arriving data or updates. Although the data streaming model is sequential, it is an oftspoken virtue of data stream processing that typical stream summary structures are efficiently composable. By composability, we mean that there is a binary composition operation _ that takes instances of the summary structure that are maintained independently over distributed streams and combines them into a single instance whose state is the same as, or Copyright c 2009, Australian Computer Society, Inc. This paper appeared at the Fifteenth Computing: The Australasian Symposium (CATS 2009), Wellington, New Zealand. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 94, Rod Downey and Prabhu Manyem, Ed. Reproduction for academic, not-for profit purposes permitted provided this text is included. is equivalent to, the state of the summary structure after processing the concatenation of the distributed streams at a single site. The property of composable summaries makes it viable to have data-parallel distributed streaming computation with substantial flexibility, as illustrated by the following example. Example 1. Consider a distributed computation consisting of k nodes such that the input at each node is an n-dimensional vector fi, i = 1, 2, . . . , k. The problem is to design an efficient distributed algorithm to compute an approximation to the `2 norm of the sum of the vectors f1 + . . . + fk. This is possible using the randomized linear sketch technique by (Alon et al. 1998) for approximating the `2 norm of a vector in the data stream model. Each site computes a random sketch of its input vector using the same random bits and sends it to a central site. The central site composes the sketches received from the sites, by simply adding them like vectors in the sketch space and then applies the `2-approximation function on the composed sketches. The composition operation is time-efficient, as well as associative and commutative. So any binary tree over the set of leaf nodes may be used to compute the answer, making the distributed computation highly flexible. The composability property of stream structures presents the possibility that sequential data stream execution can be molded into data-parallel execution over distributed streams. This was explored in the mud model (acronym for Massive Unordered Distributed algorithms) presented by (Feldman et al. 2008). We first outline the data streaming model and then discuss the mud model and the results by (Feldman et al. 2008).


Download full report
http://googleurl?sa=t&source=web&cd=1&ve...cats09.pdf&ei=NK1ETrz1GIPqrQexloT9Aw&usg=AFQjCNHOzCZZPYxrDsAqZnIWhe-HPn_DLQ&sig2=K4sJ0QjCXrWsBjm5G0tSvg
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: daynamic load balancing distributing multiple nodes with exploiting node pdf abstract, iv computations formula, sturtevant cartercar and lambert featured friction dependent cvts puttr 1991, advantages of light dependent resistor, distributing computing ppt, cts distributing, uab dean of,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  ROBUST DWT-SVD DOMAIN IMAGE WATERMARKING: EMBEDDING DATA IN ALL FREQUENCIES computer science crazy 2 5,199 19-06-2018, 06:10 PM
Last Post: KavyaIyengar
  CONTENT DEPENDENT WATER MARKING SCHEME FOR SPEECH SIGNAL seminar class 3 2,393 04-05-2015, 03:15 PM
Last Post: seminar report asees
  ELECTROSTATIC MICRO POWER GENERATOR FROM LOW FREQUENCY VIBRATION SUCH AS HUMA seminar surveyer 2 3,936 14-09-2013, 09:53 AM
Last Post: computer topic
  wireless-data-communication-infrared-led seminar class 4 3,308 31-07-2013, 10:16 AM
Last Post: computer topic
  Secured Data Transmission through Network seminar surveyer 2 2,310 26-04-2013, 02:02 PM
Last Post: computer topic
  Wirelesss Data Encryptiion and Decryption using RF Communication project topics 17 11,410 03-02-2013, 10:30 PM
Last Post: mohanece401
  IMAGE ENHANCEMENT TECHNIQUES USING FREQUENCY DOMAIN FILTERING project report tiger 1 6,263 18-01-2013, 04:45 PM
Last Post: Guest
  Enhancing Data Migration Performance via Parallel Data Compression seminar class 2 1,582 29-11-2012, 02:18 PM
Last Post: seminar details
  Radio Frequency based Real time Child monitoring and alarm system smart paper boy 1 2,479 26-11-2012, 12:55 PM
Last Post: seminar details
  Patient Monitoring System and Data Acquisition Through GSM seminar class 1 2,560 24-02-2012, 01:16 PM
Last Post: seminar paper

Forum Jump: