Processing Set Expressions over Continuous Update Streams
#1

ABSTRACT
There is growing interest in algorithms for processing and querying continuous data streams (i.e., data that is seen only once in a _xed order) with limited memory resources. In its most general form, a data stream is actually an update stream, i.e., comprising data-item deletions as well as inser- tions. Such massive update streams arise naturally in several application domains (e.g., monitoring of large IP network installations, or processing of retail-chain transactions). Estimating the cardinality of set expressions de_ned over several (perhaps, distributed) update streams is perhaps one of the most fundamental query classes of interest; as an ex- ample, such a query may ask \what is the number of distinct IP source addresses seen in passing packets from both router R1 and R2 but not router R3?". Earlier work has only ad- dressed very restricted forms of this problem, focusing solely on the special case of insert-only streams and speci_c op- erators (e.g., union). In this paper, we propose the _rst space-e_cient algorithmic solution for estimating the car- dinality of full-edged set expressions over general update streams. Our estimation algorithms are probabilistic in na- ture and rely on a novel, hash-based synopsis data structure, termed \2-level hash sketch". We demonstrate how our 2- level hash sketch synopses can be used to provide low-error, high-con_dence estimates for the cardinality of set expres- sions (including operators such as set union, intersection, and di_erence) over continuous update streams, using only small space and small processing time per update. Further- more, our estimators never require rescanning or resampling of past stream items, regardless of the number of deletions in the stream. We also present lower bounds for the problem, demonstrating that the space usage of our estimation algo- rithms is within small factors of the optimal. Preliminary experimental results verify the e_ectiveness of our approach.
1. INTRODUCTION
Query-processing algorithms for conventional Database Management Systems (DBMS) rely on (possibly) several passes over a collection of static data sets in order to produce an accurate answer to a user query. For several emerging ap- plication domains, however, updates to the data arrive on a continuous basis, and the query processor needs to be able to produce answers to user queries based solely on the observed stream of data and without the bene_t of several passes over a static data image. As a result, there has been a urry of recent work on designing e_ective query-processing algo- rithms that work over continuous data streams to produce results online while guaranteeing (1) small memory foot- prints, and (2) low processing times per stream item [1, 10, 13, 15]. Such algorithms typically rely on summarizing the data stream(s) involved in concise synopses that can be used to provide approximate answers to user queries along with some reasonable guarantees on the quality of the approxi- mation. In their most general form, real-life data streams are ac- tually update streams; that is, the stream is a sequence of updates to data items, comprising data-item deletions as well as insertions 1. Such continuous update streams arise naturally, for example, in the network installations of large Internet service providers, where detailed usage information (SNMP/RMON packet-ow data, active VPN circuits, etc.) from di_erent parts of the underlying network needs to be continuously collected and analyzed for interesting trends. Other application domains giving rise to continuous and massive update streams include retail-chain transaction pro- cessing (e.g., purchase and sale records), ATM and credit- card operations, logging Web-server usage records, and so on. The processing of such streams follows, in general, a distributed model where each stream (or, part of a stream) is observed and summarized by its respective party (e.g., the element-management system of an individual IP router) and the resulting synopses are then collected (e.g., periodically) at a central site, where queries over the entire collection of streams can be processed [15]. This model is used, for exam- ple, in Lucent's Interprenet and Cisco's NetFlow products for IP network monitoring. Clearly, there are several forms of queries that users or applications may wish to pose (online) over such continuous update streams; examples include joins or multi-joins [1, 10], norm computations [2, 19], or quantile estimation [16]. Perhaps one of the most fundamental queries of interest is estimating the result cardinalities of set expressions de_ned


Download full report
http://googleurl?sa=t&source=web&cd=1&ve...gmod03.pdf&ei=e8BETuPUF4iJrAef5vH5Aw&usg=AFQjCNERW4_YhkyS1L-uiB1JpGAHuXL4eA&sig2=Tw5WOhhccq0DuYJevV3GRw
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: nexi robot with facial expressions pdf, gsm update, update results, web page update notification, smart server update mechanism pdf, karmasanghatan paper last update, web page update monitoring program,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  Digital signal processing (DSP) smart paper boy 2 2,881 22-12-2018, 02:50 AM
Last Post:
  image processing projects electronics seminars 7 32,503 05-09-2015, 11:47 AM
Last Post: seminar report asees
  Brain Tumour Detection Using Water shedding and basic Image Processing Techniques smart paper boy 2 3,078 01-08-2015, 02:53 PM
Last Post: seminar report asees
  SECURE ATM BY IMAGE PROCESSING seminar class 6 9,903 06-04-2014, 05:49 PM
Last Post: Guest
  Space Time Adaptive Processing smart paper boy 2 1,840 29-03-2014, 10:38 PM
Last Post: ramsaini
  over-under voltage cut-off with ON-Time delay PROJECT REPORT project topics 3 5,369 01-12-2012, 12:23 PM
Last Post: seminar details
  Over / Under Line Voltage Protection for Electrical Appliances smart paper boy 1 2,433 12-10-2012, 01:05 PM
Last Post: seminar details
  OVER/UNDER VOLTAGE PROTECTION OF ELECTRICAL APPLIANCES smart paper boy 1 2,678 12-10-2012, 01:05 PM
Last Post: seminar details
  LOCATION BASED SPATIAL QUERY PROCESSING IN WIRELESS BROADCAST ENVIRONMENTS computer science crazy 4 3,737 15-03-2012, 11:07 AM
Last Post: seminar paper
  Image Processing & Communications Project ideas list computer science crazy 1 1,786 20-02-2012, 10:39 PM
Last Post: jayeshrane

Forum Jump: