On Estimating Frequency Moments of Data Streams
#1

Abstract.
Space-economical estimation of the pth frequency moments, defined as Fp = Pn i=1|fi|p, for p > 0, are of interest in estimating all-pairs distances in a large data matrix [14], machine learning, and in data stream computation. Random sketches formed by the inner product of the frequency vector f1, . . . , fn with a suitably chosen random vector were pioneered by Alon, Matias and Szegedy [1], and have since played a central role in estimating Fp and for data stream computations in general. The concept of p-stable sketches formed by the inner product of the frequency vector with a random vector whose components are drawn from a p-stable distribution, was proposed by Indyk [11] for estimating Fp, for 0 < p < 2, and has been further studied in Li [13]. In this paper, we consider the problem of estimating Fp, for 0 < p < 2. A disadvantage of the stable sketches technique and its variants is that they require O( 1 _2 ) inner-products of the frequency vector with dense vectors of stable (or nearly stable [14, 13]) random variables to be maintained. This means that each stream update can be quite time-consuming. We present algorithms for estimating Fp, for 0 < p < 2, that does not require the use of stable sketches or its approximations. Our technique is elementary in nature, in that, it uses simple randomization in conjunction with well-known summary structures for data streams, such as the COUNT-MIN sketch [7] and the COUNTSKETCH structure [5]. Our algorithms require space ˜O( 1 _2+p ) 3 to estimate Fp to within 1 ± _ factors and requires expected time O(log F1 log 1 _ ) to process each update. Thus, our technique trades an O( 1 _p ) factor in space for much more efficient processing of stream updates. We also present a stand-alone iterative estimator for F1.
1 Introduction
Recently, there has been an emergence of monitoring applications in diverse areas including network traffic monitoring, network topology monitoring, sensor networks, financial market monitoring, and web-log monitoring. In these applications, data is generated rapidly and continuously, and must be analyzed efficiently, in real-time and in a single-pass over the data to identify large trends, anomalies, user-defined exception conditions, and so on. In many of these applications, it is often required to continuously track the “big picture”, or an aggregate view of the data, as opposed to a detailed view of the data. In such scenarios, efficient approximate computation is often acceptable.


Download full report
http://googleurl?sa=t&source=web&cd=1&ve...ndom04.pdf&ei=p7NETpSQBIXnrAeqs4GKCw&usg=AFQjCNF4bINQaGweVG9AEGK7HHUiQbdqaw&sig2=-xOxrel1DMT2LHMrH6_KOw
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: top ten worst boxing moments, ntfs alternate data streams, seminars report on io streams in java, top 10 music moments of, seminar topics of estimating and costing, sampling moments, procedure of estimating charge on pith ball,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  A Link-Based Cluster Ensemble Approach for Categorical Data Clustering 1 1,062 16-02-2017, 10:51 AM
Last Post: jaseela123d
  Exploiting the Functional and Taxonomic Structure of Genomic Data by Probabilistic To 1 750 14-02-2017, 04:15 PM
Last Post: jaseela123d
  Remote Server Monitoring System For Corporate Data Centers smart paper boy 3 2,806 28-03-2016, 02:51 PM
Last Post: dhanabhagya
  Secured Data Hiding and Extractions Using BPCS project report helper 4 3,644 04-02-2016, 12:52 PM
Last Post: seminar report asees
  Data Hiding in Binary Images for Authentication & Annotation project topics 2 1,812 06-11-2015, 02:27 PM
Last Post: seminar report asees
  DATA LEAKAGE DETECTION project topics 16 13,002 31-07-2015, 02:59 PM
Last Post: seminar report asees
  Privacy Preservation in Data Mining sajidpk123 3 2,930 13-11-2014, 10:48 PM
Last Post: jaseela123d
  projects on data mining? shakir_ali 2 2,026 05-11-2014, 09:30 PM
Last Post: jaseela123d
  data mining full report project report tiger 25 171,038 07-10-2014, 09:10 PM
Last Post: ToPWA
  Data Security Using Honey Pot System computer science topics 5 6,682 11-09-2014, 07:45 PM
Last Post: erhhk

Forum Jump: