A Dual Framework and Algorithms for Targeted Online Data Delivery
Online Data Delivery.pdf (Size: 2.56 MB / Downloads: 1)
Abstract
A variety of emerging online data delivery applications challenge existing techniques for data delivery to human users,
applications, or middleware that are accessing data from multiple autonomous servers. In this paper, we develop a framework for
formalizing and comparing pull-based solutions and present dual optimization approaches. The first approach, most commonly used
nowadays, maximizes user utility under the strict setting of meeting a priori constraints on the usage of system resources.
INTRODUCTION
THE diversity of data sources and Web services currently
available on the Internet and the computational Grid, as
well as the diversity of clients and application requirements,
poses significant infrastructure challenges. In this paper, we
address the task of targeted data delivery. Users may have
specific requirements for data delivery, e.g., how frequently
or under what conditions they wish to be alerted about
update events or update values, or their tolerance to delays
or stale information.
DUAL FRAMEWORK FOR TARGETED DATA
DELIVERY
In what follows, let R ¼ fr1; r2; . . . ; rng be a set of resources;
let T be an epoch; and let fT1; T2; . . . ; TNg be a set of
equidistant chronons1 in T . A schedule S ¼ fsi;jgi¼1...n;j¼1...N
is a set of binary decision variables, set to 1 if resource ri is
probed at time Tj, and 0 otherwise. For example, Fig. 1
illustrates a possible schedule S as a matrix with binary
values, where rows represent resources and columns are
chronons. Thus, for example, at chronon T3, the illustrated
schedule assigns probes to resources r2 and r4. We further
denote by S the set of all possible schedules. Next, we
define the dual approaches for targeted data delivery,
namely, OptMon1 and OptMon2.
MODEL FOR TARGETED DATA DELIVERY
The centerpiece of our model is the notion of execution
intervals, a simple modeling tool for representing dynamically
changing client needs. We discuss user profiles, server
notifications, and monitoring. We also discuss how execution
intervals are generated from user profiles. We then turn
our attention to the formal definition of a schedule and the
utility of probing.
Execution Intervals and Monitoring
Once an event, specified in the trigger part of the
notification rule, occurs, the trigger condition is immediately
evaluated and if it is true, the notification is said to be
executable. The period in which a notification rule is
executable was referred to in the literature as life [24]. We
emphasize here the difference between the executable
period of a notification (life) and the period in which rules,
in general, can be evaluated (epoch). Two examples of life
we shall use in this paper are life ¼ overwrite, in which an
update is available for monitoring only until the next
update to the same resource occurs.
CONCLUSIONS
In this work, we focused on pull-based data delivery that
supports user profile diversity. Minimizing the number of
probes to sources is important for pull-based applications to
conserve resources and improve scalability. Solutions that
can adapt to changes in source behavior are also important
due to the difficulty of predicting when updates occur. In this
paper, we have addressed these challenges through the use
of a new formalism of a dual optimization problem
(OptMon2), reversing the roles of user utility and system
resources. This revised specification leads naturally to a
surprisingly simple, yet powerful algorithm (SUP) which
satisfies user specifications while minimizing system resource
consumption.