COMPUTER NETWORKS RESEARCH LAB

Department of Electrical and Computer Engineering, McGill University

NAVIGATION

Home
People
Photos

RESEARCH

Projects
Publications

LINKS

AAPN
MITACS
McGill TSP
McGill ECE

LOCAL ACCESS

Local Info

Project Descriptions Return to AAPN Projects

Estimating Original Flow Characteristics from Sampled Traffic Statistics

Student: Tarem Ahmed, Ph.D Student
Supervisor: Prof. Mark Coates
Abstract: Click here
Related Publications: Click here
Research problem:
In an AAPN, it is impossible for routers to keep detailed statistics of every packet that traverses through it. Thus sampling, and estimation of original flow characteristics from sampled statistics, is required to:

analyze utilization of router processing power
determine need for web proxies
perform load balancing
make dynamic routing decisions
reserve resources for new flows.

When a new flow arrives at an Edge Router (let us say Node A, destined for Node B) in the AAPN, the router must decide which Core Router (Node P or Node Q) to route it through.

Once a new flow arrives, a router is able to predict its length based on the distribution of flow lengths, and the fact that the flow length is likely to be normally distributed according to the Central Limit Theorem. The router is then able to use this information to reliably reserve time slots in future frames.
Immediate Research Objectives:
The immediate goals are to estimate from a sample:

the number of distinct flows
the number of packets in each flow (ie flow length)
the number of bytes in these packets (ie packet length).

It is tempting to:

sample every N'th pkt
attribute original length of NL to sampled flow length of L

However, this simple approach has some problems:

some smaller flows are not sampled at all
larger flows are more likely to be sampled

Various sampling strategies are can also be chosen:

Periodic sampling of every Nth pkt
introduces sampling correlations
biases against selection of multiple closely-spaced packets
Independent sampling with probability p=1/N
Multirate sampling: vary the sampling rate over time
start with high value of p, and then gradually reduce
so as not to lose information contained in smaller flows.

References:
"Estimating Flow Distributions from Sampled Flow Statistics", Duffield, Lund & Thorup, SIGCOMM'03, August 25-29, 2003, Karlsruhe, Germany
http://pma.nlanr.net/Traces