COMPUTER NETWORKS RESEARCH LAB

TSP Lab

Department of Electrical and Computer Engineering, McGill University

  NAVIGATION

Home
People
Photos
 

  RESEARCH

Projects
Publications
 

  LINKS

AAPN
MITACS
McGill TSP
McGill ECE
 

  LOCAL ACCESS
Local Info
 
 

Project Descriptions Return to AAPN Projects

   
Estimating Original Flow Characteristics from Sampled Traffic Statistics
 

Student: Tarem Ahmed, Ph.D Student
Supervisor: Prof. Mark Coates
Abstract: Click here
Related Publications: Click here

Research problem:
In an AAPN, it is impossible for routers to keep detailed statistics of every packet that traverses through it. Thus sampling, and estimation of original flow characteristics from sampled statistics, is required to:

  • analyze utilization of router processing power
  • determine need for web proxies
  • perform load balancing
  • make dynamic routing decisions
  • reserve resources for new flows.

When a new flow arrives at an Edge Router (let us say Node A, destined for Node B) in the AAPN, the router must decide which Core Router (Node P or Node Q) to route it through.

Once a new flow arrives, a router is able to predict its length based on the distribution of flow lengths, and the fact that the flow length is likely to be normally distributed according to the Central Limit Theorem. The router is then able to use this information to reliably reserve time slots in future frames.

Immediate Research Objectives:
The immediate goals are to estimate from a sample:

  • the number of distinct flows
  • the number of packets in each flow (ie flow length)
  • the number of bytes in these packets (ie packet length).

It is tempting to:

  • sample every N'th pkt
  • attribute original length of NL to sampled flow length of L

However, this simple approach has some problems:

  • some smaller flows are not sampled at all
  • larger flows are more likely to be sampled

Various sampling strategies are can also be chosen:

  • Periodic sampling of every Nth pkt
  • introduces sampling correlations
  • biases against selection of multiple closely-spaced packets
  • Independent sampling with probability p=1/N
  • Multirate sampling: vary the sampling rate over time
  • start with high value of p, and then gradually reduce
  • so as not to lose information contained in smaller flows.

References:
"Estimating Flow Distributions from Sampled Flow Statistics", Duffield, Lund & Thorup, SIGCOMM'03, August 25-29, 2003, Karlsruhe, Germany
http://pma.nlanr.net/Traces