Friday 28 February 2014

A Geometric Approach to Improving Active Packet Loss Measurement

1.      INTRODUCTION
 A Geometric Approach to Improving Active Packet Loss Measurement


Abstract



Measurement and estimation of packet loss characteristics are challenging due to the relatively rare occurrence and typically short duration of packet loss episodes. While active probe tools are commonly used to measure packet loss on end-to-end paths, there has been little analysis of the accuracy of these tools or their impact on the network. The objective of our study is to understand how to measure packet loss episodes accurately with end-to-end probes. We begin by testing the capability of standard Poisson-modulated end-to-end measurements of loss in a controlled laboratory environment using IP routers and commodity end hosts. Our tests show that loss characteristics reported from such Poisson-modulated probe tools can be quite inaccurate over a range of traffic conditions. Motivated by these observations, we introduce a new algorithm for packet loss measurement that is designed to overcome the deficiencies in standard Poisson-based tools. Specifically, our method entails probe experiments that follow a geometric distribution to 1) enable an explicit trade-off between accuracy and impact on the network, and 2) enable more accurate measurements than standard Poisson probing at the same rate. We evaluate the capabilities of our methodology experimentally by developing and implementing a prototype tool, called BADABING. The experiments demonstrate the trade-offs between impact on the network and measurement accuracy. We show that BADABING reports loss characteristics far more accurately than traditional loss measurement tools.

Introduction & Problem Description

Measuring and analyzing network traffic dynamics between end hosts has provided the foundation for the development of many different network protocols and systems. Of particular importance is understanding packet loss behavior since loss can have a significant impact on the performance of both TCP- and UDP-based applications. Despite efforts of network engineers and operators to limit loss, it will probably never be eliminated due to the intrinsic dynamics and scaling properties of traffic in packet switched network. Network operators have the ability to passively monitor nodes within their network for packet loss on routers using SNMP. End-to-end active measurements using probes provide an equally valuable perspective since they indicate the conditions that application traffic is experiencing on those paths.
There are trade-offs in packet loss measurements between probe rate, measurement accuracy, impact on the path and timeliness of results. The objective is to accurately measure loss characteristics on end-to-end paths with probes.


MEASURING and analyzing network traffic dynamics between end hosts has provided the foundation for the development of many different network protocols and systems. Of particular importance is understanding packet loss behavior since loss can have a significant impact on the performance of both TCP- and UDP-based applications. Despite efforts of network engineers and operators to limit loss, it will probably never be eliminated due to the intrinsic dynamics and scaling properties of traffic in packet switched network [1]. Network operators have the ability to passively monitor nodes within their network for packet loss on routers using SNMP. End-to-end active measurements using probes provide an equally valuable perspective since they indicate the conditions that application traffic is experiencing on those paths.

The most commonly used tools for probing end-to-end paths to measure packet loss resemble the ubiquitous PING utility. PING-like tools send probe packets (e.g., ICMP echo packets) to a target host at fixed intervals. Loss is inferred by the sender if the response packets expected from the target host are not received within a specified time period. Generally speaking, an  active measurement approach is problematic because of the discrete sampling nature of the probe process. Thus, the accuracy of the resulting measurements depends both on the characteristics and interpretation of the sampling process as well as the characteristics of the underlying loss process.
Despite their widespread use, there is almost no mention in the literature of how to tune and calibrate [2] active measurements of packet loss to improve accuracy or howto best interpret the resulting measurements. One approach is suggested by the well-known PASTA principle [3] which, in a networking context, tells us that Poisson-modulated probes will provide unbiased time average measurements of a router queue’s state. This idea has been suggested as a foundation for active measurement of end-to-end delay and loss [4]. However, the asymptotic nature of PASTA means that when it is applied in practice, the higher moments of measurements must be considered to determine the validity of the reported results. A closely related issue is the fact that loss is typically a rare event in the Internet [5]. This reality implies either that measurements must be taken over a long time period, or that average rates of Poisson-modulated probes may have to be quite high in order to report accurate estimates in a timely fashion. However, increasing the mean probe rate may lead to the situation that the probes thems elves skew the results. Thus, there are trade-offs in packet loss measurements between probe rate, measurement accuracy, impact on the path and timeliness of results.
  The goal of our study is to understand how to accurately measure loss characteristics on end-to-end paths with probes. We are interested in two specific characteristics of packet loss: loss episode frequency, and loss episode duration [5]. Our study consists of three parts:
(i)                 empirical evaluation of the currently prevailing approach,
(ii)               development of estimation techniques that are based on novel experimental design, novel probing techniques, and simple validation tests, and
(iii)             empirical evaluation of this new methodology.

We begin by testing standard Poisson-modulated probing in a controlled and carefully instrumented laboratory environment consisting of commodity workstations separated by a series of IP routers. Background traffic is sent between end hosts at different levels of intensity to generate loss episodes thereby enabling repeatable tests over a range of conditions. We consider this setting to be ideal for testing loss measurement tools since it combines the advantages of traditional simulation environments with those of tests in the wide area. Namely, much like simulation, it provides for a high level of control and an ability to compare results with “ground truth.” Furthermore, much like tests in the wide area, it provides an ability to consider loss processes in actual router buffers and queues, and the behavior of implementations of the tools on commodity end hosts. Our tests reveal two important deficiencies with simple Poisson probing. First, individual probes often incorrectly report the absence of a loss episode (i.e., they are successfully transferred when a loss episode is underway). Second, they are not well suited to measure loss episode duration over limited measurement periods.

Our observations about the weaknesses in standard Poisson probing motivate the second part of our study: the development of a new approach for end-to-end loss measurement that includes four key elements. First, we design a probe process that is geometrically distributed and that assesses the likelihood of loss experienced by other flows that use the same path, rather than merely reporting its own packet losses. The probe process assumes FIFO queues along the path with a drop-tail policy. Second, we design a new experimental framework with estimation techniques that directly estimate the mean duration of the loss episodes without estimating the duration of any individual loss episode. Our estimators are proved to be consistent, under mild assumptions of the probing process. Third, we provide simple vali ation tests (that require no additional experimentation or data collection) for some of the statistical assumptions
that underly our analysis. Finally, we discuss the variance characteristics of our estimators and show that while frequency estimate variance depends only on the total the number of probes emitted, loss duration variance depends on the frequency estimate as well as the number of probes sent.

The third part of our study involves the empirical evaluation of our new loss measurement methodology. To this end, we developed a one-way active measurement tool called BADABING. BADABING sends fixed-size probes at specified intervals from one measurement host to a collaborating target host. The target system collects the probe packets and reports the loss characteristics after a specified period of time. We also compare BADABING with a standard tool for loss measurement that emits probe packets at Poisson intervals. The results show that our tool reports loss episode estimates much more accurately for the same number of probes.We also show that BADABING estimates converge to the underlying loss episode frequency and duration characteristics.

The most important implication of these results is that there is now a methodology and tool available for wide-area studies of packet loss characteristics that enables researchers to understand and specify the trade-offs between accuracy and impact. Furthermore, the tool is self-calibrating [2] in the sense that it can report when estimates are poor. Practical applications could include its use for path selection in peer-to-peer overlay networks and as a tool for network operators to monitor specific segments of their infrastructures.


Modules are:

  • User Interface Design
  • Packet Separation
  • Implementation of the Queue
  • Packet Receiver
  • Packet Loss Calculation


  • User Interface Design:

                        In this module we design the user interface for Sender, Queue, Receiver and Result displaying window. These windows are designed in order to display all the processes in this project.

  • Packet Separation:

                        In this module the data which we are selecting to send is divided into packets and then those sent to the Queue.


  • Designing the Queue:

                        The Queue is designed in order to create the packet loss. The queue receives the packets from the Sender, creates the packet loss and then sends the remaining packets to the Receiver.





  • Packet Receiver:

                        The Packet Receiver is used to receive the packets from the Queue after the packet loss. Then the receiver displays the received packets from the Queue.

  • Packet Loss Calculation:
                        The calculations to find the packet loss are done in this module. Thus we are developing the tool to find the packet loss.

3.  FEASIBILITY STUDY

3.1  TECHNICAL  FEASIBILITY  :

Evaluating the technical feasibility is the trickiest part of a feasibility study. This is because , at this point in time, not too many detailed design of the system, making it difficult to access issues like performance, costs on ( account of the kind of technology to be deployed) etc. A number of issues have to be considered while doing a technical
analysis.
i)                    Understand the different technologies involved in the proposed system :                   
Before commencing the project, we have to be very clear about what are the technologies that are to be required for the development of the new system.
ii)                  Find out whether the organization currently possesses the required technologies:
o   Is the required technology available with the organization?
o   If so is the capacity sufficient?
For instance –
“Will the current printer be able to handle the new reports and forms required for the new system?”



1.2                                           ECONOMIC  FEASIBILITY  :



        Economic feasibility attempts 2 weigh the costs of developing and implementing a new system, against the benefits that would accrue from having the new system in place. This feasibility study gives the top management the economic justification for the new system.
A simple economic analysis which gives the actual comparison of costs and benefits are much more meaningful in this case. In addition, this proves to be a useful point of reference to compare actual costs as the project progresses. There could be various types of intangible benefits on account of automation. These could include increased customer satisfaction, improvement in product quality better decision making timeliness of information, expediting activities, improved accuracy of operations, better documentation and record keeping, faster retrieval of information, better employee morale.

1.3                                           OPEARTIONAL FEASIBILITY :


Proposed projects are beneficial only if they can be turned into information systems that will meet the organizations operating requirements. Simply stated, this test of feasibility asks if the system will work when it is developed and installed. Are there major barriers to Implementation? Here are questions that will help test the operational feasibility of a project:
§  Is there sufficient support for the project from management from users? If the current system is well liked and used to the extent that persons will not be able to see reasons for change, there may be resistance.
§  Are the current business methods acceptable to the user? If they are not, Users may welcome a change that will bring about a more operational and useful systems.
§  Have the user been involved in the planning and development of the project?
§  Early involvement reduces the chances of resistance to the system and in
§  General and increases the likelihood of successful project.
Since the proposed system was to help reduce the hardships encountered. In the existing manual system, the new system was considered to be operational feasible.    


2.      SYSTEM ANALYSIS

Existing Systems
There have been many studies of packet loss behavior in the Internet. Bolot and Paxson evaluated end-to-end probe measurements and reported characteristics of packet loss over a selection of paths in the wide area. Yajnik et al. evaluated packet loss correlations on longer time scales and developed Markov models for temporal dependence structures. Zhang et al. characterized several aspects of packet loss behavior. In particular, that work reported measures of constancy of loss episode rate, loss episode duration, loss free period duration and overall loss rates. Papagiannaki et al. used a sophisticated passive monitoring infrastructure inside Sprint’s IP backbone to gather packet traces and analyze characteristics of delay and congestion. Finally, Sommers and Barford pointed out some of the limitations in standard end-to-end Poisson probing tools by comparing the loss rates measured by such tools to loss rates measured by passive means in a fully instrumented wide area infrastructure.
The foundation for the notion that Poisson Arrivals See Time Averages (PASTA) was developed by Brumelle, and later formalized by Wolff. Adaptation of those queuing theory ideas into a network probe context to measure loss and delay characteristic began with Bolot’s study and was extended by Paxson. Several studies include the use of loss measurements to estimate network properties such as bottleneck buffer size and cross traffic intensity. The Internet Performance Measurement and Analysis efforts resulted in a series of RFCs that specify how packet loss measurements should be conducted. However, those RFCs are devoid of details on how to tune probe processes and how to interpret the resulting measurements.
ZING is a tool for measuring end-to-end packet loss in one direction between two participating end hosts. ZING sends UDP packets at Poisson-modulated intervals with fixed mean rate. Savage developed the STING tool to measure loss rates in both forward and reverse directions from a single host. STING uses a clever scheme for manipulating a TCP stream to measure loss. Allman et al. demonstrated how to estimate TCP loss rates from passive packet traces of TCP transfers taken close to the sender. A related study examined passive packet traces taken in the middle of the network. Network tomography based on using both multicast and unicast probes has also been demonstrated to be effective for inferring loss rates on internal links on end-to-end paths.
Proposed System
The goal of this study is to understand how to accurately measure loss characteristics on end-to-end paths with probes. We are interested in two specific characteristics of packet loss: loss episode frequency, and loss episode duration [5]. Our study consists of three parts: (i) empirical evaluation of the currently prevailing approach, (ii) development of estimation techniques that are based on novel experimental design, novel probing techniques, and simple validation tests, and (iii) empirical evaluation of this new methodology.
We begin by testing standard Poisson-modulated probing in a controlled and carefully instrumented laboratory environment consisting of commodity workstations separated by a series of IP routers. Background traffic is sent between end hosts at different levels of intensity to generate loss episodes thereby enabling repeatable tests over a range of conditions. We consider this setting to be ideal for testing loss measurement tools since it combines the advantages of traditional simulation environments with those of tests in the wide area. Namely, much like simulation, it provides for a high level of control and an ability to compare results with “ground truth.” Furthermore, much like tests in the wide area, it provides an ability to consider loss processes in actual router buffers and queues, and the behavior of implementations of the tools on commodity end hosts. Our tests reveal two important deficiencies with simple Poisson probing. First, individual probes often incorrectly report the absence of a loss episode (i.e., they are successfully transferred when a loss episode is underway). Second, they are not well suited to measure loss episode duration over limited measurement periods.




1.3                                           DATA FLOW DIAGRAMS  :

Data Flow Model is commonly used during the analysis phase.  The analysis is depicted by the engineer pictorially with the help of Data Flow Diagrams (DFDs).  

A DFD shows the Input/Output flow of data, Reports generated and Data stores that are used by the system.  It views a system as a process that transforms the inputs into desired outputs. That means a DFD shows the transformation of data from input to output, through processes, may be described logically and independently of the physical components associated with the system. 

Context Diagram :

            The top-level diagram is often called as “context diagram.  It contains a single process, but it plays a very important role in studying the current system.  Anything that is not inside the process identified in the context diagram will not be part of the system study.  It represents the entire software element as a single bubble with input and output data indicated by incoming and outgoing arrows respectively.

The Basic Notations used to create DFD’s are as follows :

Line with single end – represents data flow out of an entity.
Bi-Direction Line – represents data flow In and Out of an entity.
Circle – represents data process
Entity Sets – represents objects/department/branches
Data Stores – represents data storage objects


 2.      SYSTEM REQUIREMENT SPECIFICATION

Hardware Requirements Specification :

Processor                     :           Intel Pentium Family
Processor Speed          :           250MHz to 667 MHz
RAM                           :           128 MB to 512 MB
Hard Disk                   :           4 GB or higher
Keyboard                    :           Standard 104 enhanced keyboard

5.2.1 Technologies Used
           
            Technologies used in this project are as follows –

Technology                              
Java
Front End                                 
Swing

 SYSTEM DESIGN

INTRODUCTION:

The most creative and challenge phase of life cycle is system design. The term design describes a final system and the process by which it is developed. It refers to technical specifications that will be applied in implementation of the candidate system.
The design may be defined as “the process of applying various techniques and principles for the purpose of defining a device, a process or a system with sufficient details to permit its physical realization”.
The importance of software design can be stated in a single word “quality”. Design is the only way where we can accurately translate a customer’s requirements information a complete software product or a system. Without design we risk building an unstable system , that might fail if small changes are made. It may as well as difficult to test , or could be one who’s quality can be tested. So it is an essential phase in development of software product.

6.2  SYSTEM FLOW CHART:

The entire system is projected with a physical diagram which specifies the actual storage parameters that are physically necessary for any database to be stored on to the disk.  The overall systems existential idea is derived from this diagram.

6.5       INPUT AND OUTPUT DESIGN :
 Interface design: The current trend in software industry is user friendliness and flexibility. The two main factors contributing towards this are Screens and Menus. The screens in the system are designed to be self-descriptive so as to direct the user while using the system. The screen format is user friendliness.
Input design: Inaccurate input data are most common cause of errors in data processing. Input interface design takes an important role in controlling the errors. Messages are generated using the exception handling feature of JAVA. The input forms are designed in flexible way.
Output design:  It describes how to show all the information and data after processing input data.

6.6 UML DIAGRAMS:

The unified modeling language allows the software engineer to express an analysis model using the modeling notation that is governed by a set of syntactic semantic and pragmatic rules.
A UML system is represented using five different views that describe the system from distinctly different perspective. Each view is defined by a set of diagram, which is as follows.
ü    User Model View
·         This view represents the system from the users perspective.
·         The analysis representation describes a usage scenario from the end-users perspective.
ü    Structural model view
·         In this model the data and functionality are arrived from inside the system.
·         This model view models the static structures.
ü    Behavioral Model View
·         It represents the dynamic of behavioral as parts of the system, depicting the interactions of collection between various structural elements described in the user model and structural model view.
ü    Implementation Model View
·         In this the structural and behavioral as parts of the system are represented as they are to be built.
ü    Environmental Model View
·         In this the structural and behavioral aspects of the environment in which the system is to be implemented are represented.

7. SYSTEM TESTING


7.1  Introduction:
Testing is the process of detecting errors. Testing performs a very critical role for quality assurance and for ensuring the reliability of software. The results of testing are used later on during maintenance also.
The aim of testing is often to demonstrate that a program works by showing that it has no errors. The basic purpose of testing phase is to detect the errors that may be present in the program. Hence one should not start testing with the intent of showing that a program works, but the intent should be to show that a program doesn’t work. Testing is the process of executing a program with the intent of finding errors.
Testing Objectives
The main objective of testing is to uncover a host of errors, systematically and with minimum effort and time. Stating formally, we can say,
Ø  Testing is a process of executing a program with the intent of finding an error.
Ø  A successful test is one that uncovers an as yet undiscovered error.
Ø  A good test case is one that has a high probability of finding error, if it exists.
Ø  The tests are inadequate to detect possibly present errors.
                                   
 7.2   Testing Strategies:
            A strategy for software testing integrates software test case design methods into a well-planned series of steps that result in the successful construction of software. 
Unit Testing
Unit testing focuses verification effort on the smallest unit of software i.e. the module. Using the detailed design and the process specifications testing is done to uncover errors within the boundary of the module. All modules must be successful in the unit test before the start of the integration testing begins.
Unit Testing in this project :  In this project each service can be thought of a module. There are so many modules like Login, New Registration, Change Password, Post Question, Modify Answer etc.  When developing the module as well as finishing the development so that each module works without any error. The inputs are validated when accepting from the user.
7.3 Test Plan:

A number of activities must be performed for testing software. Testing starts with test plan. Test plan identifies all testing related activities that needed to be performed along with the schedule and guidelines for testing. The plan also specifies the level of testing that need to be done , by identifying the different units. For each unit specifying in the plan first the test cases and reports are produced. These reports are analyzed.
            Test plan is a general document for entire project , which defines the scope, approach to be taken and the personal responsible for different activities of testing. The inputs for forming test plans are :                     
1. Project plan
                        2. Requirements document
                        3. System design
7.4  White Box Testing
White Box Testing mainly focuses on the internal performance of the product.  Here a part will be taken at a time and tested thoroughly at a statement level to find the maximum possible errors.  Also construct a loop in such a way that the part will be tested with in a range.  That means the part is execute at its boundary values and within bounds for the purpose of testing.
White Box Testing in this Project :  I tested step wise every piece of code, taking care that every statement in the code is executed at least once. I have generated a list of test cases, sample data, which is used to check all possible combinations of execution paths through the code at every module level.

7.5  Black Box Testing
This testing method considers a module as a single unit and checks the unit at interface and communication with other modules rather getting into details at statement level. Here the module will be treated as a block box that will take some input and generate output. Output for a given set of input combinations are forwarded to other modules.
Black Box Testing in this Project:   I tested each and every module by considering each module as a unit.  I have prepared some set of input combinations and checked the outputs for those inputs.  Also I tested whether the communication between one module to other module is performing well or not.

Integration Testing
After the unit testing we have to perform integration testing.  The goal here is to see if modules can be integrated properly or not. This testing activity can be considered as testing the design and hence the emphasis on testing module interactions.  It also helps to uncover a set of errors associated with interfacing.  Here the input to these modules will be the unit tested modules.
Integration testing is classifies in two types…
1.      Top-Down Integration Testing.
2.      Bottom-Up Integration Testing.
In Top-Down Integration Testing modules are integrated by moving downward through the control hierarchy, beginning with the main control module.
In Bottom-Up Integration Testing each sub module is tested separately and then the full system is tested.
Integration Testing in this project:  In this project integrating all the modules forms the main system.  Means I used Bottom-Up Integration Testing for this project.  When integrating all the modules I have checked whether the integration effects working of any of the services by giving different combinations of inputs with which the two services run perfectly before Integration.
7.7 System Testing
Project testing is an important phase without which the system can’t be released to the end users.  It is aimed at ensuring that all the processes are according to the specification accurately.
System Testing in this project:  Here entire ‘system’ has been tested against requirements of project and it is checked whether all requirements of project have been satisfied or not.
Alpha Testing
This refers to the system testing that is carried out by the test team with the organization.
Beta Testing
This refers to the system testing that is performed by a select group of friendly customers.
Acceptance Testing
Acceptance Test is performed with realistic data of the client to demonstrate that the software is working satisfactorily. Testing here is focused on external behavior of the system; the internal logic of program is not emphasized.
Acceptance Testing in this project:  In this project I have collected some data that was belongs to the University and tested whether project is working correctly or not.
 I conclude that this system is tested using all variety of tests… and found no errors.  Hence the testing process is completed.

9.Conclusion
The purpose of our study was to understand how to measure end-to-end packet loss characteristics accurately with probes and in a way that enables us to specify the impact on the bottleneck queue. We began by evaluating the capabilities of simple Poisson-modulated probing in a controlled laboratory environment consisting of commodity end hosts and IP routers. We consider this testbed ideal for loss measurement tool evaluation since it enables repeatability, establishment of ground truth, and  a range of traffic conditions under which to subject the tool. Our initial tests indicate that simple Poisson probing is relatively ineffective at measuring loss episode frequency or measuring loss episode duration, especially when subjected to TCP (reactive) cross traffic. These experimental results led to our development of a geometrically distributed probe process that provides more accurate estimation of loss characteristics than simple Poisson probing. The experimental design is constructed in such a way that the performance of the accompanying estimators relies on the total number of probes that are sent, but not on their sending rate. Moreover, simple techniques that allow users to validate the measurement output are introduced. We implemented this method in a new tool, BADABING, which we tested in our laboratory. Our tests demonstrate that BADABING, in most cases, accurately estimates loss frequencies and durations over a range  of cross traffic conditions. For the same overall packet rate, our results show that BADABING is significantly more accurate than Poisson probing for measuring loss episode characteristics.

While BADABING enables superior accuracy and a better understanding of link impact versus timeliness of measurement, there is still room for improvement. First, we intend to investigate why   does not appear to work well even as  increases. Second, we plan to examine the issue of appropriate parameterization of BADABING, including packet sizes and  the and parameters, over a range of realistic operational settings including more complex multihop paths. Finally, we have considered adding adaptivity to our probe process model in a limited sense. We are also considering alternative, parametric methods for inferring loss characteristics from our probe process. Another task is to estimate the variability of the estimates of congestion frequency and duration themselves directly from the measured data, under a minimal set of statistical assumptions on the congestion process.


No comments:

Post a Comment