CS 260 - Measurements:Traffic, topology, p2p and security.
This seminar is going to be very research and project oriented. To make a long story short, if you are not interested keenly in networks almost to the point of wanting to jump start your thesis and research project or publish something in networks conference, donŐt take this seminar. We will have very limited organized interaction and I would expect people to come up with projects and complete them satisfactorily.
Warning: if you are a first year student, I would recommend you donŐt take this seminar, unless we have spoken and you have convinced me of vast networking experience or great desire to really work in this area.
We will have no lectures, only paper presentations, but I expect people to find a topic within the first 3 weeks, and dedicate significant effort in making it happen. This seminar will be more of a research group meeting than a class.
Warning no. 2: I have given seminars before, THIS is going to be different.
Here I have a list of topics for projects. (check for updates)
Here is the list of papers that we will start with from lecture 2.
Optional means we will do it if there is time or interest.
-- Incentives Build Robustness in BiT Torrent, B. Cohen
v Describes the whole protocol the first paper on Bittorrent
-- ŇP2P is dying or just hidingÓ,
Karagiannis et al, Global Internet 2004.
* Uses bytes from the packet load to find p2p traffic
* 4 methods are developed
* Finds that p2p traffic is not decreasing
Thomas Karagiannis, Andre Broido, Michalis Faloutsos, kc claffy,
ACM SIGCOMM/USENIX Internet Measurement Conference (IMC 2004),
Taormina, Italy, October, 2004.
* Proposes methods to detect p2p traffic by just looking at the headers
-- Understanding BT: An experimental Perspective, TR undersubmission.
v They instrumented a client and run experiments on large number of torrent file
v Approach used : Peer oriented detailed information about exchanged messages and protocol events
v Explored the choke and rarest first algorithms
v They have a detailed view of how the protocol works
v Findings: Both core algorithms work well, but the older version of the choke algorithm(still widely deployed) has some problems and a malicious peer can monopolize the seeds resources. The new version although solves the free rider problem.
v Explored the dynamics of the peer set that capture most of the variability and provides insight for realistic models of BT
v In highly dynamic environment the rarest first algorithm is good at balancing the piece distribution as peers join or leaves he peer set
v Evaluated the protocol overhead
-- A measurement-based traffic profile of the edonkey filesharing service. Kurt Tutschku.. Proceedings. Passive and Active Network Measurement: 5th International Workshop, PAM 2004,
v They provide measurement based traffic profile of eDonkey
v They discuss how this type of service increases the "mice and elephants" phenomenon in the Internet traffic characteristics.
-- Dissecting BT: Five Months in a TorrentŐs Lifetime, Alex Sung (PAM 04)
v Analysis of BT based on measurements collected over a five month period
v They used the log file of a tracker for a torrent very popular and they did analysis based on that
v They use also data acquired from a modified client
v They assess the performance of the core algorithms used in BT through several metrics
v Findings: BT is realistic and inexpensive solution to the classical server-based distribution
Business practices: Free Rider Problem and Attacks
v Bittorrent and Free Riders Đ How to cheat BT and why nobody does, Hales Patarin, (TR UBLCS 05/12/05)
v Very interesting work! Explains briefly the BT algorithm
v Analyzes the prisonerŐs dilemma problem and explains why Tit-for-Tat why is not the Ňbest strategyÓ because there is no best strategy in the population of strategies
v Explains how to fake Identity in BT
v Explains does BT work É.
v Makes some Hypotheses: Meta-search systems in BT will have deleterious effect on altruism and cooperation, because peers can locate and simultaneously participate in many related swarms. So the group selective mechanism will weaken.
v Pollution in P2P File Sharing Systems (Infocom 05)
v One solution to combat illegal file sharing is to deposit large volumes of polluted files
v Measurement study in KaZaA of the nature and the magnitude of the pollution
v Identification and description of anti-pollution mechanisms
v MuON: Epidemic based Mutual Anonymity Bansod et al (ICNP 05)
v A mutual anonymous service that hides the identity of the client fron the service server and vice-versa
v The results from our security analysis and simulation show that MuON provides mutual anonymity over unstructured P2P networks while maintaining predictable latencies, high reliability, and low communication overhead.
v DOS Resilience in P2P, Dimitriu et al. (Sigmetrics 05)
v Analytical modeling and simulation for the resilience of p2p file sharing systems against DoS attacks, in which malicious nodes respond to queries with erroneous responses
v File-targeted attacks, the attacker puts a large number of corrupted versions of a single file on the network
v Network-targeted attacks, attackers respond to queries for any file with erroneous information.
v Explains the key factors for these vulnerabilities
v They consider and propose some counter strategies for the above vulnerabilities
v Incentives in BitTorrent Induce Free Riding (Sigcomm PECON 05)
v Guideline to design a good incentive mechanism
v Analysis and experimental results using PlanetLab show that original incentive mechanism of BT can induce (susceptible) Free Riders
They propose a new mechanism that is more robust against free riders
v An empirical study of Free_Riding behavior in the Maze P2P System (IPTPS 05)
v They use Maze a P2P system with an active database, which is deployed by an academic research team
v Maze offers capabilities in order to conduct experiments to understand user behavior
v The paper presents an analysis of different incentive policies and how users react to them
Findings: Incentive policies are generally effective, but they also encourage the more selfish users to cheat by whitewashing their accounts as a variation of the Sybil attack
v Taxonomy of Trust Categorizing P2P Reputation Systems ,Marti and Garcia Molina
v We present a taxonomy of reputation system components, their properties, and discuss how user behavior and technical constraints can conflict. In our discussion, we describe research that exemplifies compromises made to deliver a useable, implementable system
v Modeling and Performance Analysis of BT-Like P2P Networks (Sigcomm 04)
v Fluid Model
v Scalability, performance and efficiency of the file sharing mechanism
v Effect in network performance of the incentive built in mechanisms
v Numerical results on both simulation and real traces obtained from Internet
v Coupon Replication System (Sigmetrics 05)
v Probabilistic model of Coupon Replication System for file swarming systems ala BT
v Users are characterized by their collection of coupons, their current collection of coupons and they leave upon collection of all the coupons, users that meet exchange coupons
v They discuss different scenarios: which users meet , how they meet, user arrivals,
Their results suggest that performance of file swarming systems does not depend critically on either altruistic user behavior, or on load balancing strategies such as rarest first.
Search - Optional
v Analysis of Search and Replication in Unstructured Peer-to-Peer Network (Sigmetrics 05)
v They study the effect of the number of replicas in the search performance in unstructured P2P networks
v They observe that for a search network with a random graph topology where file replicas are uniformly distributed, the hop distance to a replica of a file is logarithmic in the number of replicas.
v Using this observation we show that flooding-based search is optimized when the number of replicas is proportional to the file request rates.
Traffic Measurement- Classification -Detection
-- BLINC: Multilevel Traffic Classification in the Dark (Sigcomm 05)
v Classification of traffic flow according to the applications that generated them
v Their approach is based on observing and identifying patterns of host behavior at the transport layer
v These patterns are analyzed in three levels: social, functional and application level
v The approach operates in dark, i.e no access to packet payload and has no knowledge of port numbers and no additional information other than what current collector provide these restriction respect privacy and technological constrains
-- Internet Traffic classification using Bayesian Analysis Techniques (Sigmetrics 05)
v They use the well know data mining technique of the nave Bayes estimator to categorize traffic by application
v With the simplest of the Nave Bayes estimator they can achieve 65% success and with some refinements they can achieve 95%
v The training and testing was done by using header derived discriminators
-- Mining Anomalies Using Traffic Feature Distributions (paper)
Anukool Lakhina, Dept. of Computer Science, Boston University
Mark Crovella, Dept. of Computer Science, Boston University
Christophe Diot, Intel Research, Cambridge, UK
Kuai Xu, University of Minnesota
Zhi-Li Zhang, University of Minnesota
Supratik Bhattacharya, Sprint ATL
-- Combining Filtering and Statistical Methods for Anomaly Detection (30 min.)
Augustin Soule and Kav Salamatian, LIP6-UPMC; Nina Taft, Intel Research
-- Flow Classification by Histograms or How to Go on Safari in the Internet
Augustin Soule, Kav Sal amatian, Nina Taft, Richard Emilion and Konstantina Papagiannaki
New York - June, 2004
Measurements at the BGP level: Security and Topology
-- Reducing Large Internet Topologies for Faster Simulations
Vaishnavi Krishnamurthy, et al.
In Proceedings of IFIP Networking 2005, Waterloo, Ontario, Canada, May 2-6, 2005
-- The Missing AS Links and Their Impact On the Internet Topology Model, Y. He et al
under submission (I will make it available).
-- A Blueprint for Improving the Robustness of Internet Routing, Siganos et al.
Under submission (I will make available).
Topological data: http://www.netdimes.org/index.html
All kinds of data: www.caida.org
Routeviews repository: http://archive.routeviews.org/