Michalis Faloutsos      2 April, 2002

 Bibliography  you can consult.

For this class the theme is going to revolve around these main topics:

* Modeling the Interent topology
* Measuring the Internet performance
* How asymmetric is the Internet behavior and structure
     - in terms of routing paths, in terms of end-to-end delay, interms of capacity
* Topology in other communication networks: www, peer-to-peer
* BGP routing: routing paths, asymmetry, routing stability, anomalies
* Peer-to-peer networks: performance analysis, simulations, scalability issues

Most topics that you can think of in these guidelines should be reasonable,
however make sure you contact me early.

(Below: * = topic of current interest and recommended)
New- Some Specific projects
 Each numbered item (e.e 2.1)  could be a project on each own or combined with other items
 If there is a related paper, please take a look at it first, before seeing me.

1. Generate undirected realistic Internet graphs according to Papadimitriou
    1.1 - Test and implement his algorithm
           - See if you can improve it
    1.2 *- Reduce a real graph to smaller realistic graph
            - We already have ideas  and a program to reduce graphs
            - We want to test how "realistic" the reduce graphs are

2. *Generate directed realistic BGP graphs
      - Graphs in BGP are directed: currently we generate undirected graphs
      2.1 - Step 1: how does a BGP directed graph looks like:
          See paper by Lixin Gao (U. Massachusets ata Amherst) in her page appeared in ITCOM
          Does the directed graph look like a jellyfish? (see Tauro's work in my webpage Global Internet 2001)
       2.2- Step 2:  How can I create such a graph?
           Modify existing generators for undirected graphs
           See paper of Towsley  and his student in INFOCOM 2002 and a few previous generators

3. * Compare and improve the inference of the relationships between AS.
   Main problem: Provide-Customer are pretty much correct. We need to improve the
    peer-to-peer accuracy
   3.1 - Gao's (appeared in Global Internet or Globecom and she has anew paper in ToN)  and in ,
     and  Rexford algorithm (appearing in infocom 2002)
   3.2 - Use our idea of using BGP routing paths to understand the "direction" a path
       is used more. You would nee to talk to me or Georgos about this.
        (interact with George Siganos: siganos@cs)

4. Study the traffic characteristics and  end-to-end performance
   4.1 - Characterize the packet loss traces between two points (burstiness, LRD, predicatbility)
    4.2 - The same for round-trip delay
    4.3 - Compare and cluster destinations according to their packet loss and delay behavior
      (see paper: Padmanabhan in sigcomm 2001)
      (interact with Thomas Karagiannis: tkarag@cs)
    4.4  Statistical observations and modeling of traffic:
         - What applications dominate the network?
         - what kind of packets fo we see?
          - Identify patterns and trends
 Good sites to look at: see the bilbiography above.

5.  What is the effect of topology on  traffic?  (for experienced people)
      Conduct simulations with the Berkeley/ISI  Network Simulator (NS).
      5.1 - Examine the traffic patterns of a link in different topologies
          a) mesh b) tree c) powerlaw like
      I will give you a paper by Thomas that is under submission.
     A good but heavy paper to look at: Anja Feldman et al. "Dynamics of IP Traffic ..." SIGCOMM '99
       and other papers by her.
    (interact with Thomas Karagiannis: tkarag@cs)
      5.2 - Examine the effect of placing source-destinations in different places in the network
       (see above papers)
      5.3 *- Examine the effect of placing sources and destinations of multicast users in a network
                on the tree, and on our ability to aggregate the tree
         (see papers of Chuang-Sirbu, Philips et al in sigcomm'99,
         '`Aggregated Multicast with Inter-Group Sharing``,
          A. Fei, J. Cui,M. Gerla, M. Faloutsos,
          Networked Group Communications (NGC), London, November, 2001 )

6. Study the topology of the UCR or UCR-CS site or the webgraph of Google
     6.1 - Characterize it (study distribution of indegree, outdegree)
     6.2  - Examine the hits that pages get and try to identify patterns and correlation with their location
     6.3  - Compare different web-sites to find patterns
     6.4 - Compare web-sites with the large web-graph
Papers: There are some papers by Andrew Tomkins
-the bowtie idea (in WWW conference 2-3 ago)
-the "trawling for communities"
-and more recently a paper in VLDB (I think) that
 finds thematic communities that look like smaller bowties
Then there is the classic paper of Kleinberg on hubs and authorities.
Several people have shown that the distribution of degrees in the
graph follow powerlaws.

7. Peer-to-peer networks:
    - * How can I optimize the topology of a p2p networks
    - What do I want to optimize vs how much collaboration I can expect
    - What can peer-2-peer networks learn from the Internet topology?
    (see papers SIGCOMM 2002, and 2001)

 8. BGP convergence: simulation and analysis;
      - Start from the papers:
            An Experimental Analysis of BGP Convergence Time: T. Griffin al in ICNP 2001
             `BGP Routing Properties at a Large Time Scale ''    G. Siganos, M. Faloutsos
             Global Internet Symposium,  2002  ( I will put it on the web soon or ask Georgos)
              Look at sigcomm 2002 and previous sigcomm.
9. Mulitcasting:  modeling and managing multicast connections
        - Realistic multicast simulations and benchmarks
             where should I place the group memebers? Does it matter?
              What affects the performance of multicast protocols and the accuracy of simulations (topology,
               capacity, traffic matrix, users)
         - How can I manage multicast connections efficiently and scalably:
              see our papers on aggregate multicasting (from my web page)
      Good starting points aprt from  the above papers:
       "Channelization problem in large scale data dissemination" M. Alder, ICNP 2001
       "An Analysis of Multicast Forwarding State Scalability (2000) "
              Tina Wong and Randy Katz Department of Electrical Engineering and Computer...
                         IEEE ACM Transactions on Networking
              earlier version appeared in ICNP (I think  2000)

More generally here are some topics what I would encourage you to think about.

* What does the Internet topology look like:
     - find structure and patterns
* What is the right topological model for simulations
    - compare existing topological modesl
    - establish techniques to compare the "similarity" of graphs
    - does topology really make a difference in simulations?
* What is the minimum size of a graph that I need to use in my simulations?
* How much capacity is in the network:
    - how is capacity distributed
    - what kind of capacity does a typical connections "see"?
* Is the Interent topology optimized?
    - Optimized for what?
    - What is the effect of small changes?

* How does topology affect traffic
* How does the traffic on a single link look like:
    - Describe the traffic load
    - How many and what type of connections do I see?
x* Input and output traffic of a router/switch: what goes where?
    - Similar questions as above
* How do users/applications behave?
    = what is typical user behavior
    - model traffic demand: upload vs download
    - user behavior in time
* Simulating multicast communications
    - what topology should I use?
    - Where are the multicast users located?
           -- Can I infer multicast behavior from common interests in www?

* Analyze UCR's BGP routing table
    - Frequence of updates
    - Erroneous updates
* Real BGP measurements
    - Study the convergence of BGP in our testbed (to be set up soon)
    - Verify previous simulations studies a
*  Analyze BGP routing
    - stability
    - convergence
* How well is BGP designed?
    - Identify real BGP probelms
    - how can they be avoided?
* How can I model and simulate BGP?
    - What are the assumptions I need to make?
    - The SSFNET software and its BGP extensions

* What does a site look like?
* What does the web look like?
* How can I model user access patterns
    - regarding a particular web-site (which pages within a site)
    - in general (which sites, how often etc)
* How can I infer "thematic communities" from the structure of the web?

* How do p2p networks look like?
* How can I optimize the topology of a p2p networks
    - what do I want to optimize vs how much collaboration I can expect
* Improving peer-to-peer searches and dwoloads
* Evaluating the different peer-to-peer architectures