| |
We're still working with Pajek with a little bit of Matlab in the mix.
1. Real world networks vs. random
Download the two files everglades.paj (this is a Pajek project file rather than a .net file, so one way you can open it is by pressing 'F1') and gnutella.net. The first is a food web of the Everglades Graminoid marshes and the second is a snapshot of a Gnutella peer to peer filesharing network in the year 2000.
- For both networks compute the average clustering coefficient and average shortest path (Net>Paths between two vertices > Distribution of distances > From all vertices and look in the report window).
- For everglades.paj (a directed network) construct a random network with the same number of nodes and arcs, using the Net>Random Network>total # of arcs command. For the gnutella graph, construct an Erdos-Renyi random graph with the same number of vertices and same average degree of each vertex (Net>Random Network>Erdos-Renyi). Check that the random graphs you have obtained have the same number of vertices and a similar numbers of edges (write them down). Compute the average clustering coefficients and average shortest paths for the two corresponding random graphs.
- How do the clustering coefficients and average shortest paths of the two real world networks compare to their random counterparts? Do they exhibit small world properties?
- Describe one difference you see between the two real-world networks.
2. Small world network vs. random graph
Download the file lattice.net from cTools.
- Lay out the network using the Kamada-Kawai algorithm (you can try others).
- Compute the clustering coefficient and average shortest path.
- Add 10 random edges. If you need a random number generator you can go to http://www.random.org/nform.html. Then either edit the .net file to add the edges, or right click on the randomly chosen node in the visualization, then click on 'newline' and type in the number of the other random node it is supposed to connect to.
- What percentage of the edges are now random?
- Submit an image of the resulting network. (*I*)
- Create a Erdos-Renyi graph in Pajek with the same number of nodes average degree as the lattice graph. Compute the clustering coefficient and average shortest path. How close did the lattice graph with the 10 random edges come to having the same average clustering as the random graph? How about the average shortest path?
3 Generating and fitting power law distributions
If you are using Matlab to solve this problem please refer to http://www-personal.umich.edu/~ladamic/si614w06/matlab/ for instructions on how to use the various functions which can be downloaded there. Feel free to use software other than Matlab for this task.
- Generate 100,000 random integers from a power law distribution with exponent alpha = 2.1
- What is the largest value in your sample? Is it possible for a node in a network to have a degree this high (assuming you don't allow multiple edges between two nodes)?
- Construct a histogram of the frequency of occurrence of each integer in your sample. Turn in both a linear scale plot and a log-log scale plot.
- What happens to the bins with zero count in the log-log plot?
- Try a simple linear regression on the log transformation of both variables (if you are using the fitlineonloglog.m Matlab script, you will feed it the binned data, and it will take the log of the x and y for you before doing a linear fit). What is your value of the power-law exponent alpha? Include a plot of the data with the fit superimposed.(*I*)
- Now exponentially bin the data and fit with a line. What is your value of alpha? (*I*)
- Finally, do a cumulative frequency plot of the original data sample. Fit, plot, and report on the fitted exponent and the corresponding value of alpha. (*I*)
- Which method was the most accurate? Which one, in your opinion, gave the best view of the data and the fit?
|