Calculation of source counts from Time Directed Partner Graphs:

From TDPG it is possible to construct a "source" count for each partnership. This is the number of previous partnerships that could have originated a chain of infection that could reach the partnership under concern. Cumulative source counts for an individual, A at a time T are calculated by finding the nodes (ie. Relationships) that individual A is a constituent member or was a member (if the relationship has ended) within a fixed time interval d of time. Then for each of these nodes (relationships) calculate the source counts and sum these counts. Any partnership characteristic or any characteristic of individuals in partnerships can be used to specify such source counts. For example, the source counts for partnerships formed in one setting could be specified by source partnerships in other settings. Or the number of source partnerships involving another type of individual could specify the source partnerships for one class of individual. For example, the source partnerships for individuals who have been monogamous could be specified by partnerships made with commercial sex workers or by partnerships involving individuals with high rates of partnership formation or concurrency. This allows for constructing network measures that describe both the degree of linkage between individuals in a group and the extent of linkage between groups of individuals.

Source counts can be calculated in many ways. The method you would use to calculate the source counts depends on the data structure used to represent the TDPG. Time Directed Partner Graphs are digraphs and therefore are usually represented in one of three data structures incidence lists, adjacency lists or adjacency matrices. For example if adjacency matrices are employed to represent a TDPG then matrix algebra can be used to calculate source counts. When using adjacency matrices the source count of a node is the number of distinct nodes that can reach the node of interest. For example, the source counts of a node are the number of nodes that can reach a node ni. Therefore, sum the ith column of the reachability matrix, XR (defined below). We illustrate this with the adjacency matrix X at time 25 derived using a d of 11 (see Table 4).

X=, X2=X2’=, X3=

XXR

The source counts are 2, 2, 3, 4, 0, 3, 0 for n1, n3, n4, n5, n6, n7, n8 respectively. The number of nodes that nodes ni and nj have in common can also be calculated by examining the rows and columns of the product matricies, Xp.

The adjacency matricies corresponding to the TDPG’s illustrated in Table 1 have entries consisting of 0 and 1’s. The adjacency matrices are a convenient representation of TDPG but have the disadvantage that they can be unwieldy in terms of both the matrix algebra involved in the calculation of measures and the space required to represent TDPG's. The TDPG make these matrices more visually meaningful. This is readily calculated using matrix multiplication as follows. If X is a connection matrix then xij=1 means that there is an arc from ni to nj. Computing the powers of the adjacency matrix forms the product matricies, Xp. Xp¢will denote the product matrices modified with 0’s along the diagonal (since we do not allow ping ponging during the computation of the source count). Next I will define a standardizing transformation, standardized matrices will be denoted, Xs.

For each, xijX and yij Xs

yij = 1 ,if xij

yij = 0 ,otherwise

The matrix, Xis then formed,

X = where,

k=min ;.

(Note: The k defined above is the number of terms in the sum that forms the X

matrix. It say’s that you should stop when the standardized product matrix is equal to a previous standardized product matrix in the sum or when the number of terms reaches one less than the order of the matrix, which ever comes first.)

The reachability matrix, XR is then formed by standardizing the matrix , X.

XR=X

If there is a 0 in the xij entry of the reachability matrix, XR then there is no path from the ni node to the nj node. If there is a 1 in the xij entry of the reachability matrix then there is at least one path from the ni node to the nj node.
 
 

For example consider the graph below,

Let X be the adjacency matrix for this graph. The indegree of node ni can be computing the sum of ith column (and indegree is found by computing the sum of the ith row) of the adjacency matrix.

Þ X2 = Þ X3 = X= X+ X2 +X3

So, now transform Xto XR using the standardizing transformation defined above,

X= XR the column sums (source counts) of the reachability matrix XR are

n1=0+0+0+0=0, n2=1+0+0+0=1, n3=1+1+0+1=3, n4=1+0+0+0=1

Note that even though there are two paths from n1 to n3 the source count for n3 is correct. This method works with any matrix representation of a graph whether or not the relation is directional, non-directional.

Since it is not possible to continually updated value of each individual source count during the simulation due to computational limitations. The source counts can be continually updated as the TDPG are constructed and updated from the raw data in the simulation output files. At each event where a relationship is formed or dissolved, only some of the adjacency matrices will be modified and the network measure for the individuals in the matrices recalculated. Concurrent relationships are represented as new links. These new links are either non-directional or directional since non-directional links represent concurrent relationships and directional links represent possible transmission possibilities carries forward from one node to another. When the concurrent relationships end they eventually (depending on d length) change to directional links. The only links that are removed are (the oldest) directional links. For example consider the TDPG below consisting of 3 connected components.

The adjacency matrices for this TDPG are,

X1=, X2=, X3= respectively.

Suppose at the next event a new concurrent relationship is initiated between n6 to n9. The adjacency matrices X2 and X3 would be joined to form a new matrix, X4.

X4=

The new adjacency matrix is now modified by changing the {n6,n9} entry from 0 to a 1 and the {n9,n6} entry from 0 to a 1 which would change the new adjacency matrix X4,

X4

The graph of the TDPG is now represented by 2 adjacency matrices,

but only the last matrix requires addition analysis

in order to determine the updated source counts and nodal degree of n5, n6, n7, n8, n9, n10. The updated graphs are depicted below,

When a relationship is initiated either a node will be added to an adjacency matrix representing a connected component of the TDPG or two adjacency matrices will be joined to form a single adjacency matrix which represents the connected component formed by the connection of two previously unconnected components. The nij and nji entries of this new connected component will then be changed from 0 to1 to reflect the concurrent relationship between node i and node j. Only the modified or new adjacency matrices will be analyzed in order to update the network measures of interest. This will reduce the amount of computation required to update the network measures of interest at each event. At each event the updated network measure and statistics can be recorded in a j by ki array where j is the number of measures or statistics being recorded (such as current source count, max source count, current nodal degree, etc.). When a relationship is initiated a new node is created then a search for other nodes that involve either of the two individuals that constitute the new node must be completed. If there exist other nodes that the share the same individuals are found then these nodes are evaluated to determine whether a link or links are initiated using the timing rules in effect. Any adjacency matrices that include these nodes are then reanalyzed. When a relationship is terminated links are scheduled to be removed according to the timing rules in effect and the matrices that involve the relevant nodes are reanalyzed using the timing rules in effect.

For an n-node network (graph) the method outlined above can be computed in time proportional to n4. This method is not efficient enough to be employed when compared to other methods but it illustrates the advantages of using adjacency matrices to represent the TDPG’s. The Floyd-Warshall algorithm based on the triangle inequality is a more efficient algorithm that can be employed in computing source counts in time proportional to n3. The Floyd-Warshall algorithm finds the shortest paths between all pair of nodes in an n-node directional graph and can be used to determine connected subcomponents when links are dissolved and then calculate updated source counts for the modified matrix or matrices. The advantage of using the Floyd-Warshall algorithm to calculate the shortest path between all of the nodes in a graph is that this algorithm could be used to calculate subgraphs, it works with weighted links and it is efficient. This disadvantage is that the matricies used in the calculations requires a lot of space since our simulation involves a large number of nodes.