Guide Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis

Free download. Book file PDF easily for everyone and every device. You can download and read online Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis file PDF Book only if you are registered here. And also you can download or read online all Book PDF file that related with Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis book. Happy reading Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis Bookeveryone. Download file Free Book PDF Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis at Complete PDF Library. This Book have some digital formats such us :paperbook, ebook, kindle, epub, fb2 and another formats. Here is The CompletePDF Book Library. It's free to register here to get Book file PDF Graphs as Structural Models: The Application of Graphs and Multigraphs in Cluster Analysis Pocket Guide.

Articles

  1. Associated Data
  2. Account Options
  3. Probabilistic Aspects in Cluster Analysis
  4. Two Models of Random Intersection Graphs for Classification | SpringerLink

All Examples Functions More. Search MathWorks. Open Mobile Search. All Examples Functions. Toggle navigation. Trial Software Product Updates. Graph and Network Algorithms Directed and undirected graphs, network analysis. Functions expand all Construction. Modify Nodes and Edges. Search and Structure.

Associated Data

Shortest Path. Matrix Representation. Node Information. Objects GraphPlot Graph plot for directed and undirected graphs.

Account Options

Properties GraphPlot Properties Graph plot appearance and behavior. Topics Directed and Undirected Graphs Introduction to directed and undirected graphs.

Graphs and Matrices This example shows an application of sparse matrices and explains the relationship between graphs and matrices. Add Graph Node Names, Edge Weights, and Other Attributes This example shows how to add attributes to the nodes and edges in graphs created using graph and digraph. Graph Plotting and Customization This example shows how to plot graphs, and then customize the display to add labels or highlighting to the graph nodes and edges. Label Graph Nodes and Edges This example shows how to add and customize labels on graph nodes and edges.

Visualize Breadth-First and Depth-First Search This example shows how to define a function that visualizes the results of bfsearch and dfsearch by highlighting the nodes and edges of a graph. Open Script.

Open Live Script. Partition Graph with Laplacian Matrix.

Select a Web Site Choose a web site to get translated content where available and see local events and offers. Select web site. The Havel-Hakimi algorithm [ 35 , 36 ] constructs graphs by sorting nodes according to their degree and successively connecting nodes of highest degree with each other. After each step of connecting the highest degree node, the degree list is resorted and the process continues until all the edges on the graph are connected. Here, we modify this to construct between-edges by sorting nodes by highest between-degree, in order of highest total between-degree for the module to which they belong, and successively connecting the node at the top of the list randomly with other nodes.

Connections are only made between nodes if they are not previously connected, belong to different modules, and do not both have within-degree of zero to avoid disconnected components. After each step the between-degree list is resorted, and the process continues until all between-edges are connected.

After all between-edges have been connected, the connections are randomized using a well-known method of rewiring through double-edge swaps [ 37 ]. Specifically, two randomly chosen between-edges u , v and x , y are removed, and replaced by two new edges u , x and v , y , as long as u and x , and v and y belong to different modules, respectively.

The swaps are constrained to avoid the formation of self loops and multi-edges. This process is repeated a large number of times to randomize edges.

Benjamin Bengfort - Dynamics in Graph Analysis Adding Time as a Structure for Visual and Statistical

We then connect within-edges using the standard Havel-Hakimi algorithm, applied to each module independently. Specifically, within-edges of a module are connected by sorting nodes of the module according to their within-degree and successively connecting nodes of highest within-degree with each other.

Probabilistic Aspects in Cluster Analysis

Connections are only made between nodes if they are not previously connected, and do not both have a between-degree of zero to avoid disconnected components. After each step the within-degree list is resorted and the process continues until all the within-edges of the module are connected. The connections are then randomized by rewiring through double-edge swaps [ 37 ].

We do not specify that each module be connected only that the full graph is connected. Specifically, the algorithm selects two random edges u , v and x , y that belong to two different disconnected components of the module. As long as u , x and v , y are not existing edges, the u , v and x , y edges are removed and u , x and v , y are added. Using our simulation algorithm, we were able to generate modular random graphs of variable network size, number of communities, degree distribution, and community size distribution. In the sections that follow, we consider the algorithm performance, as well as structural properties of the generated graphs.

We then highlight two applications of our model: 1 to generate benchmark graphs for validation of community detection algorithms and 2 to generate null graphs for the analysis of empirical networks.

Community detection algorithms assist in identifying community structure in empirical networks. Our model is able to generate modular networks de novo to test these algorithms. Once community structure has been identified in an empirical network with a community detection algorithm, the number of communities and the modularity level Q and, if desired, the community size distribution and within-degree sequence can be used as input to our model to generate graphs that can act as random controls to test hypotheses about the empirical system.

As the modularity increases, the ratio of the total number of edges within modules to the number of edges in the network increases i. Our model generates graphs that closely match the expected modularity and degree distribution. The deviation of the observed modularity is less than 0. However, our model overcomes several limitations of the model proposed by Girvan and Newman [ 3 ] and others [ 16 , 17 ] by considering heterogeneity in total degree, within-module degree distribution, and module sizes.

Unlike many of the existing models [ 18 — 21 ], our model can generate modular random graphs with arbitrary degree distributions, including those obtained from empirical networks. Though we discuss modular random graphs with positive Q values, our model can also generate disassortative modular random graphs see Figure S3 in Additional file 1. In this case, nodes tend to connect to nodes in other modules and thus the density of edge connections within a module is less than what is expected at random. Additionally, we also compare our model to graphs generated based on a degree-corrected stochastic block model SBM.

There are several other topological properties besides degree distribution and community structure that can influence network function and dynamics. We have developed this model to generate graphs with specified degree distribution and modularity, while minimizing structural byproducts. Thus, it is important to confirm that we have reached this goal with the generative model above. We chose these particular types of degree distributions as they have widely studied in the context of biological networks [ 39 — 41 ].

For each level of modularity, we generated 50 such modular random graphs and calculated the degree assortativity r , clustering coefficient C , and average path length L for each network, which is illustrated in Figure 3. In Figure 3 , we show that for increasing values of modularity, degree assortativity, clustering coefficient, and average path length remain relatively constant for all three network types i. Poisson, geometric and power-law. At the highest levels of modularity, edge connections are constrained, particularly for the heavy-tailed geometric and power-law degree distributions, leading to an increase in clustering coefficient.

Correlations between high clustering coefficient and high modularity have also been observed before [ 2 ]. The average path length for all network types also increases at the highest levels of modularity, likely reflecting the lack of many paths between modules, requiring additional steps to reach nodes in different modules. Thus, our model is able to increase levels of modularity in random graphs without altering other topological properties significantly.

Values of a Assortativity, r , b clustering coefficient, C , and c path length, L in modular random graphs do not vary significantly with increasing modularity Q. The data points represent the average value of 50 random graphs. Standard deviations are plotted as error bars. Biological networks show remarkable variation in network size, connectivity and community size distribution, with some of them having particularly small network size, high degree, and small module sizes e.

We therefore tested the performance of our generated networks under deviations in the network specifications of size, mean degree and module size distribution results presented in Additional file 1 : Figure S5, S6 and S7. At these parameter extremes, the modular random graphs become degree disassortative and have increased clustering coefficient.

A similar observation of network degree disassortativity has been made in hierarchically modular networks [ 42 ]. In these two scenarios, the highest value of within-degree d w v i that a node can attain is constrained by the community size, which reduces the number of possible high within-degree nodes. As a consequence high within-degree nodes must connect to low within-degree nodes more than expected, resulting in a degree disassortative network.

In these two cases, modules also become more dense and thus create more triangles resulting in a gradual increase in clustering. Path length, on the other hand, is not affected by these conditions and shows a consistent dependence on network size and mean degree, which is well known [ 43 , 44 ].

Two Models of Random Intersection Graphs for Classification | SpringerLink

Extant techniques such as modularity maximization, hierarchical clustering, the clique-based method, the spin glass method etc. Choosing the best algorithm can be a difficult task especially as algorithms often use distinct definitions of communities and perform well within that description. Thus, it is exceedingly important to test community-detection algorithms against a suitable benchmark.