Machine Learning in Complex Networks - Thiago Christiano Silva, Liang Zhao

Blick ins Buch

Machine Learning in Complex Networks (eBook)

Thiago Christiano Silva, Liang Zhao (Autoren)

eBook Download: PDF

2016 | 1. Auflage
XVIII, 345 Seiten
Springer-Verlag
978-3-319-17290-3 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

This book presents the features and advantages offered by complex networks in the machine learning domain. In the first part, an overview on complex networks and network-based machine learning is presented, offering necessary background material. In the second part, we describe in details some specific techniques based on complex networks for supervised, non-supervised, and semi-supervised learning. Particularly, a stochastic particle competition technique for both non-supervised and semi-supervised learning using a stochastic nonlinear dynamical system is described in details. Moreover, an analytical analysis is supplied, which enables one to predict the behavior of the proposed technique. In addition, data reliability issues are explored in semi-supervised learning. Such matter has practical importance and is not often found in the literature. With the goal of validating these techniques for solving real problems, simulations on broadly accepted databases are conducted. Still in this book, we present a hybrid supervised classification technique that combines both low and high orders of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features, while the latter measures the compliance of the test instances with the pattern formation of the data. We show that the high level technique can realize classification according to the semantic meaning of the data. This book intends to combine two widely studied research areas, machine learning and complex networks, which in turn will generate broad interests to scientific community, mainly to computer science and engineering areas.

Preface 8
Acknowledgments 9
Contents 10
Biography 16
List of Symbols 18
1 Introduction 20
1.1 General Background 20
1.2 Focus of This Book 23
1.3 Organization of the Remainder of the Book 29
References 30
2 Complex Networks 33
2.1 Basic Concepts of Graphs 33
2.1.1 Graph Definitions 34
2.1.2 Connectivity 38
2.1.3 Paths and Cycles 42
2.1.4 Subgraphs 44
2.1.5 Trees and Forest 46
2.1.6 Graph Representation 47
2.2 Complex Network Models 49
2.2.1 Random Networks 50
2.2.2 Small-World Networks 52
2.2.3 Scale-Free Networks 53
2.2.4 Random Clustered Networks 55
2.2.5 Core-Periphery Networks 55
2.3 Complex Network Measures 58
2.3.1 Degree and Degree-Correlation Measures 58
2.3.2 Distance and Path Measures 61
2.3.3 Structural Measures 62
2.3.4 Centrality Measures 65
2.3.4.1 Distance-Based Centrality Measures 65
2.3.4.2 Path- and Walk-Based Centrality Measures 66
2.3.4.3 Vitality 68
2.3.4.4 General Feedback Centrality 69
2.3.5 Classification of the Network Measurements 72
2.4 Dynamical Processes in Complex Networks 73
2.4.1 Random Walks 73
2.4.2 Lazy Random Walks 80
2.4.3 Self-Avoiding Walks 81
2.4.4 Tourist Walks 81
2.4.5 Epidemic Spreading 83
2.4.5.1 Susceptible-Infected-Recovered (SIR) Model 83
2.4.5.2 Susceptible-Infected-Susceptible (SIS) Model 84
2.4.5.3 Epidemic Spreading in Complex Networks 84
2.5 Chapter Remarks 85
References 85
3 Machine Learning 89
3.1 Overview of Machine Learning 89
3.2 Supervised Learning 92
3.2.1 Mathematical Formalization and Fundamental Assumptions 92
3.2.2 Overview of the Techniques 95
3.3 Unsupervised Learning 96
3.3.1 Mathematical Formalization and Fundamental Assumptions 96
3.3.2 Overview of the Techniques 98
3.4 Semi-Supervised Learning 100
3.4.1 Motivations 100
3.4.2 Mathematical Formalization and Fundamental Assumptions 101
3.4.3 Overview of the Techniques 103
3.5 Overview of Network-Based Machine Learning 104
3.6 Chapter Remarks 105
References 106
4 Network Construction Techniques 110
4.1 Introduction 110
4.2 Similarity and Dissimilarity Functions 113
4.2.1 Formal Definitions 113
4.2.2 Examples of Vector-Based Similarity Functions 115
4.2.2.1 Numerical Data 116
4.2.2.2 Categorical Data 119
4.3 Transforming Vector-Based Data into Networks 121
4.3.1 Analysis of k-Nearest Neighbors and ?-Radius Networks 123
4.3.2 Combination of k-Nearest Neighbors and ?-Radius Network Formation Techniques 125
4.3.3 b-Matching Networks 126
4.3.4 Linear Neighborhood Networks 127
4.3.5 Relaxed Linear Neighborhood Networks 129
4.3.6 Network Formation Using Clustering Heuristics 131
4.3.7 Network Formation Using Overlapping Histogram Segments 132
4.3.8 More Advanced Network Formation Techniques 136
4.4 Transforming Time Series Data into Networks 138
4.4.1 Cycle Networks 141
4.4.2 Correlation Networks 142
4.4.3 Recurrence Networks 143
4.4.4 Transition Networks 143
4.5 Classification of Network Formation Techniques 144
4.6 Challenges in Transforming Unstructured Data to Networked Data 145
4.7 Chapter Remarks 147
References 147
5 Network-Based Supervised Learning 150
5.1 Introduction 150
5.2 Representative Network-Based Supervised Learning Techniques 152
5.2.1 Classification Using k-Associated Graphs 153
5.2.2 Network Learning Toolkit (NetKit) 154
5.2.3 Classification Using Ease of Access Heuristic 155
5.2.3.1 Training Phase 156
5.2.3.2 Classification Phase 156
5.3 Chapter Remarks 157
References 158
6 Network-Based Unsupervised Learning 159
6.1 Introduction 159
6.2 Community Detection 162
6.2.1 Relevant Concepts and Motivations 162
6.2.2 Mathematical Formalization and Fundamental Assumptions 164
6.2.3 Overview of the State-of-the-Art Techniques 166
6.2.4 Community Detection Benchmarks 166
6.3 Representative Network-Based Unsupervised Learning Techniques 167
6.3.1 Betweenness 168
6.3.2 Modularity Maximization 169
6.3.2.1 Clauset et al. Algorithm 170
6.3.2.2 Louvain Algorithm 172
6.3.3 Spectral Bisection Method 173
6.3.4 Community Detection Using Particle Competition 175
6.3.5 Chameleon 177
6.3.6 Community Detection by Space Transformation and Swarm Dynamics 179
6.3.7 Synchronization Methods 183
6.3.8 Finding Overlapping Communities 185
6.3.8.1 Clique Percolation 186
6.3.8.2 Bayesian Nonnegative Matrix Factorization Algorithm 187
6.3.8.3 Fuzzy Partition Algorithm 188
6.3.9 Network Embedding and Dimension Reduction 190
6.4 Chapter Remarks 192
References 193
7 Network-Based Semi-Supervised Learning 197
7.1 Introduction 197
7.2 Network-Based Semi-Supervised Learning Assumptions 199
7.3 Representative Network-Based Semi-Supervised Learning Techniques 201
7.3.1 Maximum Flow and Minimum Cut 202
7.3.2 Gaussian Field and Harmonic Function 203
7.3.3 Tikhonov Regularization Framework 205
7.3.4 Local and Global Consistency 206
7.3.5 Adsorption 207
7.3.6 Semi-Supervised Modularity Method 210
7.3.7 Interaction Forces 213
7.3.8 Discriminative Walks (D-Walks) 214
7.4 Chapter Remarks 218
References 219
8 Case Study of Network-Based Supervised Learning: High-Level Data Classification 222
8.1 A Quick Overview of the Chapter 223
8.2 Motivation 224
8.3 Model Description 227
8.3.1 Fundamental Ideas Behind the Model 227
8.3.1.1 Training Phase 227
8.3.1.2 Classification Phase 229
8.3.2 Derivation of the Hybrid Classification Framework 231
8.4 Possible Ways of Composing High-Level Classifiers 234
8.4.1 High-Level Classification Using a Mixture of Complex Network Measures 234
8.4.1.1 First Network Measure: Assortativity 235
8.4.1.2 Second Network Measure: Clustering Coefficient 235
8.4.1.3 Third Network Measure: Average Degree or Connectivity 236
8.4.2 High-Level Classification Using Tourist Walks 237
8.4.2.1 Calculating the Variational Descriptors Ti(y)(?) and Ci(y)(?) 239
8.4.2.2 Determining the Intra-Modulation Parameters wintra(y)(?) 240
8.4.2.3 Determining the Inter-Modulation Parameters winter(y)(?) 240
8.5 Numerical Analysis of the High-Level Classification 241
8.5.1 An Illustrative Example 241
8.5.2 Parameter Sensitivity Analysis 242
8.5.2.1 Influence of the Parameters Related to the Network Formation 243
8.5.2.2 Influence of the Critical Memory Length 243
8.6 Application: Handwritten Digits Recognition 246
8.6.1 Motivation 246
8.6.2 Description of the MNIST Data Set 247
8.6.3 A Suitable Similarity Measure for Images 247
8.6.4 Configurations of the Low-Level Classification Techniques 248
8.6.5 Experimental Results 248
8.6.6 Illustrative Examples: High-Level Classification vs. Low-Level Classification 249
8.7 Chapter Remarks 253
References 253
9 Case Study of Network-Based Unsupervised Learning: Stochastic Competitive Learning in Networks 256
9.1 A Quick Overview of the Chapter 256
9.2 Description of the Stochastic Competitive Model 257
9.2.1 Intuition of the Model 258
9.2.2 Derivation of the Transition Matrix 259
9.2.3 Definition of the Stochastic Nonlinear Dynamical System 267
9.2.4 Method for Estimating the Number of Communities 269
9.2.5 Method for Detecting Overlapping Structures 270
9.2.6 Parameter Sensitivity Analysis 270
9.2.6.1 Impact of the Parameter ? 270
9.2.6.2 Impact of the Parameter ? 272
9.2.6.3 Impact of the Parameter K 273
9.2.7 Convergence Analysis 274
9.3 Theoretical Analysis of the Model 278
9.3.1 Mathematical Analysis 278
9.3.1.1 Discovering the Factor Pp(t+1) 279
9.3.1.2 Discovering the Factor PN(t+1) 279
9.3.1.3 Discovering the Factor PE(t+1) 280
9.3.1.4 Discovering the Factor PS(t+1) 281
9.3.1.5 The Transition Probability Function 281
9.3.1.6 Discovering the Distribution P(N(t)) 282
9.3.1.7 Discovering the Distribution of the Domination Level Matrix P(N(t)) 286
9.3.2 Linking the Particle Competition Model and the Classical Multiple Independent Random Walks System 289
9.3.3 A Numerical Example 291
9.4 Numerical Analysis of the Detection of Overlapping Vertices and Communities 295
9.4.1 Zachary's Karate Club Network 295
9.4.2 Dolphin Social Network 297
9.4.3 Les misérables Novel Network 298
9.5 Application: Handwritten Digits and Letters Clustering 299
9.5.1 Brief Information of the Handwritten Digits and Letters Data Sets 299
9.5.2 Determining the Optimal Number of Particles and Clusters 300
9.5.3 Handwritten Data Clustering 300
9.6 Chapter Remarks 303
References 304
10 Case Study of Network-Based Semi-Supervised Learning: Stochastic Competitive-Cooperative Learning in Networks 306
10.1 A Quick Overview of the Chapter 306
10.2 Description of the Stochastic Competitive-Cooperative Model 307
10.2.1 Differences of the Semi-Supervised and the Unsupervised Versions 308
10.2.2 Familiarizing with the Semi-Supervised Environment 310
10.2.3 Deriving the Modified Competitive Transition Matrix 310
10.2.4 Modified Initial Conditions of the System 311
10.3 Theoretical Analysis of the Model 313
10.3.1 Mathematical Analysis 313
10.3.2 A Numerical Example 316
10.4 Numerical Analysis of the Model 319
10.4.1 Simulation on a Synthetic Data Set 319
10.4.2 Simulations on Real-World Data Sets 321
10.5 Application: Detection and Prevention of Error Propagation in Imperfect Learning 323
10.5.1 Motivation 323
10.5.2 Detecting Imperfect Training Data 324
10.5.3 Preventing Label Propagation from Imperfect Training Data 326
10.5.4 Definition of the Modified Learning System to Withstand Imperfect Data 328
10.5.5 Parameter Sensitivity Analysis 328
10.5.5.1 Impact of ? 329
10.5.5.2 Impact of ? 331
10.5.6 Computer Simulations 332
10.5.6.1 Synthetic Data Sets 332
10.5.6.2 Real-World Data Sets 334
10.6 Chapter Remarks 335
References 336
Index 337

Erscheint lt. Verlag	28.1.2016
Zusatzinfo	XVIII, 331 p. 87 illus., 80 illus. in color.
Verlagsort	Cham
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Informatik ► Datenbanken
	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
	Technik
Schlagworte	community detection • complex networks • Data Classification • Data Clustering • machine learning
ISBN-10	3-319-17290-5 / 3319172905
ISBN-13	978-3-319-17290-3 / 9783319172903

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 8,7 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

160,49 €