Network Analysis and Modeling
CSCI 5352, Fall 2022

Time: Tuesday and Thursday, 11:00am - 12:15pm
Room: ECCS 1B14

Instructor: Aaron Clauset
Office: ECES 118B
Office hours: Thursday, 1:30-2:30pm
Email: zzilm.xozfhvg@xlolizwl.vwf (an Atbash cipher)
Syllabus

Description
Course structure
Schedule and lecture notes
Problem sets
Supplemental readings

Description
Network science is a thriving and increasingly important cross-disciplinary domain that focuses on the representation, analysis and modeling of complex social, biological and technological systems as networks or graphs. Modern data sets often include some kind of network. Nodes can have locations, directions, memory, demographic characteristics, content, and preferences. Edges can have lengths, directions, capacities, costs, durations, and types. And, these variables and the network structure itself can vary, with edges and nodes appearing, disappearing and changing their characteristics over time. Capturing, modeling and understanding networks and rich data requires understanding both the mathematics of networks and the computational tools for identifying and explaining the patterns they contain.

This graduate-level course will examine modern techniques for analyzing and modeling the structure and dynamics of complex networks. The focus will be on statistical algorithms and methods, and both lectures and assignments will emphasize model interpretability and understanding the processes that generate real data. Applications will be drawn from computational biology and computational social science. No biological or social science training is required. (Note: this is not a scientific computing course, but there will be plenty of computing for science.)

Prerequisites
Recommended: CSCI 3104 (undergraduate algorithms) and APPM 3570 (applied probability), or equivalent preparation.

Note: An adequate mathematical and programming background is mandatory. The concepts and techniques covered in this course depend heavily on basic statistics (distributions, Monte Carlo techniques), scientific programming, and calculus (integration and differentiation). Students without sufficient preparation will struggle to keep up with the lectures and assignments. Students without proper preparation may audit the course.

Text
Required (available at the CU Bookstore):
Networks: An Introduction by M.E.J. Newman

Supplementary (optional):
1. All of Statistics by L. Wasserman
2. Numerical Recipes
3. Networks, Crowds and Markets by D. Easley and J. Kleinberg
4. Error and the Growth of Experimental Knowledge by D.G. Mayo
5. Pattern Recognition and Machine Learning by C.M. Bishop

Course structure

Learning Objectives:

  • develop network intuition, and understand how to reason about network phenomena
  • understand network representations
  • learn principles and methods for describing and clustering network data
  • learn to predict missing network information
  • understand how to conduct and interpret numerical network experiments
  • analyze and model real-world network data

Overview:

  • lectures 2 times a week, some guest lectures and some class discussions
  • problem sets (6 total) due every 2 weeks throughout the semester
  • a class project, presentation, and final report
  • this will be a challenging and fun course; plan accordingly

See the syllabus for more detail on the structure and requirements of the class.

Tentative Schedule
Week 1 : Fundamentals of networks (Lecture 0 and Lecture 1)
Week 2 : Network representations and summary statistics (Lecture 2)
Week 3 : Random graphs with homogeneous degrees (Lecture 3)
Week 4 : Random graphs with heterogeneous degrees (Lecture 4)
Week 5 : Network prediction: node attributes (Lecture 5)
Week 6 : Network prediction: missing links (Lecture 6)
Week 7 : Community structure and mixing patterns (Lecture 7)
Week 8 : Community structure models (Lecture 8)
Week 9 : Spreading processes and cascades Lecture 9
Week 10 : Spreading processes with structure (Lecture 10)
Week 11 : Sampling network data (Lecture 11a) and
                 Modeling network growth (Lecture 11b)
Week 12 : Ranking in networks (Lecture 12)
Week 13 : Ethics and Networks (Lecture 13)
Week 14 : Fall break
Weeks 15-16 : Project presentations and Wrap up

Supplemental Readings

Week 1:

Week 2:

Weeks 3-4:

Week 5:

Week 6:

Week 7:

Week 8:

Week 9:

Week 10:

Week 11:

Week 12:

Week 13:

Network Tools
NetworkX, network analysis package (Python)
igraph, network analysis tools (Python, C++, R)
graph-tool, network analysis and visualization software (Python, C++)
graspologic, latent community and spatial network analysis (Python)
NodeXL, network analysis and visualization software

Network Visualization
Cytoscape, network visualization software
CosmoGraph.app, in-browser, larger network visualization
yEd Graph Editor, network visualization software
Graphviz, network visualization software
Gephi, network visualization software
graph-tool, network analysis and visualization software
webweb, network visualization tool joining Matlab and d3
MuxViz, multilayer analysis and visualization platform

Network Data Sets
The Colorado Index of Complex Networks (ICON; more than 698 network data sets)
Netzschleuder (network catalogue, repository and centrifuge; more than 278 network data sets)
US Census Education-Employment network (social, bipartite, weighted)

Other Courses on Networks
Network Theory (University of Michigan)
Statistical Network Analysis (Purdue University)
Networks (Cornell University)
Networks (Harvard University)
Social and Economic Networks: Models and Analysis (Coursera / Stanford)
Social Network Analysis (Coursera / University of Michigan)
Social and Information Network Analysis (Stanford)
Graphs and Networks (Yale)
Spectral Graph Theory (Yale)
The Structure of Social Data (Stanford)

Resources
LaTeX (general) and TeXShop (Mac)
Matlab license for CU staff (includes student employees)
Mathematica license for CU students
NumPy/SciPy libraries for Python (similar to Matlab)
GNU Octave (similar to Matlab)
Wolfram Alpha (Web interface for simple integration and differentiation)
Introduction to the Modeling and Analysis of Complex Systems, by Hiroki Sayama (free online textbook)

Things Worth Reading
Everything you wanted to know about Data Analysis and Fitting but were afraid to ask, by Peter Young
Machine Learning, Statistical Inference and Induction Notebook (by Cosma Shalizi)
Power Law distributions, etc. Notebook (by Cosma Shalizi)
Statistics Done Wrong, The woefully complete guide (by Alex Reinhart)
Some Advice on Process for [Research Projects]