Biological Networks
CSCI 3352, Spring 2023
Time: Tuesday and Thursday, 1:00pm - 2:15pm
Room: JSCBB A104
Instructor: Aaron Clauset
Office: Zoomiverse
Office hours: Tuesday 2:30-3:30pm, and Thursday 11:30am-12:30pm, or by appointment
Email: zzilm.xozfhvg@xlolizwl.vwf (an Atbash cipher)
Syllabus
Description
Learning objectives
Schedule and lecture notes
Supplemental readings
Description
This undergraduate-level course examines the computational representation and analysis of biological phenomena through the structure and dynamics of networks, from molecules to species. Attention focuses on algorithms for clustering network structures, predicting missing information, modeling flows, regulation, and spreading-process dynamics, examining the evolution of network structure, and developing intuition for how network structure and dynamics relate to biological phenomena.
Prerequisites
Recommended: Data structures (CSCI 2270) or similar, Data science (CSCI 3320) or similar, Calculus I (or equivalent); programming competence is assumed.
Textbook
Optional:
1. Networks by Mark Newman
2. Introduction to Systems Biology by Uri Alon
Learning Objectives
In this class, students will
- understand the representation and analysis of biological phenomena as networks
- develop intuition for how network structure shapes dynamics
- learn principles and methods for describing and clustering network data
- learn to predict missing network information
- understand and simulate models of biological network dynamics
- understand and model the evolution of biological networks
- analyze real-world biological network data
Problem sets: There will be 10 weekly problem sets. Each will
include some mathematical and some computational problems. See syllabus for
details.
Class project: In the class project, students will explore a class topic (of their choice) more deeply.
Students may work in teams of 1-2 on the project.
There are three deliverables associated with the class project:
(i) a project proposal, (ii) a brief in-class presentation of the project idea and results, and
(iii) a 6-page project report, due at the end of the semester. See syllabus for details.
Grading: See the syllabus.
Tentative Schedule
Week 1 : Fundamentals of networks (Lecture 0 and Lecture 1)
Week 2 : Network representations and statistics (Lecture 2)
Week 3 : Random graph models (Lecture 3)
Week 4 : Predicting missing links and attributes (Lecture 4)
Week 5 : Modular networks I: structure (Lecture 5)
Week 6 : Modular networks II: inference (Lecture 6)
Week 7 : Epidemiology I: models and networks (Lecture 7)
Week 8 : Epidemiology II: models and complications (Lecture 8)
Week 9 : Proteins & genes I: structure (Lecture 9)
Week 10 : Proteins & genes II: models (Lecture 10)
Week 11 : Spring break
Week 12 : Metabolism I: structure (Lecture 11)
Week 13 : Metabolism II: flux balance analysis (Lecture 12)
Week 14 : Ethics, biology, and networks (Lecture 13)
Week 15 : Project presentations
Week 16 : Project presentations
Week 17 : Finals
Week 0:
- D.J. Nicholson, On being the right size, revisited: The problem with engineering metaphors in molecular biology. In Holm, S. & Serban, M. (eds.), Philosophical Perspectives on the Engineering Approach in Biology: Living Machines? London: Routledge, pp. 40–68 (2020).
- C.T. Butts, "Revisiting the foundations of network analysis." Science 325, 414-416 (2009).
Week 1:
- M.E.J. Newman, The structure and function of complex networks. SIAM Review 45, 167-256 (2003).
- R. Milo et al., Network Motifs: Simple Buildling Blocks of Complex Networks. Science 298, 824-827 (2002).
- T.E. Gorochowski, C. S. Grierson, and M. di Bernardo, Organization of feed-forward loop motifs reveals architectural principles in natural and engineered networks. Science Advances 4(3), eaap9751 (2018).
Week 2:
- L. Peel, J.-C. Delvenne, and R. Lambiotte, Multiscale mixing patterns in networks. Proc. Natl. Acad. Sci. USA, Early Edition (2018).
- M. Mitzenmacher, A Brief History of Generative Models for Power Law and Lognormal Distributions. Internet Mathematics 1(2), 226-251 (2004).
- A. Clauset, C.R. Shalizi and M.E.J. Newman, Power-law distributions in empirical data. SIAM Review 51(4), 661-703 (2009).
Week 3:
- E. A. Hobson et al., "A guide to choosing and implementing reference models for social network analysis." Biological Reviews (2021).
- B.K. Fosdick et al., Configuring random graph models with fixed degree sequences. SIAM Review 60(2), 315–355 (2018).
- B. Karrer and M.E.J. Newman, Random graphs containing arbitrary distributions of subgraphs. Physical Review E 82, 066118 (2010).
- K. Van Koevering, A R. Benson and J. Kleinberg, "Random graphs with prescribed k-core sequences: a new null model for network analysis." Proc. Web Conference 2021, 367–378 (2021).
Week 4:
- D. Liben-Nowell and J. Kleinberg, The Link Prediction Problem for Social Networks. J. Amer. Soc. Info. Sci. and Tech. 58(7), 1019-1031 (2007).
- A. Clauset, C. Moore, and M.E.J. Newman, Hierarchical structure and the prediction of missing links in networks. Nature 453, 98 - 101 (2008).
- A. Ghasemian et al., "Stacking models for nearly optimal link prediction in complex networks." Proc. Natl. Acad. Sci. USA 117, 23393-23400 (2020).
- T. Zhou, "Progresses and challenges in link prediction." Preprint, arxiv:2102.11472 (2021).
Week 5:
- B. Karrer and M.E.J. Newman, Stochastic blockmodels and community structure in networks. Phys. Rev. E 83, 016107 (2011).
- M.E.J. Newman, Mixing patterns in networks. Physical Review E 67, 026126 (2003).
Week 6:
- L. Peel, D.B. Larremore, and A. Clauset, The ground truth about metadata and community detection in networks. Science Advances 3(5), e1602548 (2017).
- B.H. Good, Y.-A. de Montjoye and A. Clauset, Performance of modularity maximization in practical contexts. Physical Review E 81, 046106 (2010).
- A. Decelle, et al., Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications Phys. Rev. E 84, 066106 (2011).
- T.P. Peixoto, Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Phys. Rev. E 89, 012804 (2014)
Weeks 7-8:
- E. Yong, The Deceptively Simple Number Sparking Coronavirus Fears, The Atlantic, 28 January 2020.
- J. Bedson et al., A review and agenda for integrated disease models including social and behavioural factors. Nature Human Behavior 5, 834–846 (2021).
- L.A. Meyers et al., Network theory and SARS: predicting outbreak diversity. J. Theoretical Biology 232(1), 71-81 (2005).
- S.M. Kissler et al., Projecting the transmission dynamics of SARS-CoV-2 through the post-pandemic period. Science 368(6493), 860-868 (2020).
- D.H. Morris et al., Optimal, near-optimal, and robust epidemic control. Communications Physics 4, 78 (2021).
- D.J. Watts et al., Multiscale, resurgent epidemics in a hierarchical metapopulation model. Proc. Natl. Acad. Sci. USA 102(32) 11157-11162 (2005).
- D. Brockmann and D. Helbing, The hidden geometry of complex, network-driven contagion phenomena. Science 342, 1337-1342 (2013).
Weeks 9-10:
- H. Huang, B.M. Jedynak, and J.S. Bader, Where Have All the Interactions Gone? Estimating the Coverage of Two-Hybrid Protein Interaction Maps. PLoS Computational Biology 3(11), e214 (2007).
- M. Middendorf, E. Ziv, and C.H. Wiggins, Inferring network mechanisms: The Drosophila melanogaster protein interaction network Proc. Natl. Acad. Sci. USA 102(9), 3192-3197 (2005).
- A. Vazquez et al., Modeling of protein interaction networks. Complexus 1, 38-44 (2003).
- M.M. Saint-Antoine and A. Singh, Network Inference in Systems Biology: Recent Developments, Challenges, and Applications. Preprint (2019).
- S. Bornholdt, Boolean network models of cellular regulation: prospects and limitations. J. R. Soc. Interface 5, S85-S94 (2008).
Weeks 12-13:
- M. Huss and P. Holme, Currency and commodity metabolites: Their identification and relation to the modularity of metabolic networks. IET Systems Biology 1, 280-285 (2007).
- S. Maslov et al., Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc. Natl. Acad. Sci. USA 106(24), 9743-9748 (2009).
- J.D. Orth, I. Thiele, and B. Palsson, What is flux balance analysis? Nature Biotechnology 28, 245-248 (2010).
- S. Hui et al., Quantitative fluxomics of circulating metabolites Cell Metabolism 32, 1-13 (2020).
- J.M. Monk, Genome-scale metabolic network reconstructions of diverse Escherichia strains reveal strain-specific adaptations Phil. Trans. R. Soc. B 377, 20210236 (2022).
Network Tools
NetworkX, network analysis package (Python)
igraph, network analysis tools (Python, C++, R)
graph-tool, network analysis and visualization software (Python, C++)
GraphLab, scalable network analysis (Python, C++)
NodeXL, network analysis and visualization software
Network Visualization
Cytoscape, network visualization software
CosmoGraph.app, in-browser, larger network visualization
yEd Graph Editor, network visualization software
Graphviz, network visualization software
Gephi, network visualization software
graph-tool, network analysis and visualization software
webweb, network visualization tool joining Matlab and d3
MuxViz, multilayer analysis and visualization platform
Network Data Sets
The Colorado Index of Complex Networks (ICON; more than 5000 graphs)
Other Courses on Networks
Network Theory (University of Michigan)
Statistical Network Analysis (Purdue University)
Networks (Cornell University)
Networks (Harvard University)
Social and Economic Networks: Models and Analysis (Coursera / Stanford)
Social Network Analysis (Coursera / University of Michigan)
Social and Information Network Analysis (Stanford)
Graphs and Networks (Yale)
Spectral Graph Theory (Yale)
The Structure of Social Data (Stanford)
Resources
LaTeX (general) and TeXShop (Mac)
Matlab license for CU staff (includes student employees)
Mathematica license for CU students
NumPy/SciPy libraries for Python (similar to Matlab)
GNU Octave (similar to Matlab)
Wolfram Alpha (Web interface for simple integration and differentiation)
Introduction to the Modeling and Analysis of Complex Systems, by Hiroki Sayama (free online textbook)
Things Worth Reading
Everything you wanted to know about Data Analysis and Fitting but were afraid to ask, by Peter Young
Machine Learning, Statistical Inference and Induction Notebook (by Cosma Shalizi)
Power Law distributions, etc. Notebook (by Cosma Shalizi)
Statistics Done Wrong, The woefully complete guide (by Alex Reinhart)
Some Advice on Process for
[Research Projects]