Focus Areas
My group's research focuses on the development of novel methods for analyzing and modeling complex systems of all kinds and for extracting scientifically valuable insights from complex data. We are particularly interested in notions of collective dynamics, the emergence of patterns in random processes, population dynamics, and statistical forecasting. These efforts are fundamentally non-discplinary, sitting at the intersection of Computer Science, Physics, and Statistics, and with broad applications across the sciences.
Network Science
The quantitative study of networks has emerged as a fundamental tool for the
study of complex systems, in part for its ability to provide a rigorous
foundation to the study of biological and social complexity. Our
work here focuses on developing novel methods and models of large-scale
structure (regularities like modules, communities, and hierarchies) that can
be fitted directly to empirical network data, that account for auxiliary
information including vertex and edge annotations and temporal dynamics, and
that make precise predictions about missing information, anomalies, or future
evolution.
Computational Social Science
The computer revolution is generating a revolution in the social sciences,
via both the collection of massive data sets on social behavior and newfound
ability to test complex theories with empirical data. These changes are
allowing us to examine old questions with new data and models and to pose
fundamentally new questions about large-scale patterns in social phenomena.
My group's work here focuses on the 'science of science', and in particular
the drivers of different kinds of inequalities in the academic workforce and
how they shape who makes what scientific discoveries. I also study global
patterns in terrorism and the dynamics of warfare and competition, as well
as the way networks provide a way to bridge the micro-dynamics of individuals
and the macro-patterns of populations.
Computational Systems Biology
Fundamental questions in biology increasingly demand answers that consider the
interactions of different components or subsystems and the impact of macroevolutionary
forces on the large-scale and long-term dynamics of the biosphere. This work
spans all scales, including work with oncologists, paleontologists, epidemiologists,
geneticists, and microbial ecologists. Currently, my group's work aims to understand
the complex dynamics of ovarian cancer and the statistical patterns of genomic
structural variants. In the past, my group has developed predictive theories of
the macroevolution of species body sizes, microbial ecologies, and malaria.
-
Sequential stacking link prediction algorithms for temporal networks.
X. He, A. Ghasemian, E. Lee, A. Clauset, and P.J. Mucha
Submitted (2023). -
Gender and retention patterns among U.S. faculty.
K. Spoon, N. Laberge, K.H. Wapman, S. Zhang, A.C. Morgan, M. Galesic, B.K. Fosdick, D.B. Larremore, and A. Clauset
Submitted (2022). -
Sampling random graphs with specified degree sequences.
U. Dutta, B.K. Fosdick, and A. Clauset
Submitted (2022). -
Supporting working parents: The effects of work-family policies on research productivity trends.
D. Van Egdom, C. Spitzmueller, P. Lindner, A. Clauset
Submitted (2022).
Publications (Refereed)
-
An open-source cultural consensus approach to name-based gender classification.
I. V. Buskirk, A. Clauset, and D.B. Larremore
To appear, ICWSM (2023). -
Subfield prestige and gender inequality in computer science.
N. LaBerge, K.H. Wapman, A.C. Morgan, S. Zhang, D.B. Larremore, and A. Clauset
Communications of the ACM 65(12), 46-55 (CACM) (2022). [CACM version] [twitterthread] -
Labor advantages drive the greater productivity of faculty at elite universities.
S. Zhang, K.H. Wapman, D.B. Larremore, and A. Clauset
Science Advances 8(46), eabq7056 (2022). [preprint version] [twitterthread] -
Quantifying hierarchy and dynamics in US faculty hiring and retention.
K.H. Wapman, S. Zhang, A. Clauset, and D.B. Larremore
Nature 610, 120-127 (2022). [twitterthread] -
Socioeconomic roots of academic faculty.
A.C. Morgan, N. LaBerge, D.B. Larremore, M. Galesic, J.E. Brand, and A. Clauset
Nature Human Behavior 6, 1625-1633 (2022). [Nat. Hum. Behav. version] [twitterthread] -
Untangling the network effects of productivity and prominence among scientists.
W. Li, S. Zhang, Z. Zheng, S.J. Cranmer and A. Clauset
Nature Communications 13, 4907 (2022). [twitterthread] -
The dynamics of faculty hiring networks.
E. Lee, A. Clauset, and D.B. Larremore
EPJ Data Science 10, 48 (2021). [EPJ Data Science version] [twitterthread] -
Examining the consumption of radical content on YouTube.
H. Hosseinmardi, A. Ghasemian, A. Clauset, M. Mobiush, D.M. Rothschild, and D.J. Watts
Proc. Natl. Acad. Sci. USA 118(32), e2101967118 (2021). [PNAS version] -
Denoising large-scale biological data using network filters.
A.J. Kavran and A. Clauset
BMC Bioinformatics 22, article 157 (2021). [code and data] -
The unequal impact of parenthood in academia.
A.C. Morgan, S.F. Way, M.J.D. Hoefer, D.B. Larremore, M. Galesic, and A. Clauset
Science Advances 7(9), eabd1996 (2021). [slides] [video presentation]
[Paper of the Year Award, 2021, Internat. Soc. Sciento. and Infor. (ISSI)] [twitterthread] -
The capacity of the ovarian cancer tumor microenvironment to integrate inflammation signaling conveys a shorter disease-free interval.
K.R. Jordan, M.J. Sikora, J.E. Slansky, A. Minic, J.K. Richer, M.R. Moroney, J.C. Costello, A. Clauset, K. Behbakht, T.R. Kumar, and B.G. Bitler
J. Clinical Research 26(23), 6362-6373 (2020). -
Stacking models for nearly optimal link prediction in complex networks.
A. Ghasemian, H. Hosseinmardi, A. Galstyan, E.M. Airoldi, and A. Clauset
Proc. Natl. Acad. Sci. USA 117(38), 23393-23400 (2020). [code and data] [PNAS version] [video presentation] [slides] [twitterthread] - Productivity, prominence, and the effects of academic environment.
S. F. Way, A. C. Morgan, D. B. Larremore, and A. Clauset
Proc. Natl. Acad. Sci. USA 116(22), 10729-10733 (2019). [twitterthread] - Evaluating overfit and underfit in models of network community structure.
A. Ghasemian, H. Hosseinmardi, and A. Clauset
IEEE Trans. Knowledge and Data Engineering 32(9), 1722-1735 (2019). [TKDE version] - Environmental changes and the dynamics of musical identity
S. F. Way, S. Gil, I. Anderson, and A. Clauset
Proc. of the 13th International AAAI Conference on the Web and Social Media (ICWSM) 13, 527-536 (2019). - Scale-free networks are rare.
A. D. Broido and A. Clauset
Nature Communications 10, 1017 (2019). [Nature Communications version] [data and code] [video presentation, at 41:00] - Prestige drives epistemic inequality in the diffusion of scientific ideas.
A. C. Morgan, D. J. Economou, S. F. Way, and A. Clauset
EPJ Data Science 7, 40 (2018). [EPJ Data Science version] - Automatically assembling a full census of an academic field.
A. C. Morgan, S. F. Way, and A. Clauset
PLOS ONE 13(8), e0202223 (2018). [PLoS version] - Trends and fluctuations in the severity of interstate wars.
A. Clauset
Science Advances 4, eaao3580 (2018). [video presentation at NASEM 2019] - A communal catalogue reveals Earth’s multiscale microbial diversity. L. R. Thompson, J. G. Sanders, [et al. including A. Clauset] Nature 551, 457-463 (2017).
- The misleading narrative of the canonical faculty productivity trajectory.
S. F. Way, A. C. Morgan, A. Clauset, and D. B. Larremore
Proc. Natl. Acad. Sci. USA 114(44), E9216–E9223 (2017). [PNAS version]
[video presentation] - Using null models to infer microbial co-occurrence networks.
N. Connor, A. Barbaran and A. Clauset
PLOS ONE 12(5), e0176751 (2017). [PLoS version] - The ground truth about metadata and community detection in networks.
L. Peel, D. B. Larremore, and A. Clauset
Science Advances 3(5), e1602548 (2017). [Science Advances version] [code here and here]
[video presentation] - Eigenvector-based centrality measures for temporal networks.
D. Taylor, S. A. Myers, A. Clauset, M. A. Porter, P. J. Mucha
Multiscale Modeling and Simulation 15(1), 537-574 (2017). - Detectability thresholds and optimal algorithms for community structure in dynamic networks.
A. Ghasemian, P. Zhang, A. Clauset, C. Moore, and L. Peel
Physical Review X 6, 031005 (2016). - Structure and inference in annotated networks.
M. E. J. Newman and A. Clauset
Nature Communications 7, 11863 (2016). [NComms version] [video presentation] - Gender, productivity, and prestige in computer science faculty hiring networks.
S. F. Way, D. B. Larremore, and A. Clauset
Proc. 2016 World Wide Web Conference (WWW), 1169-1179 (2016). - Predicting sports scoring dynamics with restoration and anti-persistence.
L. Peel and A. Clauset
Proc. 2015 IEEE International Conference on Data Mining (ICDM), 339-348 (2015). - Ape origins of human malaria virulence genes.
D. B. Larremore, S. A. Sundararaman, W. Liu, W. R. Proto, A. Clauset, D. E. Loy, S. Speede, P. M. Sharp, B. H. Hahn, J. C. Rayner, and C. O. Buckee
Nature Communications 6, 8368 (2015). - Assembling thefacebook: Using heterogeneity to understand online social network assembly.
A. Z. Jacobs, S. F. Way, J. Ugander and A. Clauset
Proc. ACM Web Science Conference (WebSci), article 18 (2015). [data supplement] - Safe leads and lead changes in competitive team sports.
A. Clauset, M. Kogan and S. Redner
Physical Review E 91, 062815 (2015). - Systematic inequality and hierarchy in faculty hiring networks.
A. Clauset, S. Arbesman and D. B. Larremore
Science Advances 1(1), e1400005 (2015). (code and data) (visualizations)
Perspective piece in Slate, with Joel Warner - Detecting change points in the large-scale structure of evolving networks.
L. Peel and A. Clauset
Proc. of the 29th International Conference on Artificial Intelligence (AAAI), 2914-2920 (2015). (download the code) - Learning latent block structure in weighted networks.
C. Aicher, A. Z. Jacobs and A. Clauset
Journal of Complex Networks 3(2), 221-248 (2015). (download the code) [JCN version] - Forecasting of the risk of extreme massacres in Syria.
A. Scharpf, G. Schneider, A. Noh and A. Clauset
European Review of International Studies 1(2), 50-68 (2014). - Efficiently inferring community structure in bipartite networks.
D. B. Larremore, A. Clauset and A. Z. Jacobs
Physical Review E 90, 012805 (2014). (download the code and data) [PRE version] - Exploring community structure in biological networks with random graphs.
P. Sah, L. O. Singh, A. Clauset and S. Bansal
BMC Bioinformatics 14, 220 (2014). - Scoring dynamics across professional team sports: tempo, balance and predictability.
S. Merritt and A. Clauset
EPJ Data Science 3, 4 (2014). [EPJ Data Science version] - Power-law distributions in binned empirical data.
Y. Virkar and A. Clauset
Annals of Applied Statistics 8(1), 89 - 119 (2014). (download the code) [AoAS version] - Body mass evolution and diversification within horses (family Equidae).
L. Shoemaker and A. Clauset
Ecology Letters 17(2), 211 - 220 (2014). - Estimating the historical and future probabilities of large terrorist events
A. Clauset and R. Woodard
Annals of Applied Statistics 7(4), 1838 - 1865 (2013).
(download the code; video lecture, November 2013). [AoAS version] - A network approach to analyzing highly recombinant malaria parasite genes.
D. B. Larremore, A. Clauset, and C. O. Buckee
PLOS Computational Biology 9(10), e1003268 (2013). [PLoS version] - Environmental structure and competitive scoring advantages in team competitions.
S. Merritt and A. Clauset
Scientific Reports 3, Article number 3067 (2013). (video presentation, Spring 2013) [SciRep version] - The Blood Trail of the Veto: A Forecast of the Risk of Extreme Massacres in Syria.
A. Scharpf, G. Schneider, A. Noh and A. Clauset
Zeitschrift fur Friedens - und Konfliktforschung 2(1), 6 - 31 (2013). [In German.] - Detecting friendship within dynamic online interaction networks.
S. Merritt, A. Z. Jacobs, W. Mason and A. Clauset
Proc. of the 7th International AAAI Conference on Weblogs and Social Media (ICWSM), 380 - 389 (2013). - Transformation of Social Networks in the Late Prehispanic U.S. Southwest.
B. J. Mills, J. J. Clark, M. Peeples, W. R. Haas Jr., J. M. Roberts Jr., B. Hill, D. L. Huntley, L. Borck, R. L. Breiger, A. Clauset, and M. S. Shackley
Proc. Natl. Acad. Sci. USA 110(15): 5785 - 5790 (2013). - How large should whales be?
A. Clauset
PLOS ONE 8(1), e53967 (2013). [PLoS version] - Friends FTW! Friendship, Collaboration and Competition in Halo: Reach.
W. Mason and A. Clauset
Proc. of the 2013 Conf. on Computer Supported Cooperative Work (CSCW), 375 - 386 (2013). - The developmental dynamics of terrorist organizations.
A. Clauset and K. S. Gleditsch
PLOS ONE 7(11): e48633 (2012). (video lecture, Summer 2009) [PLoS version] - The performance of modularity maximization in practical contexts.
B. H. Good, Y.-A. de Montjoye and A. Clauset
Physical Review E 81, 046106 (2010). (download the code; video presentation, Fall 2010) - The Strategic Calculus of Terrorism: Substitution and Competition in the Israel-Palestine Conflict.
A. Clauset, L. Heger, M. Young and K. S. Gleditsch
Cooperation & Conflict 45(1), 6 - 33 (2010). [C & C version]- A generalized aggregation-disintegration model for the frequency of
severe terrorist attacks.
A. Clauset and F. W. Wiegel
Journal of Conflict Resolution 54(1), 179 - 197 (2010).- Power-law distributions in empirical data.
A. Clauset, C. R. Shalizi and M. E. J. Newman
SIAM Review 51(4), 661 - 703 (2009). (download the code)- On the Bias of Traceroute Sampling.
D. Achlioptas, A. Clauset, D. Kempe and C. Moore
Journal of the ACM 56(4), article 21, 28 pages (2009). [ACM version]- Evolutionary Model of Species Body Mass Diversification.
A. Clauset and S. Redner
Physical Review Letters 102, 038103 (2009).- Methodologies for Continuous Cellular Tower Data Analysis.
N. Eagle, J. Quinn and A. Clauset
Proc. 7th International Conference on Pervasive Computing (Pervasive '09), 342 - 353.- How many species have mass M?
A. Clauset, D. J. Schwab and S. Redner
American Naturalist 173, 256 - 263 (2009).- Controlling across complex networks - Emerging links between networks and control.
A. Clauset, H. G. Tanner, C. T. Abdallah and R. H. Byrne
Annual Reviews in Control 32, 183 - 192 (2008).- The evolution and distribution of species body size.
A. Clauset and D. H. Erwin
Science 321, 399 - 401 (2008). [free reprint via Science]
Accompanying Perspectives piece.- Hierarchical structure and the prediction of missing links in networks.
A. Clauset, C. Moore and M. E. J. Newman
Nature 453, 98 - 101 (2008). (download the code) [Nature version]
Accompanying News & Views piece.- On the Frequency of Severe Terrorist Attacks.
A. Clauset, M. Young and K. S. Gledistch
Journal of Conflict Resolution 51(1), 58 - 88 (2007).
(First pre-print version: physics/0502014 in Feb. 2005; replication data) [JCR version]- Scale Invariance in Road Networks.
V. Kalapala, V. Sanwalani, A. Clauset and C. Moore
Physical Review E 73, 026130 (2006).- Molecular modeling of mono- and bis-quaternary ammonium salts as ligands at the a4b2 nicotinic acetylcholine receptor subtype using nonlinear techniques.
J. T. Ayers, A. Clauset, J. D. Schmitt, L. P. Dwoskin and P. A. Crooks
American Association of Pharmaceutical Scientists Journal 7(3), E678 - 85 (2005).- Supervised Self-Organizing Maps in QSAR I: Robust behavior with underdetermined datasets.
Y. D. Xiao, A. Clauset, R. Harris, E. Bayram, P. Santago II, and J. D. Schmitt
Journal of Chemical Information and Modeling 45(6), 1749 - 1758 (2005).- Finding local community structure in networks.
A. Clauset
Physical Review E 72, 026132 (2005).- On the bias of traceroute sampling; or, Power-law degree distributions in regular graphs.
D. Achlioptas, A Clauset, D. Kempe and C. Moore
Proc. 37th ACM Symposium on Theory of Computing (STOC) (Baltimore, May 2005).- Accuracy and Scaling Phenomena in Internet Mapping.
A. Clauset and C. Moore
Physical Review Letters 94, 018701 (2005).- Finding community structure in very large networks.
A. Clauset, M. E. J. Newman and C. Moore
Physical Review E 70, 066111 (2004). (download the code)- Genetic Algorithms and Self-Organizing Maps: A Powerful Combination for Modeling Complex QSAR and QSPR Problems.
E. Bayram, P. Santago II, R. Harris, Y. D. Xiao, A. Clauset and J. D. Schmitt
Journal of Computer-Aided Molecular Design 18(7-9), 483 - 493 (2004).
Book Chapters
- On the frequency and severity of interstate wars.
A. Clauset
In N. P. Gleditsch (Ed.), Lewis F. Richardson -- His Intellectual Legacy and Influence in the Social Sciences, Springer, Pioneers in Arts, Humanities, Science, Engineering, Practice (2020). [Springer entire book]- Trends in Conflict.
K. S. Gleditsch and A. Clauset
In A. Gheciu and W. C. Wohlforth (Eds.), The Oxford Handbook of International Security (pp 227-244) Oxford University Press (2018).
Essays and Perspectives
- Decoding the dynamic tumor microenvironment
A. Clauset, K. Behbakht and B. G. Bitler
Science Advances 7(23), eabi5904 (2021).- Data-driven predictions in the science of science.
A. Clauset, D. B. Larremore, and R. Sinatra
Science 355(6324), 477-480 (2017). [Invited] [Science version] [video 2018] [slides and video 2021]- Synthesis aided design: The biological design-build-test engineering paradigm?
R. T. Gill, A. L. Halweg-Edwards, S. F. Way and A. Clauset
Biotechnology and Bioengineering 113(1), 7-10 (2016).
Workshop Papers
- If the data do not speak for themselves, how ought we to speak for the data?
I. V. Buskirk, B. Zaharatos, A. Clauset, D. B. Larremore
ICWSM Workshop on Disrupt, Ally, Resist, Embrace (DARE): Action Items for Computational Social Scientists in a Changing World (D.A.R.E. Workshop 2023).- Highly Accurate Link Prediction in Networks Using Stacked Generalization.
A. Ghasemian, A. Galstyan, and A. Clauset
WSDM International Workshop on Heterogeneous Networks Analysis and Mining (HeteroNAM 2018).- A unified view of generative models for networks: models, methods, opportunities, and challenges.
A. Z. Jacobs and A. Clauset
NIPS Workshop on Networks: From Graphs to Rich Data (2014).- Change-point detection in temporal networks using hierarchical random graphs.
L. Peel and A. Clauset
KDD Workshop on Outlier Detection & Description under Data Diversity (2014). (download the code)- Social Network Dynamics in a Massive Online Game: Network Turnover, Non-densification, and Team Engagement in Halo Reach.
S. Merritt and A. Clauset
Eleventh Workshop on Mining and Learning with Graphs (MLG) (2013).- Adapting the Stochastic Block Model to Edge-Weighted Networks.
C. Aicher, A. Z. Jacobs and A. Clauset
ICML Workshop on Structured Learning (SLG 2013). (download the code)- Location Segmentation, Inference and Prediction for Anticipatory Computing.
N. Eagle, A. Clauset and J. Quinn
Proc. AAAI Spring Symposium, 20-25 (2009).- Persistence and periodicity in a dynamic proximity network.
A. Clauset and N. Eagle
DIMACS Workshop on Computational Methods for Dynamic Interaction Networks (Piscataway), 2007.- Structural Inference of Hierarchies in Networks.
A. Clauset, C. Moore and M. E. J. Newman
In E. M. Airoldi et al. (Eds.): ICML 2006 Ws, Lecture Notes in Computer Science 4503, 1 - 13. Springer-Verlag, Berlin Heidelberg (2007).
Popular Press
- More Inclusive Scholarship Begins With Active Experimentation.
D. B. Larremore, A. C. Morgan and A. Clauset
The Chronicle of Higher Education, published online 1 November (2017).- Why predicting the future is more than just horseplay.
D. B. Larremore and A. Clauset
Christian Science Monitor, published online 24 April (2017).- The Academy’s Dirty Secret.
J. Warner and A. Clauset
Slate, published online 23 February (2015).- What Same-Sex Marriage Means for the Future of Recreational Weed.
J. Warner and A. Clauset
Pacific Standard, published online 24 October (2014).
Preprints and Other Publications
- Optimizing polymerase chain reaction (PCR) using machine learning.
N. J. Cordaro, A. J. Kavran, M. Smallegan, M. Palacio, N. Lammer, T. S. Brant, V. DuMont, N. Doherty Garcia, S. Miller, T. Jourabchi, S. L. Sawyer, and A. Clauset
Preprint, bioRxiv DOI:10.1101/2021.08.12.455589 (2021).- Predicting the outcomes of policy diffusion from U.S. states to federal law.
N. Connor and A. Clauset
Preprint, arxiv:1810.08988 (2018).- Thermodynamics of the minimum description length on community detection.
J. I. Perotti, C. J. Tessone, A. Clauset and G. Caldarelli
Preprint, arxiv:1806.07005 (2018).- Characterizing the structural diversity of complex networks across domains.
K. Ikehara and A. Clauset
Preprint, arxiv:1710.11304 (2017).- The evolution of primate body size: left-skewness, maximum size, and Cope's rule.
R. C. Tillquist, L. Shoemaker, K. B. Knight, and A. Clauset
Preprint, bioRxiv DOI:10.1101/092866 (2016).- Revisiting the effect of red on competition in humans.
L. Fortunato and A. Clauset
Preprint, bioRxiv DOI:10.1101/086710 (2016).- Untangling the roles of parasites in food webs with generative network models.
A. Z. Jacobs, J. A. Dunne, C. Moore, and A. Clauset
Preprint, arxiv:1505.04741 (2015).- Rejoinder of "Estimating the historical and future probabilities of large terrorist events".
A. Clauset and R. Woodard
Annals of Applied Statistics 7(4), 1895-1897 (2013). [AoAS version]- Adapting to Non-stationarity with Growing Expert Ensembles.
C. R. Shalizi, A. Z. Jacobs, K. L. Klinkner and A. Clauset
Preprint, arXiv:1103.0949 (2011).- A Novel Explanation of the Power-Law Form of the Frequency of Severe Terrorist Events: Reply to Saperstein.
A. Clauset, M. Young and K.S. Gleditsch
Peace Economics, Peace Science and Public Policy 16(1), Article 12 (2010).- Story-telling, Statistics, And Other Grave Scientific Insults.
A. Clauset
Nature Soapbox Science Blog, posted 27 October (2010).- A theoretician ponders what physics has to offer ecology.
A. Clauset
Nature 465, 139 (2010).- Multi-dimensional Edge Inference: Response to Comment by Dr. Adams.
N. Eagle, A. Clauset, A. Pentland and D. Lazer
Proc. Natl. Acad. Sci. USA 107(9), E31 (2010).- Comment on Yu et al., 'High Quality Binary Protein Interaction Map of the Yeast Interactome Network.' Science 322, 104 (2008).
A. Clauset
Preprint, arXiv:0901.0530 (2009).- How do networks become navigable?
A. Clauset and C. Moore
Preprint, arXiv:cond-mat/0309415 (2003).- Chaos You Can Play In.
A. Clauset, N. Grigg, M. Lim and E. Miller
Proc. 2003 Santa Fe Institute Complex Systems Summer School (Santa Fe, July 2003). - A generalized aggregation-disintegration model for the frequency of