Synopsis
Data Skeptic is a data science podcast exploring machine learning, statistics, artificial intelligence, and other data topics through short tutorials and interviews with domain experts.
Episodes
-
Interpretable Real Estate Recommendations
22/09/2025 Duration: 32minIn this episode of Data Skeptic's Recommender Systems series, host Kyle Polich interviews Dr. Kunal Mukherjee, a postdoctoral research associate at Virginia Tech, about the paper "Z-REx: Human-Interpretable GNN Explanations for Real Estate Recommendations" The discussion explores how the post-COVID real estate landscape has created a need for better recommendation systems that can introduce home buyers to emerging neighborhoods they might not know about. Dr. Mukherjee, explains how his team developed a graph neural network approach that not only recommends properties but provides human-interpretable explanations for why certain regions are suggested. The conversation covers the advantages of using graph-based models over traditional recommendation systems, the importance of regional context in real estate features, and how co-click data from similar users can create more effective recommendations. Key topics include the distinction between model developer explanations and end-user explanations, the challenge
-
Why Am I Seeing This?
08/09/2025 Duration: 49minIn this episode of Data Skeptic, we explore the challenges of studying social media recommender systems when exposure data isn't accessible. Our guests Sabrina Guidotti, Gregor Donabauer, and Dimitri Ognibene introduce their innovative "recommender neutral user model" for inferring the influence of opaque algorithms.
-
Eco-aware GNN Recommenders
30/08/2025 Duration: 44minIn this episode of Data Skeptic, we dive into eco-friendly AI with Antonio Purificato, a PhD student from Sapienza University of Rome. Antonio discusses his research on "EcoAware Graph Neural Networks for Sustainable Recommendations" and explores how we can measure and reduce the environmental impact of recommender systems without sacrificing performance.
-
Networks and Recommender Systems
17/08/2025 Duration: 17minKyle reveals the next season's topic will be "Recommender Systems". Asaf shares insights on how network science contributes to the recommender system field.
-
Network of Past Guests Collaborations
21/07/2025 Duration: 34minKyle and Asaf discuss a project in which we link former guests of the podcast based on their co-authorship of academic papers.
-
The Network Diversion Problem
06/07/2025 Duration: 46minIn this episode, Professor Pål Grønås Drange from the University of Bergen, introduces the field of Parameterized Complexity - a powerful framework for tackling hard computational problems by focusing on specific structural aspects of the input. This framework allows researchers to solve NP-complete problems more efficiently when certain parameters, like the structure of the graph, are "well-behaved". At the center of the discussion is the network diversion problem, where the goal isn’t to block all routes between two points in a network, but to force flow - such as traffic, electricity, or data - through a specific path. While this problem appears deceptively similar to the classic "Min.Cut/Max.Flow" algorithm, it turns out to be much harder and, in general, its complexity is still unknown. Parameterized complexity plays a key role here by offering ways to make the problem tractable under constraints like low treewidth or planarity, which often exist in real-world networks like road systems or utility grids.
-
Complex Dynamic in Networks
28/06/2025 Duration: 56minIn this episode, we learn why simply analyzing the structure of a network is not enough, and how the dynamics - the actual mechanisms of interaction between components - can drastically change how information or influence spreads. Our guest, Professor Baruch Barzel of Bar-Ilan University, is a leading researcher in network dynamics and complex systems ranging from biology to infrastructure and beyond. BarzelLab BarzelLab on Youtube Paper in focus: Universality in network dynamics, 2013
-
Github Network Analysis
22/06/2025 Duration: 36minIn this episode we'll discuss how to use Github data as a network to extract insights about teamwork. Our guest, Gabriel Ramirez, manager of the notifications team at GitHub, will show how to apply network analysis to better understand and improve collaboration within his engineering team by analyzing GitHub metadata - such as pull requests, issues, and discussions - as a bipartite graph of people and projects. Some insights we'll discuss are how network centrality measures (like eigenvector and betweenness centrality) reveal organizational dynamics, how vacation patterns influence team connectivity, and how decentralizing communication hubs can foster healthier collaboration. Gabriel’s open-source project, GH Graph Explorer, enables other managers and engineers to extract, visualize, and analyze their own GitHub activity using tools like Python, Neo4j, Gephi and LLMs for insight generation, but always remember – don't take the results on face value. Instead, use the results to guide your qualitative investi
-
Networks and Complexity
14/06/2025 Duration: 17minIn this episode, Kyle does an overview of the intersection of graph theory and computational complexity theory. In complexity theory, we are about the runtime of an algorithm based on its input size. For many graph problems, the interesting questions we want to ask take longer and longer to answer! This episode provides the fundamental vocabulary and signposts along the path of exploring the intersection of graph theory and computational complexity theory.
-
Graphs for Causal AI
24/05/2025 Duration: 41minHow to build artificial intelligence systems that understand cause and effect, moving beyond simple correlations? As we all know, correlation is not causation. "Spurious correlations" can show, for example, how rising ice cream sales might statistically link to more drownings, not because one causes the other, but due to an unobserved common cause like warm weather. Our guest, Utkarshani Jaimini, a researcher from the University of South Carolina's Artificial Intelligence Institute, tries to tackle this problem by using knowledge graphs that incorporate domain expertise. Knowledge graphs (structured representations of information) are combined with neural networks in the field of neurosymbolic AI to represent and reason about complex relationships. This involves creating causal ontologies, incorporating the "weight" or strength of causal relationships and hyperrelations. This field has many practical applications such as for AI explainability, healthcare and autonomous driving. Follow our guest Utkarshani Ja
-
-
-
Network Manipulation
30/04/2025 Duration: 40minIn this episode we talk with Manita Pote, a PhD student at Indiana University Bloomington, specializing in online trust and safety, with a focus on detecting coordinated manipulation campaigns on social media. Key insights include how coordinated reply attacks target influential figures like journalists and politicians, how machine learning models can detect these inauthentic campaigns using structural and behavioral features, and how deletion patterns reveal efforts to evade moderation or manipulate engagement metrics. Follow our guest X/Twitter Google Scholar Papers in focus Coordinated Reply Attacks in Influence Operations: Characterization and Detection ,2025 Manipulating Twitter through Deletions,2022
-
The Small World Hypothesis
21/04/2025 Duration: 17minKyle discusses the history and proof for the small world hypothesis.
-
Thinking in Networks
12/04/2025 Duration: 33minKyle asks Asaf questions about the new network science course he is now teaching. The conversation delves into topics such as contact tracing, tools for analyzing networks, example use cases, and the importance of thinking in networks.
-
Fraud Networks
01/04/2025 Duration: 42minIn this episode we talk with Bavo DC Campo, a data scientist and statistician, who shares his expertise on the intersection of actuarial science, fraud detection, and social network analytics. Together we will learn how to use graphs to fight against insurance fraud by uncovering hidden connections between fraudulent claims and bad actors. Key insights include how social network analytics can detect fraud rings by mapping relationships between policyholders, claims, and service providers, and how the BiRank algorithm, inspired by Google’s PageRank, helps rank suspicious claims based on network structure. Bavo will also present his iFraud simulator that can be used to model fraudulent networks for detection training purposes. Do you have a question about fraud detection? Bavo says he will gladly help. Feel free to contact him. ------------------------------- Want to listen ad-free? Try our Graphs Course? Join Data Skeptic+ for $5 / month of $50 / year https://plus.dataskeptic.com
-
Criminal Networks
17/03/2025 Duration: 43minIn this episode we talk with Justin Wang Ngai Yeung, a PhD candidate at the Network Science Institute at Northeastern University in London, who explores how network science helps uncover criminal networks. Justin is also a member of the organizing committee of the satellite conference dealing with criminal networks at the network science conference in The Netherlands in June 2025. Listeners will learn how graph-based models assist law enforcement in analyzing missing data, identifying key figures in criminal organizations, and improving intervention strategies. Key insights include the challenges of incomplete and inaccurate data in criminal network analysis, how law enforcement agencies use network dismantling techniques to disrupt organized crime, and the role of machine learning in predicting hidden connections within illicit networks. ------------------------------- Want to listen ad-free? Try our Graphs Course? Join Data Skeptic+ for $5 / month of $50 / year https://plus.dataskeptic.com
-
Graph Bugs
10/03/2025 Duration: 29minIn this episode today’s guest is Celine Wüst, a master’s student at ETH Zurich specializing in secure and reliable systems, shares her work on automated software testing for graph databases. Celine shows how fuzzing—the process of automatically generating complex queries—helps uncover hidden bugs in graph database management systems like Neo4j, FalconDB, and Apache AGE. Key insights include how state-aware query generation can detect critical issues like buffer overflows and crashes, the challenges of debugging complex database behaviors, and the importance of security-focused software testing. We'll also find out which Graph DB company offers swag for finding bugs in its software and get Celine's advice about which graph DB to use. ------------------------------- Want to listen ad-free? Try our Graphs Course? Join Data Skeptic+ for $5 / month of $50 / year https://plus.dataskeptic.com
-
Organizational Network Analysis
03/03/2025 Duration: 44minIn this episode, Gabriel Petrescu, an organizational network analyst, discusses how network science can provide deep insights into organizational structures using OrgXO, a tool that maps companies as networks rather than rigid hierarchies. Listeners will learn how analyzing workplace collaboration networks can reveal hidden influencers, organizational bottlenecks, and engagement levels, offering a data-driven approach to improving effectiveness and resilience. Key insights include how companies can identify overburdened employees, address silos between departments, and detect vulnerabilities where too few individuals hold critical knowledge. Real-life applications range from mergers and acquisitions, where network analysis helps assess company dynamics before an acquisition, to restructuring efforts that improve workflow and team collaboration. Gabriel’s work highlights how organizations can shift from traditional hierarchical thinking to a network-based perspective, leading to smarter decision-making and mor
-
Organizational Networks
25/02/2025 Duration: 27minIs it better to have your work team fully connected or sparsely connected? In this episode we'll try to answer this question and more with our guest Hiroki Sayama, a SUNY Distinguished Professor and director of the Center for Complex Systems at Binghamton University. Hiroki delves into the applications of network science in organizational structures and innovation dynamics by showing his recent work of extracting network structures from organizational charts to enable insights into decision-making and performance, He'll also cover how network connectivity impacts team creativity and innovation. Key insights include how the structure of organizational networks—such as the depth of hierarchy or proximity to leadership—can influence corporate performance and how sparse network connectivity fosters more diverse and innovative ideas than fully connected networks.