Data Skeptic

  • Author: Vários
  • Narrator: Vários
  • Publisher: Podcast
  • Duration: 299:48:45
  • More information

Informações:

Synopsis

Data Skeptic is a data science podcast exploring machine learning, statistics, artificial intelligence, and other data topics through short tutorials and interviews with domain experts.

Episodes

  • The Defeat of the Winograd Schema Challenge

    11/09/2023 Duration: 31min

    Our guest today is Vid Kocijan, a Machine Learning Engineer at Kumo AI. Vid has a Ph.D. in Computer Science at the University of Oxford. His research focused on common sense reasoning, pre-training in LLMs, pretraining in knowledge-based completion, and how these pre-trainings impact societal bias. He joins us to discuss how he built a BERT model that solved the Winograd Schema Challenge.

  • LLMs in Social Science

    04/09/2023 Duration: 34min

    Today, We are joined by Petter Törnberg, an Assistant Professor in Computational Social Science at the University of Amsterdam and a Senior Researcher at the University of Neuchatel. His research is centered on the intersection of computational methods and their applications in social sciences. He joins us to discuss findings from his research papers, ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning, and How to use LLMs for Text Analysis.

  • LLMs in Music Composition

    28/08/2023 Duration: 33min

    In this episode, we are joined by Carlos Hernández Oliván, a Ph.D. student at the University of Zaragoza. Carlos’s interest focuses on building new models for symbolic music generation. Carlos shared his thoughts on whether these models are genuinely creative. He revealed situations where AI-generated music can pass the Turing test. He also shared some essential considerations when constructing models for music composition.

  • Cuddlefish Model Tuning

    21/08/2023 Duration: 27min

    Hongyi Wang, a Senior Researcher at the Machine Learning Department at Carnegie Mellon University, joins us. His research is in the intersection of systems and machine learning. He discussed his research paper, Cuttlefish: Low-Rank Model Training without All the Tuning, on today’s show. Hogyi started by sharing his thoughts on whether developers need to learn how to fine-tune models. He then spoke about the need to optimize the training of ML models, especially as these models grow bigger. He discussed how data centers have the hardware to train these large models but not the community. He then spoke about the Low-Rank Adaptation (LoRa) technique and where it is used. Hongyi discussed the Cuttlefish model and how it edges LoRa. He shared the use cases of Cattlefish and who should use it. Rounding up, he gave his advice on how people can get into the machine learning field. He also shared his future research ideas.

  • Which Professions Are Threatened by LLMs

    15/08/2023 Duration: 38min

    On today’s episode, we have Daniel Rock, an Assistant Professor of Operations Information and Decisions at the Wharton School of the University of Pennsylvania. Daniel’s research focuses on the economics of AI and ML, specifically how digital technologies are changing the economy. Daniel discussed how AI has disrupted the job market in the past years. He also explained that it had created more winners than losers. Daniel spoke about the empirical study he and his coauthors did to quantify the threat LLMs pose to professionals. He shared how they used the O-NET dataset and the BLS occupational employment survey to measure the impact of LLMs on different professions. Using the radiology profession as an example, he listed tasks that LLMs could assume. Daniel broadly highlighted professions that are most and least exposed to LLMs proliferation. He also spoke about the risks of LLMs and his thoughts on implementing policies for regulating LLMs.

  • Why Prompting is Hard

    08/08/2023 Duration: 48min

    We are excited to be joined by J.D. Zamfirescu-Pereira, a Ph.D. student at UC Berkeley. He focuses on the intersection of human-computer interaction (HCI) and artificial intelligence (AI). He joins us to share his work in his paper, Why Johnny can’t prompt: how non-AI experts try (and fail) to design LLM prompts.  The discussion also explores lessons learned and achievements related to BotDesigner, a tool for creating chat bots.

  • Automated Peer Review

    31/07/2023 Duration: 36min

    In this episode, we are joined by Ryan Liu, a Computer Science graduate of Carnegie Mellon University. Ryan will begin his Ph.D. program at Princeton University this fall. His Ph.D. will focus on the intersection of large language models and how humans think. Ryan joins us to discuss his research titled "ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing"

  • Prompt Refusal

    24/07/2023 Duration: 44min

    The creators of large language models impose restrictions on some of the types of requests one might make of them.  LLMs commonly refuse to give advice on committing crimes, producting adult content, or respond with any details about a variety of sensitive subjects.  As with any content filtering system, you have false positives and false negatives. Today's interview with Max Reuter and William Schulze discusses their paper "I'm Afraid I Can't Do That: Predicting Prompt Refusal in Black-Box Generative Language Models".  In this work, they explore what types of prompts get refused and build a machine learning classifier adept at predicting if a particular prompt will be refused or not.

  • A Long Way Till AGI

    18/07/2023 Duration: 37min

    Our guest today is Maciej Świechowski. Maciej is affiliated with QED Software and QED Games. He has a Ph.D. in Systems Research from the Polish Academy of Sciences. Maciej joins us to discuss findings from his study, Deep Learning and Artificial General Intelligence: Still a Long Way to Go.

  • Brain Inspired AI

    11/07/2023 Duration: 36min

    Today on the show, we are joined by Lin Zhao and Lu Zhang. Lin is a Senior Research Scientist at United Imaging Intelligence, while Lu is a Ph.D. candidate at the Department of Computer Science and Engineering at the University of Texas. They both shared findings from their work When Brain-inspired AI Meets AGI. Lin and Lu began by discussing the connections between the brain and neural networks. They mentioned the similarities as well as the differences. They also shared whether there is a possibility for solid advancements in neural networks to the point of AGI. They shared how understanding the brain more can help drive robust artificial intelligence systems. Lin and Lu shared how the brain inspired popular machine learning algorithms like transformers. They also shared how AI models can learn alignment from the human brain. They juxtaposed the low energy usage of the brain compared to high-end computers and whether computers can become more energy efficient.

  • Computable AGI

    03/07/2023 Duration: 36min

    On today’s show, we are joined by Michael Timothy Bennett, a Ph.D. student at the Australian National University. Michael’s research is centered around Artificial General Intelligence (AGI), specifically the mathematical formalism of AGIs. He joins us to discuss findings from his study, Computable Artificial General Intelligence.

  • AGI Can Be Safe

    26/06/2023 Duration: 45min

    We are joined by Koen Holtman, an independent AI researcher focusing on AI safety. Koen is the Founder of Holtman Systems Research, a research company based in the Netherlands. Koen started the conversation with his take on an AI apocalypse in the coming years. He discussed the obedience problem with AI models and the safe form of obedience. Koen explained the concept of Markov Decision Process (MDP) and how it is used to build machine learning models. Koen spoke about the problem of AGIs not being able to allow changing their utility function after the model is deployed. He shared another alternative approach to solving the problem. He shared how to engineer AGI systems now and in the future safely. He also spoke about how to implement safety layers on AI models. Koen discussed the ultimate goal of a safe AI system and how to check that an AI system is indeed safe. He discussed the intersection between large language Models (LLMs) and MDPs. He shared the key ingredients to scale the current AI implementation

  • AI Fails on Theory of Mind Tasks

    19/06/2023 Duration: 52min

    An assistant professor of Psychology at Harvard University, Tomer Ullman, joins us. Tomer discussed the theory of mind and whether machines can indeed pass it. Using variations of the Sally-Anne test and the Smarties tube test, he explained how LLMs could fail the theory of mind test.

  • AI for Mathematics Education

    12/06/2023 Duration: 35min

    The application of LLMs cuts across various industries. Today, we are joined by Steven Van Vaerenbergh, who discussed the application of AI in mathematics education. He discussed how AI tools have changed the landscape of solving mathematical problems. He also shared LLMs' current strengths and weaknesses in solving math problems.

  • Evaluating Jokes with LLMs

    06/06/2023 Duration: 43min

    Fabricio Goes, a Lecturer in Creative Computing at the University of Leicester, joins us today. Fabricio discussed what creativity entails and how to evaluate jokes with LLMs. He specifically shared the process of evaluating jokes with GPT-3 and GPT-4. He concluded with his thoughts on the future of LLMs for creative tasks.

  • Why Machines Will Never Rule the World

    29/05/2023 Duration: 55min

    Barry Smith and Jobst Landgrebe, authors of the book “Why Machines will never Rule the World,” join us today. They discussed the limitations of AI systems in today’s world. They also shared elaborate reasons AI will struggle to attain the level of human intelligence.

  • A Psychopathological Approach to Safety in AGI

    23/05/2023 Duration: 49min

    While the possibilities with AGI emergence seem great, it also calls for safety concerns. On the show, Vahid Behzadan, an Assistant Professor of Computer Science and Data Science, joins us to discuss the complexities of modeling AGIs to accurately achieve objective functions. He touched on tangent issues such as abstractions during training, the problem of unpredictability, communications among agents, and so on.

  • The NLP Community Metasurvey

    15/05/2023 Duration: 49min

    Julian Michael, a postdoc at the Center for Data Science, New York University, joins us today. Julian’s conversation with Kyle was centered on the NLP community metasurvey: a survey aimed at understanding expert opinions on controversial NLP issues. He shared the process of preparing the survey as well as some shocking results.

  • Skeptical Survey Interpretation

    10/05/2023 Duration: 21min

    Kyle shares his own perspectives on challenges getting insight from surveys. The discussion ranges from commentary on the market research industry to specific advice for detecting disingenuous or fraudulent responses and filtering them from your analysis. Finally, he shares some quick thoughts on the usage of the Chi-Square test for interpreting cross tab results in survey analysis.  

  • The Gallup Poll

    01/05/2023 Duration: 40min

    Jeff Jones, a Senior Editor at Gallup, joins us today. His conversation with Kyle spanned a range of topics on Gallup’s poll creation process. He discussed how Gallup generates unbiased questionnaires, gets respondents, analyzes results, and everything in between.

page 5 from 29