Data Skeptic

  • Author: Vários
  • Narrator: Vários
  • Publisher: Podcast
  • Duration: 291:45:45
  • More information

Informações:

Synopsis

Data Skeptic is a data science podcast exploring machine learning, statistics, artificial intelligence, and other data topics through short tutorials and interviews with domain experts.

Episodes

  • Do Results Generalize for Privacy and Security Surveys

    17/01/2023 Duration: 40min

    Today, Jenny Tang, a Ph.D. student of societal computing at Carnegie Mellon University discusses her work on the generalization of privacy and security surveys on platforms such as Amazon MTurk and Prolific. Jenny shared the drawbacks of using such online platforms, the discrepancies observed about the samples drawn, and key insights from her results.

  • 4 out of 5 Data Scientists Agree

    10/01/2023 Duration: 28min

    This episode kicks off the new season of the show, Data Skeptic: Surveys.  Linhda rejoins the show for a conversation with Kyle about her experience taking surveys and what questions she has for the season.  Lastly, Kyle announces the launch of survey.dataskeptic.com, a new site we're launching to gather your opinions.  Please take a moment and share your thoughts!

  • Crowdfunded Board Games

    26/12/2022 Duration: 34min

    It may be intuitive to think crowdfunding a project drives its innovation and novelty, but there are no empirical studies that prove this. On the show, Johannes Wachs shares his research that sought to determine whether crowdfunding truly drives innovation. He used board games as a case study and shared the results he found.

  • Russian Election Interference Effectiveness

    19/12/2022 Duration: 41min

    There were reports of Russia’s interference in the 2016 US elections. In today’s episode, Koustuv Saha, a researcher at Microsoft Research walks us through the effect of targeted ads for political campaigns. Using practical examples, he discusses how targeted ads can propagate fake news, its ripple effects on electioneering, and how to find a sweet spot with targeted ads.

  • Placement Laundering Fraud

    15/12/2022 Duration: 32min

    There is an unsung kind of ad fraud brewing in the ad tech space — placement laundering fraud. On the show, Jeff Kline discusses what placement laundering fraud is, how it can be identified, and possible solutions to it. Listen to learn more.

  • Data Clean Rooms

    12/12/2022 Duration: 31min

    Bosko Milekic, the Co-founder of Optable, a data collaboration platform for the media and advertising industry, joins us today. Bosko talked about the clean rooms, the technology driving data privacy during collaboration. He discussed why clean rooms are gaining widespread adoption, and how users can exploit Optable’s clean room platform for a secured data-sharing experience.

  • Dark Patterns in Site Design

    05/12/2022 Duration: 34min

    Kerstin Bongard-Blanchy is a Research Associate at the University of Luxembourg. She joins us to discuss her study that investigated dark patterns in web designs. She discussed the results, the effect of dark patterns effect on users, whether an average user can detect them, and the way forward to a more ethical web space.

  • Internet Advertising Bureau Media Lab

    03/12/2022 Duration: 37min

    We are joined by Anthony Katsur, the CEO of IAB Tech Lab. Anthony discusses standards within the ad tech industry. He explained how IAB Tech Lab set and propagates global standards, actions to ensure compliance from advertisers, and industry trends for a more privacy-centric ad tech space.

  • Your Mouse Reveals Your Gender and Age

    28/11/2022 Duration: 39min

    When we navigate a webpage, it is fairly easy for our mouse movement to be tracked and collected. Today, Luis Leiva, a Professor of Computer Science discusses how these mouse tracking data can be used to predict age, gender and user attention. He also discusses the privacy concerns with mouse tracking data and possible ways it can be curtailed.

  • Measuring Web Search Behavior

    21/11/2022 Duration: 36min

    On the show, Aleksandra Urman and Mykola Makhortykh join us to discuss their work on the comparative analysis of web search behavior using web tracking data. They shared interesting results from their analysis, bordering around the user preferences for search engines, demographic patterns, and differences between how men and women surf the net.

  • StrategyQA and Big Bench

    18/11/2022 Duration: 41min

    Did Aristotle Use a Laptop?  That's a question from the StrategyQA benchmark which highlights the stretch goals for current artificial intelligence systems.  Answering a question like that requires several cognitive steps and reasoning.  Constructing a dataset of similarly challenging questions is a major undertaking.  On today's episode, Mor Geva returns to share details about the creation of StrategyQA and the larger Big Bench dataset it has been included in.

  • Ad Blockers Effect on News Consumption

    14/11/2022 Duration: 38min

    While at first glance, the use of ad blockers drops the revenue of news publishers, this may not be completely true. On the show today, Shunyao Yan, an Assistant Professor in Marketing at Leavey School of Business, Santa Clara University, discussed the effect of ad blockers on news consumption and how ad blockers can potentially be helpful for news publishers.

  • Your Consent is Worth 75 Euros a Year

    07/11/2022 Duration: 24min

    People who do not want their data tracked and shared online can pay a token for a cookie paywall. But are the websites keeping to their side of the bargain? Victor Morel, a Postdoc candidate at the Chalmers University of Technology joins us to discuss his work around auditing the activities of cookie paywalls. He discussed the findings from his analysis and proffers some solutions to making cookie paywalls more transparent.

  • Automated Email Generation for Targeted Attacks

    31/10/2022 Duration: 45min

    The advancement of generative language models has been a force for good, but also for evil. On the show, Avisha Das, a post-doctoral scholar at the University of Texas Health Center, joins us to discuss how attackers use machine learning to create unsuspecting phishing emails. She also discussed how she used RNN for automated email generation, with the goal of defeating statistical detectors. 

  • Tribal Marketing

    24/10/2022 Duration: 37min

    Peter Gloor, a Research Scientist at the MIT Center for Collective Intelligence, takes us on a new world of tribe classification. He extensively discussed the need for such classification on the internet and how he built a machine learning model that does it. Listen to find out more!

  • Debiasing GPT-3 Job Ads

    10/10/2022 Duration: 48min

    We hear about the impeccable achievements of GPT-3 models, but such large generative models come with their bias. On the show today, Conrad Borchers, a Ph.D. student in Human-Computer Interaction, joins us to discuss the bias in GPT-3 for job ads and how such large models can be de-biased. Listen to learn more!

  • ML Ops in Production

    06/10/2022 Duration: 41min

    Moses Guttman from Clear ML joins us to share insights about how organizations leveraging machine learning keep their programs on track.  While many parallels exist between the software development life cycle (SWLC) and the machine learning development life cycle, successful deployments of ML in production have demonstrated that a unique set of tools is required.  Moses and I discuss the emergence of ML Ops, success stories, and how modern teams leverage tools like Clear ML's open source solution to maximize the value of ML in the organization.  

  • Ad Network Tomography

    03/10/2022 Duration: 35min

    Data sharing in the ad tech space has largely been a black box system. While it is obvious the data is being collected, the data sharing process is obscure to users. On the show today, Maaz Bin Musa and Rishab, both researchers at the University of Iowa, speak about the importance of data transparency and their tool, ATOM for data transparency. Listen to find out how ATOM uncovers data-sharing relationships in the ad-tech space.

  • First Party Tracking Cookies

    26/09/2022 Duration: 35min

    When you accept cookies on a website, you cannot tell whether the cookies are used for tracking your personal data or not. Shaoor Munir’s machine learning model does that. On the show today, the Ph.D student at the University of California, discussed the world of first-party cookies and how he developed a machine learning model that predicts whether a first-party cookie is used for tracking purposes.

page 6 from 29