Manifold

Robots, Small Models, and RL with DeepSeek Alumnus Zihan Wang — #86

Informações:

Synopsis

Zihan Wang is an AI researcher at Northwestern University, where he works on vision-language models, robotics, and reinforcement learning. Previously, he interned at DeepSeek, contributing to projects like DeepSeek-V2.Zihan's homepage:  https://zihanwang314.github.io/(00:00) - Introduction (01:13) - Zihan's Background, CS and AI Research in China (11:09) - DeepSeek; Human capital flow from PRC to US (16:07) - DeepSeek, Open Source and AI Research (31:52) - Model Size and Performance Constraints (33:01) - Data Bottleneck in Pre-trained Models (34:12) - Transformer Architecture and Scaling Laws (36:30) - Efficiency in Model Training (47:44) - Chain of Experts Architecture (01:01:06) - Future of AI and Robotics Music used with permission from Blade Runner Blues Livestream improvisation by State Azure.–Steve Hsu is Professor of Theoretical Physics and of Computational Mathematics, Science, and Engineering at Michigan State University. Previously, he was Senior Vice President for Research and Innovation at MSU an