

00:00
Why AI Isn't Just More Software: A Guide to ML, MLOps, and Reinforcement Learning
machine learning
mlops
reinforcement learning
ai strategy
software engineering
Why can't you apply Agile sprints to an AI project? This episode dives into why ML development is 'fuzzy' and non-linear, unlike traditional software. We explore the 'nothing, nothing, something' problem that frustrates engineers and managers alike. Discover the real-world challenges of MLOps, from testing non-deterministic models to deployment. The conversation also breaks down Reinforcement Learning (RL), explaining how it learns from exploration, the high-stakes risks, and its role in training LLMs.
Hosted by
Deejay
Featuring
Phil Winder
Guest Role & Company
CEO @ Winder.AI
Guest Socials
Episode Transcript

Episode Highlights
Discusses why AI projects are 'fuzzy' and non-linear, unlike the prescriptive, plannable nature of traditional software engineering.
Explains that ML models need to be 'massaged and babied' and often show 'nothing, nothing, something' progress, frustrating agile teams.
Testing AI models is probabilistic, not pass/fail, requiring 'fuzzing' to find edge cases where the model lacks data.
Reinforcement Learning (RL) is defined by its 'agency to explore' an environment, a key difference from other ML types.
The biggest challenge in RL is the need for a safe simulation, as live exploration in industrial settings 'could be catastrophic.'
'Offline Reinforcement Learning' is a powerful alternative that can train effective agents purely from pre-existing logged data.
Modern LLMs are already trained using RL, which uses human feedback to fine-tune models for better conversational behavior.
Related Episodes
Share This Episode
https://re-cinq.com/podcast/why-ai-isnt-just-more-software-a-guide-to-ml-mlops-and-reinforcement-learning




