The event looked great, and the t-shirt looked amazing!
If you want to look that good at your next one, pick up a little something here.
PODCAST
Hard-Learned Lessons from Over a Decade in AI
Agents and GenAI are great, but I remember when, back in the day, ML was about helping decisions happen quietly in the background. Funny how that’s still where most of the value is.
I spoke with Mike about why predictive ML still delivers most of the value in production today, especially in automated decisioning systems. We explored how, at Uber, building robust data pipelines - not model serving - was the biggest challenge. That led the Michelangelo team to build an internal feature store, which later inspired the creation of Tecton.
Fraud detection is one area where ML maturity is high. Companies increasingly blend in-house models with external signals to stay ahead:
External APIs are used as model inputs
Pipelines are tuned for rapid iteration
Models are carefully balanced to catch fraud without blocking good users
No ML decision needed - just click below and listen.
PODCAST
Packaging MLOps Tech Neatly for Engineers and Non-engineers
When I was 8 and no one believed I'd done a backflip over a moving car, I learned a hard lesson about the importance of reproducibility. And an even harder one about road surfaces.
I spoke with Jukka about building an open-source ML platform combining Kubeflow, MLflow, KServe, Prometheus, and Grafana. It’s designed to help researchers and engineers run reproducible experiments and deploy models, even on HPC infrastructure.
We explored how Git-based CI/CD adds much-needed traceability, especially as AI regulations tighten. This setup helps move research beyond isolated experiments:
• Pipelines and deployments can be versioned and automated
• Large-model training runs on HPC clusters
• Teams can more easily bridge the gap from research to production
An easy episode to reproduce - click below to listen again and again.
MEME OF THE WEEK

ML CONFESSIONS
I once accidentally improved our feature store performance… by fat-fingering a config.
We were trying to speed up feature retrieval for online inference. Loads of debates about caching strategies, materialization schedules, all the usual. I was testing some settings and meant to set prefetch_factor=2. Typed prefetch_factor=20 instead.
Didn’t notice. Ran the benchmarks. Blazing fast.
I thought something was broken, but checked the logs - nope, we were just hammering the cache aggressively. Weirdly, it worked great for our access pattern (lots of hot features, mostly static).
Showed the results in the team meeting. People were amazed. “What changed?” I mumbled something about tuning prefetch. They asked for the config. I… quickly fixed the typo in the PR and set it to 10 (safer, still fast). Shipped it.
Now everyone calls it "the turbo prefetch" config and it’s our default. Performance improved ~3x. We closed three Jira tickets because of it.
I tell this story proudly. Sometimes mistakes work for you.
Submit your accidental wins here.
A curated GitHub list compiling foundational courses, framework tutorials, and evaluation benchmarks for building and evaluating autonomous LLM-powered agents.
Summarising Mary Meeker’s 340-page 2025 AI Trends deck, highlighting macro AI adoption patterns, infrastructure investment, model cost trends, and enterprise deployment challenges.
Presenting Sentinel, a ModernBERT-large-based model for detecting prompt injection attacks in LLM applications, with open benchmarks and an accompanying model repository for practical evaluation and integration.
A developer-focused session introducing Apple’s Foundation Models framework, enabling on-device LLM capabilities in Swift - covering structured output generation, real-time streaming, tool execution, and multi-turn session management.