Red flags when building AI
Common red flags that lead to AI builds dragging on indefinitely or features falling flat on launch, and how to distinguish them from the expected discomfort of building probabilistic systems.
Read more →Practical thinking on AI strategy, ML product development, and building AI features that actually work.
Common red flags that lead to AI builds dragging on indefinitely or features falling flat on launch, and how to distinguish them from the expected discomfort of building probabilistic systems.
Read more →Why CTOs should carve out time to write code despite conventional wisdom, and how to do it without becoming a bottleneck.
Read more →Why creating a golden dataset of curated user questions paired with verified correct answers is a critical first step before building AI-powered applications.
Read more →Why the placement of an ML feature in the user journey matters more than model choice, using the Jobs-to-Be-Done framework to define what your model should optimize for.
Read more →Why chat interfaces aren't the future of AI UX, and how AI patterns like smart defaults, contextual suggestions, and proactive helpers create better experiences.
Read more →One of the biggest mindset shifts when building AI features: moving from design-first to data-first, where your data determines what's feasible, stable, and valuable.
Read more →How I help startup founders and tech leads bring clarity and momentum to their AI efforts, from figuring out what's worth building to shipping features users trust.
Read more →Why hand-labeling data and manually inspecting errors are some of the highest-ROI activities when building AI features, despite teams resisting the manual work.
Read more →Why standard uptime monitoring isn't enough for AI-powered features, and the minimum metrics teams should track to catch model drift and degradation.
Read more →How much data do you need to build a predictive model? The honest answer is you don't know upfront, but starting with 100-200 labeled examples is a solid approach.
Read more →Before building an AI feature, make sure someone will actually use it. Key takeaways from The Mom Test for better customer discovery.
Read more →A practical approach to using LLMs as classifiers by combining multi-criteria LLM judgments with a lightweight model like XGBoost for better accuracy and calibration.
Read more →Why there's rarely a single 'best' technical decision, and how decision records, speed, and iterative review help engineering organizations get better over time.
Read more →Five common failure modes that cause ML projects to fail, from mismatched expectations and lack of progress to inadequate data, overly-complex solutions, and lack of expertise.
Read more →Why 1-week sprints are uniquely powerful for ML and research teams, despite seeming counterintuitive for open-ended experimental work.
Read more →