🎤 Speaking at KubeCon EU 2026 Lessons Learned Orchestrating Multi-Tenant GPUs on OpenShift AI View Session
Luca Berton
AI Engineering

ProteinLens: Vibe Coding on Azure… and the Christmas Lunch That Triggered ‘Quota exceeded’

Luca Berton
#azure#ai#ai-foundry#serverless#azure-functions#postgresql#static-web-apps#cicd#vibecoding#copilot

On Dec 25th I messaged a few friends: “I’m working on this project… if you want to put the site under pressure with some tests. It’s all on Azure. Merry Christmas!”

The project is ProteinLens: you take a photo of your meal and the app tries to figure out what it is and how much protein it contains. Simple idea, “weekend MVP” vibes — but with a real goal: build something production-shaped, with proper cloud infrastructure and automated deploys.

Then reality showed up, fast:

That Christmas stress test turned into a mini post-mortem. And honestly, it was perfect.


What ProteinLens does (and what’s under the hood)

The MVP does one thing clearly: snap a photo → get a structured result (recognition + estimated proteins). The AI part runs via AI Foundry.

Azure infrastructure overview

Current stack:

In the chat I joked “it’s super scalable.” The services are, on paper. But “scalable services” don’t automatically make a scalable system — especially when an entire group starts hammering it with food photos at the same time.

GitHub Pipeline

The uncomfortable part: “AI writes your code” isn’t real yet (not end-to-end)

I built a lot of this using Spec Driven Vibe Coding with tools like SpecKit/Copilot. They absolutely speed things up — but the “AI will build the whole app for you” story is still oversold.

GitHub Spec Driven Vibe Coding

Here’s what happened when I iterated fast with LLM-generated infra + pipelines:

The most dangerous pattern: Many tools “help” by adding something on every interaction — another resource, another config file, another pipeline step. But real progress often comes from the opposite: removing, simplifying, testing, refining.


“Quota exceeded” isn’t just a bug — it’s product and architecture feedback

GitHub Copilot Business

When multiple people hit an MVP simultaneously, the real bottlenecks show up immediately:

At that moment I realized it wasn’t just “I need more horsepower.” It was: I need engineering discipline — observability, retry/backoff, sane limits, guardrails.

Azure log troubleshooting


Authentication: custom in Postgres or managed identity (Entra / SWA Auth)?

After the first stress test, it became obvious I need auth to unlock:

Because the frontend is on Static Web Apps, the platform already offers built-in auth options — which is appealing for moving fast without reinventing identity from scratch. The alternative is rolling your own user table in Postgres, but then you own passwords, email flows, security, compliance… and the debt grows quickly.

For an MVP that wants to become a real product: managed identity tends to save months.


Another reality check: model availability and quota are part of the design

One more thing that surprised me: you can’t always “just use the best model” when you want.

OpenAI gpt-5.1-codex-max.jpg

Concrete examples I ran into:

Antropic claude-opus-4.5

This affects real decisions: fallback models, region strategy, budgeting, and even UX (clear messages when you’re rate-limited).


What I learned (the honest version)

  1. AI accelerates — it doesn’t replace: if you don’t understand what it generated, you’ll ship a Frankenstein that “sort of works.”
  2. The difference between a demo and a product is made of boring things: retries, logging, caching, tests, limits.
  3. Good iteration includes deleting: fewer resources, fewer variables, less magic.
  4. Quotas aren’t an afterthought — they’re part of system design.
  5. Cloud ≠ local, and fighting that truth wastes time. Build with observability and resilience early.

What’s next for ProteinLens

Very practical next steps:

If you want to try it (and yes, feel free to “stress test” it with a heavy meal photo 😄): https://www.proteinlens.com/

← Back to Blog