Sorry, the response hit the length limit — How I Stopped Fighting Claude Opus (and Started Shipping) | Luca Berton

“Sorry, the response hit the length limit. Please rephrase your prompt.”
Model: Claude Opus 4.5 — 3x Copilot

If you’ve seen this message enough times, you start reading it like a weather forecast:

“Too much. Try again. Good luck.”

At first, I treated it like a bug.
Then I realized it’s closer to a design constraint:

The model is fine. My prompting shape wasn’t.

What actually happened (the unglamorous truth)

I asked for something “simple” like:

Which is basically saying:

“Please generate a small book, in one go.”

Claude Opus tries.
Copilot tries.
And then the response gets guillotined mid-sentence.

In most editors, Copilot isn’t just your prompt.

It’s also:

So even before the model starts answering, you might already be spending a big chunk of the context window.

Then you request a long response…

…and the model goes:

🧠 ✅ “I can do it.”
📦 ❌ “I can’t fit it.”

This one change eliminates 90% of my length-limit pain:

“Write the whole post.”

“Plan it, then write section-by-section.”

You’re not lowering quality — you’re forcing a workflow that fits the model’s constraints.

Give the model a structure and a maximum size per section.

Create an outline with 6 sections.
For each section, include:
- 1 sentence goal
- 3 bullets max
Keep the whole outline under 200 words.