For the RFP Rocket project at Nettra Media, the goal was simple: take a process that took an account manager several hours and compress it to minutes. Here's the technical breakdown of how we did it.
The pipeline at a glance
The flow looked like this: a Slack message triggers an n8n webhook → n8n pulls the RFP brief from Google Drive → the brief is chunked and sent to the OpenAI API → structured JSON responses are assembled into a Word doc → the finished packet is posted back to Slack and filed in Drive.
n8n (self-hosted), OpenAI gpt-4o, Google Drive API, Slack webhooks, and docxtemplater for Word assembly. Total infrastructure cost: ~$15/month on a small VPS.
Getting consistent output from the model
The hardest part wasn't hooking up the APIs — it was making the model produce output that was reliable enough to drop straight into a client document. Two techniques made the biggest difference:
1. Structured outputs (JSON mode)
We used OpenAI's response_format: { type: "json_object" } and included a JSON schema in the system prompt. This eliminated the "the model added a preamble paragraph" class of failures entirely.
{
"model": "gpt-4o",
"response_format": { "type": "json_object" },
"messages": [
{
"role": "system",
"content": "Return a JSON object with keys: executive_summary, scope_of_work, pricing_narrative. Each value is a string of 2-3 professional paragraphs."
},
{ "role": "user", "content": rfpBrief }
]
}
2. Few-shot examples in the system prompt
We included one complete example of a well-written RFP response in the system prompt. The model's tone and formatting immediately converged on the agency's house style without any further tuning.
Managing API quotas in n8n
n8n's built-in retry logic handles transient 429s, but we also added an explicit rate-limiter node before any OpenAI call using the "Wait" node with a short fixed delay. For document-heavy batches we split the workflow into a parent (scheduling) and child (per-document) execution so failures were isolated.
OpenAI tier-based rate limits reset per minute. If you're processing multiple documents in a loop, add a 1–2 second delay between calls even if you're within your TPM limit. Burst overages are the most common failure mode in production.
Results
The agency went from 4–6 hours per RFP to under 30 minutes. The remaining time is human review, which is exactly where it should be.
Don't try to automate the human decision — automate the tedious information-gathering and formatting that surrounds it. That's where the 80% time savings live.
The first version had no few-shot examples and no JSON schema. It still cut turnaround time by 60%. You don't need a perfect prompt to get value — ship, collect feedback, and iterate.