Your MCP Tools Don't Need Anthropic: The No-Terminal Way to Connect Gemini
If you've built a custom MCP server, you've probably structured your entire workflow around Claude. The tools live on the server. The context files live on the server. The data lives on the server. Claude is just the interface that calls them.
Which means when Anthropic has an outage — and they do — your entire workflow goes dark. Not because your server is down. Because the one client you built for is unavailable.
The fix turned out to be simpler than expected: Google's Gemini API supports the same MCP protocol. You can connect a second AI client to your existing server in about five minutes, no terminal commands required.
How Everyone Else Is Doing This (And Why It's More Work)
Most guides for connecting Gemini to an MCP server send you down the Gemini CLI path. The typical setup looks like this:
npm install -g @google/gemini-cli
gemini mcp add my-server http://your-server:8000/mcp
From there you're managing a ~/.gemini/settings.json file, dealing with OAuth flows, and running everything through a terminal session. The Docker MCP Toolkit approach is similar — you need Docker Desktop installed, a separate MCP gateway running, and a CLI command to wire it together.
These setups work. But they add infrastructure that doesn't need to exist if you're already working in VS Code.
The Simpler Path: Continue + config.yaml
Continue is a VS Code extension that connects directly to any MCP server over streamable-http. No CLI. No npm install. No Docker dependency. Just a YAML file.
Step 1: Install Continue
VS Code Extensions panel → search "Continue" → install.
Step 2: Edit ~/.continue/config.yaml
Add your MCP server and Gemini models. The entire config is one file:
models:
- name: Gemini 2.5 Flash
model: gemini-2.5-flash
provider: gemini
apiKey: YOUR_GEMINI_API_KEY
- name: Gemini 2.5 Flash Lite
provider: gemini
model: gemini-2.5-flash-lite
apiKey: YOUR_GEMINI_API_KEY
mcpServers:
- name: your-mcp-server
url: http://your-server-ip:8000/mcp
type: streamable-http
Step 3: Reload VS Code
Cmd+Shift+P → Developer: Reload Window
That's it. Your full MCP tool suite is now available from the Continue chat panel with Gemini as the orchestrator.
Get your API key at aistudio.google.com. Free tier works for testing. Enable billing for production use — the paid tier rate limits are orders of magnitude higher and cost almost nothing at normal usage volumes.
Why This Works: Your Intelligence Lives on the Server
Most people think of the AI model as the intelligent part of the system. In an MCP setup, that framing is backwards.
Your tools, memory, context files, and agents all live on the MCP server. The model is an orchestrator — it decides which tools to call and in what order, then synthesizes the results. The actual domain intelligence is distributed across your server infrastructure.
This means the model is more replaceable than it seems. As long as a client supports MCP tool-calling and can reason about sequencing, it can drive the same workflow. Claude, Gemini, or anything else.
The practical upside: when Anthropic has downtime, your data tools stay available. You switch clients, not workflows.
What Gemini Flash Actually Handles Well
In practice, Gemini Flash handles MCP tool orchestration cleanly. It:
- Calls tools in the correct sequence without explicit instruction
- Checks for existing records before writing new ones
- Filters and summarizes structured tool output accurately
- Handles multi-step workflows — retrieve, analyze, write — without losing thread
Flash Lite is noticeably faster and sufficient for most retrieval and summarization tasks. Flash is the better choice when reasoning across multiple tool outputs simultaneously.
The Gemma Problem (And Why It Stays Server-Side)
While setting this up, we tried adding gemma-3-27b-it directly as a Continue model through the same Gemini API. The model resolves correctly — it's a valid endpoint — but hits a hard 15,000 token/minute cap regardless of whether your account is on free or paid tier.
This is a model-level limit Google applies to Gemma specifically. Upgrading billing doesn't change it.
For now, Gemma is better kept as a server-side tool — called programmatically through your MCP server where you control the call rate — rather than used as an orchestrating client. Flash and Flash Lite don't have this limitation.
One Small Detail: Use a Separate API Key
If your MCP server already calls the Gemini API for server-side tasks, your Continue client will share the same quota. Create a second API key at aistudio.google.com and assign it to your Continue config — different project, separate quota. The keys are free.
# Continue config — dedicated key
models:
- name: Gemini 2.5 Flash
provider: gemini
model: gemini-2.5-flash
apiKey: YOUR_CONTINUE_ONLY_KEY
What Continue Can't Do
Continue with Gemini has full access to your MCP tools. What it doesn't have is everything outside the MCP protocol — SSH access, file editing, bash execution. Those remain Claude Code territory.
In practice the split is clean:
- Tool calls, data queries, writes to external services → Continue + Gemini
- Server config, file edits, deployments → Claude Code
Continue becomes your always-available lightweight client for data work. Claude Code handles infrastructure changes.
The Bottom Line
The Gemini CLI approach works, but it trades simplicity for flexibility most developers don't need. If you're already in VS Code and already have an MCP server running, a twelve-line YAML edit gets you a fully functional second client without touching the terminal.
Your MCP server already abstracts your tools away from any single provider. This just completes that architecture on the client side.






