Google Cloud Run MCP server
Create a powerful Model Context Protocol (MCP) server for Google Cloud Run to deploy, manage, and scale containerized applications without managing servers. This integration enables AI agents to automate service deployment, manage revisions, split traffic for gradual rollouts, and monitor serverless applications with secure service account authentication.
Setting up an MCP server
This article covers the standard steps for creating an MCP server in AI Gateway and connecting it to an AI client. The steps are the same for every integration — application-specific details (API credentials, OAuth endpoints, and scopes) are covered in the individual application pages.
Before you begin
You'll need:
- Access to AI Gateway with permission to create MCP servers
- API credentials for the application you're connecting (see the relevant application page for what to collect)
Create an MCP server
Find the API in the catalog
- Sign in to AI Gateway and select MCP Servers from the left navigation.
- Select New MCP Server.
- Search for the application you want to connect, then select it from the catalog.
Configure the server
- Enter a Name for your server — something descriptive that identifies both the application and its purpose (for example, "Zendesk Support — Prod").
- Enter a Description so your team knows what the server is for.
- Set the Timeout value. 30 seconds works for most APIs; increase to 60 seconds for APIs that return large payloads.
- Toggle Production mode on if this server will be used in a live workflow.
- Select Next.
Configure authentication
Enter the authentication details for the application. This varies by service — see the Authentication section of the relevant application page for the specific credentials, OAuth URLs, and scopes to use.
Configure security
- Set any Rate limits appropriate for your use case and the API's own limits.
- Enable Logging if you want AI Gateway to record requests and responses for auditing.
- Select Next.
Deploy
Review the summary, then select Deploy. AI Gateway provisions the server and provides a server URL you'll use when configuring your AI client.
Connect to an AI client
Once your server is deployed, you'll need to add it to the AI client your team uses. Select your client for setup instructions:
Tips
- You can create multiple MCP servers for the same application — for example, a read-only server for reporting agents and a read-write server for automation workflows.
- If you're unsure which OAuth scopes to request, start with the minimum read-only set and add write scopes only when needed. Most application pages include scope recommendations.
- You can edit a server's name, description, timeout, and security settings after deployment without redeploying.
Authentication
Google Cloud Run uses OAuth 2.0 with service accounts for API access. You'll create a service account in your Google Cloud project and download a JSON key file. The service account needs the Cloud Run Admin role (or permissions like run.services.create, run.services.update, run.services.delete, iam.serviceAccountUser). The Google OAuth endpoint is https://oauth2.googleapis.com/token, and the integration requires scope https://www.googleapis.com/auth/cloud-platform for full Cloud Run access.
Available tools
These tools enable AI agents to manage the full Cloud Run lifecycle—deployment, configuration, traffic management, and monitoring. Together they support CI/CD automation, canary deployments, and serverless infrastructure as code.
| Tool | Description |
|---|---|
| List services | Find all Cloud Run services in a region or project |
| Create service | Deploy a container image as a new Cloud Run service |
| Get service | Retrieve details about a service (memory, CPU, environment variables) |
| Update service | Change service configuration (memory, concurrency, image) |
| Delete service | Remove a service and stop all revisions |
| List revisions | View all versions of a service |
| Get revision | Fetch details about a specific revision |
| Delete revision | Remove an old revision to save costs |
| Update traffic | Split traffic between revisions (for canary or blue-green deployments) |
| Create domain mapping | Link a custom domain to a service |
| Delete domain mapping | Remove a custom domain |
| List jobs | View Cloud Run jobs (batch tasks) |
| Create job | Create a new job for scheduled or event-driven work |
| Run job | Execute a job immediately |
| List executions | View past job executions and their status |
Tips
Build workflows that deploy from artifact repositories automatically when new container images are pushed — no manual gcloud commands needed.
Use traffic splitting to deploy new versions to 10% of traffic first to test behavior in production with real users.
Gradually increase traffic (20%, 50%, 100%) as you gain confidence that the new version is performing well.
Create a second revision with new code and test it thoroughly before switching any traffic to it.
Switch 100% traffic instantly once you confirm the new version is working correctly — if issues arise, roll back just as fast.
Automate setting environment variables, memory limits, and concurrency limits based on your environment (dev, staging, prod) without manual edits.
Use Cloud Run Jobs to automate batch tasks (cleanup, reporting, data sync) on a schedule without managing Kubernetes or cron infrastructure.
Cequence AI Gateway