Google BigQuery MCP Server
Create a powerful Model Context Protocol (MCP) server for Google BigQuery in minutes with our AI Gateway. This guide walks you through setting up seamless data warehouse integration with enterprise analytics and instant OAuth authentication.
About Google BigQuery API
Google BigQuery is a fully-managed, serverless data warehouse that enables scalable analysis over petabytes of data. The API provides comprehensive access to datasets, tables, queries, and machine learning capabilities with standard SQL support.
Key Capabilities
- Serverless Architecture: No infrastructure management
- Petabyte Scale: Analyze massive datasets
- Standard SQL: ANSI-compliant SQL dialect
- Real-time Analytics: Streaming data ingestion
- Machine Learning: Built-in ML with BigQuery ML
- Geospatial Analysis: Geography functions
- Data Sharing: Cross-org dataset sharing
- Cost Control: On-demand and flat-rate pricing
API Features
- RESTful API: JSON over HTTPS
- OAuth 2.0: Secure authentication
- Client Libraries: Multiple languages
- Streaming Inserts: Real-time data loading
- Batch Loading: Bulk data import
- Export Options: Multiple formats
- IAM Integration: Fine-grained access
- Audit Logging: Comprehensive tracking
What You Can Do with BigQuery MCP Server
The MCP server transforms BigQuery API into a natural language interface, enabling AI agents to:
Query Operations
-
SQL Queries
- "Show total sales by region last quarter"
- "Find top 10 customers by revenue"
- "Calculate year-over-year growth"
- "Analyze user behavior patterns"
-
Data Exploration
- "Preview data in customers table"
- "Show table schema"
- "Get row count for all tables"
- "Sample 100 random records"
-
Query Management
- "List running queries"
- "Cancel long-running query"
- "Get query execution plan"
- "Show query history"
Dataset & Table Management
-
Dataset Operations
- "Create analytics dataset"
- "List all datasets"
- "Share dataset with team"
- "Set dataset expiration"
-
Table Operations
- "Create partitioned table"
- "Load CSV into table"
- "Copy table to backup"
- "Delete old tables"
-
Schema Management
- "Add column to table"
- "Update field descriptions"
- "Create nested schema"
- "Validate schema changes"
Data Loading & Export
-
Batch Loading
- "Import data from GCS"
- "Load Parquet files"
- "Import from Cloud SQL"
- "Load JSON newline data"
-
Streaming Inserts
- "Stream real-time events"
- "Insert sensor data"
- "Load transaction stream"
- "Buffer and batch inserts"
-
Data Export
- "Export query results to GCS"
- "Save table as CSV"
- "Export to Cloud Bigtable"
- "Create Avro backup"
BigQuery ML
-
Model Creation
- "Create linear regression model"
- "Build classification model"
- "Train time series model"
- "Create clustering model"
-
Model Operations
- "Evaluate model performance"
- "Make predictions"
- "Explain model features"
- "Export model"
-
ML Workflows
- "Retrain with new data"
- "A/B test models"
- "Feature engineering"
- "Hyperparameter tuning"
Data Discovery
-
Metadata Search
- "Find tables with PII"
- "Search column names"
- "Discover data assets"
- "Tag sensitive data"
-
Data Profiling
- "Analyze data quality"
- "Find null percentages"
- "Detect anomalies"
- "Profile distributions"
Views & Materialized Views
-
View Management
- "Create business logic view"
- "Update view definition"
- "Create authorized view"
- "List view dependencies"
-
Materialized Views
- "Create materialized view"
- "Refresh materialized data"
- "Monitor refresh status"
- "Optimize query performance"
Security & Access Control
-
IAM Management
- "Grant dataset access"
- "Create service account"
- "Set row-level security"
- "Audit permissions"
-
Data Governance
- "Apply column encryption"
- "Set data retention"
- "Configure DLP scanning"
- "Manage data policies"
Prerequisites
- Access to Cequence AI Gateway
- Google Cloud Project
- BigQuery API enabled
- Service account or OAuth credentials
Step 1: Configure BigQuery API Access
1.1 Enable BigQuery API
- Go to Google Cloud Console
- Navigate to APIs & Services Library
- Search for "BigQuery API"
- Click Enable
1.2 Create Service Account
- Go to IAM & Admin Service Accounts
- Click Create Service Account
- Configure:
- Name: "AI Gateway BigQuery"
- Role: BigQuery Admin (or custom role)
- Create and download JSON key
1.3 OAuth 2.0 Setup (Alternative)
- Go to APIs & Services Credentials
- Create OAuth 2.0 Client ID
- Configure consent screen
- Add authorized redirect URIs
1.4 Set Permissions
Grant necessary permissions:
bigquery.datasets.create
bigquery.tables.create
bigquery.tables.getData
bigquery.jobs.create
- Additional permissions as needed
Step 2-4: Standard Setup
Follow standard steps to access AI Gateway, find BigQuery API, and create MCP server.
Step 5: Configure API Endpoints
- Base URL:
https://bigquery.googleapis.com/bigquery/v2
- Project ID: Your GCP project ID
- Location: Dataset location (e.g., US, EU)
- Click Next
Step 6: MCP Server Configuration
- Name: "BigQuery Analytics"
- Description: "Data warehouse and analytics"
- Query Timeout: 300 seconds (5 minutes)
- Click Next
Step 7: Configure Authentication
- Authentication Type: Service Account or OAuth 2.0
- For Service Account:
- Upload JSON key file
- Or paste key contents
- For OAuth 2.0:
- Enter Client ID and Secret
- Configure scopes
- Test connection
Available BigQuery API Operations
Query APIs
-
Query Execution
- Run interactive queries
- Execute batch queries
- Get query results
- List query jobs
-
Query Configuration
- Set query parameters
- Configure destination
- Set processing location
- Control caching
Dataset APIs
-
Dataset Management
- Create dataset
- Update properties
- Delete dataset
- List datasets
-
Access Control
- Get/Set ACLs
- Share datasets
- Row-level security
Table APIs
-
Table Operations
- Create table
- Update schema
- Delete table
- Copy table
-
Data Operations
- Insert rows
- Stream data
- Get table data
- Export table
Job APIs
- Job Management
- List jobs
- Get job details
- Cancel job
- Get job statistics
Model APIs (BigQuery ML)
-
Model Operations
- Create model
- List models
- Delete model
- Export model
-
Prediction
- ML.PREDICT
- ML.EVALUATE
- ML.EXPLAIN_PREDICT
Step 8-10: Complete Setup
Configure security settings, choose deployment options, and deploy your server.
Using Your BigQuery MCP Server
With Claude Desktop
{
"servers": {
"bigquery": {
"url": "your-mcp-server-url",
"auth": {
"type": "service-account",
"credentials": "base64-encoded-service-account-key"
},
"config": {
"project_id": "your-project-id",
"location": "US"
}
}
}
}
Natural Language Commands
- "Show me total revenue by product category last month"
- "Create a table for customer transactions"
- "Load the sales CSV file into BigQuery"
- "Build a model to predict customer churn"
- "Export query results to Google Cloud Storage"
API Integration Example
// Initialize MCP client
const mcpClient = new MCPClient({
serverUrl: 'your-mcp-server-url',
auth: {
type: 'service-account',
credentials: serviceAccountKey
}
});
// Run a query
const queryResults = await mcpClient.bigquery.query({
query: `
SELECT
product_category,
SUM(revenue) as total_revenue,
COUNT(DISTINCT customer_id) as unique_customers,
AVG(order_value) as avg_order_value
FROM \`project.dataset.sales\`
WHERE date >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
GROUP BY product_category
ORDER BY total_revenue DESC
LIMIT 10
`,
useLegacySql: false,
location: 'US'
});
console.log('Top Product Categories:');
queryResults.rows.forEach(row => {
console.log(`${row.product_category}: $${row.total_revenue.toLocaleString()}`);
});
// Create a dataset
const dataset = await mcpClient.bigquery.createDataset({
datasetId: 'analytics_warehouse',
location: 'US',
description: 'Central analytics data warehouse',
defaultTableExpirationMs: 90 * 24 * 60 * 60 * 1000, // 90 days
labels: {
environment: 'production',
team: 'analytics'
}
});
// Create a partitioned table
const table = await mcpClient.bigquery.createTable({
datasetId: 'analytics_warehouse',
tableId: 'events',
schema: {
fields: [
{ name: 'event_timestamp', type: 'TIMESTAMP', mode: 'REQUIRED' },
{ name: 'user_id', type: 'STRING', mode: 'REQUIRED' },
{ name: 'event_type', type: 'STRING', mode: 'REQUIRED' },
{ name: 'event_properties', type: 'JSON', mode: 'NULLABLE' },
{ name: 'device_info', type: 'RECORD', mode: 'NULLABLE', fields: [
{ name: 'platform', type: 'STRING' },
{ name: 'version', type: 'STRING' },
{ name: 'browser', type: 'STRING' }
]}
]
},
timePartitioning: {
type: 'DAY',
field: 'event_timestamp',
expirationMs: 365 * 24 * 60 * 60 * 1000 // 1 year
},
clustering: {
fields: ['user_id', 'event_type']
}
});
// Load data from Cloud Storage
const loadJob = await mcpClient.bigquery.loadTableFromGCS({
datasetId: 'analytics_warehouse',
tableId: 'customers',
sourceUris: [
'gs://my-bucket/customers/customers_*.csv'
],
sourceFormat: 'CSV',
skipLeadingRows: 1,
autodetect: true,
writeDisposition: 'WRITE_TRUNCATE',
maxBadRecords: 100,
allowJaggedRows: false
});
// Wait for job completion
const completedJob = await mcpClient.bigquery.waitForJob({
jobId: loadJob.jobReference.jobId,
location: loadJob.jobReference.location
});
console.log(`Loaded ${completedJob.statistics.load.outputRows} rows`);
// Stream data
const streamResponse = await mcpClient.bigquery.insertRows({
datasetId: 'analytics_warehouse',
tableId: 'events',
rows: [
{
insertId: 'event-001',
json: {
event_timestamp: new Date().toISOString(),
user_id: 'user123',
event_type: 'page_view',
event_properties: {
page: '/products',
referrer: 'google.com'
},
device_info: {
platform: 'web',
version: '2.0',
browser: 'Chrome'
}
}
}
],
skipInvalidRows: false,
ignoreUnknownValues: false
});
// Create BigQuery ML model
const modelQuery = `
CREATE OR REPLACE MODEL \`analytics_warehouse.customer_churn_model\`
OPTIONS(
model_type='LOGISTIC_REG',
input_label_cols=['churned'],
auto_class_weights=TRUE,
data_split_method='AUTO_SPLIT',
max_iterations=20
) AS
SELECT
customer_age,
total_purchases,
days_since_last_purchase,
average_order_value,
support_tickets_count,
churned
FROM \`analytics_warehouse.customer_features\`
WHERE date >= '2024-01-01'
`;
const modelJob = await mcpClient.bigquery.query({
query: modelQuery,
useLegacySql: false
});
// Make predictions
const predictions = await mcpClient.bigquery.query({
query: `
SELECT
customer_id,
predicted_churned,
predicted_churned_probs[OFFSET(1)].prob as churn_probability
FROM ML.PREDICT(
MODEL \`analytics_warehouse.customer_churn_model\`,
(
SELECT
customer_id,
customer_age,
total_purchases,
days_since_last_purchase,
average_order_value,
support_tickets_count
FROM \`analytics_warehouse.customer_features\`
WHERE date = CURRENT_DATE()
)
)
WHERE predicted_churned_probs[OFFSET(1)].prob > 0.7
ORDER BY churn_probability DESC
`
});
console.log(`${predictions.totalRows} customers at high risk of churn`);
// Create materialized view
const materializedView = await mcpClient.bigquery.query({
query: `
CREATE MATERIALIZED VIEW \`analytics_warehouse.daily_revenue_summary\`
PARTITION BY DATE(order_date)
CLUSTER BY product_category
AS
SELECT
DATE(order_timestamp) as order_date,
product_category,
SUM(revenue) as daily_revenue,
COUNT(DISTINCT customer_id) as unique_customers,
COUNT(*) as order_count
FROM \`analytics_warehouse.orders\`
GROUP BY order_date, product_category
`
});
// Create authorized view with row-level security
const authorizedView = await mcpClient.bigquery.createView({
datasetId: 'analytics_warehouse',
viewId: 'customer_data_secure',
view: {
query: `
SELECT *
FROM \`analytics_warehouse.customers\`
WHERE region = SESSION_USER()
`,
useLegacySql: false
},
access: [
{
role: 'READER',
userByEmail: 'analyst@company.com'
}
]
});
// Export query results
const exportJob = await mcpClient.bigquery.exportQueryResults({
query: 'SELECT * FROM `analytics_warehouse.customers` WHERE active = true',
destinationUri: 'gs://my-bucket/exports/active_customers_*.csv',
format: 'CSV',
compression: 'GZIP',
fieldDelimiter: ',',
printHeader: true
});
// Monitor long-running query
const queryJob = await mcpClient.bigquery.startQuery({
query: complexAnalyticsQuery,
useQueryCache: false,
maximumBytesBilled: 1000000000000 // 1TB limit
});
// Check job status
const jobStatus = await mcpClient.bigquery.getJob({
jobId: queryJob.jobReference.jobId
});
if (jobStatus.status.state === 'RUNNING') {
console.log(`Query progress: ${jobStatus.statistics.query.totalBytesProcessed} bytes processed`);
}
// Set up scheduled query
const scheduledQuery = await mcpClient.bigquery.createScheduledQuery({
displayName: 'Daily Revenue Report',
schedule: 'every day 02:00',
timeZone: 'America/New_York',
query: `
INSERT INTO \`analytics_warehouse.daily_reports\`
SELECT
CURRENT_DATE() as report_date,
SUM(revenue) as total_revenue,
COUNT(DISTINCT customer_id) as customers
FROM \`analytics_warehouse.orders\`
WHERE DATE(order_timestamp) = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
`,
destinationDataset: 'analytics_warehouse',
writeDisposition: 'WRITE_APPEND'
});
Common Use Cases
Data Analytics
- Business intelligence
- Customer analytics
- Product analytics
- Financial reporting
Data Engineering
- ETL pipelines
- Data warehousing
- Real-time streaming
- Data lake integration
Machine Learning
- Predictive modeling
- Customer segmentation
- Anomaly detection
- Recommendation systems
Data Governance
- Access control
- Data lineage
- Compliance reporting
- Data quality monitoring
Best Practices
-
Query Optimization:
- Use partitioning and clustering
- Avoid SELECT *
- Use approximate aggregation functions
- Materialize frequently used queries
-
Cost Management:
- Set byte processing limits
- Use query preview
- Enable query caching
- Monitor slot usage
-
Data Loading:
- Batch small files
- Use appropriate file formats
- Validate schemas
- Handle errors gracefully
Troubleshooting
Common Issues
-
Authentication Errors
- Verify service account key
- Check API enablement
- Validate permissions
- Review OAuth scopes
-
Query Errors
- Check SQL syntax
- Verify table references
- Review data types
- Check resource limits
-
Performance Issues
- Analyze query plan
- Check data distribution
- Review slot allocation
- Optimize table design
Getting Help
- Documentation: AI Gateway Docs
- Support: support@cequence.ai
- BigQuery Docs: cloud.google.com/bigquery/docs