API Gateway Metrics

Effective monitoring of API gateways requires understanding the key metrics that reveal performance, errors, and usage patterns. This guide covers essential metrics for AWS API Gateway and Oracle API Gateway, showing how to interpret them and set up effective monitoring workflows.
Why Gateway Metrics Matter
API gateways are critical infrastructure that can become bottlenecks. Monitoring these 5 key areas prevents issues:
- Error rates (4xx/5xx)
- Latency percentiles
- Cache effectiveness
- Traffic patterns
- Backend integration health
AWS API Gateway Metrics (CloudWatch)
4XXError
Client-side errors (HTTP 4xx) including modified gateway responses
5XXError
Server-side errors (HTTP 5xx) indicating backend issues
Latency
End-to-end request time from API Gateway receipt to response
IntegrationLatency
Time between API Gateway sending to backend and receiving response
CacheHitCount
Requests served from API cache (when caching enabled)
Count
Total API requests in period - your traffic baseline
Sample AWS CLI Command
aws cloudwatch get-metric-statistics \
--namespace AWS/ApiGateway \
--metric-name Latency \
--dimensions Name=ApiName,Value=MyAPI Name=Stage,Value=prod \
--statistics Average \
--period 3600 \
--start-time 2025-05-01T00:00:00Z \
--end-time 2025-05-02T00:00:00Z
Oracle API Gateway Metrics
AWS Metric | Oracle Equivalent | Key Differences |
---|---|---|
4XXError | HttpResponses{status=4xx} | Oracle separates by exact status code |
5XXError | BackendHttpResponses{status=5xx} | Oracle distinguishes gateway vs backend errors |
Latency | Latency + InternalLatency | Oracle provides more granular timing breakdown |
CacheHitCount | ResponseCacheAction | Oracle includes cache write metrics |
Unique Oracle Metrics
Data Volume
- BytesReceived: Inbound data size
- BytesSent: Outbound data size
Business Metrics
- UsagePlanRequests: Track by subscription tier
- SubscriberRequests: Per-client usage
Viewing and Analyzing Metrics
CloudWatch Console
- Navigate to CloudWatch Metrics
- Select API Gateway namespace
- Filter by API, stage, or method
- Create dashboards with key metrics
Third-Party Tools
- Datadog: Correlate with app metrics
- Prometheus: For custom metric collection
- Grafana: Visualization and alerting
Method-Level Metrics
AWS requires explicit enabling of detailed method metrics which may incur additional charges. Oracle provides these by default but with higher cardinality costs.
Creating Effective Alerts
Error Rate Alerts
Trigger when 5xx errors exceed 1% of traffic or 4xx errors spike unexpectedly
Latency Alerts
Monitor p99 latency crossing SLO thresholds (e.g., >500ms)
Traffic Anomalies
Detect unusual request patterns that may indicate attacks
Sample CloudWatch Alarm
aws cloudwatch put-metric-alarm \
--alarm-name "High-5XX-Rate" \
--metric-name 5XXError \
--namespace AWS/ApiGateway \
--statistic Sum \
--period 300 \
--evaluation-periods 2 \
--threshold 10 \
--comparison-operator GreaterThanThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:MyAlerts
Advanced Custom Metrics
When built-in metrics aren't sufficient, create custom metrics from:
Access Logs
- Parse with Lambda or Fluentd
- Extract client-specific patterns
- Calculate business metrics
X-Ray Traces
- Analyze latency distributions
- Track downstream dependencies
- Identify bottleneck segments
Real-World Implementation
A fintech company improved their API reliability by:
- Setting 5xx alerts at 0.5% threshold
- Creating custom metrics for PII detection
- Building dashboards with 95th percentile latency
Result: 40% faster incident detection and 25% lower error rates.
Best Practices
1. Monitor Key Ratios
Track CacheHit/CacheMiss and 4xx/5xx ratios rather than just counts
2. Dimension Filtering
Break down by stage, method, and resource for troubleshooting
3. Baseline Comparison
Compare current metrics to historical baselines
Conclusion
Effective API gateway monitoring requires tracking the right metrics with appropriate granularity. While AWS and Oracle provide similar core metrics around errors, latency, and traffic, their implementations differ in:
- Granularity: Oracle provides more detailed status code breakdowns
- Cost Structure: AWS charges for detailed metrics while Oracle has higher base cardinality
- Business Metrics: Oracle includes more subscription-aware tracking
For comprehensive monitoring, combine platform metrics with custom metrics from logs and traces. Set up alerts on error rates and latency thresholds, but also monitor trends and ratios that indicate emerging issues before they impact users.