Effective monitoring of API gateways requires understanding the key metrics that reveal performance, errors, and usage patterns. This guide covers essential metrics for AWS API Gateway and Oracle API Gateway, showing how to interpret them and set up effective monitoring workflows.

Why Gateway Metrics Matter

API gateways are critical infrastructure that can become bottlenecks. Monitoring these 5 key areas prevents issues:

  1. Error rates (4xx/5xx)
  2. Latency percentiles
  3. Cache effectiveness
  4. Traffic patterns
  5. Backend integration health

AWS API Gateway Metrics (CloudWatch)

4XXError

Client-side errors (HTTP 4xx) including modified gateway responses

Namespace: AWS/ApiGateway Unit: Count

5XXError

Server-side errors (HTTP 5xx) indicating backend issues

Namespace: AWS/ApiGateway Unit: Count

Latency

End-to-end request time from API Gateway receipt to response

Includes integration latency Unit: Milliseconds

IntegrationLatency

Time between API Gateway sending to backend and receiving response

Isolates backend performance Unit: Milliseconds

CacheHitCount

Requests served from API cache (when caching enabled)

Compare with CacheMissCount Unit: Count

Count

Total API requests in period - your traffic baseline

Primary volume metric Unit: Count

Sample AWS CLI Command

aws cloudwatch get-metric-statistics \
  --namespace AWS/ApiGateway \
  --metric-name Latency \
  --dimensions Name=ApiName,Value=MyAPI Name=Stage,Value=prod \
  --statistics Average \
  --period 3600 \
  --start-time 2025-05-01T00:00:00Z \
  --end-time 2025-05-02T00:00:00Z

Oracle API Gateway Metrics

AWS Metric Oracle Equivalent Key Differences
4XXError HttpResponses{status=4xx} Oracle separates by exact status code
5XXError BackendHttpResponses{status=5xx} Oracle distinguishes gateway vs backend errors
Latency Latency + InternalLatency Oracle provides more granular timing breakdown
CacheHitCount ResponseCacheAction Oracle includes cache write metrics

Unique Oracle Metrics

Data Volume

  • BytesReceived: Inbound data size
  • BytesSent: Outbound data size

Business Metrics

  • UsagePlanRequests: Track by subscription tier
  • SubscriberRequests: Per-client usage

Viewing and Analyzing Metrics

CloudWatch Console

  1. Navigate to CloudWatch Metrics
  2. Select API Gateway namespace
  3. Filter by API, stage, or method
  4. Create dashboards with key metrics

Third-Party Tools

  • Datadog: Correlate with app metrics
  • Prometheus: For custom metric collection
  • Grafana: Visualization and alerting

Method-Level Metrics

AWS requires explicit enabling of detailed method metrics which may incur additional charges. Oracle provides these by default but with higher cardinality costs.

Creating Effective Alerts

Error Rate Alerts

Trigger when 5xx errors exceed 1% of traffic or 4xx errors spike unexpectedly

Latency Alerts

Monitor p99 latency crossing SLO thresholds (e.g., >500ms)

Traffic Anomalies

Detect unusual request patterns that may indicate attacks

Sample CloudWatch Alarm

aws cloudwatch put-metric-alarm \
  --alarm-name "High-5XX-Rate" \
  --metric-name 5XXError \
  --namespace AWS/ApiGateway \
  --statistic Sum \
  --period 300 \
  --evaluation-periods 2 \
  --threshold 10 \
  --comparison-operator GreaterThanThreshold \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:MyAlerts

Advanced Custom Metrics

When built-in metrics aren't sufficient, create custom metrics from:

Access Logs

  • Parse with Lambda or Fluentd
  • Extract client-specific patterns
  • Calculate business metrics

X-Ray Traces

  • Analyze latency distributions
  • Track downstream dependencies
  • Identify bottleneck segments

Real-World Implementation

A fintech company improved their API reliability by:

  • Setting 5xx alerts at 0.5% threshold
  • Creating custom metrics for PII detection
  • Building dashboards with 95th percentile latency

Result: 40% faster incident detection and 25% lower error rates.

Best Practices

1. Monitor Key Ratios

Track CacheHit/CacheMiss and 4xx/5xx ratios rather than just counts

2. Dimension Filtering

Break down by stage, method, and resource for troubleshooting

3. Baseline Comparison

Compare current metrics to historical baselines

Conclusion

Effective API gateway monitoring requires tracking the right metrics with appropriate granularity. While AWS and Oracle provide similar core metrics around errors, latency, and traffic, their implementations differ in:

  • Granularity: Oracle provides more detailed status code breakdowns
  • Cost Structure: AWS charges for detailed metrics while Oracle has higher base cardinality
  • Business Metrics: Oracle includes more subscription-aware tracking

For comprehensive monitoring, combine platform metrics with custom metrics from logs and traces. Set up alerts on error rates and latency thresholds, but also monitor trends and ratios that indicate emerging issues before they impact users.