Monitoring API usage
Stay organized with collections
Save and categorize content based on your preferences.
This page describes how to use API metrics to track and understand your usage
of Google APIs and Google Cloud APIs.
Google APIs produce detailed usage metrics that can help you:
Track and understand your usage of Google APIs.
Monitor performance of your applications and Google APIs.
Discover problems between your applications and Google APIs.
It can dramatically speed up resolution times when you troubleshoot problems
or need technical support from Google.
The metrics that Google APIs produce are the standard signals that Google's
own Site Reliability Engineers use to assess the health of a service.
These metrics covers request counts, error rates, total latencies, backend
latencies, request sizes, and response sizes. For the API metric definitions,
see
Cloud Monitoring documentation.
You can view API metrics in two places:
API Dashboard and
Cloud Monitoring. The metrics you see are specific to
your project, and they don't reflect the overall service status.
Using the API Dashboard
The simplest way to view your API metrics is to use the Google Cloud
console's API Dashboard. You can see an
overview of all your API usage, or you can drill down to your usage of a
specific API.
To see an overview of your API usage:
Visit Cloud console's APIs and Services section.
The main API Dashboard is displayed by default. In this page you can
see all the APIs you currently have enabled for your project,
as well as overview charts for the following metrics:
Traffic: the number of requests per second made by or about your
project to enabled APIs
Errors: the percentage of requests to enabled APIs that
resulted in errors
Median latency: the median latency for requests to enabled APIs,
if available".
To view usage details for a specific API:
Select the API you want to view in the main API Dashboard list of APIs.
The API's Overview page shows a more detailed traffic chart with a
breakdown by response code.
For even more detailed usage information, select View metrics.
By default, the following pre-built charts are displayed,
though more are available:
Traffic by response code
Errors by API method
Overall latency at the 50th, 95th, and 99th percentile
Latency by API method (median)
If you want to add more charts, you can select additional
pre-built charts from the Select Graphs drop-down menu.
Using Cloud Monitoring
If you use Cloud Monitoring, you can dive deeper into available metrics
data using the Metrics Explorer to give you greater insight into your API usage.
Cloud Monitoring supports a wide variety of metrics, which you can combine
with filters and aggregations for new and insightful views into your application
performance. For example, you can combine a request count metric with a filter
on the HTTP Response Code class to build a dashboard that shows error rates over
time, or you can look at the 95th percentile latency of requests to the Cloud
Pub/Sub API.
Available metrics
The following table lists the available serviceruntime metrics.
The API-usage metrics are those that include
consumed_api as a monitored resource.
The "metric type" strings in this table must be prefixed
with serviceruntime.googleapis.com/. That prefix has been
omitted from the entries in the table.
When querying a label, use the metric.labels. prefix; for
example, metric.labels.LABEL="VALUE".
Metric type Launch stage(Resource hierarchy levels) Display name
The count of completed requests. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds. protocol:
The protocol of the request, e.g. "http", "grpc".
response_code:
The HTTP response code for HTTP requests, or HTTP equivalent code for gRPC requests. See code mapping in https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto.
response_code_class:
The response code class for HTTP requests, or HTTP equivalent class for gRPC requests, e.g. "2xx", "4xx".
grpc_status_code:
The numeric gRPC response code for gRPC requests, or gRPC equivalent code for HTTP requests. See code mapping in https://github.com/googleapis/googleapis/blob/master/google/rpc/code.proto.
Distribution of backend latencies in seconds for non-streaming requests. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds.
Distribution of request latencies in seconds for non-streaming requests excluding the backend. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds.
Distribution of request sizes in bytes recorded at request completion. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds.
Distribution of response sizes in bytes recorded at request completion. Sampled every 60 seconds. After sampling, data is not visible for up to 1800 seconds.
The count of MCP requests. response_code:
The HTTP response code for HTTP requests, or HTTP equivalent code for MCP requests.
response_code_class:
The response code class for HTTP requests, or HTTP equivalent class for gRPC requests, e.g. '2xx', '4xx'.
The total consumed allocation quota. Values reported more than 1/min are dropped. If no changes are received in quota usage, the last value is repeated at least every 24 hours. Sampled every 60 seconds. quota_metric:
The name of quota metric or quota group.
The number of times exceeding the concurrent quota was attempted. Sampled every 86400 seconds. After sampling, data is not visible for up to 180 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
time_window:
The window size for concurrent operation limits.
The concurrent limit for the quota. Sampled every 86400 seconds. After sampling, data is not visible for up to 180 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
time_window:
The window size for concurrent operation limits.
The concurrent usage of the quota. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
time_window:
The window size for concurrent operation limits.
The error happened when the quota limit was exceeded. Sampled every 60 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
The limit for the quota. Sampled every 86400 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
The total consumed rate quota. Sampled every 60 seconds. After sampling, data is not visible for up to 240 seconds. method:
The API method name, such as "disks.list".
quota_metric:
The name of quota metric or quota group.
The number of times exceeding the windowed rate quota was attempted. Sampled every 86400 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
window_size:
The window size for rate operation limits.
The windowed rate limit for the quota. Sampled every 86400 seconds. limit_name:
The quota limit name, such as "Requests per day" or "In-use IP addresses".
quota_metric:
The name of quota metric or quota group.
window_size:
The window size for rate operation limits.
The windowed rate usage of the quota. Sampled every 60 seconds. method:
The API method name, such as "disks.list".
limit_name:
The quota limit name, such as "RequestsPerDay" or "InUseIpAddresses".
quota_metric:
The name of quota metric or quota group.
window_size:
The window size for rate operation limits.
window_start_time:
The start time of the window.
Deprecated. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. quota_name:
Deprecated.
credential_id:
Deprecated.
quota_location:
Deprecated.
Table generated at 2026-06-18 17:12:37 UTC.
To see API metrics in Metrics Explorer, select Consumed API as the resource
type, then select one of the serviceruntime metrics. Then use the filter and
aggregation options to refine your data.
After you've found the API usage information you want, you can use
Cloud Monitoring to create custom dashboards and alerts that will help you
continue to monitor and maintain a robust application. You can find out how to
do this in the following pages:
API metrics can be particularly useful if you need to contact Google when
something goes wrong, and may even show you that you don't need to contact
support at all. For example:
If all of your calls to a service are failing for a single credential ID, but
not any other, chances are there is something wrong with that account that you
can easily fix yourself without opening a ticket.
You’re troubleshooting a problem with your app, and notice a correlation
between your application’s degraded performance and a sustained increase in
the 50th percentile latency of a critical GCP service. Definitely call us and
point us to this data so we can start working on the problem as quickly as
possible.
The latencies for a GCP service report look good and unchanged from before,
but your in-app metrics report that the latency on calls to the service is
abnormally high. That tells you that there is some trouble in the network.
Call your network provider (in some cases, Google) to get the debugging
process started.
Best practices
While API metrics are an extremely useful tool, there are issues you need to
consider to make sure they provide useful information, particularly when setting
up alerts based on metric values. The following best practices will help you get
the most from API metrics data.
Is latency causing a problem?
While some services are quite latency-sensitive, for others scale and
reliability matter more. Some APIs, Cloud Storage or
BigQuery for example, can have a couple of seconds of high
latency without customers noticing. With data from API metrics, you can learn
what your users need from a given service.
Look for changes from the norm
Before you decide to alert on a particular metric value, consider what actually
counts as unusual behavior. Looking at your API metrics can show you that
latency results for most services fall within a normal distribution: a big hump
in the middle, and outliers on either side. The metrics will help you understand
the normal distribution so that you can engineer your app to work well within
the distribution curve. Metrics can also help you correlate distribution changes
with times where your app is not working as intended, to help you find the root
cause of an issue. We expect the 99th percentile to look very different than the
median — what we don’t expect are dramatic changes in those percentiles
over time.
Also you may see that some kinds of requests take longer than others. If the
median size of a photo uploaded to Google Photos is 4 MB, but you normally
upload 20 MB RAW files, your average time to upload 20 photos is likely to be
substantially worse than that of most users, but is still your normal
behavior.
All this means that it's not particularly useful to alert the first time a
second-long RPC or 5xx HTTP call is detected. Instead, when investigating a
Google service as a possible cause for an issue your application is
experiencing, compare the return codes and latency rates over time and
look for sustained changes from the norm that are correlated with observed issues in your application.
Traffic rate
API metrics are most useful where you have a high volume of traffic going to the
API. If you call a service only intermittently, your API metrics won’t be
statistically valid and won’t give you meaningful triage information.
For example, if you want to track the 99.5th percentile latency for a service,
and you only do 100 calls an hour, watching the measurement over a two hour
period would only give you one data point to represent the 99.5th percentile,
which won't tell you much about the normal behavior of the API or your
application. Make sure the traffic rate, the percentile you are tracking,
and the time window you are considering generate many data points of interest
or the monitoring data will not be helpful to you.
Supported APIs
All Google APIs and Google Cloud APIs, as well as APIs built on top of Cloud
Endpoints and API Gateway, support API metrics. If you are an API consumer,
you can view the Consumed API metrics in the
API Dashboard. If you
are an API producer, you can view the Produced API metrics in the
Endpoints Dashboard.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-06-18 UTC."],[],[]]