Google Workspace Status Dashboard

This page provides status information on the services that are part of Google Workspace. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://workspace.google.com/. For incidents related to Google Analytics, visit the Google Ads Status Dashboard.

Incident affecting NotebookLM

Incident began at 2025-09-29 17:53 and ended at 2025-09-29 18:42 (times are in Coordinated Universal Time (UTC)).

Date Time Description
Oct 2, 2025 8:17 PM UTC

Incident Report

Summary

On Monday, 29 September 2025, several AI products experienced elevated 500 (internal) errors globally for 49 minutes, between 10:53 and 11:42 US/Pacific.

To our customers whose operations were impacted during this disruption, we sincerely apologize. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability.

Root Cause

The incident occurred on a core machine learning (ML) serving platform at Google, responsible for hosting and delivering the Large Language Models that power many of our AI products and services.

A software update intended to improve the platform's performance and scalability introduced a bug in a central control system. This system directs traffic to the correct machine learning models, and the bug caused it to incorrectly determine that some models were unavailable, leading to errors for our customers.

The impact of this issue was unintentionally magnified because the faulty component was responding more quickly, which caused our traffic management system to direct more requests to that component.

Remediation and Prevention

Google engineers were alerted to the outage via manual escalation from multiple products on Monday, 29 September 2025 at 11:09 US/Pacific and immediately started an investigation. Once the underlying cause was identified, the problematic change was rolled back. All the errors returned by ML serving were then reduced to pre-fault levels, mitigating customer impact at 11:42 US/Pacific.

Google is committed preventing a repeat of this issue in the future and is completing the following actions:

  • Update error handling for the ML serving platform to not incorrectly return an unavailable status for deployed models.
  • Shard the deployed models such that the critical ones always get new updates in the last waves of rollouts.
  • Add missing checks to automatically rollback this class of changes.
  • Review and update internal processes to ensure timely communications with external customers.

We apologize for the length and severity of this incident. We are taking immediate steps to prevent a recurrence and improve reliability in the future.

Detailed Description of Impact

On Monday, 29 September 2025, from 10:53 to 11:42 US/Pacific, Google Workspace products with a dependency on the LM Serving infrastructure experienced routing issues globally. The incident resulted in 21 percent of the request failing due to [model] NOT_FOUND error, which cascaded to customers as an internal error.


Sep 30, 2025 5:15 AM UTC

Mini Incident Report

We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support or to Google Workspace Support using help article https://support.google.com/a/answer/1047213.

(All Times US/Pacific)

Incident Start: 29 September, 2025 10:53

Incident End: 29 September, 2025 11:42

Duration: 49 minutes

GWS Affected Services and Features:

Workspace Gemini and NotebookLM

Regions/Zones: Global

Description: Various AI products experienced elevated 500 (internal) errors due to the rollout of a faulty change to production. Google will complete a full IR in the following days that will provide a full root cause.

Customer Impact:

Customers affected by this issue may have observed up to a 100% of requests failing with '500' errors during the impacted time, between 10:53 and 11:42 US/Pacific on 29 September 2025.

Additional details:

A faulty change was rolled out to the control plane of the canonical Machine Learning (ML) serving stack in one cluster. Traffic routers that received their configuration from the faulty cluster assumed that some models were deleted and returned errors to the clients. Due to the error, the faulty control plane cluster had a significantly lower load, which resulted in more routers getting their configuration from the faulty cluster. This exacerbated the impact of the change.

The bad change was rolled back and all the errors returned by ML serving were reduced to pre-fault levels, mitigating customer impact.

Sep 29, 2025 7:56 PM UTC

Summary:

WorkSpace NotebookLM customers may experience elevated errors for online users

Description:

The issue with Workspace NotebookLM has been mitigated as of Monday, 2025-09-29 11:49 PDT.

Based on preliminary investigation, the root cause of the issue was due to a recent change, which has been rolled back.

Customer Symptoms:

Customers are experiencing elevated "INTERNAL_ERROR" with error code 500s

Sep 29, 2025 7:39 PM UTC

Summary:

NotebookLM customers may experience elevated errors for online users

Description:

Our engineering team has identified the cause of the issue and a mitigation has been put in place. We are currently monitoring and believe we are seeing full recovery.

Customer Symptoms:

Customers are experiencing elevated "INTERNAL_ERROR" with error code 500s

Sep 29, 2025 6:57 PM UTC We're investigating reports of an issue with NotebookLM. We will provide more information shortly.

Some users would observe elevated errors from Gemini for Workspace.