Unintended Data Exposure

Microsoft AI Research Team's Open-Source Initiative Unveils Critical Security Lapse

Microsoft AI Research Team's Unintended Data Exposure

In a noble pursuit to contribute to the research community, the Microsoft Research AI team inadvertently exposed a substantial 38 terabytes of the company's sensitive data on the internet while sharing open-source code and AI models for image recognition. This unforeseen exposure was discovered by Wiz, a cybersecurity firm, which stumbled upon a link containing critical data, including backups of employees' computers mistakenly made public by Microsoft.

Contained within these backups were invaluable assets such as Microsoft service passwords, confidential cryptographic keys, and a staggering 30 thousand internal messages exchanged among hundreds of employees within this technological juggernaut. Despite this alarming breach, the company sought to reassure stakeholders in its incident report, emphasizing that "no customer data had been compromised" and that no other internal services were at risk.

The Unintentional Leak and Its Fallout

The inadvertent exposure stemmed from the intentional inclusion of a link within the files, allowing interested researchers to download pre-trained artificial intelligence in cars (AI) models. This process involved Microsoft researchers leveraging an Azure feature known as "SAS tokens," empowering users to generate sharing links that grant access to specific content within their Azure Storage accounts.

Users of this feature possess the ability to fine-tune the access permissions granted via SAS links, be it for a singular file, an entire folder, or the entirety of the storage repository. However, in this particular case, researchers inadvertently shared a link that provided unrestricted access to the complete data storage account.

Immediate Action and Future Precautions

Upon uncovering the breach, the cybersecurity firm promptly reported the issue to Microsoft on June 22. Impressively, Microsoft acted swiftly, revoking the SAS token by the following day. Nonetheless, it was distressing to note that this lapse in security had allowed the data to be publicly accessible for a staggering period of three years, underscoring the gravity of the situation.

While the problematic link has been rectified, it's imperative to recognize that misconfigured SAS tokens could potentially lead to future data leaks and grave privacy concerns. The company has affirmed its commitment to proactive security measures by announcing ongoing scans of its public repositories and a heightened vigilance for tokens that inadvertently reveal more information than necessary, enhancing their ability to preempt similar occurrences in the future.

Page updated

Google Sites

Report abuse