Microsoft AI Research SAS Token — Over-Permissioned Token → Public GitHub → 38TB Internal Data Exposed for 3 Years
A Microsoft AI researcher shared a URL to open-source training data on a public GitHub repository. The URL contained an Azure Shared Access Signature token — but instead of being scoped to a specific file or container, it was an Account SAS with full-control permissions to the entire storage account, set to expire in 2051. Anyone who found the URL could read, modify, or delete 38TB of internal Microsoft data including employee workstation backups, private keys, saved passwords, and 30,000+ internal Teams messages. Discovered and responsibly disclosed by Wiz Research in June 2023 after ~3 years of exposure.
When sharing open-source AI training data publicly, the researcher used Azure's SAS token feature but chose the broadest option — an Account SAS — rather than a narrowly-scoped Service SAS. They set permissions to "full control" (read, write, delete) and the expiry to October 2051. Azure does not audit SAS token generation, making this invisible to administrators.
Permissions set: Full control — read, write, delete, list everything
Expiry set: October 6, 2051 (30+ years)
Azure's own warning: "Not possible to audit generation of SAS tokens" — no admin visibility
The researcher committed the complete SAS token URL to the public GitHub repository "robust-models-transfer" as download instructions. GitHub's secret scanning did not cover Account SAS token patterns at the time. The URL was publicly visible for nearly 3 years. In October 2021, the token was renewed — with the expiry extended to October 2051.
Exposed from: July 20, 2020 to June 24, 2023 (2 years 11 months)
Token renewed: October 2021 — expiry extended to 2051 (30 more years)
Scanning gap: GitHub secret scanning did not cover Account SAS tokens until after this disclosure
Anyone with the URL had full access to an internal Azure Blob storage account — not just the intended training data folder. The account contained disk backups of two Microsoft employees' workstations with saved passwords, private keys, and an archive of 30,000+ internal Microsoft Teams messages. Full-control permissions also meant a malicious actor could have injected code into AI model files, creating a supply chain attack vector.
→ Disk backups of 2 employee workstations (passwords, private keys, personal data)
→ 30,000+ Microsoft Teams messages from 359 employees
→ Internal credentials and secret keys
→ Intended open-source AI training data
Supply chain risk: Write access meant an attacker could have injected malicious code into AI model files
Wiz Research runs an ongoing project scanning the internet and public repositories for misconfigured cloud storage. While reviewing Microsoft's public AI GitHub repositories, they found the SAS token URL, followed it, and discovered the full scope of exposure. They reported to Microsoft MSRC on June 22; the token was revoked on June 24, 2023 — 2 days later. Coordinated public disclosure followed on September 18, 2023.
Reported: June 22, 2023 | Token revoked: June 24, 2023 (48 hours)
GitHub URL updated: July 7, 2023 | Public disclosure: September 18, 2023
No evidence: Microsoft found no evidence of malicious exfiltration beyond Wiz's research