Google’s Vertex AI SDK could allow RCE through bucket squatting

2 hours ago 3

Google reportedly patched a flaw in the Vertex AI SDK for Python that could allow attackers to hijack model uploads and trigger remote code execution across tenants.

A design flaw in the Vertex AI software development kit (SDK) for Python, Google Cloud’s managed platform for building, training, and deploying AI agents, could allow hijacking and poisoning of models outside of a developer’s own Google Cloud project.

According to Unit 42 researchers, a combination of bad bucket naming logic and missing authentication made it possible for an attacker to hijack the victim’s project by just knowing their project ID and region.

“Since no two buckets across all of Google Cloud can share the same name, an attacker who is able to predict a bucket name can preemptively create it in their own project,” the researchers said in a blog post. “Any subsequent attempt to use a bucket with that name, even from a different project, silently falls back to the attacker’s bucket.“

Researchers said this is a known class of vulnerability that “takes advantage of the global uniqueness” of cloud storage bucket names. They called it “Bucket Squatting”.

Successful exploitation could inject a malicious model that gets loaded by the Vertex AI infrastructure, resulting in code execution across tenants. The flaw was reported to Google, which reportedly fixed the underlying issue.

Google did not immediately respond to CSO’s request for comments.

pickle deserialization for cross-tenant RCE

According to Unit 42, the vulnerable model workflow in Vertex AI SDK for Python versions 1.139.0 and 1.140.0 relied on a staging bucket name derived exclusively from a customer’s project ID and region. When a bucket with that name already existed, the SDK only verified its existence and did not confirm ownership.

This created a bucket-squatting scenario in which an attacker could pre-create a bucket matching a victim’s expected staging bucket and wait for model uploads to be directed there. Once a model artifact was uploaded to the attacker-controlled bucket, the attacker could replace it with a malicious version during a narrow race-condition window before Vertex AI’s service agent retrieved it.

The attack could turn into an RCE as machine learning models in Python are commonly stored using pickle or Joblib serialization formats. Since pickle deserialization can execute arbitrary code through specially crafted objects, a poisoned model could run remote code when loaded by Vertec AI’s serving infrastructure.

This cross-tenant exploitation process was dubbed “Pickle in the Middle” by the researchers as it depended, in parts, on the deserialization of Python’s built-in pickle module.

Google fixed the AI-hunted bug

As part of the research, Unit 42 incorporated a large language model (LLM) into its code analysis workflow to accelerate vulnerability discovery.

“Analysis that once took days can now be executed significantly faster,” the researchers said. “By iteratively narrowing the model’s focus and instructing it to look for specific patterns, we found paths that led to resources provisioned on the cloud, affected by user-controlled or project-derived inputs.”

Google reportedly modified the affected workflow so that staging buckets are now validated before use, preventing attackers from registering bucket names that could be mistaken for resources belonging to other projects.

The fixes were deployed in SDK versions 1.144.0 and 1.148.0, and users must upgrade to either of the patched versions.

SUBSCRIBE TO OUR NEWSLETTER

From our editors straight to your inbox

Get started by entering your email address below.

Read Entire Article