Skip to content

Commit 5da2616

Browse files
authored
fix(gcp): use authorized session for workbench notebooks API (#2442)
### Type of change - [x] Bug fix (non-breaking change that fixes an issue) - [x] Documentation update ### Summary This fixes recurring GCP Vertex AI Workbench sync failures where `notebooks.googleapis.com` location discovery returned `401 Unauthorized` in production/staging runs. Changes made: - Switched Workbench Notebooks API calls to use `google.auth.transport.requests.AuthorizedSession` instead of manually constructing `Authorization: Bearer ...` headers from discovery credentials. - Extended `paginate_vertex_api()` to optionally execute requests via a provided authorized session (used by Workbench v2 instance listing path). - Added focused unit tests for: - authorized-session usage in Workbench location discovery - `401` handling behavior - passing the authorized session through to paginated Workbench instance calls - Updated GCP configuration docs to include: - optional `roles/notebooks.viewer` and `roles/run.viewer` - enabling `notebooks.googleapis.com` - note on per-location Cloud Run permission warnings being skipped gracefully ### Breaking changes None. ### How was this tested? - added tests - tested locally: ### Checklist #### General - [x] I have read the [contributing guidelines](https://cartography-cncf.github.io/cartography/dev/developer-guide.html). - [x] The linter passes locally (`make lint`). - [x] I have added/updated tests that prove my fix is effective or my feature works. #### Proof of functionality - [x] New or updated unit/integration tests. #### If you are changing a node or relationship - [ ] Updated the [schema documentation](https://github.com/cartography-cncf/cartography/tree/master/docs/root/modules). - [ ] Updated the [schema README](https://github.com/cartography-cncf/cartography/blob/master/docs/schema/README.md). #### If you are implementing a new intel module - [ ] Used the NodeSchema [data model](https://cartography-cncf.github.io/cartography/dev/writing-intel-modules.html#defining-a-node). ### Notes for reviewers - The `401` appeared consistently in multiple production-like runs for Workbench location discovery, while other Vertex API calls succeeded in the same runs. - This PR intentionally keeps behavior non-fatal for Workbench discovery failures (skip Workbench for that project/run), matching existing graceful-degradation behavior. --------- Signed-off-by: Kunaal Sikka <kunaal@subimage.io>
1 parent ac260b5 commit 5da2616

4 files changed

Lines changed: 272 additions & 95 deletions

File tree

cartography/intel/gcp/vertex/instances.py

Lines changed: 96 additions & 85 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
import logging
22
from typing import Dict
33
from typing import List
4+
from typing import Optional
45

56
import neo4j
7+
import requests
68
from googleapiclient.discovery import Resource
79

810
from cartography.client.core.tx import load
@@ -14,98 +16,105 @@
1416

1517

1618
@timeit
17-
def get_workbench_api_locations(aiplatform: Resource, project_id: str) -> List[str]:
19+
def get_workbench_api_locations(
20+
aiplatform: Resource,
21+
project_id: str,
22+
) -> Optional[List[str]]:
1823
"""
1924
Gets all available Workbench (In Notebooks API) API locations for a project.
2025
The Notebooks API uses both zones and regions, unlike Vertex AI which primarily uses regions.
2126
Filters to commonly-used locations to improve sync performance.
2227
"""
23-
import requests
24-
from google.auth.transport.requests import Request as AuthRequest
28+
from google.auth.transport.requests import AuthorizedSession
2529

26-
# Get credentials and refresh token if needed
2730
creds = aiplatform._http.credentials
28-
if not creds.valid:
29-
creds.refresh(AuthRequest())
31+
session = AuthorizedSession(creds)
3032

3133
# Query Notebooks API for available locations
3234
notebooks_endpoint = "https://notebooks.googleapis.com"
3335
url = f"{notebooks_endpoint}/v1/projects/{project_id}/locations"
34-
headers = {
35-
"Authorization": f"Bearer {creds.token}",
36-
"Content-Type": "application/json",
37-
}
36+
response = session.get(url, timeout=60)
37+
38+
if response.status_code == 401:
39+
logger.warning(
40+
"Unauthorized when trying to get Notebooks API locations for project %s. "
41+
"Ensure credentials are valid for notebooks.googleapis.com and that the "
42+
"Notebooks API is enabled on the host/quota project.",
43+
project_id,
44+
)
45+
return None
46+
if response.status_code == 403:
47+
logger.warning(
48+
"Access forbidden when trying to get Notebooks API locations for project %s. "
49+
"Ensure the Notebooks API is enabled and you have the necessary permissions.",
50+
project_id,
51+
)
52+
return None
53+
if response.status_code == 404:
54+
logger.warning(
55+
"Notebooks API locations not found for project %s. "
56+
"The Notebooks API may not be enabled.",
57+
project_id,
58+
)
59+
return None
3860

3961
try:
40-
response = requests.get(url, headers=headers)
4162
response.raise_for_status()
42-
data = response.json()
43-
44-
# Filter to commonly-used locations to avoid excessive API calls
45-
# Include major regions and their zones
46-
# Reference: https://cloud.google.com/vertex-ai/docs/general/locations
47-
supported_prefixes = {
48-
"us-central1",
49-
"us-east1",
50-
"us-east4",
51-
"us-west1",
52-
"us-west2",
53-
"us-west3",
54-
"us-west4",
55-
"europe-west1",
56-
"europe-west2",
57-
"europe-west3",
58-
"europe-west4",
59-
"asia-east1",
60-
"asia-northeast1",
61-
"asia-northeast3",
62-
"asia-southeast1",
63-
"australia-southeast1",
64-
"northamerica-northeast1",
65-
"southamerica-east1",
66-
}
67-
68-
locations = []
69-
all_locations = data.get("locations", [])
70-
for location in all_locations:
71-
# Extract location ID from the full path
72-
# Format: "projects/PROJECT_ID/locations/LOCATION_ID"
73-
location_id = location.get("locationId", "")
74-
75-
# Check if this location matches any of our supported prefixes
76-
# This handles both regions (us-central1) and zones (us-central1-a, us-central1-b)
77-
if any(location_id.startswith(prefix) for prefix in supported_prefixes):
78-
locations.append(location_id)
79-
80-
logger.info(
81-
f"Found {len(locations)} supported Notebooks API locations "
82-
f"(filtered from {len(all_locations)} total) for project {project_id}"
83-
)
84-
return locations
85-
86-
except requests.exceptions.HTTPError as e:
87-
if e.response.status_code == 403:
88-
logger.warning(
89-
f"Access forbidden when trying to get Notebooks API locations for project {project_id}. "
90-
"Ensure the Notebooks API is enabled and you have the necessary permissions.",
91-
)
92-
elif e.response.status_code == 404:
93-
logger.warning(
94-
f"Notebooks API locations not found for project {project_id}. "
95-
"The Notebooks API may not be enabled.",
96-
)
97-
else:
98-
logger.error(
99-
f"Error getting Notebooks API locations for project {project_id}: {e}",
100-
exc_info=True,
101-
)
102-
return []
103-
except Exception as e:
63+
except requests.HTTPError:
10464
logger.error(
105-
f"Unexpected error getting Notebooks API locations for project {project_id}: {e}",
65+
"Error getting Notebooks API locations for project %s: HTTP %s - %s",
66+
project_id,
67+
response.status_code,
68+
response.reason,
10669
exc_info=True,
10770
)
108-
return []
71+
raise
72+
73+
data = response.json()
74+
75+
# Filter to commonly-used locations to avoid excessive API calls
76+
# Include major regions and their zones
77+
# Reference: https://cloud.google.com/vertex-ai/docs/general/locations
78+
supported_prefixes = {
79+
"us-central1",
80+
"us-east1",
81+
"us-east4",
82+
"us-west1",
83+
"us-west2",
84+
"us-west3",
85+
"us-west4",
86+
"europe-west1",
87+
"europe-west2",
88+
"europe-west3",
89+
"europe-west4",
90+
"asia-east1",
91+
"asia-northeast1",
92+
"asia-northeast3",
93+
"asia-southeast1",
94+
"australia-southeast1",
95+
"northamerica-northeast1",
96+
"southamerica-east1",
97+
}
98+
99+
locations = []
100+
all_locations = data.get("locations", [])
101+
for location in all_locations:
102+
# Extract location ID from the full path
103+
# Format: "projects/PROJECT_ID/locations/LOCATION_ID"
104+
location_id = location.get("locationId", "")
105+
106+
# Check if this location matches any of our supported prefixes
107+
# This handles both regions (us-central1) and zones (us-central1-a, us-central1-b)
108+
if any(location_id.startswith(prefix) for prefix in supported_prefixes):
109+
locations.append(location_id)
110+
111+
logger.info(
112+
"Found %s supported Notebooks API locations (filtered from %s total) for project %s",
113+
len(locations),
114+
len(all_locations),
115+
project_id,
116+
)
117+
return locations
109118

110119

111120
@timeit
@@ -119,33 +128,28 @@ def get_workbench_instances_for_location(
119128
Note: This queries the Notebooks API v2 for Workbench instances. The v2 API is used
120129
by the GCP Console for creating new Workbench instances. The v1 API is deprecated.
121130
"""
122-
from google.auth.transport.requests import Request as AuthRequest
131+
from google.auth.transport.requests import AuthorizedSession
123132

124133
from cartography.intel.gcp.vertex.utils import paginate_vertex_api
125134

126-
# Get credentials and refresh token if needed
127135
creds = aiplatform._http.credentials
128-
if not creds.valid:
129-
creds.refresh(AuthRequest())
136+
session = AuthorizedSession(creds)
130137

131138
# Prepare request parameters for Notebooks API v2
132139
# Workbench Instances use notebooks.googleapis.com/v2, not aiplatform.googleapis.com
133140
notebooks_endpoint = "https://notebooks.googleapis.com"
134141
parent = f"projects/{project_id}/locations/{location}"
135-
headers = {
136-
"Authorization": f"Bearer {creds.token}",
137-
"Content-Type": "application/json",
138-
}
139142
url = f"{notebooks_endpoint}/v2/{parent}/instances"
140143

141144
# Use helper function to handle pagination and error handling
142145
return paginate_vertex_api(
143146
url=url,
144-
headers=headers,
147+
headers={"Content-Type": "application/json"},
145148
resource_type="workbench instances",
146149
response_key="instances",
147150
location=location,
148151
project_id=project_id,
152+
session=session,
149153
)
150154

151155

@@ -184,7 +188,8 @@ def transform_workbench_instances(instances: List[Dict]) -> List[Dict]:
184188
transformed_instances.append(transformed_instance)
185189

186190
logger.info(
187-
f"Transformed {len(transformed_instances)} Vertex AI Workbench instances"
191+
"Transformed %s Vertex AI Workbench instances",
192+
len(transformed_instances),
188193
)
189194
return transformed_instances
190195

@@ -234,6 +239,12 @@ def sync_workbench_instances(
234239
# Note: We use the Notebooks API location list, not Vertex AI locations, because
235240
# Workbench Instances can be deployed in zones (e.g., us-east1-b) not just regions
236241
locations = get_workbench_api_locations(aiplatform, project_id)
242+
if locations is None:
243+
logger.warning(
244+
"Skipping Vertex AI Workbench instances sync for project %s to preserve existing data.",
245+
project_id,
246+
)
247+
return
237248

238249
# Collect instances from all locations
239250
all_instances = []

cartography/intel/gcp/vertex/utils.py

Lines changed: 29 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,29 @@ def handle_vertex_api_response(
3131
"""
3232
if response.status_code == 404:
3333
logger.debug(
34-
f"Vertex AI {resource_type} not found in {location} for project {project_id}. "
35-
f"This location may not have any {resource_type}."
34+
"Vertex AI %s not found in %s for project %s. This location may not have any %s.",
35+
resource_type,
36+
location,
37+
project_id,
38+
resource_type,
3639
)
3740
return None, False
3841
elif response.status_code == 403:
3942
logger.warning(
40-
f"Access forbidden when trying to get Vertex AI {resource_type} in {location} "
41-
f"for project {project_id}."
43+
"Access forbidden when trying to get Vertex AI %s in %s for project %s.",
44+
resource_type,
45+
location,
46+
project_id,
4247
)
4348
return None, False
4449
elif response.status_code != 200:
4550
logger.error(
46-
f"Error getting Vertex AI {resource_type} in {location} for project {project_id}: "
47-
f"HTTP {response.status_code} - {response.reason}",
51+
"Error getting Vertex AI %s in %s for project %s: HTTP %s - %s",
52+
resource_type,
53+
location,
54+
project_id,
55+
response.status_code,
56+
response.reason,
4857
exc_info=False,
4958
)
5059
return None, False
@@ -55,34 +64,40 @@ def handle_vertex_api_response(
5564

5665
def paginate_vertex_api(
5766
url: str,
58-
headers: Dict[str, str],
67+
headers: Optional[Dict[str, str]],
5968
resource_type: str,
6069
response_key: str,
6170
location: str,
6271
project_id: str,
72+
session: Optional[Any] = None,
6373
) -> List[Dict]:
6474
"""
6575
Handle paginated requests to Vertex AI regional endpoints.
6676
6777
:param url: Base API URL (without pagination params)
68-
:param headers: HTTP headers including Authorization
78+
:param headers: Optional HTTP headers
6979
:param resource_type: Type of resource (for logging)
7080
:param response_key: Key in JSON response containing the resource list
7181
:param location: GCP location/region
7282
:param project_id: GCP project ID
83+
:param session: Optional authorized session used to execute requests
7384
:return: List of all resources across all pages
7485
"""
7586
import requests
7687

7788
resources = []
7889
page_token = None
90+
request_headers = headers or {}
7991

8092
while True:
8193
params: Dict[str, str] = {}
8294
if page_token:
8395
params["pageToken"] = page_token
8496

85-
response = requests.get(url, headers=headers, params=params)
97+
if session is not None:
98+
response = session.get(url, headers=request_headers, params=params)
99+
else:
100+
response = requests.get(url, headers=request_headers, params=params)
86101

87102
# Handle response with common error patterns
88103
data, should_continue = handle_vertex_api_response(
@@ -101,6 +116,10 @@ def paginate_vertex_api(
101116
break
102117

103118
logger.info(
104-
f"Found {len(resources)} Vertex AI {resource_type} in {location} for project {project_id}"
119+
"Found %s Vertex AI %s in %s for project %s",
120+
len(resources),
121+
resource_type,
122+
location,
123+
project_id,
105124
)
106125
return resources

docs/root/modules/gcp/config.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ Grant the following roles to the identity at the **organization level**. This en
1919
| `roles/bigquery.connectionUser` | List BigQuery connections | Optional |
2020
| `roles/cloudasset.viewer` | Sync IAM policy bindings (effective policies across org hierarchy) | Optional |
2121
| `roles/artifactregistry.reader` | List/get Artifact Registry repositories and artifacts | Optional |
22+
| `roles/run.viewer` | List/get Cloud Run services, jobs, and executions | Optional |
23+
| `roles/notebooks.viewer` | List/get Vertex AI Workbench (Notebooks API) resources | Optional |
2224

2325
To grant a role at the organization level:
2426
```bash
@@ -68,6 +70,7 @@ gcloud services enable secretmanager.googleapis.com --project=YOUR_HOST_PROJECT
6870
gcloud services enable artifactregistry.googleapis.com --project=YOUR_HOST_PROJECT
6971
gcloud services enable run.googleapis.com --project=YOUR_HOST_PROJECT
7072
gcloud services enable aiplatform.googleapis.com --project=YOUR_HOST_PROJECT
73+
gcloud services enable notebooks.googleapis.com --project=YOUR_HOST_PROJECT
7174
gcloud services enable cloudasset.googleapis.com --project=YOUR_HOST_PROJECT
7275
```
7376

@@ -79,6 +82,8 @@ If you set `GOOGLE_CLOUD_QUOTA_PROJECT` to override the default quota project, e
7982

8083
If an API is not enabled on your host/quota project, Cartography will log a warning and skip syncing that resource type rather than crashing. Other modules will continue normally.
8184

85+
Some services also emit per-location permission warnings (for example Cloud Run in restricted regions). Cartography logs these and skips only affected locations.
86+
8287
### Cloud Asset Inventory (CAI)
8388

8489
Cartography uses the [Cloud Asset Inventory API](https://cloud.google.com/asset-inventory/docs/overview) for two features:

0 commit comments

Comments
 (0)