Skip to content

Conversation

@fyuan1316
Copy link
Contributor

@fyuan1316 fyuan1316 commented Jan 29, 2026

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive guide for model storage and loading in cloud-native inference environments.
    • Covers S3 object storage, OCI containerized model images, and PVC-based storage with clear examples.
    • Includes authentication guidance, deployment procedures, verification steps, prerequisites, and operational notes.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

Walkthrough

Adds a new MDX documentation file describing three model storage modalities for cloud-native inference—S3 Object Storage, OCI Model-as-Image, and PVC—plus loading mechanisms (Init Container, Sidecar), configuration examples, and deployment snippets.

Changes

Cohort / File(s) Summary
Documentation - Model Storage Guide
docs/en/model_inference/model_management/functions/model_storage.mdx
Adds a new comprehensive guide covering S3 (auth, Kubernetes Secret/ServiceAccount, Init Container example), OCI Model-as-Image (packaging, Sidecar/native OCI examples), and PVC (upload, storageUri, verification). Includes prerequisites, code blocks, and deployment examples.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐇 I hopped through docs to store a dream,
S3 baskets and OCI cream,
PVC burrows snug and neat,
Models safe where cloud and ground meet. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add Model Storage' clearly and directly describes the main change: introducing new documentation for model storage options. It is specific enough to convey the primary purpose of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch add-model-storage

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx`:
- Line 22: Replace the hyphenated S3 endpoint placeholders in the table
("your-s3-service-ip:your-s3-port") with the underscore-formatted placeholders
used in the YAML ("your_s3_service_ip:your_s3_port") so the example strings
match; update all occurrences (e.g., the cell at the shown table row and the
other instance around line 36) to use the underscore format for consistency.
- Around line 5-6: The opening sentence under the "Model Storage" heading
currently lists only S3 and OCI; update that sentence to also mention PVC
(Persistent Volume Claim) so it reflects all storage options documented on the
page (e.g., "You can store a model in an S3 bucket, Open Container Initiative
(OCI) containers, or a Persistent Volume Claim (PVC)."). Locate and edit the
initial paragraph following the "Model Storage" heading and ensure terminology
matches other sections that reference PVC (use "Persistent Volume Claim (PVC)"
on first mention).
- Around line 216-220: Steps 3 and 4 repeat the sentence starter "In…" — reword
them to avoid repetition by merging into one instruction: replace the two lines
starting "In your workbench IDE, navigate to the file browser:" and "In the file
browser, navigate to the home directory." with a single line like "Open the file
browser (Files tab in JupyterLab or Explorer view in code-server) and navigate
to the home directory, which represents the root of your attached PVC." This
keeps the referenced UI elements ("Files tab", "Explorer view", "home
directory") but removes the repeated "In…" sentence starts.
- Around line 125-127: Update the prerequisite text that currently reads "PSA
(Pod Security Admission) Enforce set to Privilege" to use the correct lowercase
Kubernetes Pod Security Admission level: change it to "PSA (Pod Security
Admission) Enforce set to privileged"; ensure the rest of the prerequisite (the
Enable Modelcar line with uidModelcar set to 0) remains unchanged.
- Line 272: Update the storageUri example and add a short note clarifying the
optional model path and namespace behavior: state that the PVC URI format is
pvc://<pvc-name>/<model-path-within-pvc>, that the example storageUri:
pvc://model-pvc refers to the PVC root, and show an example with a specific path
(e.g., pvc://model-pvc/models/my-model); also add a sentence that the PVC must
exist in the same Kubernetes namespace as the InferenceService (namespace does
not apply in the URI).
🧹 Nitpick comments (1)
docs/en/model_inference/model_management/functions/model_storage.mdx (1)

24-24: Add a production safety note for HTTP.
This row reads like a recommendation; consider explicitly warning against HTTP in production.

✏️ Suggested edit
-| HTTPS Enabled | 0 | Encryption disabled for internal test/Demo environment |
+| HTTPS Enabled | 0 | Use only for internal test/demo; use HTTPS (1) in production |

Comment on lines +5 to +6
# Model Storage
You can store a model in an S3 bucket or Open Container Initiative (OCI) containers.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Include PVC in the opening sentence.
The intro says only S3/OCI, but the page also documents PVC storage.

✏️ Suggested edit
-You can store a model in an S3 bucket or Open Container Initiative (OCI) containers.
+You can store a model in an S3 bucket, an Open Container Initiative (OCI) container image, or a PVC.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Model Storage
You can store a model in an S3 bucket or Open Container Initiative (OCI) containers.
# Model Storage
You can store a model in an S3 bucket, an Open Container Initiative (OCI) container image, or a PVC.
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` around
lines 5 - 6, The opening sentence under the "Model Storage" heading currently
lists only S3 and OCI; update that sentence to also mention PVC (Persistent
Volume Claim) so it reflects all storage options documented on the page (e.g.,
"You can store a model in an S3 bucket, Open Container Initiative (OCI)
containers, or a Persistent Volume Claim (PVC)."). Locate and edit the initial
paragraph following the "Model Storage" heading and ensure terminology matches
other sections that reference PVC (use "Persistent Volume Claim (PVC)" on first
mention).

#### S3 Key Configuration Parameters
| Configuration Item | Actual Value | Description |
|-------------------|-------------|-------------|
| Endpoint | your-s3-service-ip:your-s3-port | IP and port pointing to private MinIO service |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align S3 endpoint placeholder formatting.
The table uses hyphens while the YAML uses underscores; this can confuse copy/paste.

✏️ Suggested edit
-| Endpoint | your-s3-service-ip:your-s3-port | IP and port pointing to private MinIO service |
+| Endpoint | your_s3_service_ip:your_s3_port | IP and port pointing to private MinIO service |

Also applies to: 36-36

🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` at line
22, Replace the hyphenated S3 endpoint placeholders in the table
("your-s3-service-ip:your-s3-port") with the underscore-formatted placeholders
used in the YAML ("your_s3_service_ip:your_s3_port") so the example strings
match; update all occurrences (e.g., the cell at the shown table row and the
other instance around line 36) to use the underscore format for consistency.

Comment on lines +125 to +127
#### Prerequisites:
- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to Privilege
- Enable Modelcar in AmlCluster resource with uidModelcar set to 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Kubernetes Pod Security Admission levels privileged baseline restricted official

💡 Result:

Kubernetes Pod Security Admission (PSA) is the built-in admission controller that can enforce the Pod Security Standards (PSS) at the namespace level. PSA supports three official PSS levels (from least to most restrictive): privileged, baseline, restricted. (v1-34.docs.kubernetes.io)

Official Pod Security levels (PSS)

  • privileged: Unrestricted; allows known privilege escalations and host-level access patterns (intended for trusted system/infrastructure workloads). (v1-34.docs.kubernetes.io)
  • baseline: Minimally restrictive; “default-friendly” while preventing known privilege escalations. (v1-34.docs.kubernetes.io)
  • restricted: Heavily restrictive; aligns to current pod hardening best practices (most secure, least compatible). (v1-34.docs.kubernetes.io)

How PSA applies these levels (namespace labels)

You set namespace labels to choose a level for each PSA mode:

  • pod-security.kubernetes.io/enforce: <privileged|baseline|restricted>
  • pod-security.kubernetes.io/audit: <...>
  • pod-security.kubernetes.io/warn: <...>
    Optional pinning:
  • pod-security.kubernetes.io/<mode>-version: <k8s minor version or latest> (v1-34.docs.kubernetes.io)

PSA modes mean:

  • enforce: reject non-compliant pods
  • audit: allow but annotate for audit logs
  • warn: allow but show user warning (v1-34.docs.kubernetes.io)

Citations:


PSA level should be "privileged".
Kubernetes Pod Security Admission levels use lowercase terminology: "privileged", "baseline", "restricted". The current text uses incorrect capitalization.

✏️ Suggested edit
-- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to Privilege
+- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to privileged
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#### Prerequisites:
- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to Privilege
- Enable Modelcar in AmlCluster resource with uidModelcar set to 0
#### Prerequisites:
- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to privileged
- Enable Modelcar in AmlCluster resource with uidModelcar set to 0
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` around
lines 125 - 127, Update the prerequisite text that currently reads "PSA (Pod
Security Admission) Enforce set to Privilege" to use the correct lowercase
Kubernetes Pod Security Admission level: change it to "PSA (Pod Security
Admission) Enforce set to privileged"; ensure the rest of the prerequisite (the
Enable Modelcar line with uidModelcar set to 0) remains unchanged.

Comment on lines +216 to +220
3. In your workbench IDE, navigate to the file browser:
- In JupyterLab, this is the **Files** tab in the left sidebar.
- In code-server, this is the **Explorer** view in the left sidebar.

4. In the file browser, navigate to the home directory. This directory represents the root of your attached PVC.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reduce repeated “In …” sentence starts.
Three consecutive steps start with “In…”, which reads repetitive.

✏️ Suggested edit
-3. In your workbench IDE, navigate to the file browser:
+3. From your workbench IDE, open the file browser:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
3. In your workbench IDE, navigate to the file browser:
- In JupyterLab, this is the **Files** tab in the left sidebar.
- In code-server, this is the **Explorer** view in the left sidebar.
4. In the file browser, navigate to the home directory. This directory represents the root of your attached PVC.
3. From your workbench IDE, open the file browser:
- In JupyterLab, this is the **Files** tab in the left sidebar.
- In code-server, this is the **Explorer** view in the left sidebar.
4. In the file browser, navigate to the home directory. This directory represents the root of your attached PVC.
🧰 Tools
🪛 LanguageTool

[style] ~220-~220: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...xplorer** view in the left sidebar. 4. In the file browser, navigate to the home ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` around
lines 216 - 220, Steps 3 and 4 repeat the sentence starter "In…" — reword them
to avoid repetition by merging into one instruction: replace the two lines
starting "In your workbench IDE, navigate to the file browser:" and "In the file
browser, navigate to the home directory." with a single line like "Open the file
browser (Files tab in JupyterLab or Explorer view in code-server) and navigate
to the home directory, which represents the root of your attached PVC." This
keeps the referenced UI elements ("Files tab", "Explorer view", "home
directory") but removes the repeated "In…" sentence starts.

cpu: '2'
memory: 4Gi
runtime: aml-vllm-0.11.2-cpu # [!code callout]
storageUri: pvc://model-pvc # [!code callout]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cd docs/en/model_inference/model_management/functions && wc -l model_storage.mdx

Repository: alauda/aml-docs

Length of output: 80


🏁 Script executed:

cd docs/en/model_inference/model_management/functions && sed -n '260,285p' model_storage.mdx

Repository: alauda/aml-docs

Length of output: 1030


🌐 Web query:

KServe PVC storageUri format URI subpath namespace requirements

💡 Result:

KServe storageUri for PVC: format + subpath

Use the pvc:// scheme with PVC name followed by an optional path inside the PVC:

  • Format: pvc://<pvc-name>/<model-path-within-pvc>
  • Examples:
    • Single file model: pvc://task-pv-claim/model.joblib (KServe example) [1]
    • General form: pvc://<pvc_name>/<model_path> [2]

<model-path-within-pvc> is effectively a subpath inside the PVC (for example, a file like model.joblib or a directory containing model artifacts). [1]

Namespace requirements

  • You cannot reference a PVC in another namespace via storageUri (there is no namespace field in the pvc:// URI).
  • The PVC must exist in the same Kubernetes namespace as the serving Pod / InferenceService, because PVCs are namespaced and must be in the same namespace as the Pod that uses them. [3]

Sources: KServe PVC storage guide examples [1], Open Data Hub (KServe) serving docs [2], Kubernetes persistent volume docs (claims + namespaces) [3].


Consider clarifying the optional model path in the PVC URI.
The format is pvc://<pvc-name>/<model-path-within-pvc>. The example pvc://model-pvc is valid for the root of the PVC, but consider adding a note that specific model paths (e.g., pvc://model-pvc/models/my-model) can be used. Namespace does not apply in the URI—the PVC must exist in the same Kubernetes namespace as the InferenceService.

🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` at line
272, Update the storageUri example and add a short note clarifying the optional
model path and namespace behavior: state that the PVC URI format is
pvc://<pvc-name>/<model-path-within-pvc>, that the example storageUri:
pvc://model-pvc refers to the PVC root, and show an example with a specific path
(e.g., pvc://model-pvc/models/my-model); also add a sentence that the PVC must
exist in the same Kubernetes namespace as the InferenceService (namespace does
not apply in the URI).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant