-
Notifications
You must be signed in to change notification settings - Fork 0
Add Model Storage #89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds a new MDX documentation file describing three model storage modalities for cloud-native inference—S3 Object Storage, OCI Model-as-Image, and PVC—plus loading mechanisms (Init Container, Sidecar), configuration examples, and deployment snippets. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx`:
- Line 22: Replace the hyphenated S3 endpoint placeholders in the table
("your-s3-service-ip:your-s3-port") with the underscore-formatted placeholders
used in the YAML ("your_s3_service_ip:your_s3_port") so the example strings
match; update all occurrences (e.g., the cell at the shown table row and the
other instance around line 36) to use the underscore format for consistency.
- Around line 5-6: The opening sentence under the "Model Storage" heading
currently lists only S3 and OCI; update that sentence to also mention PVC
(Persistent Volume Claim) so it reflects all storage options documented on the
page (e.g., "You can store a model in an S3 bucket, Open Container Initiative
(OCI) containers, or a Persistent Volume Claim (PVC)."). Locate and edit the
initial paragraph following the "Model Storage" heading and ensure terminology
matches other sections that reference PVC (use "Persistent Volume Claim (PVC)"
on first mention).
- Around line 216-220: Steps 3 and 4 repeat the sentence starter "In…" — reword
them to avoid repetition by merging into one instruction: replace the two lines
starting "In your workbench IDE, navigate to the file browser:" and "In the file
browser, navigate to the home directory." with a single line like "Open the file
browser (Files tab in JupyterLab or Explorer view in code-server) and navigate
to the home directory, which represents the root of your attached PVC." This
keeps the referenced UI elements ("Files tab", "Explorer view", "home
directory") but removes the repeated "In…" sentence starts.
- Around line 125-127: Update the prerequisite text that currently reads "PSA
(Pod Security Admission) Enforce set to Privilege" to use the correct lowercase
Kubernetes Pod Security Admission level: change it to "PSA (Pod Security
Admission) Enforce set to privileged"; ensure the rest of the prerequisite (the
Enable Modelcar line with uidModelcar set to 0) remains unchanged.
- Line 272: Update the storageUri example and add a short note clarifying the
optional model path and namespace behavior: state that the PVC URI format is
pvc://<pvc-name>/<model-path-within-pvc>, that the example storageUri:
pvc://model-pvc refers to the PVC root, and show an example with a specific path
(e.g., pvc://model-pvc/models/my-model); also add a sentence that the PVC must
exist in the same Kubernetes namespace as the InferenceService (namespace does
not apply in the URI).
🧹 Nitpick comments (1)
docs/en/model_inference/model_management/functions/model_storage.mdx (1)
24-24: Add a production safety note for HTTP.
This row reads like a recommendation; consider explicitly warning against HTTP in production.✏️ Suggested edit
-| HTTPS Enabled | 0 | Encryption disabled for internal test/Demo environment | +| HTTPS Enabled | 0 | Use only for internal test/demo; use HTTPS (1) in production |
| # Model Storage | ||
| You can store a model in an S3 bucket or Open Container Initiative (OCI) containers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include PVC in the opening sentence.
The intro says only S3/OCI, but the page also documents PVC storage.
✏️ Suggested edit
-You can store a model in an S3 bucket or Open Container Initiative (OCI) containers.
+You can store a model in an S3 bucket, an Open Container Initiative (OCI) container image, or a PVC.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Model Storage | |
| You can store a model in an S3 bucket or Open Container Initiative (OCI) containers. | |
| # Model Storage | |
| You can store a model in an S3 bucket, an Open Container Initiative (OCI) container image, or a PVC. |
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` around
lines 5 - 6, The opening sentence under the "Model Storage" heading currently
lists only S3 and OCI; update that sentence to also mention PVC (Persistent
Volume Claim) so it reflects all storage options documented on the page (e.g.,
"You can store a model in an S3 bucket, Open Container Initiative (OCI)
containers, or a Persistent Volume Claim (PVC)."). Locate and edit the initial
paragraph following the "Model Storage" heading and ensure terminology matches
other sections that reference PVC (use "Persistent Volume Claim (PVC)" on first
mention).
| #### S3 Key Configuration Parameters | ||
| | Configuration Item | Actual Value | Description | | ||
| |-------------------|-------------|-------------| | ||
| | Endpoint | your-s3-service-ip:your-s3-port | IP and port pointing to private MinIO service | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align S3 endpoint placeholder formatting.
The table uses hyphens while the YAML uses underscores; this can confuse copy/paste.
✏️ Suggested edit
-| Endpoint | your-s3-service-ip:your-s3-port | IP and port pointing to private MinIO service |
+| Endpoint | your_s3_service_ip:your_s3_port | IP and port pointing to private MinIO service |Also applies to: 36-36
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` at line
22, Replace the hyphenated S3 endpoint placeholders in the table
("your-s3-service-ip:your-s3-port") with the underscore-formatted placeholders
used in the YAML ("your_s3_service_ip:your_s3_port") so the example strings
match; update all occurrences (e.g., the cell at the shown table row and the
other instance around line 36) to use the underscore format for consistency.
| #### Prerequisites: | ||
| - The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to Privilege | ||
| - Enable Modelcar in AmlCluster resource with uidModelcar set to 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Kubernetes Pod Security Admission levels privileged baseline restricted official
💡 Result:
Kubernetes Pod Security Admission (PSA) is the built-in admission controller that can enforce the Pod Security Standards (PSS) at the namespace level. PSA supports three official PSS levels (from least to most restrictive): privileged, baseline, restricted. (v1-34.docs.kubernetes.io)
Official Pod Security levels (PSS)
- privileged: Unrestricted; allows known privilege escalations and host-level access patterns (intended for trusted system/infrastructure workloads). (v1-34.docs.kubernetes.io)
- baseline: Minimally restrictive; “default-friendly” while preventing known privilege escalations. (v1-34.docs.kubernetes.io)
- restricted: Heavily restrictive; aligns to current pod hardening best practices (most secure, least compatible). (v1-34.docs.kubernetes.io)
How PSA applies these levels (namespace labels)
You set namespace labels to choose a level for each PSA mode:
pod-security.kubernetes.io/enforce: <privileged|baseline|restricted>pod-security.kubernetes.io/audit: <...>pod-security.kubernetes.io/warn: <...>
Optional pinning:pod-security.kubernetes.io/<mode>-version: <k8s minor version or latest>(v1-34.docs.kubernetes.io)
PSA modes mean:
- enforce: reject non-compliant pods
- audit: allow but annotate for audit logs
- warn: allow but show user warning (v1-34.docs.kubernetes.io)
Citations:
- 1: https://v1-34.docs.kubernetes.io/docs/concepts/security/pod-security-standards/?utm_source=openai
- 2: https://v1-34.docs.kubernetes.io/docs/concepts/security/pod-security-standards/?utm_source=openai
- 3: https://v1-34.docs.kubernetes.io/docs/concepts/security/pod-security-standards/?utm_source=openai
- 4: https://v1-34.docs.kubernetes.io/docs/concepts/security/pod-security-standards/?utm_source=openai
- 5: https://v1-34.docs.kubernetes.io/docs/concepts/security/pod-security-admission/?utm_source=openai
- 6: https://v1-34.docs.kubernetes.io/docs/concepts/security/pod-security-admission/?utm_source=openai
PSA level should be "privileged".
Kubernetes Pod Security Admission levels use lowercase terminology: "privileged", "baseline", "restricted". The current text uses incorrect capitalization.
✏️ Suggested edit
-- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to Privilege
+- The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to privileged📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #### Prerequisites: | |
| - The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to Privilege | |
| - Enable Modelcar in AmlCluster resource with uidModelcar set to 0 | |
| #### Prerequisites: | |
| - The namespace where the inference service is located must have PSA (Pod Security Admission) Enforce set to privileged | |
| - Enable Modelcar in AmlCluster resource with uidModelcar set to 0 |
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` around
lines 125 - 127, Update the prerequisite text that currently reads "PSA (Pod
Security Admission) Enforce set to Privilege" to use the correct lowercase
Kubernetes Pod Security Admission level: change it to "PSA (Pod Security
Admission) Enforce set to privileged"; ensure the rest of the prerequisite (the
Enable Modelcar line with uidModelcar set to 0) remains unchanged.
| 3. In your workbench IDE, navigate to the file browser: | ||
| - In JupyterLab, this is the **Files** tab in the left sidebar. | ||
| - In code-server, this is the **Explorer** view in the left sidebar. | ||
|
|
||
| 4. In the file browser, navigate to the home directory. This directory represents the root of your attached PVC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce repeated “In …” sentence starts.
Three consecutive steps start with “In…”, which reads repetitive.
✏️ Suggested edit
-3. In your workbench IDE, navigate to the file browser:
+3. From your workbench IDE, open the file browser:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 3. In your workbench IDE, navigate to the file browser: | |
| - In JupyterLab, this is the **Files** tab in the left sidebar. | |
| - In code-server, this is the **Explorer** view in the left sidebar. | |
| 4. In the file browser, navigate to the home directory. This directory represents the root of your attached PVC. | |
| 3. From your workbench IDE, open the file browser: | |
| - In JupyterLab, this is the **Files** tab in the left sidebar. | |
| - In code-server, this is the **Explorer** view in the left sidebar. | |
| 4. In the file browser, navigate to the home directory. This directory represents the root of your attached PVC. |
🧰 Tools
🪛 LanguageTool
[style] ~220-~220: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...xplorer** view in the left sidebar. 4. In the file browser, navigate to the home ...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` around
lines 216 - 220, Steps 3 and 4 repeat the sentence starter "In…" — reword them
to avoid repetition by merging into one instruction: replace the two lines
starting "In your workbench IDE, navigate to the file browser:" and "In the file
browser, navigate to the home directory." with a single line like "Open the file
browser (Files tab in JupyterLab or Explorer view in code-server) and navigate
to the home directory, which represents the root of your attached PVC." This
keeps the referenced UI elements ("Files tab", "Explorer view", "home
directory") but removes the repeated "In…" sentence starts.
| cpu: '2' | ||
| memory: 4Gi | ||
| runtime: aml-vllm-0.11.2-cpu # [!code callout] | ||
| storageUri: pvc://model-pvc # [!code callout] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cd docs/en/model_inference/model_management/functions && wc -l model_storage.mdxRepository: alauda/aml-docs
Length of output: 80
🏁 Script executed:
cd docs/en/model_inference/model_management/functions && sed -n '260,285p' model_storage.mdxRepository: alauda/aml-docs
Length of output: 1030
🌐 Web query:
KServe PVC storageUri format URI subpath namespace requirements
💡 Result:
KServe storageUri for PVC: format + subpath
Use the pvc:// scheme with PVC name followed by an optional path inside the PVC:
- Format:
pvc://<pvc-name>/<model-path-within-pvc> - Examples:
- Single file model:
pvc://task-pv-claim/model.joblib(KServe example) [1] - General form:
pvc://<pvc_name>/<model_path>[2]
- Single file model:
<model-path-within-pvc> is effectively a subpath inside the PVC (for example, a file like model.joblib or a directory containing model artifacts). [1]
Namespace requirements
- You cannot reference a PVC in another namespace via
storageUri(there is no namespace field in thepvc://URI). - The PVC must exist in the same Kubernetes namespace as the serving Pod /
InferenceService, because PVCs are namespaced and must be in the same namespace as the Pod that uses them. [3]
Sources: KServe PVC storage guide examples [1], Open Data Hub (KServe) serving docs [2], Kubernetes persistent volume docs (claims + namespaces) [3].
Consider clarifying the optional model path in the PVC URI.
The format is pvc://<pvc-name>/<model-path-within-pvc>. The example pvc://model-pvc is valid for the root of the PVC, but consider adding a note that specific model paths (e.g., pvc://model-pvc/models/my-model) can be used. Namespace does not apply in the URI—the PVC must exist in the same Kubernetes namespace as the InferenceService.
🤖 Prompt for AI Agents
In `@docs/en/model_inference/model_management/functions/model_storage.mdx` at line
272, Update the storageUri example and add a short note clarifying the optional
model path and namespace behavior: state that the PVC URI format is
pvc://<pvc-name>/<model-path-within-pvc>, that the example storageUri:
pvc://model-pvc refers to the PVC root, and show an example with a specific path
(e.g., pvc://model-pvc/models/my-model); also add a sentence that the PVC must
exist in the same Kubernetes namespace as the InferenceService (namespace does
not apply in the URI).
3e87e0a to
51db17e
Compare
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.