Add file size validation to document upload endpoint by ankit-mehta07 · Pull Request #584 · ProjectTech4DevAI/kaapi-backend

ankit-mehta07 · 2026-02-05T09:25:27Z

Summary

Added file size validation to document upload endpoint
Rejects files exceeding the allowed limit
Prevents empty file uploads

Result

Improves reliability and prevents oversized uploads.

Summary by CodeRabbit

New Features
- Added server-side file upload validation (default max 512MB); empty files are rejected.
Documentation
- Updated upload docs with file size limits and error responses.
Tests
- Added tests covering oversized, empty, and valid uploads.
Chores
- Improved conversation query performance via database indexing.
Chores
- Job records now capture project and organization context for better traceability.

coderabbitai · 2026-02-05T09:25:43Z

No actionable comments were generated in the recent review. 🎉

📝 Walkthrough

Walkthrough

Adds organization and project foreign keys to Job (DB + model + CRUD + service callers) via migrations. Introduces async document file-size validation with configurable MAX_DOCUMENT_UPLOAD_SIZE_MB (default 512MB), integrates it into upload route, and adds tests for oversize/empty/file-ok cases.

Changes

Cohort / File(s)	Summary
Database Migrations `backend/app/alembic/versions/043_add_project_org_to_job_table.py`, `backend/app/alembic/versions/044_optimize_conversation_query.py`	Revision 043 adds nullable `organization_id` and `project_id` columns to `job` with FK (ON DELETE CASCADE) and indexes; downgrade removes them. Revision 044 adds composite index on `openai_conversation` for (ancestor_response_id, project_id, is_deleted, inserted_at); downgrade drops it.
Configuration & Documentation `backend/app/core/config.py`, `backend/app/api/docs/documents/upload.md`	Adds `MAX_DOCUMENT_UPLOAD_SIZE_MB` (default 512) to settings. Upload docs updated to state max size configurable (512MB default), 413 for oversized files, 422 for empty files.
Document Upload Validation `backend/app/services/documents/validators.py`, `backend/app/api/routes/documents.py`	New validators module with `MAX_DOCUMENT_SIZE` and async `validate_document_file(UploadFile)` that checks file size (raises 413/422). Upload route now awaits `validate_document_file` before proceeding to upload/pre-transform validation.
Job Model & CRUD `backend/app/models/job.py`, `backend/app/crud/jobs.py`	Job model gains `organization_id`, `project_id` FKs and optional relationships; `JobUpdate` exposes `error_message` and `task_id`. `JobCrud.create` signature updated to require `project_id` and `organization_id` and persists them on create.
Service Layer Integration `backend/app/services/llm/jobs.py`, `backend/app/services/response/jobs.py`	Calls to create jobs now pass `project_id` and `organization_id` through LLM and response job creation paths to persist context.
Tests `backend/app/tests/api/routes/documents/test_route_document_upload.py`	Adds tests for file-size behavior: oversized -> 413, empty -> 422, valid size -> success and DB record creation; uses patches to simulate different MAX_DOCUMENT_SIZE values.
Misc / Branch Metadata (repo branch list entries)	Adds branch/remote listing metadata referencing enhancement branch names (no code changes).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Route as /documents/upload
    participant Validator as validate_document_file
    participant Storage as CloudStorage
    participant DB as Database

    Client->>Route: POST file + metadata
    Route->>Validator: await validate_document_file(file)
    alt size > MAX
        Validator-->>Route: raise 413
        Route-->>Client: 413 Payload Too Large
    else size == 0
        Validator-->>Route: raise 422
        Route-->>Client: 422 Unprocessable Entity
    else valid size
        Validator-->>Route: return file_size
        Route->>Storage: upload file
        Storage-->>Route: upload success
        Route->>DB: create document record
        DB-->>Route: record created
        Route-->>Client: 200 OK (document created)
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Suggested labels

enhancement

Suggested reviewers

nishika26
AkhileshNegi

Poem

🐰 I hopped through tables, added fields with care,
Files now measured, no uploads too rare.
Projects and orgs snug in each job's tune,
Tiny validations hum like a tune.
Hop, test, deploy — a carrot-shaped boon! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the primary change: adding file size validation to the document upload endpoint, which aligns with the main objective and majority of file changes.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/app/services/response/jobs.py (1)
1-6: ⚠️ Potential issue | 🟡 Minor

Add a type hint for task_instance.
task_instance is untyped on Line 61, which breaks the project’s type-hint requirement.
✅ Suggested fix
+from typing import Any
@@
 def execute_job(
     request_data: dict,
     project_id: int,
     organization_id: int,
     job_id: str,
     task_id: str,
-    task_instance,
+    task_instance: Any,
 ) -> None:
As per coding guidelines, "Always add type hints to all function parameters and return values in Python code".
Also applies to: 55-62

🤖 Fix all issues with AI agents

In `@backend/app/alembic/versions/043_add_project_org_to_job_table.py`:
- Around line 22-41: The migration adds non-nullable columns organization_id and
project_id to the job table using op.add_column without a server_default, which
will fail if rows already exist; update the migration to perform a safe
two-phase change: either (A) add organization_id and project_id with a sensible
server_default (or temporary default value) so existing rows get backfilled,
commit, then remove the server_default and alter nullable to False, or (B) add
both columns as nullable (nullable=True) via op.add_column, run a data backfill
step to populate them, then run a follow-up ALTER to set nullable=False; adjust
the op.add_column calls for "organization_id" and "project_id" accordingly and
include a follow-up migration step to remove defaults or flip nullable once
backfill is done.

In `@backend/app/alembic/versions/044_optimize_conversation_query.py`:
- Around line 18-34: The migration functions upgrade and downgrade lack explicit
return type annotations; update their signatures (functions named upgrade and
downgrade in this migration) to include return type hints (i.e., -> None) so
they comply with the project's mandatory type-hints guideline, leaving the
function bodies unchanged and keeping the existing op.create_index/op.drop_index
calls intact.
- Around line 11-15: The migration functions upgrade() and downgrade() lack
return type annotations; update their definitions to include explicit return
types by changing them to "def upgrade() -> None:" and "def downgrade() ->
None:" so both functions are annotated as returning None (keep bodies unchanged
and only adjust the function signatures for upgrade and downgrade).

In `@backend/app/api/docs/documents/upload.md`:
- Around line 7-11: The docs claim a 50MB max but the code default constant
MAX_DOCUMENT_UPLOAD_SIZE_MB is 512; update the documentation in the upload.md
text to reflect the actual default (change "Maximum file size: 50MB" to "Maximum
file size: 512MB (configurable via MAX_DOCUMENT_UPLOAD_SIZE_MB environment
variable)") and ensure any related lines about rejection behavior remain
unchanged; reference the MAX_DOCUMENT_UPLOAD_SIZE_MB symbol so readers know the
source of truth.

🧹 Nitpick comments (2)

backend/app/models/job.py (1)
92-94: Consider adding back_populates for bidirectional navigation.

The relationships lack back_populates, meaning you cannot navigate from Organization or Project to their associated jobs. If bidirectional access is needed (e.g., organization.jobs), you'll need to add corresponding relationship fields to those models.
♻️ Example with back_populates
     # Relationships
-    organization: Optional["Organization"] = Relationship()
-    project: Optional["Project"] = Relationship()
+    organization: Optional["Organization"] = Relationship(back_populates="jobs")
+    project: Optional["Project"] = Relationship(back_populates="jobs")
Then add to Organization and Project models:
jobs: list["Job"] = Relationship(back_populates="organization", cascade_delete=True)
backend/app/crud/jobs.py (1)
15-31: Consider adding a log statement for job creation.

Per the coding guidelines, log messages should be prefixed with the function name. Adding a log entry here would improve observability for job creation events.
📝 Proposed logging addition
         self.session.add(new_job)
         self.session.commit()
         self.session.refresh(new_job)
+        logger.info(
+            f"[create] Job created | job_id={new_job.id}, job_type={job_type}, "
+            f"project_id={project_id}, organization_id={organization_id}"
+        )
         return new_job

backend/app/alembic/versions/043_add_project_org_to_job_table.py

backend/app/alembic/versions/044_optimize_conversation_query.py

backend/app/api/docs/documents/upload.md

Prajna1999

Is this not a duplicate PR?

ankit-mehta07 added 4 commits February 3, 2026 18:37

Add file size validation to document upload endpoint

828fa51

Remove trailing slashes from API endpoints

652adb9

Add project and organization foreign keys to async Job model

d2bfc46

Address review comments and update file size validation

1adc93c

coderabbitai bot reviewed Feb 5, 2026

View reviewed changes

Prajna1999 reviewed Feb 5, 2026

View reviewed changes

Update file size validation as per review

ab8c2e9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add file size validation to document upload endpoint#584

Add file size validation to document upload endpoint#584
ankit-mehta07 wants to merge 5 commits intoProjectTech4DevAI:mainfrom
ankit-mehta07:enhancement/add-file-size-validation-document-upload

ankit-mehta07 commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 5, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Prajna1999 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

ankit-mehta07 commented Feb 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Result

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Prajna1999 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

ankit-mehta07 commented Feb 5, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 5, 2026 •

edited

Loading