Skip to content

Comments

Persist article storage metrics during ingestion#44

Merged
jonocodes merged 3 commits intodevfrom
claude/track-article-assets-RfVdP
Feb 13, 2026
Merged

Persist article storage metrics during ingestion#44
jonocodes merged 3 commits intodevfrom
claude/track-article-assets-RfVdP

Conversation

@jonocodes
Copy link
Owner

Summary

This PR optimizes storage size calculation by persisting asset count and total size information during article ingestion, eliminating the need for expensive on-demand calculations in most cases.

Key Changes

  • Added storage metrics to Article model: New optional fields assetCount and sizeBytes to track persisted storage information
  • Updated ingestion pipeline: Modified ingestHtml(), ingestPdf(), and ingestImage() to calculate and persist asset counts and total sizes during the ingestion process
  • Enhanced image processing: Modified downloadAndResizeImages() to return detailed metrics (successful downloads, saved bytes, file count) instead of just a count
  • Optimized ArticleScreen display logic: Updated storage size display to prefer persisted metrics from the article object, falling back to on-demand calculation only when metrics are unavailable
  • Conditional calculation: Added checks in two locations to skip expensive calculateArticleStorageSize() calls when persisted metrics are already available

Implementation Details

  • Storage size calculation now happens during ingestion when all file data is readily available, improving performance
  • The on-demand calculation remains as a fallback for articles ingested before this change or in edge cases
  • Asset counts include base files (HTML, metadata, logs) plus image/thumbnail resources
  • Size calculations account for all stored files: rendered HTML, raw HTML, fetch logs, article metadata, and media resources

https://claude.ai/code/session_011h8i7BVMqrpsxkER9yKVSF

Track the number of stored files and total storage size in bytes for
each article during ingestion. These fields are computed from the actual
content saved (HTML, images, thumbnails, metadata) and persisted in the
article.json metadata for all ingestion types (HTML, PDF, image).

https://claude.ai/code/session_011h8i7BVMqrpsxkER9yKVSF
Display article size and file count from persisted fields when available,
skipping the expensive calculateArticleStorageSize call. Legacy articles
without these fields fall back to the on-demand calculation.

https://claude.ai/code/session_011h8i7BVMqrpsxkER9yKVSF
@netlify
Copy link

netlify bot commented Feb 13, 2026

Deploy Preview for savrlist ready!

Name Link
🔨 Latest commit bc08b97
🔍 Latest deploy log https://app.netlify.com/projects/savrlist/deploys/698f561f7ab384000836a05a
😎 Deploy Preview https://deploy-preview-44--savrlist.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Feb 13, 2026

Deploy Preview for savrdev ready!

Name Link
🔨 Latest commit bc08b97
🔍 Latest deploy log https://app.netlify.com/projects/savrdev/deploys/698f561ffb39930008f413ae
😎 Deploy Preview https://deploy-preview-44--savrdev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Verify that ingestHtml sets assetCount (4 base files for no-image articles)
and sizeBytes (positive, larger than raw HTML) on the returned article,
and that these fields are persisted in the saved article.json.

https://claude.ai/code/session_011h8i7BVMqrpsxkER9yKVSF
@jonocodes jonocodes merged commit 02a0f0b into dev Feb 13, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants