Skip to content

Conversation

@TylerZeroMaster
Copy link

https://openstax.atlassian.net/browse/CORE-1521

The idea is that we will use partial_data to determine if there was an error. If there was an error, we can cache less greedily.

I tried to do this in a way that wouldn't break things that do not care about partial data. It's maybe a little hacky...

Next I plan to update content.rb in exercises so that it sets the cache expiration to 30 minutes if partial_data is true.

The idea is that we will use partial_data to determine if there was an error. If there was an error, we can cache less greedily.
@TylerZeroMaster
Copy link
Author

TylerZeroMaster commented Jan 27, 2026

It might also make sense for it to look at the previous content version, not just previous archive version. I think in most cases the page slugs will remain the same between content versions. This would still potentially cause some context tags to be removed if the slugs are not the same between versions, but I think it would be very rare.

The ABL keeps two versions when someone adds a new version: the version that is currently on REX (if there is one), and the new version they add.

Edit: It already does this.

hash[page.uuid] ||= []
hash[page.uuid] << { book: book.slug, page: page.slug }
end
rescue StandardError => exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm the issue is that this rescue will prevent the rescue in https://github.com/openstax/content-ruby/blob/main/lib/openstax/content/abl.rb#L57 from triggering, so we'll lose the behavior of trying old archive versions.

Maybe this logic has to replace the raise exception in https://github.com/openstax/content-ruby/blob/main/lib/openstax/content/abl.rb#L58 and https://github.com/openstax/content-ruby/blob/main/lib/openstax/content/abl.rb#L66 instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you're right. I think I got in too big of a hurry on this one. I will fix it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dantemss I moved the error handling into each_book_with_previous_archive_version_fallback and changed it so that it tries with the newest archive version each time, and walks backwards up to max_attempts for each book.

Before it would reuse the same previous_archive for subsequent books, and only fail up to max_attempts for all books.

…eful degradation

The changes fix a critical bug and significantly improve error handling.

Key improvements:

1. Each book now gets its own previous_version and previous_archive instead of reusing the first failing book's archive for all subsequent failures
2. Replace complex batch retry pattern with straightforward per-book retry loop
3. New allow_partial_data parameter allows processing to continue when books fail, logging warnings instead of raising exceptions
4. Remove duplicate try/catch from slugs_by_page_uuid, consolidate in each_book_with_previous_archive_version_fallback
5. Add proper test setup with mocked books to verify graceful degradation and partial data handling
@Dantemss Dantemss requested a review from Copilot February 2, 2026 17:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds error handling and tracking to the ABL (Approved Book List) processing logic. The primary goal is to enable graceful degradation when some books fail to process, allowing the system to return partial data while tracking that errors occurred. This will enable less aggressive caching when errors are encountered.

Changes:

  • Added a partial_data flag to track when book processing encounters errors
  • Modified each_book_with_previous_archive_version_fallback to support an allow_partial_data parameter that controls whether errors are raised or suppressed
  • Updated the retry logic to work on a per-book basis instead of retrying all failed books together
  • Added comprehensive test coverage for the new error handling behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
lib/openstax/content/abl.rb Added partial_data attribute and modified retry logic to handle errors gracefully on a per-book basis, with optional suppression of exceptions
spec/openstax/content/abl_spec.rb Added three new tests covering partial data scenarios: single book failure, exhausting all retries, and the allow_partial_data: false behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# There are no more earlier archive versions
raise exception
previous_version = book.archive.previous_version
if previous_version.nil? or attempt >= max_attempts
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use || instead of or for boolean logic. In Ruby, or has very low precedence and is typically used for control flow, while || is the standard boolean OR operator. Using || makes the intent clearer and avoids potential precedence issues.

Suggested change
if previous_version.nil? or attempt >= max_attempts
if previous_version.nil? || attempt >= max_attempts

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge deal


# If the book requires an archive version that hasn't finished building yet, don't include it
retry_books << retry_book if retry_book.valid?
break unless book.valid?
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a book is invalid after being created with a previous archive version, the code breaks out of the retry loop without setting @partial_data = true or logging a warning. This means the book is silently skipped, and callers won't know that some books were not processed. Consider setting @partial_data = true and logging a warning before the break, similar to the error handling on lines 64-68.

Suggested change
break unless book.valid?
unless book.valid?
@partial_data = true
OpenStax::Content::logger.warn do
"Failed to process slugs for book: #{book.uuid}. " \
"Error: invalid book for archive version #{previous_version}"
end
break
end

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this one probably should be fixed

Comment on lines +47 to +54
result << OpenStax::Content::Book.new(
archive: archive,
uuid: '00000000-0000-0000-0000-000000000000',
version: version,
min_code_version: archive_version,
slug: 'test-book',
committed_at: '2026-01-21T21:45:57+00:00'
)
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test creates two books with the same UUID ('00000000-0000-0000-0000-000000000000') but different versions. This doesn't match realistic data where each book should have a unique UUID. While this may work for testing purposes, it could mask issues where book identity matters. Consider using different UUIDs for the two test books to better reflect production data.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge deal, but I tend to just use SecureRandom.uuid unless I need a fixed value for a snapshot or something

Comment on lines 59 to 68
allow_any_instance_of(OpenStax::Content::Book).to receive(:all_pages).and_wrap_original do |method, *args|
# Fail for the first book encountered
if @first_book_processed
# Return fake pages for the second book
[
OpenStruct.new(uuid: '11111111-1111-1111-1111-111111111111', slug: 'test-page-1'),
OpenStruct.new(uuid: '22222222-2222-2222-2222-222222222222', slug: 'test-page-2')
]
else
@first_book_processed = true
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test uses an instance variable @first_book_processed in the stub, which is set at the RSpec test level. This creates shared state that persists across stub invocations and could lead to test pollution if the stub is called in unexpected ways. Consider using a local variable or a more explicit counter mechanism to track which book is being processed.

Suggested change
allow_any_instance_of(OpenStax::Content::Book).to receive(:all_pages).and_wrap_original do |method, *args|
# Fail for the first book encountered
if @first_book_processed
# Return fake pages for the second book
[
OpenStruct.new(uuid: '11111111-1111-1111-1111-111111111111', slug: 'test-page-1'),
OpenStruct.new(uuid: '22222222-2222-2222-2222-222222222222', slug: 'test-page-2')
]
else
@first_book_processed = true
first_book_processed = false
allow_any_instance_of(OpenStax::Content::Book).to receive(:all_pages).and_wrap_original do |method, *args|
# Fail for the first book encountered
if first_book_processed
# Return fake pages for the second book
[
OpenStruct.new(uuid: '11111111-1111-1111-1111-111111111111', slug: 'test-page-1'),
OpenStruct.new(uuid: '22222222-2222-2222-2222-222222222222', slug: 'test-page-2')
]
else
first_book_processed = true

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge deal I think

raise exception unless allow_partial_data
@partial_data = true
OpenStax::Content::logger.warn do
"Failed to process slugs for book: #{book.uuid}. " \
Copy link

Copilot AI Feb 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message says "Failed to process slugs for book" but this method is generic and not specific to slug processing. It could be called from contexts other than slugs_by_page_uuid. Consider a more generic message like "Failed to process book" to better reflect the method's general-purpose nature.

Suggested change
"Failed to process slugs for book: #{book.uuid}. " \
"Failed to process book: #{book.uuid}. " \

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants