Harden provider flows: EPUB/PDF enforcement, mirror retries, and Z-Lib compatibility fixes#41
Open
tetrabit wants to merge 3 commits intojustrals:mainfrom
Open
Harden provider flows: EPUB/PDF enforcement, mirror retries, and Z-Lib compatibility fixes#41tetrabit wants to merge 3 commits intojustrals:mainfrom
tetrabit wants to merge 3 commits intojustrals:mainfrom
Conversation
Improve core utility behavior used by setup/search/download flows:\n- Expand find_working_url to retry up to 5 times with a longer timeout before failing.\n- Add title normalization helper for cross-result matching.\n- Add select_preferred_format_book_info to prioritize EPUB then PDF variants for a title/provider pair.\n- Add zlib_md5_is_downloadable with lightweight cache to mark complaint-blocked Z-Library entries and avoid repeatedly probing the same MD5.\n\nThis commit centralizes selection and availability logic in misc.sh so call-sites can stay small and consistent.
Tighten search/result UX around preferred formats and availability:\n- Filter parsed Anna\'s Archive entries to EPUB/PDF only.\n- During source selection, gate zlib availability through provider-aware selection logic.\n- Show an explicit note when zlib is unavailable for the selected title (blocked or no EPUB/PDF copy).\n\nThis reduces dead-end source choices and aligns visible results with download format policy.
Update both provider downloaders to use preferred-format selection and improve reliability:\n- lgli_download now resolves the selected record through EPUB/PDF preference and emits a clear error when no compatible format is available.\n- zlib_download now resolves modern and legacy book URL shapes, extracts missing numeric IDs from the book page, and falls back to page-level /dl links when API downloadLink is absent/invalid.\n- zlib downloader now detects copyright-complaint blocks and returns an explicit message instead of surfacing opaque "Invalid hash" behavior.\n- Normalize extension handling/fallback when API metadata is incomplete.\n\nTogether these changes reduce false failures and keep downloads aligned with EPUB/PDF policy.
Author
|
Observation from on-device testing (not a direct code change that exists in this PR):
We observed TLS handshake failures when accessing Anna's Archive mirrors with the stock system curl/OpenSSL stack on this device. Typical error was:
This caused mirror checks and requests to fail even when domains were otherwise reachable. As a workaround during testing, replacing/upgrading curl to a modern build (curl Marking this here as an environment compatibility note for maintainers/readers of this PR. I replied to this issue on how I fixed it #40 (comment) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR hardens KindleFetch search/download behavior across Anna's Archive, LibGen, and Z-Library, with a focus on reliability and avoiding dead-end results.
Main goals:
EPUB, fallbackPDF)Problems Observed
1) Mirror/connectivity false negatives
find_working_url()used a very short timeout and no retries, causing intermittent "failed to connect" outcomes on slower Kindle network conditions.2) Inconsistent/undesired formats
Search and download flows could still select non-preferred formats (e.g. AZW3), despite user intent to prioritize EPUB/PDF.
3) Z-Library flow regressions
Recent Z-Lib behavior differs from older assumptions:
/book/<hash>(not always/book/<id>/<hash>)eapi/book/<id>/<hash>/filecan returnInvalid hashfor some entries even when logged inThis produced opaque failures in KindleFetch (e.g. generic
Invalid hash).What Changed
A) Core helper improvements (
kindlefetch/bin/misc.sh)find_working_url():normalize_title()for stable same-title matching.select_preferred_format_book_info():EPUBfirst, thenPDFzlib_md5_is_downloadable()with cache ($TMP_DIR/zlib_availability.cache):B) Search filtering and source gating (
kindlefetch/bin/search.sh)C) LibGen downloader updates (
kindlefetch/bin/downloads/lgli_download.sh)D) Z-Library downloader hardening (
kindlefetch/bin/downloads/zlib_download.sh)/book/<id>/<hash>/book/<hash>book_idfrom page HTML when needed.downloadLinkis missing/invalid, fallback to HTML/dl/...extraction.Why This Approach
Validation Performed
Invalid hashcase now handled more explicitlyUser-Visible Behavior Changes
Files Changed
kindlefetch/bin/misc.shkindlefetch/bin/search.shkindlefetch/bin/downloads/lgli_download.shkindlefetch/bin/downloads/zlib_download.sh