Skip to content

Harden provider flows: EPUB/PDF enforcement, mirror retries, and Z-Lib compatibility fixes#41

Open
tetrabit wants to merge 3 commits intojustrals:mainfrom
tetrabit:main
Open

Harden provider flows: EPUB/PDF enforcement, mirror retries, and Z-Lib compatibility fixes#41
tetrabit wants to merge 3 commits intojustrals:mainfrom
tetrabit:main

Conversation

@tetrabit
Copy link

@tetrabit tetrabit commented Mar 6, 2026

Summary

This PR hardens KindleFetch search/download behavior across Anna's Archive, LibGen, and Z-Library, with a focus on reliability and avoiding dead-end results.

Main goals:

  • enforce preferred download formats (EPUB, fallback PDF)
  • reduce connection flakiness during mirror probing
  • handle current Z-Library URL/API behavior changes
  • avoid presenting Z-Lib options that are known to be blocked/unusable

Problems Observed

1) Mirror/connectivity false negatives

find_working_url() used a very short timeout and no retries, causing intermittent "failed to connect" outcomes on slower Kindle network conditions.

2) Inconsistent/undesired formats

Search and download flows could still select non-preferred formats (e.g. AZW3), despite user intent to prioritize EPUB/PDF.

3) Z-Library flow regressions

Recent Z-Lib behavior differs from older assumptions:

  • redirects commonly use /book/<hash> (not always /book/<id>/<hash>)
  • hashes may be mixed-case
  • eapi/book/<id>/<hash>/file can return Invalid hash for some entries even when logged in
  • some books are visible but blocked from download due to copyright complaints

This produced opaque failures in KindleFetch (e.g. generic Invalid hash).

What Changed

A) Core helper improvements (kindlefetch/bin/misc.sh)

  • Added retries + longer timeout in find_working_url():
    • up to 5 attempts
    • 10-second timeout
    • short delay between retries
  • Added normalize_title() for stable same-title matching.
  • Added select_preferred_format_book_info():
    • for a selected result/title/provider, choose EPUB first, then PDF
    • return no candidate if neither exists
  • Added zlib_md5_is_downloadable() with cache ($TMP_DIR/zlib_availability.cache):
    • probes Z-Lib entry by md5
    • detects copyright-complaint blocked pages
    • caches blocked/ok decisions to avoid repeated network checks

B) Search filtering and source gating (kindlefetch/bin/search.sh)

  • Filter parsed Anna's results to EPUB/PDF only before presenting list entries.
  • At source selection time:
    • only expose zlib when a valid EPUB/PDF candidate exists and is not known blocked
    • otherwise show an explicit note that zlib is unavailable for that title

C) LibGen downloader updates (kindlefetch/bin/downloads/lgli_download.sh)

  • Resolve selected item through provider-aware format preference helper.
  • Return clear message when no LibGen EPUB/PDF variant exists.
  • Normalize output extension casing.

D) Z-Library downloader hardening (kindlefetch/bin/downloads/zlib_download.sh)

  • Support both URL layouts:
    • legacy /book/<id>/<hash>
    • current /book/<hash>
  • Accept mixed-case hashes.
  • Extract missing numeric book_id from page HTML when needed.
  • If API downloadLink is missing/invalid, fallback to HTML /dl/... extraction.
  • Add explicit detection/message for copyright-complaint blocked entries.
  • Improve title/extension fallback behavior when API fields are missing.

Why This Approach

  • Keeps behavior aligned with user-facing expectation: EPUB first, then PDF.
  • Moves volatile provider-specific checks into reusable helpers to reduce duplicated logic.
  • Handles provider drift (especially Z-Lib) without requiring user intervention.
  • Replaces ambiguous errors with actionable messages.

Validation Performed

  • Shell syntax checks on modified scripts.
  • Real on-device Kindle tests over SSH:
    • Anna search parsing/output
    • LibGen download success path
    • Z-Lib success path
    • Z-Lib blocked/copyright complaint path
    • previously failing Invalid hash case now handled more explicitly
  • Regression checks for search/source menu behavior after gating logic.

User-Visible Behavior Changes

  • Search result list excludes non-EPUB/PDF entries.
  • Downloader refuses titles without EPUB/PDF variants (instead of silently taking AZW3).
  • Z-Lib source may be hidden for titles known to be blocked/unusable.
  • Error messages are more specific for blocked Z-Lib items.

Files Changed

  • kindlefetch/bin/misc.sh
  • kindlefetch/bin/search.sh
  • kindlefetch/bin/downloads/lgli_download.sh
  • kindlefetch/bin/downloads/zlib_download.sh

tetrabit added 3 commits March 6, 2026 00:40
Improve core utility behavior used by setup/search/download flows:\n- Expand find_working_url to retry up to 5 times with a longer timeout before failing.\n- Add title normalization helper for cross-result matching.\n- Add select_preferred_format_book_info to prioritize EPUB then PDF variants for a title/provider pair.\n- Add zlib_md5_is_downloadable with lightweight cache to mark complaint-blocked Z-Library entries and avoid repeatedly probing the same MD5.\n\nThis commit centralizes selection and availability logic in misc.sh so call-sites can stay small and consistent.
Tighten search/result UX around preferred formats and availability:\n- Filter parsed Anna\'s Archive entries to EPUB/PDF only.\n- During source selection, gate zlib availability through provider-aware selection logic.\n- Show an explicit note when zlib is unavailable for the selected title (blocked or no EPUB/PDF copy).\n\nThis reduces dead-end source choices and aligns visible results with download format policy.
Update both provider downloaders to use preferred-format selection and improve reliability:\n- lgli_download now resolves the selected record through EPUB/PDF preference and emits a clear error when no compatible format is available.\n- zlib_download now resolves modern and legacy book URL shapes, extracts missing numeric IDs from the book page, and falls back to page-level /dl links when API downloadLink is absent/invalid.\n- zlib downloader now detects copyright-complaint blocks and returns an explicit message instead of surfacing opaque "Invalid hash" behavior.\n- Normalize extension handling/fallback when API metadata is incomplete.\n\nTogether these changes reduce false failures and keep downloads aligned with EPUB/PDF policy.
@tetrabit
Copy link
Author

tetrabit commented Mar 6, 2026

Observation from on-device testing (not a direct code change that exists in this PR):

  • Device: Kindle Paperwhite 2 (model PQ948KJ)
  • Firmware: 15.8.1

We observed TLS handshake failures when accessing Anna's Archive mirrors with the stock system curl/OpenSSL stack on this device. Typical error was:

SSL23_GET_SERVER_HELLO: tlsv1 alert protocol version

This caused mirror checks and requests to fail even when domains were otherwise reachable.

As a workaround during testing, replacing/upgrading curl to a modern build (curl 8.17.0 with OpenSSL 3.5.4) restored connectivity to affected endpoints.

Marking this here as an environment compatibility note for maintainers/readers of this PR.

I replied to this issue on how I fixed it #40 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant