Skip to content

Comments

Add BFI Festivals event scraper for film festival screenings#281

Merged
alistairjcbrown merged 1 commit intomainfrom
claude/epic-carson-hc9jb
Feb 23, 2026
Merged

Add BFI Festivals event scraper for film festival screenings#281
alistairjcbrown merged 1 commit intomainfrom
claude/epic-carson-hc9jb

Conversation

@alistairjcbrown
Copy link
Member

Summary

This PR adds a new event source for BFI Festivals (starting with the Flare festival) to scrape film screening information from the BFI website.

Key Changes

  • New source module: bfi.org.uk-bfi-festivals that retrieves and parses BFI festival film screening data
  • Retrieve functionality: Scrapes A-Z film listings and individual film pages using Playwright, extracting HTML and search result metadata
  • Event parsing: Extracts screening performances with dates, venues, and booking URLs from parsed HTML
  • Accessibility support: Parses and maps access screening types (relaxed screenings, audio descriptions, captions, etc.) to structured accessibility flags
  • Film metadata: Extracts directors, actors, classification, year, and duration from film info sections
  • Test suite: Includes comprehensive tests with Polly.js recording for HTTP interactions

Notable Implementation Details

  • Uses hardcoded array indices to extract data from window.articleContext.searchResults (indices confirmed from live data)
  • Implements smart date parsing for access screenings by inferring the year from performance entries
  • Filters screenings by cinema venue using existing venue matching utilities
  • Handles error pages and network timeouts gracefully during page retrieval
  • Deduplicates articles by stripping context_id parameters from URLs

https://claude.ai/code/session_01GMhmjDMEe9PYJz1tg3mPXH

New source module for BFI festival screenings (BFI Flare for this
iteration). Fetches the festival A-Z listing page with Playwright,
extracts per-film articleContext.searchResults to get performances,
and matches venues against cinemas via venueMatchesCinema.

- retrieve.js: Playwright fetch of A-Z page, deduplicates by article_id,
  extracts HTML + articleContext.searchResults from each film page
- find-events.js: Parses film info for overview, Access screenings section
  for per-performance accessibility flags, and searchResults for
  date/venue/booking URL; matches against cinema via venueMatchesCinema
- Tests wired up with isRecording=true pending first recording run

https://claude.ai/code/session_01GMhmjDMEe9PYJz1tg3mPXH
@alistairjcbrown alistairjcbrown force-pushed the claude/epic-carson-hc9jb branch from 6ff02ca to df640d3 Compare February 23, 2026 17:32
@alistairjcbrown alistairjcbrown merged commit ea3e49a into main Feb 23, 2026
1 check passed
@alistairjcbrown alistairjcbrown deleted the claude/epic-carson-hc9jb branch February 23, 2026 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants