Skip to content

GPU Price Scraper & Dashboard #1

@kchandan

Description

@kchandan

GPU Price Scraper & Dashboard

🎯 Objective

Build a Python-based web scraper that collects GPU pricing data across multiple cloud providers and online marketplaces, saves it into a CSV file, and displays the results on a simple web page dashboard.


🔹 Data Sources

Target Sites (Cloud Providers + Marketplaces)

The intern should check real, public-facing pricing pages such as:


🔹 Technical Tasks

1. Web Scraper (Python)

  • Use requests + BeautifulSoup (or selenium if needed).
  • Extract GPU type, price/hour, region (if cloud), or price/unit (if retail).
  • Normalize field names (e.g., provider, gpu_model, price_usd, unit).

2. CSV Output

  • Write to gpu_prices.csv with schema:

    timestamp | provider | gpu_model | price_usd | unit
    
  • Example row:

    2025-08-24 | AWS | NVIDIA A100 | 3.20 | per_hour
    

3. Web Page Dashboard

  • Build a simple Flask (or Streamlit) app.

  • Load gpu_prices.csv.

  • Show in:

    • Table view (sortable by provider, GPU).
    • Bar chart (cheapest GPU per provider).

🔹 Challenges for the Intern

  1. Dynamic pages → some sites require scraping HTML, others expose structured JSON.
  2. Currency normalization → convert all prices to USD if multiple currencies appear.
  3. Different units → hourly cloud rental vs one-time retail purchase.
  4. Automation → script should be runnable daily (via cron or Databricks job).

🔹 Deliverables

  1. Python Scraper: gpu_scraper.py

    • Configurable list of providers & URLs.
    • Outputs gpu_prices.csv.
  2. Web Dashboard: app.py (Flask or Streamlit)

    • Table of scraped GPU prices.
    • Bar chart comparing providers.
  3. README.md

    • How to run the scraper.
    • How to launch the dashboard.

🔹 Stretch Goals (Optional)

  • Store results historically (append to CSV) → analyze price trends over time.
  • Deploy dashboard to Heroku/Render/Databricks SQL + dashboarding.
  • Add alerts: flag when GPU prices drop below a threshold.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions