-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
GPU Price Scraper & Dashboard
🎯 Objective
Build a Python-based web scraper that collects GPU pricing data across multiple cloud providers and online marketplaces, saves it into a CSV file, and displays the results on a simple web page dashboard.
🔹 Data Sources
Target Sites (Cloud Providers + Marketplaces)
The intern should check real, public-facing pricing pages such as:
-
AWS EC2 Pricing (NVIDIA A100, V100, T4, H100 instances)
-
Google Cloud GPU Pricing
-
Azure GPU Pricing
-
Paperspace / CoreWeave (GPU Cloud)
-
Optional – Retail Sites (for hardware GPUs)
- Newegg: https://www.newegg.com/
- BestBuy: https://www.bestbuy.com/
🔹 Technical Tasks
1. Web Scraper (Python)
- Use
requests+BeautifulSoup(orseleniumif needed). - Extract GPU type, price/hour, region (if cloud), or price/unit (if retail).
- Normalize field names (e.g.,
provider,gpu_model,price_usd,unit).
2. CSV Output
-
Write to
gpu_prices.csvwith schema:timestamp | provider | gpu_model | price_usd | unit -
Example row:
2025-08-24 | AWS | NVIDIA A100 | 3.20 | per_hour
3. Web Page Dashboard
-
Build a simple Flask (or Streamlit) app.
-
Load
gpu_prices.csv. -
Show in:
- Table view (sortable by provider, GPU).
- Bar chart (cheapest GPU per provider).
🔹 Challenges for the Intern
- Dynamic pages → some sites require scraping HTML, others expose structured JSON.
- Currency normalization → convert all prices to USD if multiple currencies appear.
- Different units → hourly cloud rental vs one-time retail purchase.
- Automation → script should be runnable daily (via cron or Databricks job).
🔹 Deliverables
-
Python Scraper:
gpu_scraper.py- Configurable list of providers & URLs.
- Outputs
gpu_prices.csv.
-
Web Dashboard:
app.py(Flask or Streamlit)- Table of scraped GPU prices.
- Bar chart comparing providers.
-
README.md
- How to run the scraper.
- How to launch the dashboard.
🔹 Stretch Goals (Optional)
- Store results historically (append to CSV) → analyze price trends over time.
- Deploy dashboard to Heroku/Render/Databricks SQL + dashboarding.
- Add alerts: flag when GPU prices drop below a threshold.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels