🔍 Deepfake Detection Using SWIN Transformer

A deep learning-based system for detecting deepfake images using a fine-tuned SWIN Transformer (Shifted Window Transformer). The model classifies facial images as Real or Fake with confidence scores and is deployed as a live web application.

B.Tech Major Project by Purna Chandar Konda

🚀 Try the Live Demo →

🤗 View Trained Model on HuggingFace Hub →

🧠 About

The rapid advancement of deepfake technology has made it increasingly difficult to distinguish real facial images from manipulated ones, posing a serious threat to digital media integrity, cybersecurity, and trust in online content.

This project addresses the problem by leveraging the SWIN Transformer architecture — a hierarchical vision transformer that uses shifted window-based self-attention — to detect and classify manipulated facial images with high accuracy.

Key highlights:

Fine-tuned SWIN-Tiny model (microsoft/swin-tiny-patch4-window7-224) pretrained on ImageNet-1K
Trained on 190,000+ real and fake face images from the Deepfake and Real Images dataset
Deployed as a live Gradio web application on Hugging Face Spaces
Includes a one-click Google Colab training notebook for easy reproducibility

⚙️ How It Works

┌─────────────┐    ┌──────────────────┐    ┌────────────────┐    ┌──────────────┐
│  Input Image │───▶│  Preprocessing    │───▶│ SWIN Transformer│───▶│ Classification│
│  (224×224)   │    │  (Resize, Norm)   │    │ (Feature Ext.)  │    │    Output    │
└─────────────┘    └──────────────────┘    └────────────────┘    └──────────────┘
                                                                        │
                                                                        ▼
                                                              ┌──────────────────┐
                                                              │  ✅ Real / ❌ Fake │
                                                              │  + Confidence %   │
                                                              └──────────────────┘

Pipeline:

Input — A facial image is uploaded through the web interface (any resolution).
Preprocessing — The image is resized to 224×224 pixels and normalized using ImageNet mean/std values.
Feature Extraction — The SWIN-Tiny transformer processes the image through 4 hierarchical stages using shifted window multi-head self-attention, extracting both local and global features.
Classification — A linear classification head maps the extracted 768-dimensional feature vector to 2 output classes (Real / Fake) using softmax probabilities.

📁 Project Structure

DeepfakeDetectionUsingSWINTransformer/
├── app.py                          # Gradio web app (deployed on HF Spaces)
├── demo.py                         # Quick local demo using HF pipeline
├── deploy_to_spaces.py             # One-click deployment script for HF Spaces
├── train_on_colab.ipynb            # Google Colab training notebook (recommended)
├── swin-tiny-complete-training.py  # Local training script (requires GPU)
├── model-testing.py                # Model evaluation script
├── image_extractor.py              # Frame extraction from video datasets
├── models/
│   └── swin-tiny-complete/         # Model configuration files
│       ├── config.json
│       └── preprocessor_config.json
├── requirements.txt
├── .gitignore
├── LICENSE
└── README.md

Note: Model weights (~110 MB) are hosted on Hugging Face Hub and are automatically downloaded when you run the app.

🚀 Installation

Prerequisites

Python 3.10+
Git
(Optional) NVIDIA GPU with CUDA for local training

Setup

# Clone the repository
git clone https://github.com/Purnachander-Konda/DeepfakeDetectionUsingSWINTransformer.git
cd DeepfakeDetectionUsingSWINTransformer

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate        # Linux/Mac
venv\Scripts\activate           # Windows

# Install dependencies
pip install -r requirements.txt

Model Weights

The trained model weights are hosted on Hugging Face Hub and are downloaded automatically when you run app.py. For manual download:

pip install huggingface-hub
huggingface-cli download Purnachander-Konda/deepfake-detection-swin --local-dir ./models/swin-tiny-complete

💻 Usage

Run the Web App

python app.py

Open http://localhost:7860 in your browser — upload any face image to get a Real/Fake prediction with confidence scores.

Quick Demo

python demo.py

Evaluate the Model

python model-testing.py

🏋️ Model Training

Train on Google Colab (Recommended)

The fastest way to train — no local GPU needed:

Open the notebook in Colab
Set Runtime → GPU (T4)
Run all cells — the notebook handles everything:
- Downloads the 190K+ image dataset
- Fine-tunes SWIN-Tiny for 3 epochs (~30-60 min)
- Evaluates and prints metrics
- Uploads the trained model to Hugging Face Hub

Train Locally

Requires an NVIDIA GPU and the dataset:

python swin-tiny-complete-training.py

Training Configuration

Parameter	Value
Base Model	`microsoft/swin-tiny-patch4-window7-224`
Dataset	Hemg/deepfake-and-real-images (190K+ images)
Train/Test Split	80/20 (stratified)
Learning Rate	2e-5
Batch Size	16 (×2 gradient accumulation = effective 32)
Epochs	3
Optimizer	AdamW (weight decay: 0.01)
Precision	FP16 mixed precision
Checkpointing	Gradient checkpointing enabled
Total Parameters	~27.5M

📊 Results

Evaluation metrics on the held-out test set (20% of dataset, ~38K images):

Metric	Score
Accuracy	0.9881
F1 Score (Macro)	0.9881
Precision (Macro)	0.9881
Recall (Macro)	0.9881

The model achieves strong performance on binary deepfake classification. Metrics were computed using HuggingFace Evaluate on the stratified test split.

🌐 Live Demo

The model is deployed as a Gradio web application on Hugging Face Spaces:

🔗 https://huggingface.co/spaces/Purnachander-Konda/deepfake-detection-swin

Features:

Upload any face image for instant Real/Fake classification
Confidence scores for both classes
No sign-up or installation required

🛠️ Tech Stack

Component	Technology
Model	SWIN Transformer — Tiny variant
Framework	PyTorch + HuggingFace Transformers
Web Interface	Gradio
Dataset	Deepfake and Real Images — 190K+ images
Evaluation	HuggingFace Evaluate (Accuracy, F1, Precision, Recall)
Training	Google Colab — T4 GPU
Model Hosting	Hugging Face Hub
App Hosting	Hugging Face Spaces

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

🙏 Acknowledgments

Microsoft Research for the SWIN Transformer architecture
Hemg for the Deepfake and Real Images dataset
Hugging Face for Transformers library, model hosting, and Spaces
Gradio for the web interface framework
Google Colab for free GPU access

Built by Purna Chandar Konda

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 Deepfake Detection Using SWIN Transformer

📋 Table of Contents

🧠 About

⚙️ How It Works

📁 Project Structure

🚀 Installation

Prerequisites

Setup

Model Weights

💻 Usage

Run the Web App

Quick Demo

Evaluate the Model

🏋️ Model Training

Train on Google Colab (Recommended)

Train Locally

Training Configuration

📊 Results

🌐 Live Demo

🛠️ Tech Stack

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
models/swin-tiny-complete		models/swin-tiny-complete
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
demo.py		demo.py
deploy_to_spaces.py		deploy_to_spaces.py
gradio-test.py		gradio-test.py
image_extractor.py		image_extractor.py
model-testing.py		model-testing.py
requirements.txt		requirements.txt
swin-tiny-complete-training.py		swin-tiny-complete-training.py
train_on_colab.ipynb		train_on_colab.ipynb

Folders and files

Latest commit

History

Repository files navigation

🔍 Deepfake Detection Using SWIN Transformer

📋 Table of Contents

🧠 About

⚙️ How It Works

📁 Project Structure

🚀 Installation

Prerequisites

Setup

Model Weights

💻 Usage

Run the Web App

Quick Demo

Evaluate the Model

🏋️ Model Training

Train on Google Colab (Recommended)

Train Locally

Training Configuration

📊 Results

🌐 Live Demo

🛠️ Tech Stack

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages