Skip to content

Fact Checking System: Information Credibility Verification

License

Notifications You must be signed in to change notification settings

DominiqueLoyer/systemFactChecking

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

77 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Fact Checking System: Information Credibility Verification

PyPI version DOI Python 3.8+ License: MIT Open In Colab Kaggle Buy me a coffee Sponsor on GitHub

PhD Thesis Prototype - Dominique S. Loyer
Citation Key: loyerModelingHybridSystem2025

Note

New in v2.2 (Jan 29, 2026):

  • GraphRAG: Contextual memory from Knowledge Graph.
  • Interactive Graph: D3.js visualization with physics and details on click.
  • Cloud Ready: Docker & Supabase integration.

πŸ“‹ Overview

A neuro-symbolic AI system for verifying information credibility that combines:

  • Symbolic AI: Rule-based reasoning with OWL ontologies (RDF/Turtle)
  • Neural AI: Transformer models for sentiment analysis and NER
  • IR Engine: BM25, TF-IDF, and PageRank estimation

The system provides explainable credibility scores (High/Medium/Low) with detailed factor breakdown.


πŸš€ Quick Start (v2.2 - January 2026)

Installation via PyPI (Recommended)

Option 1: Minimal Installation (Lightweight, ~100 MB)

Perfect for exploring the code, basic credibility checking without ML features:

pip install syscred

Option 2: With Machine Learning (Complete, ~2.5 GB)

Includes PyTorch, Transformers, and all ML models for full credibility analysis:

pip install syscred[ml]

Option 3: Full Installation (All features)

Includes ML, production tools, and development dependencies:

pip install syscred[all]

Alternative: Run on Kaggle/Colab

  1. Click the Kaggle or Colab badge above
  2. Enable GPU runtime
  3. Run All cells

Alternative: Local Installation with Docker

# Clone the repository
git clone https://github.com/DominiqueLoyer/systemFactChecking.git
cd systemFactChecking/02_Code

# Run with Startup Script (Mac/Linux)
./start_syscred.sh
# Access at http://localhost:5001

Python API Usage

from syscred import CredibilityVerificationSystem

# Initialize
system = CredibilityVerificationSystem()

# Verify a URL
result = system.verify_information("https://www.lemonde.fr/article")
print(f"Score: {result['scoreCredibilite']} ({result['niveauCredibilite']})")

# Verify text directly
result = system.verify_information(
    "According to Harvard researchers, the new study shows significant results."
)

πŸ“‘ REST API Endpoints

Endpoint Method Description
/api/verify POST Full credibility verification
/api/seo POST SEO analysis only (faster)
/api/ontology/stats GET Ontology statistics
/api/health GET Server health check

Example Request

curl -X POST http://localhost:5000/api/verify \
  -H "Content-Type: application/json" \
  -d '{"input_data": "https://www.bbc.com/news/article"}'

Example Response

{
  "scoreCredibilite": 0.78,
  "niveauCredibilite": "HIGH",
  "analysisDetails": {
    "sourceReputation": "High",
    "domainAge": 9125,
    "sentiment": {"label": "NEUTRAL", "score": 0.52},
    "entities": [{"word": "BBC", "entity_group": "ORG"}]
  }
}

πŸ“ Project Structure

systemFactChecking/ β”œβ”€β”€ README.md # This file β”œβ”€β”€ 01_Presentations/ # Presentations (.pdf, .tex) β”œβ”€β”€ 02_Code/ # Source Code & Docker β”‚ β”œβ”€β”€ syscred/ # ⭐ CORE ENGINE (v2.2) β”‚ β”‚ β”œβ”€β”€ graph_rag.py # [NEW] GraphRAG Module β”‚ β”‚ β”œβ”€β”€ verification_system.py β”‚ β”‚ β”œβ”€β”€ database.py # [NEW] Supabase Connector β”‚ β”‚ └── ... β”‚ β”œβ”€β”€ start_syscred.sh # Startup Script β”‚ β”œβ”€β”€ Dockerfile # Deployment Config β”‚ └── requirements.txt β”œβ”€β”€ 03_Docs/ # Documentation (.pdf) └── 04_Bibliography/ # References (.bib, .pdf)


---

## πŸ”§ Configuration

Set environment variables or edit `02_Code/v2_syscred/config.py`:

```bash
# Optional: Google Fact Check API key
export SYSCRED_GOOGLE_API_KEY=your_key_here

# Server settings
export SYSCRED_PORT=5000
export SYSCRED_DEBUG=true
export SYSCRED_ENV=production  # or development, testing

πŸ“Š Credibility Scoring

The system uses weighted factors to calculate credibility:

Factor Weight Description
Source Reputation 25% Known credible sources database
Domain Age 10% WHOIS lookup for domain history
Sentiment Neutrality 15% Extreme sentiment = lower score
Entity Presence 15% Named entities (ORG, PER)
Text Coherence 15% Vocabulary diversity
Fact Check 20% Google Fact Check API results

Thresholds:

  • HIGH: Score β‰₯ 0.7
  • MEDIUM: 0.4 ≀ Score < 0.7
  • LOW: Score < 0.4

πŸ“š Documentation & Papers


🏷️ Citation

@software{loyer2025syscred,
  author = {Loyer, Dominique S.},
  title = {SysCRED: Neuro-Symbolic System for Information Credibility Verification},
  year = {2026},
  publisher = {GitHub},
  url = {https://github.com/DominiqueLoyer/systemFactChecking}
}

πŸ“œ License

MIT License - See LICENSE for details.


πŸ”„ Version History

Version Date Changes
v2.0 Jan 2026 Complete rewrite with modular architecture, Kaggle/Colab support, REST API
v1.0 Apr 2025 Initial prototype with basic credibility scoring