Skip to content

Fix scheduler stepping and label dtype handling in training loop#156

Open
Apprentice2907 wants to merge 1 commit intoML4SCI:mainfrom
Apprentice2907:fix/scheduler-dtype-clean
Open

Fix scheduler stepping and label dtype handling in training loop#156
Apprentice2907 wants to merge 1 commit intoML4SCI:mainfrom
Apprentice2907:fix/scheduler-dtype-clean

Conversation

@Apprentice2907
Copy link

Fixes #152

Changes Made

  1. Scheduler Fix

    • Replaced scheduler.step(loss) with scheduler.step()
    • CosineAnnealingWarmRestarts does not accept a loss argument and expects a simple step call
  2. Label dtype Fix

    • Replaced labels.type(torch.LongTensor).to(device) with labels.long().to(device)
    • The old approach creates a CPU tensor first before moving to device, causing unnecessary CPU→GPU transfer
    • The new approach keeps the tensor on the correct device throughout

Files Changed

  • DeepLense_Classification_Transformers_Archil_Srivastava/train.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect scheduler stepping and label dtype handling in training loop

1 participant