Skip to content

Fix scheduler stepping and label dtype handling in training loop#153

Open
SarthakJagota wants to merge 1 commit intoML4SCI:mainfrom
SarthakJagota:fix/training-scheduler-dtype
Open

Fix scheduler stepping and label dtype handling in training loop#153
SarthakJagota wants to merge 1 commit intoML4SCI:mainfrom
SarthakJagota:fix/training-scheduler-dtype

Conversation

@SarthakJagota
Copy link

@SarthakJagota SarthakJagota commented Feb 24, 2026

This PR introduces two small training stability improvements:

• Replaced scheduler.step(loss) with scheduler.step() to ensure compatibility with schedulers such as CosineAnnealingWarmRestarts which do not use the loss value.
• Replaced labels.type(torch.LongTensor).to(device) with labels.long().to(device) to avoid unnecessary CPU tensor creation and ensure consistent device handling.

These changes do not modify training behavior conceptually but improve correctness and stability of the training loop.

#152

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant