Skip to content

CNNT: dynamic positional embeddings to support variable input sizes#150

Open
SarthakJagota wants to merge 1 commit intoML4SCI:mainfrom
SarthakJagota:fix/cnnt-dynamic-pos-embedding
Open

CNNT: dynamic positional embeddings to support variable input sizes#150
SarthakJagota wants to merge 1 commit intoML4SCI:mainfrom
SarthakJagota:fix/cnnt-dynamic-pos-embedding

Conversation

@SarthakJagota
Copy link

This PR removes the hard-coded patch count used to initialize positional embeddings in the CNNT model and instead creates them dynamically based on the CNN output sequence length.

Changes

  1. Compute patch sequence length from CNN feature maps

  2. Create positional embeddings dynamically to avoid shape mismatches

  3. Properly register positional embeddings so they are tracked by the optimizer

  4. Maintain existing behavior for default input configurations

Motivation

The previous implementation assumed a fixed input resolution, which limited reuse across DeepLense tasks and could lead to runtime errors when image sizes change.
This update makes CNNT resolution-agnostic and improves model flexibility.

#149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant