[Performance + VRAM] Re-implement 16-byte Vertex Format + use 4 Byte alignment#487
[Performance + VRAM] Re-implement 16-byte Vertex Format + use 4 Byte alignment#487thr3343 wants to merge 71 commits intoxCollateral:1.21from
Conversation
|
This doesn't seem to improve performance on Nvidia RTX 2000+, while older tests with alternate versions of the 16-Byte format did I suspect this is due to alignment again //2 Byte aligned: At least 10%+ FPS on Nvidia Turing+, but has Performance regressions on GCN
layout (location = 0) in ivec4 Position;
layout (location = 1) in vec4 Color;
layout (location = 2) in uvec2 UV0;
//4 Byte aligned: Slower due to not hitting the fast FP16 path on Nvidia RTX 2000+
layout (location = 0) in ivec4 Position;
layout (location = 1) in vec4 Color;
layout (location = 2) in uint UV0;
This is problematic as this PR should improve performance on Nvidia not decrease it So will mark as draft until this missing performance uplift on Nvidia is fixed |
Select 2 Byte alignment by default (including Nvidia), otherwise use 4 Bytes on AMD (GCN)
- Refactor - Fix biome blending not applied in some cases
- Implement BlockColorRegistry for fast look up - Refactor
- refactor shader loading
- Refactor CommandPool class
- Refactor
- Add gl method overwrites
- Refactor buffer
|
Hey, could this be updated? as it fixes all the vram issues I have with the mod (with the newest version I can't even get to a rd of 96 but with this I can get to 164) |
|
MB made a typo in the terrain shader (Incorrect multiplication) (i.e. Caused the GPU to use the wrong UVs for the block atlas) |
- Refactor
- Fix: normalize normal vector if needed
|
Has been added into upstream as of b35e7a3: PR has served its purpose, so will close If i'm not mistaken Nvidia Turing or later needs 2 byte alignment (i.e. not using 4 byte attributes in vertex Format) for optimal performance, but never confirmed this for sure (Lack the free time currently to do community test builds) |

The 16-Byte vertex format for terrain was planned in earlier versions of VulkanMod, but was scrapped due to causing performance regressions on AMD's GCN architecture
This patch exploits a workaround by doubling the vertex byte alignment to 4 bytes instead of 2, which fixes the regression and allows the 16-byte format to be used on GCN at full performance.
The 16-Byte vertex format provides the following advantages over the current 20-byte format