Support GEMM on CPU & MetaX and Add Generic Dispatcher #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TL;DR: This PR primarily extends the support of the GEMM (
gemm) operator to MetaX and adds a generic dispatcher.Key Changes
Multi-platform Support:
mcblasimplementation ofgemmon MetaX along with its example program.gemmand is tested on MetaX.BLAS Abstraction: Abstract a
blas.hwhich contains the common framework for calling the blas library across different platforms (currently CUDA-ish).Generic Dispatcher: Add the generic dispatcher in
src/dispatcher.h, which provides both the core dispatcher and some specialized interfaces/APIs.Codebase Refactoring:
DataType, specifically change it from aclassto aenum classand updates the relevant caller codes.common/directory which contains the common constructs that can be used internally (no expose to the outside), and later it is planned to have device-specific subdirectories.Add a bunch of constructs for compile-time usage, currently they are mainly used by the dispatcher and its caller code.
Known Issues & Future Work:
C++ Standard Compatibility: the dispatcher currently still uses a specific C++20 feature, which is explicit template parameter list for lambdas. For compatiability, a pure C++17 substitute is planned to be developed.
Type Mapping:
float16andbfloat16have NOT been mapped to primitive types yet.