llama : add fd-based model loading via llama_model_load_from_fd ( REWORK )#20402
llama : add fd-based model loading via llama_model_load_from_fd ( REWORK )#20402Siddhesh2377 wants to merge 3 commits intoggml-org:masterfrom
Conversation
ggml/include/gguf.h
Outdated
|
|
||
| GGML_API struct gguf_context * gguf_init_empty(void); | ||
| GGML_API struct gguf_context * gguf_init_from_file(const char * fname, struct gguf_init_params params); | ||
| GGML_API struct gguf_context * gguf_init_from_fd(int fd, struct gguf_init_params params); |
There was a problem hiding this comment.
For your purposes, would it work to expose the current gguf_init_from_file_impl as gguf_init_from_file_ptr and to use that as the basis for the implementation instead? That way we would be able to also use this code on Windows in conjunction with ggml_fopen.
There was a problem hiding this comment.
Done, replaced gguf_init_from_fd with gguf_init_from_file_ptr(FILE *) and moved the dup+fdopen logic up to llama-model-loader
There was a problem hiding this comment.
The llama C API should also use a file pointer if at all possible, the conversion from file descriptor to file pointer should be in your user code.
There was a problem hiding this comment.
Done, switched the llama C API to FILE pointer as well. The test shows the fd to FILE* conversion on the caller side.
Adds
llama_model_load_from_fd()to load GGUF models from a POSIX file descriptor instead of a file path.On Android, apps accessing user files through SAF only get a file descriptor, not a path. The alternative is copying the model into app storage or requesting MANAGE_EXTERNAL_STORAGE, which gets rejected by Google Play. This happened with my app (ToolNeuron).
Reworked version of a previous PR that was rejected for code quality.
Not supported on Windows. The fd is dup'd internally so the caller retains ownership.
Tested locally with CI and a real model (vocab_only + mmap).