llama.cpp

Port of Facebook's LLaMA model in C/C++

Inference of LLaMA model in pure C/C++. The main goal is to run the model using 4-bit quantization on a MacBook.

Supported platforms:

Supported models:

At Apideck we're building the world's biggest API network. Discover and integrate over 12,000 APIs.