Llama.cpp
Efficient inference engine for C and C++ language models.
Free
Efficient inference engine for C and C++ language models.
Efficient tensor library for machine learning on everyday devices.