The growing dependence on closed, inaccessible systems in academic publishing undermine the core principles of transparency, reproducibility, and openness. It is time to open models and community-driven innovation as the path forward for meaningful, equitable research. [3 min read]
Sometimes your finetuned language Model work as expected but you need a faster inference time. Some other times you need to reduce memory footprint. By transformring your models to the GGUF format you can store quantized models and using them on top of the fast llama.cpp inference engine. [5 min read]