Search results
README.md. bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models on CPU (with NPU and GPU support coming next).
28 lut 2024 · Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}.
bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models on CPU (with NPU and GPU support coming next).
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities - microsoft/unilm
Key Features of BitNet.cpp. BitNet.cpp comes with a treasure trove 🪙 of features designed to optimize performance and usability: Optimized Performance 🚀: BitNet.cpp is fine-tuned to run seamlessly on both ARM and x86 CPUs — commonly found in PCs and mobile devices. Performance gains are impressive 🔥; on ARM CPUs, speed increases range from 1.37x to 5.07x, and on x86 CPUs, up to 6.17x.
arXiv. Publication. The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption. In this work, we introduce BitNet, a scalable and stable 1-bit Transformer architecture designed for large language models.
26 mar 2024 · BitLinear workflow and BitNet architecture: Source: https://arxiv.org/pdf/2310.11453.pdf. Idea behind 1-bit LLM: Quantization. Artificial neural networks are composed of activation...