Search results
bitnet.cpp is the official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models on CPU (with NPU and GPU support coming next).
28 lut 2024 · Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}.
This repository not only provides PyTorch implementations for training and evaluating 1.58-bit neural networks but also includes a unique integration where the experiments conducted automatically update a LaTeX-generated paper.
27 lut 2024 · Abstract. Recent research, such as BitNet [23], is paving the way for a new era of 1-bit Large Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1.58, in which every single parameter (or weight) of the LLM is ternary {-1, 0, 1}.
29 mar 2024 · Here is the commands to run the evaluation: pip install lm-eval==0.3.0. python eval_ppl.py --hf_path 1bitLLM/bitnet_b1_58-3B --seqlen 2048. python eval_task.py --hf_path 1bitLLM/bitnet_b1_58-3B \. --batch_size 1 \. --tasks \. --output_path result.json \. --num_fewshot 0 \. --ctx_size 2048.
5 sie 2024 · When doubling the number of weights, we still reap a significant part of the memory savings of going from 16-bits to 1.58-bits, i.e., we need 2x 2-bit instead of 16-bit.
We introduce a variant of BitNet b1.58 that relies on the median rather than the mean of the absolute values of the weights. Through extensive experiments we investigate and compare the scaling, the learning-rate robustness, and the regularization properties of both 1.58-bit variants. Our work demonstrates that.