The hype around 1 Bit LLMs explained | Simplified Explanation for Beginners
The Era of 1-bit LLMs
This blog covers the essential breakthrough methodology discussed in the paper along with key terms and concepts for better understanding.
Why is 1 bit LLM important?
As the performance of the LLMs are getting better, it comes at the cost of the growing size of the model along with high inference time and cost. So one of the major focus areas of research is trying to build LLMs with high precision but low size so that they are more cost-effective in terms of latency, memory, throughput, and energy consumption.
1 bit LLM is a breakthrough in that direction, that defines a new scaling law and recipe for training new generations of LLMs that are both high-performance and cost effective
Let’s understand what’s 1.58 in 1 bit LLM.
Until now, the LLM models have billions of parameters, and the better the model, the bigger the size of parameters. These parameters are essentially the weights of the model, that are stored in a matrix, and are multiplied with the input to produce and output. These weights, at the moment are 16 bit each, because the values are float points.
Ternary weights: In this approach, they have reduced these weights into one of the following three values: {-1,01}.
1 bit LLM: And since the value can only be one of these three, so total bits that are required to…