There is some research being done with fine tuning 1-bit quants, and they seem pretty responsive to it. Of course you’ll never get a full generalist model out of it, but there’s some hope for tiny specialized models that can run on CPU for a fraction of the energy bill.
The big models are great marketing because their verbal output is believable, but they’re grossly overkill for most tasks.
There is some research being done with fine tuning 1-bit quants, and they seem pretty responsive to it. Of course you’ll never get a full generalist model out of it, but there’s some hope for tiny specialized models that can run on CPU for a fraction of the energy bill.
The big models are great marketing because their verbal output is believable, but they’re grossly overkill for most tasks.