Gemma 4 models use a training trick to slash their memory footprint
TL;DR Gemma 4 models are now available for download with quantization-aware training (QAT), which reduces the size and memory footprint of the models. These open-source models retain quality better thanks…