Tether Releases QVAC Cross-Platform BitNet LoRA Framework: Enabling Training of Billion-Parameter AI Models on Consumer Devices
According to an official announcement, Tether has unveiled a cross-platform BitNet LoRA fine-tuning framework within QVAC Fabric, enabling optimized training and inference for Microsoft BitNet (1-bit LLM). This framework significantly reduces computational and memory requirements, allowing billion-parameter scale models to be trained and fine-tuned on laptops, consumer-grade GPUs, and even smartphones.
This solution marks the first implementation of BitNet model fine-tuning on mobile GPUs (including Adreno, Mali, and Apple Bionic). Tests show that a 125M parameter model can be fine-tuned in approximately 10 minutes, a 1B model in about an hour, and it can even scale to 13B parameter models on mobile devices.
Furthermore, the framework supports heterogeneous hardware such as Intel, AMD, and Apple Silicon, achieving the first instance of 1-bit LLM LoRA fine-tuning on non-NVIDIA devices. In terms of performance, BitNet models demonstrate a 2x to 11x improvement in inference speed on mobile GPUs compared to CPUs, while reducing VRAM usage by up to approximately 77.8% compared to traditional 16-bit models.
Tether stated that this technology has the potential to break the dependence on high-end computing power and cloud infrastructure, promoting the decentralization and localization of AI training, and providing a foundation for new application scenarios such as federated learning.
