Back
Google AI EdgeOn-Device AIGoogle AI Edge2026-04-09

Google Launches LiteRT-LM and AI Edge Gallery — On-Device LLMs Just Became a Mainstream Android Feature

Google released LiteRT-LM, an open-source production inference framework for running LLMs on Android, iOS, and IoT devices, alongside the AI Edge Gallery Android app. Together they make Gemma 4 on-device inference a Play Store-distributed consumer reality.

Original source

On April 8, 2026, Google shipped two releases that together make on-device LLM inference a mainstream Android proposition: LiteRT-LM, an open-source production inference framework for edge devices, and the Google AI Edge Gallery, a Play Store app that lets any Android user run Gemma 4 E2B or E4B locally — no cloud, no API key, no data leaving the device.

LiteRT-LM is the LLM-focused successor to TensorFlow Lite, built for production deployment. It provides hardware acceleration via GPU and NPU across Android, iOS, desktop, and IoT platforms, supports multimodal inputs (vision and audio), includes tool use and function calling for agentic workflows, and runs Gemma, Llama, Phi-4, and Qwen model families. The framework was designed with two non-negotiable requirements: latency low enough for interactive use, and privacy guarantees that cloud inference cannot provide.

The AI Edge Gallery app is the consumer face of this stack. It ships as a standard Play Store install, presents a clean model selection interface, and demonstrates on-device text generation, image analysis, and agent behavior using Gemma 4 — Google's open multimodal model family released earlier this month. For developers, the open-source codebase is a complete reference implementation showing how to integrate LiteRT-LM into an Android app with production-quality model download, caching, and accelerator management.

The significance is not just technical. By shipping via Play Store rather than developer-only channels, Google is communicating to enterprises, regulators, and privacy advocates that on-device AI is a supported, long-term commitment — not a research demo. For healthcare, finance, legal, and government use cases where data residency is mandatory, Gemma 4 + LiteRT-LM on-device is now a production-ready path.

The competitive framing is clear: Apple has shipped Apple Intelligence as on-device AI at the OS level. Google's answer is an open-source stack that any developer can build on, any device manufacturer can ship, and any user can install today from the Play Store. The on-device AI race just became a platform war.

Panel Takes