Question 1

Which is better: claude-mem or LiteRT-LM?

Accepted Answer

Based on our expert panel, claude-mem has a stronger verdict with a 75% Ship rate. claude-mem received a panel verdict of Ship and LiteRT-LM received Ship.

Question 2

Is claude-mem free?

Accepted Answer

claude-mem pricing: Free / Open Source (AGPL-3.0)

Question 3

Is LiteRT-LM free?

Accepted Answer

LiteRT-LM pricing: Open Source (Apache 2.0)

Question 4

What do experts say about claude-mem vs LiteRT-LM?

Accepted Answer

claude-mem: claude-mem is a Claude Code plugin that hooks into the agent's full session lifecycle — capturing every tool call, observation, and interaction — compresses them semantically using Claude's agent-sdk, and stores everything in a local SQLite + Chroma vector database. On each new session, it injects only the most contextually relevant history via a 3-layer token-efficient retrieval system. The result: a coding agent that actually remembers your project across disconnected sessions.

It's crossed 55K GitHub stars with support for Cursor, Gemini CLI, Windsurf, and OpenClaw. A community audit flagged the unauthenticated HTTP API on port 37777 as a HIGH severity issue — any local process can read every stored observation including API keys. The fix hasn't shipped yet.

The 'Endless Mode' beta enables truly continuous sessions with automatic context compression when approaching token limits, making it useful for long-running projects that currently require frequent re-orientation. LiteRT-LM: LiteRT-LM is Google's production-grade, open-source inference framework for deploying Large Language Models on edge devices — phones, IoT hardware, Raspberry Pi, and desktop machines without cloud connectivity. Launched April 7, 2026 alongside Gemma 4 support, it enables developers to run Gemma, Llama, Phi-4, Qwen, and other models entirely locally via a simple CLI or embedded SDK.

The framework handles the hard parts of edge inference: memory-mapped per-layer embeddings, 2-bit and 4-bit quantization, NPU acceleration for Qualcomm and MediaTek chipsets (early access), and cross-platform support spanning Android, iOS, Web, and desktop. Gemma 4's E2B variant runs under 1.5GB RAM on some devices, making full LLM functionality viable on mid-range hardware.

What makes LiteRT-LM significant is the agentic angle. It's one of the first frameworks to support multi-step agentic workflows running completely on-device — function calling, tool use, vision and audio inputs — without a single network request. For developers building privacy-sensitive apps or offline-capable agents, this changes the calculus entirely.

claude-mem vs LiteRT-LM

claude-mem

LiteRT-LM

Bookmarks