Question 1

Which is better: Extractor or GuppyLM?

Accepted Answer

Based on our expert panel, Extractor has a stronger verdict with a 100% Ship rate. Extractor received a panel verdict of Ship and GuppyLM received Ship.

Question 2

Is Extractor free?

Accepted Answer

Extractor pricing: Free / Open Source

Question 3

Is GuppyLM free?

Accepted Answer

GuppyLM pricing: Open Source (MIT)

Question 4

What do experts say about Extractor vs GuppyLM?

Accepted Answer

Extractor: Extractor by Lightfeed is a TypeScript library that uses LLMs to extract structured data from websites. It handles messy HTML, JavaScript-rendered content, and inconsistent page layouts that break traditional scrapers. Define your schema and let the LLM figure out where the data lives. GuppyLM: GuppyLM is a deliberately tiny language model — 9 million parameters, 6 transformer layers — that roleplays as a fish and can be fully trained in under 5 minutes on a free Google Colab T4 GPU. The entire pipeline from data generation to training loop to inference fits in approximately 130 lines of PyTorch, making it the most compressed end-to-end LLM tutorial available.

Unlike educational projects that paper over complexity with abstraction layers, GuppyLM deliberately avoids modern optimizations — no RoPE positional encoding, no grouped-query attention, no SwiGLU activations. You see exactly why each component exists when you remove it. It ships with a 60,000-example synthetic conversation dataset and produces coherent (if goofy) fish-themed responses after training.

The project hit the top of Hacker News Show HN with 365 points and 31 comments. Developers praised how the simplicity forces you to confront how training data shapes model behavior directly, with multiple commenters saying it's the clearest path from 'I know Python' to 'I understand why LLMs work.'

Extractor vs GuppyLM

Extractor

GuppyLM

Bookmarks