Getting Started
Learn about logpare - semantic log compression for LLM context windows
logpare is a semantic log compression library for LLM context windows. It uses the Drain algorithm to extract templates from repetitive log data, achieving 60-90% token reduction while preserving diagnostic information.
The Problem
AI assistants processing logs waste tokens on repetitive patterns. A 10,000-line log dump might contain 50 unique message templates repeated thousands of times — but the LLM sees (and bills for) every repetition.
The Solution
logpare identifies log templates and outputs a compressed format showing each template once with occurrence counts.
Before logpare
INFO Connection from 192.168.1.1 established
INFO Connection from 192.168.1.2 established
INFO Connection from 10.0.0.55 established
... (10,844 more similar lines)After logpare
=== Log Compression Summary ===
Input: 10,847 lines → 23 templates (99.8% reduction)
Top templates by frequency:
1. [4,521x] INFO Connection from <*> established
2. [3,892x] DEBUG Request <*> processed in <*>
3. [1,203x] WARN Retry attempt <*> for <*>Key Features
- High compression rates: 60-90% token reduction
- Semantic understanding: Preserves diagnostic information
- Automatic extraction: URLs, HTTP status codes, correlation IDs, durations
- Severity detection: Automatic tagging as error, warning, or info
- Multiple output formats: Summary, detailed, and JSON
- Fast & efficient: Processes 10,000+ lines/second
- V8-optimized: Uses monomorphic classes and Map-based children
- TypeScript-first: Full type safety with strict checking
How It Works
logpare uses the Drain algorithm to parse logs:
- Preprocessing: Mask known variables (IPs, UUIDs, timestamps)
- Tokenization: Split log line into tokens
- Tree Navigation: Navigate parse tree by token count → first token → subsequent tokens
- Cluster Matching: Find cluster with highest similarity above threshold
- Template Update: Update pattern, replacing differing tokens with
<*>