Show HN: Mothertoken – know the mother tongue of your LLMs
Category: cli-tool
Tags: tokenizer, llm, nlp, multilingual, cli-tool
Score: 5.8/10 (Innovation: 6, Technical: 5, Documentation: 7, Utility: 5)
Mothertoken is a CLI toolkit that analyzes and compares tokenizer efficiency of large language models across different languages, helping users identify which model tokenizes their language most efficiently. It provides an easy way to benchmark tokenizers, rank them by language, and compare custom models via Hugging Face refs, which is useful for multilingual NLP workflows.
Target audience: NLP researchers, ML engineers, and developers working with multilingual language models
Repository: https://mothertoken.inigoimaz.com/ · Python · MIT · 1 stars
View on Hacker News