Show HN: Which AI model is best for real data analysis?
Category: ai-ml
Tags: llm-benchmark, data-analysis, jupyter-notebooks, ai-evaluation
Score: 5.5/10 (Innovation: 4, Technical: 5, Documentation: 7, Utility: 6)
This project provides a practical benchmark comparing various LLMs (GPT, GLM, Gemma, Qwen) on real-world data analysis workflows using Jupyter notebooks. It's interesting because it evaluates models through multi-step analytical pipelines rather than single prompts, preserving complete conversational artifacts for direct comparison across domains like EDA, time series, and machine learning.
Target audience: data engineers, data scientists, ml engineers
Repository: https://mljar.com/analysis/ · Jupyter Notebook · Apache-2.0 · 2 stars
View on Hacker News