Show HN: Which AI model is best for real data analysis?

Category: ai-ml

Tags: llm-benchmark, data-analysis, jupyter-notebooks, ai-evaluation

Score: 5.5/10 (Innovation: 4, Technical: 5, Documentation: 7, Utility: 6)

This project provides a practical benchmark comparing various LLMs (GPT, GLM, Gemma, Qwen) on real-world data analysis workflows using Jupyter notebooks. It's interesting because it evaluates models through multi-step analytical pipelines rather than single prompts, preserving complete conversational artifacts for direct comparison across domains like EDA, time series, and machine learning.

Target audience: data engineers, data scientists, ml engineers

Repository: https://mljar.com/analysis/ · Jupyter Notebook · Apache-2.0 · 2 stars

View on Hacker News