HELM LLM Leaderboards
Website: https://crfm.stanford.edu/helm/lite/latest/#/leaderboard
More info: https://github.com/stanford-crfm/helm
HELM (Holistic Evaluation of Language Models) is a transparent and open framework for evaluating LLMs. It was created by the Center for Research on Foundation Models (CRFM) at Stanford University.
They have multiple leaderboards for different types of models. Their “Lite” leaderboard currently evaluates 80+ models across 10 benchmark scenarios. This is a broad LLM evaluation via in-context learning.