EQ-Bench

Website: https://eqbench.com/

EQ-Bench is an LLM benchmark framework to measure emotional intelligence.

“Why emotional intelligence? One reason is that it represents a subset of abilities that are important for the user experience, and which isn’t explicitly tested by other benchmarks. Another reason is that it’s not trivial to improve scores by fine tuning for the benchmark, which makes it harder to “game” the leaderboard.”

The leaderboard also features 2 additional benchmarks:

1) Creative writing uses an LLM judge (Claude 3.5 Sonnet) to evaluate the creative writing capabilities of the LLMs on writing prompts.

2) Judgemark evaluates the ability of a model to judge creative writing using a numerical scoring system.