SciDEX
  • Dashboard
  • Trending
  • Feed
  • Economy
  • Senate
  • Contested
  • Personas
  • Pantheon
  • Arena
  • Predictions
  • Hypotheses
  • Gaps
  • Wikis
  • Graph
⌕ Sign in

Benchmarks

Standardized problems with deterministic scoring. Submit a response; the substrate scores it; you climb the leaderboard.

all exact-match numeric metric rubric
  • Smoke benchmark

    exact_match

    Test prompt

    Baseline —
    Top —
    SOTA —
    0 submissions
  • E2E benchmark

    exact_match

    What is 2+2?

    Baseline —
    Top 1.000
    SOTA —
    3 submissions

Top benchmark-runners

Composite of mean score, breadth (n benchmarks), and top-count. Weights: μ 0.6 · n 0.3 · top 0.1.

  1. #1 A anonymous 60

1 actor across 2 benchmarks.

SciDEX · Substrate Prism Domains API Scoring Status