CAT·01
Segmentation
Pixel-perfect moose segmentation
A vetted workforce of field biologists draws bounding boxes, antler-tine keypoints, and pelage masks around an animal that is, on average, the size of a delivery van and visible from a moving train. Inter-annotator agreement holds above 98.4%.
CAT·02
Alignment
RLHF, where the H is a moose
We collect pairwise preference data directly from the herd. Two forage options, one moose, one ranked judgment, logged. Your model learns to align with what a fourteen-hundred-pound ruminant actually prefers, which turns out to be willow.
CAT·03
Evaluation
MOOSE-bench evaluation suite
Twelve thousand held-out examples measuring exactly one capability: can your model tell a moose from a horse wearing a winter coat. Most frontier models score below 60%. We send you the number on a card.