WildDash 2 Benchmark

View Visualizations: Algorithm Results

For all metrics, higher scores are better. To participate in the benchmark, check our submission instructions.

	Meta AVG	Classic		Negative	Impact (AP)
Algorithm	AP	AP	AP 50%	AP	Blur	Coverage	Distortion	Hood	Occ.	Overexp.	Particles	Screen	Underexp.	Var.
RUSH_ROB	26.6%	27.6%	56.5%	21.2%	-25%	0%	0%	-9%	-15%	-46%	-21%	-16%	-9%	-22%
NL_ROI_ROB	21.9%	19.4%	34.0%	19.7%	-65%	-21%	-34%	0%	-36%	-30%	-31%	0%	0%	-43%
MaskRCNN-R-50-FPN-GN	20.0%	21.1%	49.9%	14.2%	-45%	-5%	-18%	-21%	-17%	-71%	-15%	-22%	-39%	-36%
MRCNN_CS	12.2%	12.4%	28.3%	6.9%	-13%	0%	-56%	-33%	-35%	-65%	-12%	-33%	-53%	-61%
MRCNN_VSCMLab_ROB	11.8%	9.3%	18.7%	63.3%	-60%	-13%	-29%	-14%	-57%	-63%	0%	-54%	-33%	-63%
MaskRCNN_ROB	9.0%	9.0%	20.2%	6.6%	-52%	-27%	-36%	-40%	-44%	-43%	-47%	-68%	-38%	-53%
MRCNN++_VSCMLab_ROB	8.4%	7.3%	14.7%	40.7%	-55%	-20%	-32%	-35%	-41%	-78%	-32%	-27%	-64%	-67%
Sem2Ins	8.3%	7.7%	14.5%	3.9%	-62%	0%	-26%	-30%	-51%	-58%	-27%	0%	0%	-3%
MRCNN_K	4.0%	3.9%	8.2%	4.2%	-54%	-13%	-23%	-15%	-47%	-55%	-57%	-27%	-42%	-29%
BAMRCNN_ROB	1.7%	0.8%	1.9%	24.9%	-14%	-4%	-83%	-35%	-89%	-51%	-33%	0%	-43%	-21%

Cached July 15, 2025, 9:23 p.m. UTC+0

Click here for the extended metrics table

Methodology:
Our benchmark evaluates the negative Impact of common visual hazards on algorithm output performance. It is calculated by this formula:
impact = min(metric_low,metric_high) / max(metric_none,metric_low) - 1.0
The metrics_{none/low/high} are evaluated on subsets of the benchmark dataset that correspond to the identified severity of the hazard (e.g. the subset Blur_high contains images which have a lot of blur visible). Positive impacts are truncated to zero.
An impact of -10% at Blur translates to an expected performance degradation for the algorithm of 10 percent when there is a considerable blur in the input image as opposed to supplying the same algorithm a similar image without noticeable image blur.
These are all currently evaluated hazards:
Blur: Image is noticeably affected by blur (e.g. motion blur, defocusing, compression artifacts...)
Coverage: Normally visible parts of the road are covered (e.g. unusual lane markings, snow, leaves...)
Distortion: Visible lens distortion
Hood: Ego-vehicle is visible, non-windscreen parts (e.g. car hood, mirrors)
Occl: Objects are partially occluded or cut off by image border
Overexp.: The scene is overexposed
Particle: Particles in the air obstruct the view (e.g. heavy rain, snow, fog)
Screen: The windscreen is interfering (e.g. interior reflections, wipers, rain on the windscreen,...)
Underexp.: The image is underexposed
Variation: Intra-class variations within the image (i.e. unusual representations of labels like unique cars)
More details on evaluation metrics and negative test cases can also be found on the FAQ page.