Skip to content

Official codebase for the "rsbench: A Benchmark Suite for Systematically Evaluating Reasoning Shortcuts" benchmark paper.

Notifications You must be signed in to change notification settings

unitn-sml/rsbench-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RSbench

This is the official codebase for "RSbench: A Benchmark Suite for Systematically Evaluating Reasoning Shortcuts", NeurIPS 2024. This suite provides tools to evaluate and generate datasets focused on Reasoning Shortcuts (RSs) .

Content Overview

  • rsscount: Use this module to count RSs in your datasets.

  • rsseval: Evaluate the presence and impact of RSs using the rsbench datasets.

  • rssgen: Generate datasets designed to study and analyze RSs effectively.

Each component is designed to help you systematically assess and understand RSs in various machine learning models.

Website

For more info, go to the dedicated website: Link to Website.

About

Official codebase for the "rsbench: A Benchmark Suite for Systematically Evaluating Reasoning Shortcuts" benchmark paper.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published