ferret.Benchmark.evaluate_samples#

Benchmark.evaluate_samples(dataset: BaseDataset, sample: int | List[int], target: int | None = None, show_progress_bar: bool = True, n_workers: int = 1, **evaluation_args) → Dict[source]#

Explain a dataset sample, evaluate explanations, and compute average scores.

Parameters:

dataset (BaseDataset) – XAI dataset to explain and evaluate
sample (Union[int, List[int]]) – index or list of indexes
target (int) – class label for which the explanations are computed and evaluated. If None, explanations are computed and evaluated for the predicted class
show_progress (bool) – enable progress bar
n_workers (int) – number of workers

Returns:

the average evaluation scores and their standard deviation for each explainer. The form is the following: {explainer: {“evaluation_measure”: (avg_score, std)}

Return type:

Dict