ferret.Benchmark#

class ferret.Benchmark(model, tokenizer, task_name: str = 'text-classification', explainers: List | None = None, evaluators: List | None = None, class_based_evaluators: List | None = None)[source]#

Generic interface to compute multiple explanations.

__init__(model, tokenizer, task_name: str = 'text-classification', explainers: List | None = None, evaluators: List | None = None, class_based_evaluators: List | None = None)[source]#

Methods

__init__(model, tokenizer[, task_name, ...])

evaluate_explanation(explanation[, ...])

Evaluate an explanation using all the evaluators stored in the class.

evaluate_explanations(explanations[, ...])

Evaluate explanations using all the evaluators stored in the class.

evaluate_samples(dataset, sample[, target, ...])

Explain a dataset sample, evaluate explanations, and compute average scores.

explain(text[, target, show_progress, ...])

Compute explanations using all the explainers stored in the class.

load_dataset(dataset_name, **kwargs)

score(text[, return_dict])

Compute prediction scores for a single query

show_evaluation_table(explanation_evaluations)

show_samples_evaluation_table(...[, apply_style])

Format average evaluation scores into a colored table.

show_table(explanations[, ...])

Attributes

targets