ferret.Benchmark#
- class ferret.Benchmark(model, tokenizer, task_name: str = 'text-classification', explainers: List | None = None, evaluators: List | None = None, class_based_evaluators: List | None = None)[source]#
Generic interface to compute multiple explanations.
- __init__(model, tokenizer, task_name: str = 'text-classification', explainers: List | None = None, evaluators: List | None = None, class_based_evaluators: List | None = None)[source]#
Methods
__init__
(model, tokenizer[, task_name, ...])evaluate_explanation
(explanation[, ...])Evaluate an explanation using all the evaluators stored in the class.
evaluate_explanations
(explanations[, ...])Evaluate explanations using all the evaluators stored in the class.
evaluate_samples
(dataset, sample[, target, ...])Explain a dataset sample, evaluate explanations, and compute average scores.
explain
(text[, target, show_progress, ...])Compute explanations using all the explainers stored in the class.
load_dataset
(dataset_name, **kwargs)score
(text[, return_dict])Compute prediction scores for a single query
show_evaluation_table
(explanation_evaluations)show_samples_evaluation_table
(...[, apply_style])Format average evaluation scores into a colored table.
show_table
(explanations[, ...])Attributes
targets