Benchmark#

Constructor#

Benchmark(model, tokenizer[, task_name, ...])

Generic interface to compute multiple explanations.

Benchmark.explain(text[, target, ...])

Compute explanations using all the explainers stored in the class.

`Benchmark.evaluate_explanation`(explanation)	Evaluate an explanation using all the evaluators stored in the class.
`Benchmark.evaluate_explanations`(explanations)	Evaluate explanations using all the evaluators stored in the class.

`Benchmark.show_table`(explanations[, ...])
`Benchmark.show_evaluation_table`(...[, style])
`Benchmark.show_samples_evaluation_table`(...)	Format average evaluation scores into a colored table.

`Benchmark.load_dataset`(dataset_name, **kwargs)
`Benchmark.evaluate_samples`(dataset, sample)	Explain a dataset sample, evaluate explanations, and compute average scores.

Benchmark.score(text[, return_dict])

Compute prediction scores for a single query