Modules#
Benchmark#
- class ferret.benchmark.Benchmark(model, tokenizer, explainers: List | None = None, evaluators: List | None = None, class_based_evaluators: List | None = None)[source]#
Generic interface to compute multiple explanations.
- evaluate_explanation(explanation: Explanation | ExplanationWithRationale, target, human_rationale=None, class_explanation: List[Explanation | ExplanationWithRationale] | None = None, show_progress: bool = True, **evaluation_args) ExplanationEvaluation [source]#
Evaluate an explanation using all the evaluators stored in the class.
- Parameters:
explanation (Union[Explanation, ExplanationWithRationale]) – explanation to evaluate.
target (int) – class label for which the explanation is evaluated
rationale (human) – one-hot-encoding indicating if the token is in the human (or ground truth) rationale (1) or not (0)
class_explanation (list) – list of explanations. The explanation in position i is computed using as target class the class label i. The size is #target classes. If available, class-based scores are computed.
show_progress (bool) – enable progress bar
- Returns:
the evaluation of the explanation
- Return type:
ExplanationEvaluation
- evaluate_explanations(explanations: List[Explanation | ExplanationWithRationale], target, human_rationale=None, class_explanations=None, show_progress=True, **evaluation_args) List[ExplanationEvaluation] [source]#
Evaluate explanations using all the evaluators stored in the class.
- Parameters:
explanation (List[Union[Explanation, ExplanationWithRationale]]) – list of explanations to evaluate.
target (int) – class label for which the explanations are evaluated
rationale (human) – one-hot-encoding indicating if the token is in the human rationale (1) or not (0). If available, all explanations are evaluated for the human rationale (if provided)
class_explanation (list) – list of list of explanations. The k-th element represents the list of explanations computed varying the target class: the explanation in position k, i is computed using as target class the class label i. The size is # explanation, #target classes. If available, class-based scores are computed.
show_progress (bool) – enable progress bar
- Returns:
the evaluation for each explanation
- Return type:
List[ExplanationEvaluation]
- evaluate_samples(dataset: BaseDataset, sample: int | List[int], target: int | None = None, show_progress_bar: bool = True, n_workers: int = 1, **evaluation_args) Dict [source]#
Explain a dataset sample, evaluate explanations, and compute average scores.
- Parameters:
dataset (BaseDataset) – XAI dataset to explain and evaluate
sample (Union[int, List[int]]) – index or list of indexes
target (int) – class label for which the explanations are computed and evaluated. If None, explanations are computed and evaluated for the predicted class
show_progress (bool) – enable progress bar
n_workers (int) – number of workers
- Returns:
the average evaluation scores and their standard deviation for each explainer. The form is the following: {explainer: {“evaluation_measure”: (avg_score, std)}
- Return type:
Dict
- explain(text, target=1, show_progress: bool = True, normalize_scores: bool = True, order: int = 1) List[Explanation] [source]#
Compute explanations using all the explainers stored in the class.
- Parameters:
text (str) – text string to explain.
target (int) – class label to produce the explanations for
show_progress (bool) – enable progress bar
normalize_scores (bool) – do lp-normalization to make scores comparable
- Returns:
list of all explanations produced
- Return type:
List[Explanation]
- get_dataframe(explanations: List[Explanation]) DataFrame [source]#
Convert explanations into a pandas DataFrame.
- Parameters:
explanations (List[Explanation]) – list of explanations
- Returns:
explanations in table format. The columns are the tokens and the rows are the explanation scores, one for each explainer.
- Return type:
pd.DataFrame
- score(text: str, return_dict: bool = True)[source]#
Compute prediction scores for a single query
- Parameters:
str (text) – query to compute the logits from
bool (return_dict) – return a dict in the format Class Label -> score. Otherwise, return softmaxed logits as torch.Tensor. Default True
- show_evaluation_table(explanation_evaluations: List[ExplanationEvaluation], apply_style: bool = True) DataFrame [source]#
Format evaluation scores into a colored table.
- Parameters:
explanation_evaluations (List[ExplanationEvaluation]) – a list of evaluations of explanations
apply_style (bool) – color the table of evaluation scores
- Returns:
a colored (styled) pandas dataframe of evaluation scores
- Return type:
pd.DataFrame
- show_samples_evaluation_table(evaluation_scores_by_explainer, apply_style: bool = True) DataFrame [source]#
Format average evaluation scores into a colored table.
- Parameters:
evaluation_scores_by_explainer (Dict) – the average evaluation scores and their standard deviation for each explainer (output of the evaluate_samples function) apply_style (bool): color the table of average evaluation scores
- Returns:
a colored (styled) pandas dataframe of average evaluation scores of explanations of a sample
- Return type:
pd.DataFrame
- show_table(explanations: List[Explanation], apply_style: bool = True, remove_first_last: bool = True) DataFrame [source]#
Format explanation scores into a colored table.
- Parameters:
explanations (List[Explanation]) – list of explanations
apply_style (bool) – apply color to the table of explanation scores
remove_first_last (bool) – do not visualize the first and last tokens, typically cls and eos tokens
- Returns:
a colored (styled) pandas dataframed
- Return type:
pd.DataFrame
Explainers#
- class ferret.explainers.gradient.GradientExplainer(model, tokenizer, multiply_by_inputs: bool = True)[source]#
- NAME = 'Gradient'#
- class ferret.explainers.gradient.IntegratedGradientExplainer(model, tokenizer, multiply_by_inputs: bool = True)[source]#
- NAME = 'Integrated Gradient'#