Modules#

Benchmark#

class ferret.benchmark.Benchmark(model, tokenizer, explainers: List | None = None, evaluators: List | None = None, class_based_evaluators: List | None = None)[source]#

Generic interface to compute multiple explanations.

evaluate_explanation(explanation: Explanation | ExplanationWithRationale, target, human_rationale=None, class_explanation: List[Explanation | ExplanationWithRationale] | None = None, show_progress: bool = True, **evaluation_args) ExplanationEvaluation[source]#

Evaluate an explanation using all the evaluators stored in the class.

Parameters:
  • explanation (Union[Explanation, ExplanationWithRationale]) – explanation to evaluate.

  • target (int) – class label for which the explanation is evaluated

  • rationale (human) – one-hot-encoding indicating if the token is in the human (or ground truth) rationale (1) or not (0)

  • class_explanation (list) – list of explanations. The explanation in position i is computed using as target class the class label i. The size is #target classes. If available, class-based scores are computed.

  • show_progress (bool) – enable progress bar

Returns:

the evaluation of the explanation

Return type:

ExplanationEvaluation

evaluate_explanations(explanations: List[Explanation | ExplanationWithRationale], target, human_rationale=None, class_explanations=None, show_progress=True, **evaluation_args) List[ExplanationEvaluation][source]#

Evaluate explanations using all the evaluators stored in the class.

Parameters:
  • explanation (List[Union[Explanation, ExplanationWithRationale]]) – list of explanations to evaluate.

  • target (int) – class label for which the explanations are evaluated

  • rationale (human) – one-hot-encoding indicating if the token is in the human rationale (1) or not (0). If available, all explanations are evaluated for the human rationale (if provided)

  • class_explanation (list) – list of list of explanations. The k-th element represents the list of explanations computed varying the target class: the explanation in position k, i is computed using as target class the class label i. The size is # explanation, #target classes. If available, class-based scores are computed.

  • show_progress (bool) – enable progress bar

Returns:

the evaluation for each explanation

Return type:

List[ExplanationEvaluation]

evaluate_samples(dataset: BaseDataset, sample: int | List[int], target: int | None = None, show_progress_bar: bool = True, n_workers: int = 1, **evaluation_args) Dict[source]#

Explain a dataset sample, evaluate explanations, and compute average scores.

Parameters:
  • dataset (BaseDataset) – XAI dataset to explain and evaluate

  • sample (Union[int, List[int]]) – index or list of indexes

  • target (int) – class label for which the explanations are computed and evaluated. If None, explanations are computed and evaluated for the predicted class

  • show_progress (bool) – enable progress bar

  • n_workers (int) – number of workers

Returns:

the average evaluation scores and their standard deviation for each explainer. The form is the following: {explainer: {“evaluation_measure”: (avg_score, std)}

Return type:

Dict

explain(text, target=1, show_progress: bool = True, normalize_scores: bool = True, order: int = 1) List[Explanation][source]#

Compute explanations using all the explainers stored in the class.

Parameters:
  • text (str) – text string to explain.

  • target (int) – class label to produce the explanations for

  • show_progress (bool) – enable progress bar

  • normalize_scores (bool) – do lp-normalization to make scores comparable

Returns:

list of all explanations produced

Return type:

List[Explanation]

get_dataframe(explanations: List[Explanation]) DataFrame[source]#

Convert explanations into a pandas DataFrame.

Parameters:

explanations (List[Explanation]) – list of explanations

Returns:

explanations in table format. The columns are the tokens and the rows are the explanation scores, one for each explainer.

Return type:

pd.DataFrame

load_dataset(dataset_name: str, **kwargs)[source]#
score(text: str, return_dict: bool = True)[source]#

Compute prediction scores for a single query

Parameters:
  • str (text) – query to compute the logits from

  • bool (return_dict) – return a dict in the format Class Label -> score. Otherwise, return softmaxed logits as torch.Tensor. Default True

show_evaluation_table(explanation_evaluations: List[ExplanationEvaluation], apply_style: bool = True) DataFrame[source]#

Format evaluation scores into a colored table.

Parameters:
  • explanation_evaluations (List[ExplanationEvaluation]) – a list of evaluations of explanations

  • apply_style (bool) – color the table of evaluation scores

Returns:

a colored (styled) pandas dataframe of evaluation scores

Return type:

pd.DataFrame

show_samples_evaluation_table(evaluation_scores_by_explainer, apply_style: bool = True) DataFrame[source]#

Format average evaluation scores into a colored table.

Parameters:

evaluation_scores_by_explainer (Dict) – the average evaluation scores and their standard deviation for each explainer (output of the evaluate_samples function) apply_style (bool): color the table of average evaluation scores

Returns:

a colored (styled) pandas dataframe of average evaluation scores of explanations of a sample

Return type:

pd.DataFrame

show_table(explanations: List[Explanation], apply_style: bool = True, remove_first_last: bool = True) DataFrame[source]#

Format explanation scores into a colored table.

Parameters:
  • explanations (List[Explanation]) – list of explanations

  • apply_style (bool) – apply color to the table of explanation scores

  • remove_first_last (bool) – do not visualize the first and last tokens, typically cls and eos tokens

Returns:

a colored (styled) pandas dataframed

Return type:

pd.DataFrame

Explainers#

class ferret.explainers.gradient.GradientExplainer(model, tokenizer, multiply_by_inputs: bool = True)[source]#
NAME = 'Gradient'#
compute_feature_importance(text: str, target: False, **explainer_args)[source]#
class ferret.explainers.gradient.IntegratedGradientExplainer(model, tokenizer, multiply_by_inputs: bool = True)[source]#
NAME = 'Integrated Gradient'#
compute_feature_importance(text, target, **explainer_args)[source]#
class ferret.explainers.shap.SHAPExplainer(model, tokenizer)[source]#
NAME = 'Partition SHAP'#
compute_feature_importance(text, target=1, **explainer_args)[source]#
class ferret.explainers.lime.LIMEExplainer(model, tokenizer)[source]#
MAX_SAMPLES = 5000#
NAME = 'LIME'#
compute_feature_importance(text, target=1, **explainer_args)[source]#