archetypax.tools package

Submodules

archetypax.tools.evaluation module

Evaluation metrics for Archetypal Analysis.

class archetypax.tools.evaluation.ArchetypalAnalysisEvaluator(model: ArchetypalAnalysis)[source]

Bases: object

Evaluator for Archetypal Analysis results, especially for high-dimensional data.

Provides metrics and visualizations to assess model quality.

__init__(model: ArchetypalAnalysis)[source]

Initialize the evaluator.

Parameters:

model – Fitted ArchetypalAnalysis model

archetype_feature_importance() DataFrame[source]

Analyze which features are most important for each archetype.

Returns:

DataFrame with feature importance for each archetype

archetype_separation() dict[str, float][source]

Measure how well-separated the archetypes are.

Returns:

Dictionary with separation metrics

clustering_metrics(X: ndarray) dict[str, float][source]

Calculate clustering quality metrics by using dominant archetypes as cluster assignments.

Parameters:

X – Original data matrix

Returns:

Dictionary with clustering metrics

comprehensive_evaluation(X: ndarray) dict[str, Any][source]

Run all evaluation metrics and return comprehensive results.

Parameters:

X – Original data matrix

Returns:

Dictionary with all evaluation metrics

convex_hull_metrics() dict[str, Any][source]

Calculate metrics related to the convex hull formed by the archetypes.

This method evaluates whether the archetypes form a non-degenerate convex hull by calculating its volume/area and comparing it to the data’s convex hull.

Returns:

  • volume/area of the convex hull

  • ratio compared to data hull volume/area

  • dimensionality of the hull

Return type:

Dictionary with convex hull metrics including

dominant_archetype_purity() dict[str, Any][source]

Analyze how dominant each archetype is for its assigned samples.

Returns:

Dictionary with purity metrics

explained_variance(X: ndarray) float[source]

Calculate the explained variance of the model.

Parameters:

X – Data matrix

Returns:

Explained variance (0-1)

plot_archetype_feature_comparison(top_n: int = 5, feature_names: list[str] | None = None) None[source]

Plot radar chart or bar chart comparing top N most important features for each archetype.

Parameters:
  • top_n – Number of top features to display

  • feature_names – Optional list of feature names

plot_convex_hull(feature_indices: list[int] | None = None, figsize: tuple[int, int] = (10, 8)) None[source]

Plot the convex hull formed by archetypes in 2D or 3D.

Parameters:
  • feature_indices – Indices of features to use for visualization (2 or 3 features)

  • figsize – Size of the figure

plot_distance_matrix() None[source]

Plot distance matrix between archetypes.

plot_entropy_vs_reconstruction(X: ndarray, n_samples: int = 1000) None[source]

Plot relationship between sample entropy and reconstruction error.

Parameters:
  • X – Original data matrix

  • n_samples – Number of samples to plot (random subset)

plot_feature_importance_heatmap(feature_names: list[str] | None = None) None[source]

Plot heatmap of feature importance across archetypes.

Parameters:

feature_names – Optional list of feature names

plot_purity_distribution() None[source]

Plot the distribution of dominant archetype weights (purity).

plot_weight_distributions(bins: int = 20) None[source]

Plot histograms of weight distributions for each archetype.

Parameters:

bins – Number of histogram bins

print_evaluation_report(X: ndarray) None[source]

Print a comprehensive evaluation report.

Parameters:

X – Original data matrix

reconstruction_error(X: ndarray, metric: str = 'frobenius') float[source]

Calculate the reconstruction error of the model.

Parameters:
  • X – Data matrix

  • metric – Error metric to use (‘frobenius’, ‘mae’, ‘mse’, or ‘relative’)

Returns:

Reconstruction error value

weight_diversity() dict[str, float][source]

Measure how diverse the weight distributions are across samples.

Returns:

Dictionary with diversity metrics

class archetypax.tools.evaluation.BiarchetypalAnalysisEvaluator(model)[source]

Bases: object

Evaluator for Biarchetypal Analysis results.

Provides metrics and visualizations to assess model quality for biarchetypal models, which use two sets of archetypes to represent data.

__init__(model)[source]

Initialize the evaluator.

Parameters:

model – Fitted BiarchetypalAnalysis model

archetype_separation()[source]

Calculate separation metrics between archetypes.

Returns:

Dictionary of separation metrics

comprehensive_evaluation(X: ndarray) dict[source]

Perform a comprehensive evaluation of the model.

Parameters:

X – Data matrix

Returns:

Dictionary of evaluation metrics

dominant_archetype_purity() dict[source]

Calculate purity metrics for dominant archetypes.

Returns:

Dictionary of purity metrics

explained_variance(X: ndarray) float[source]

Calculate the explained variance of the model.

Parameters:

X – Data matrix

Returns:

Explained variance (0-1)

print_evaluation_report(X: ndarray) None[source]

Print a comprehensive evaluation report.

Parameters:

X – Original data matrix

print_summary(results: dict)[source]

Print a summary of the evaluation results.

Parameters:

results – Dictionary of evaluation results

reconstruction_error(X: ndarray, metric: str = 'frobenius') float[source]

Calculate the reconstruction error of the model.

Parameters:
  • X – Data matrix

  • metric – Error metric to use (‘frobenius’, ‘mae’, ‘mse’, or ‘relative’)

Returns:

Reconstruction error value

weight_diversity() dict[source]

Calculate diversity metrics for archetype weights.

Returns:

Dictionary of diversity metrics

archetypax.tools.interpret module

Interpretability metrics for Archetypal Analysis.

class archetypax.tools.interpret.ArchetypalAnalysisInterpreter(models_dict: dict[int, ArchetypalAnalysis] | None = None)[source]

Bases: object

Interpreter for Archetypal Analysis results, focusing on interpretability metrics.

Provides quantitative measures for archetype interpretability and optimal number selection.

__init__(models_dict: dict[int, ArchetypalAnalysis] | None = None) None[source]

Initialize the interpreter.

Parameters:

models_dict – Optional dictionary of {n_archetypes: model} pairs

add_model(n_archetypes: int, model: ArchetypalAnalysis) ArchetypalAnalysisInterpreter[source]

Add a fitted model to the interpreter.

cluster_purity(weights: ndarray, threshold: float = 0.6) tuple[ndarray, float][source]

Calculate purity of each archetype’s associated data points.

Parameters:
  • weights – Weight matrix (n_samples, n_archetypes)

  • threshold – Threshold for considering an archetype as dominant

Returns:

Tuple of purity scores per archetype, average purity

evaluate_all_models(X: ndarray) dict[int, dict[str, Any]][source]

Evaluate interpretability metrics for all models.

Parameters:

X – Original data matrix

Returns:

Dictionary of results per number of archetypes

feature_consistency(X: ndarray, n_archetypes: int, n_trials: int = 5, top_k: int = 5, random_seed: int = 42) ndarray[source]

Calculate feature importance consistency across multiple initializations.

Parameters:
  • X – Original data matrix

  • n_archetypes – Number of archetypes to evaluate

  • n_trials – Number of different initializations to try

  • top_k – Number of top features to consider

  • random_seed – Base random seed (will be incremented for each trial)

Returns:

Array of consistency scores for each archetype

feature_distinctiveness(archetypes: ndarray) ndarray[source]

Calculate how distinctive each archetype is in terms of feature values.

Parameters:

archetypes – Archetype matrix (n_archetypes, n_features)

Returns:

Array of distinctiveness scores for each archetype

information_gain(X: ndarray) list[tuple[int, float]][source]

Calculate information gain when adding each additional archetype.

Parameters:

X – Original data matrix

Returns:

List of (n_archetypes, gain) pairs

plot_interpretability_metrics()[source]

Plot interpretability metrics for different numbers of archetypes.

sparsity_coefficient(archetypes: ndarray, percentile: float = 80) ndarray[source]

Calculate sparsity of each archetype’s feature representation.

Parameters:
  • archetypes – Archetype matrix (n_archetypes, n_features)

  • percentile – Percentile threshold for considering features as prominent

Returns:

Array of sparsity scores for each archetype (higher is more interpretable)

suggest_optimal_archetypes(method: str = 'balance') int[source]

Suggest optimal number of archetypes based on interpretability metrics.

Parameters:

method – Method to use for selection (‘balance’, ‘interpretability’, or ‘information_gain’)

Returns:

Optimal number of archetypes

class archetypax.tools.interpret.BiarchetypalAnalysisInterpreter(models_dict: dict[tuple[int, int], BiarchetypalAnalysis] | None = None)[source]

Bases: object

Interpreter for Biarchetypal Analysis results, focusing on interpretability metrics.

Provides quantitative measures for biarchetype interpretability and optimal number selection.

__init__(models_dict: dict[tuple[int, int], BiarchetypalAnalysis] | None = None) None[source]

Initialize the interpreter.

Parameters:

models_dict – Optional dictionary of {n_archetypes_first, n_archetypes_second: model} pairs

add_model(n_archetypes_first: int, n_archetypes_second: int, model: BiarchetypalAnalysis) BiarchetypalAnalysisInterpreter[source]

Add a fitted model to the interpreter.

cluster_purity(weights: ndarray, threshold: float = 0.6) tuple[ndarray, float][source]

Calculate purity of each archetype’s associated data points.

Parameters:
  • weights – Weight matrix (n_samples, n_archetypes)

  • threshold – Threshold for considering an archetype as dominant

Returns:

Tuple of purity scores per archetype, average purity

compute_information_gain(X: ndarray) None[source]

Calculate information gain between different archetype number combinations.

Parameters:

X – Original data matrix

evaluate_all_models(X: ndarray) dict[tuple[int, int], dict[str, Any]][source]

Evaluate interpretability metrics for all models.

Parameters:

X – Original data matrix

Returns:

Dictionary of results per combination of archetypes

feature_distinctiveness(archetypes: ndarray) ndarray[source]

Calculate how distinctive each archetype is in terms of feature values.

Parameters:

archetypes – Archetype matrix (n_archetypes, n_features)

Returns:

Array of distinctiveness scores for each archetype

plot_interpretability_heatmap() Figure[source]

Plot heatmaps of interpretability metrics for different archetype number combinations.

Returns:

The matplotlib figure object

sparsity_coefficient(archetypes: ndarray, percentile: float = 80) ndarray[source]

Calculate sparsity of each archetype’s feature representation.

Parameters:
  • archetypes – Archetype matrix (n_archetypes, n_features)

  • percentile – Percentile threshold for considering features as prominent

Returns:

Array of sparsity scores for each archetype (higher is more interpretable)

suggest_optimal_biarchetypes(method: str = 'balance') tuple[int, int][source]

Suggest optimal archetype number combination based on interpretability metrics.

Parameters:

method – Method to use for selection (‘balance’, ‘interpretability’, or ‘information_gain’)

Returns:

Optimal combination of n_archetypes_first, n_archetypes_second

archetypax.tools.visualization module

Visualization utilities for Archetypal Analysis.

class archetypax.tools.visualization.ArchetypalAnalysisVisualizer[source]

Bases: object

Visualization utilities for Archetypal Analysis.

static plot_archetype_distribution(model: ArchetypalAnalysis) None[source]

Plot the distribution of dominant archetypes across samples.

Parameters:

model – Fitted ArchetypalAnalysis model

static plot_archetype_profiles(model: ArchetypalAnalysis, feature_names: list[str] | None = None) None[source]

Plot feature profiles of each archetype.

Parameters:
  • model – Fitted ArchetypalAnalysis model

  • feature_names – Optional list of feature names for axis labels

static plot_archetypes_2d(model: ArchetypalAnalysis, X: ndarray, feature_names: list[str] | None = None) None[source]

Plot data and archetypes in 2D.

Parameters:
  • model – Fitted ArchetypalAnalysis model

  • X – Original data

  • feature_names – Optional feature names for axis labels

static plot_loss(model: ArchetypalAnalysis) None[source]

Plot the loss history from training.

Parameters:

model – Fitted ArchetypalAnalysis model

static plot_membership_weights(model: ArchetypalAnalysis, n_samples: int | None = None) None[source]

Plot membership weights for samples.

Parameters:
  • model – Fitted ArchetypalAnalysis model

  • n_samples – Optional number of samples to visualize (default: all)

static plot_reconstruction_comparison(model: ArchetypalAnalysis, X: ndarray) None[source]

Plot original vs reconstructed data.

Parameters:
  • model – Fitted ArchetypalAnalysis model

  • X – Original data matrix

static plot_simplex_2d(model: ArchetypalAnalysis, n_samples: int | None = 500) None[source]

Plot samples in 2D simplex space (only works for 3 archetypes).

Parameters:
  • model – Fitted ArchetypalAnalysis model

  • n_samples – Number of samples to plot (default: 500)

class archetypax.tools.visualization.BiarchetypalAnalysisVisualizer[source]

Bases: object

Visualization utilities for Biarchetypal Analysis.

static plot_biarchetypal_reconstruction(model: BiarchetypalAnalysis, X: ndarray) None[source]

Plot original data vs. reconstructions from each archetype set and combined.

Parameters:
  • model – Fitted BiarchetypalAnalysis model

  • X – Original data matrix

static plot_dual_archetypes_2d(model: BiarchetypalAnalysis, X: ndarray, feature_names: list[str] | None = None) None[source]

Plot data and both sets of archetypes in 2D.

Parameters:
  • model – Fitted BiarchetypalAnalysis model

  • X – Original data

  • feature_names – Optional feature names for axis labels

static plot_dual_membership_heatmap(model: BiarchetypalAnalysis, n_samples: int = 50) None[source]

Plot heatmap of membership weights for both sets of archetypes.

Parameters:
  • model – Fitted BiarchetypalAnalysis model

  • n_samples – Number of samples to visualize

static plot_dual_simplex_2d(model: BiarchetypalAnalysis, n_samples: int = 200) None[source]

Plot samples in separate 2D simplex spaces for each archetype set (only works for 3 archetypes per set).

Parameters:
  • model – Fitted BiarchetypalAnalysis model

  • n_samples – Number of samples to plot

static plot_mixture_effect(model: BiarchetypalAnalysis, X: ndarray, mixture_steps: int = 5) None[source]

Plot the effect of different mixture weights between the two archetype sets.

Parameters:
  • model – Fitted BiarchetypalAnalysis model

  • X – Original data matrix

  • mixture_steps – Number of different mixture weights to try

Module contents

Utility modules for Archetypal Analysis.