archetypax.tools package
Submodules
archetypax.tools.evaluation module
Evaluation metrics for Archetypal Analysis.
- class archetypax.tools.evaluation.ArchetypalAnalysisEvaluator(model: ArchetypalAnalysis)[source]
Bases:
objectEvaluator for Archetypal Analysis results, especially for high-dimensional data.
Provides metrics and visualizations to assess model quality.
- __init__(model: ArchetypalAnalysis)[source]
Initialize the evaluator.
- Parameters:
model – Fitted ArchetypalAnalysis model
- archetype_feature_importance() DataFrame[source]
Analyze which features are most important for each archetype.
- Returns:
DataFrame with feature importance for each archetype
- archetype_separation() dict[str, float][source]
Measure how well-separated the archetypes are.
- Returns:
Dictionary with separation metrics
- clustering_metrics(X: ndarray) dict[str, float][source]
Calculate clustering quality metrics by using dominant archetypes as cluster assignments.
- Parameters:
X – Original data matrix
- Returns:
Dictionary with clustering metrics
- comprehensive_evaluation(X: ndarray) dict[str, Any][source]
Run all evaluation metrics and return comprehensive results.
- Parameters:
X – Original data matrix
- Returns:
Dictionary with all evaluation metrics
- convex_hull_metrics() dict[str, Any][source]
Calculate metrics related to the convex hull formed by the archetypes.
This method evaluates whether the archetypes form a non-degenerate convex hull by calculating its volume/area and comparing it to the data’s convex hull.
- Returns:
volume/area of the convex hull
ratio compared to data hull volume/area
dimensionality of the hull
- Return type:
Dictionary with convex hull metrics including
- dominant_archetype_purity() dict[str, Any][source]
Analyze how dominant each archetype is for its assigned samples.
- Returns:
Dictionary with purity metrics
- explained_variance(X: ndarray) float[source]
Calculate the explained variance of the model.
- Parameters:
X – Data matrix
- Returns:
Explained variance (0-1)
- plot_archetype_feature_comparison(top_n: int = 5, feature_names: list[str] | None = None) None[source]
Plot radar chart or bar chart comparing top N most important features for each archetype.
- Parameters:
top_n – Number of top features to display
feature_names – Optional list of feature names
- plot_convex_hull(feature_indices: list[int] | None = None, figsize: tuple[int, int] = (10, 8)) None[source]
Plot the convex hull formed by archetypes in 2D or 3D.
- Parameters:
feature_indices – Indices of features to use for visualization (2 or 3 features)
figsize – Size of the figure
- plot_entropy_vs_reconstruction(X: ndarray, n_samples: int = 1000) None[source]
Plot relationship between sample entropy and reconstruction error.
- Parameters:
X – Original data matrix
n_samples – Number of samples to plot (random subset)
- plot_feature_importance_heatmap(feature_names: list[str] | None = None) None[source]
Plot heatmap of feature importance across archetypes.
- Parameters:
feature_names – Optional list of feature names
- plot_purity_distribution() None[source]
Plot the distribution of dominant archetype weights (purity).
- plot_weight_distributions(bins: int = 20) None[source]
Plot histograms of weight distributions for each archetype.
- Parameters:
bins – Number of histogram bins
- print_evaluation_report(X: ndarray) None[source]
Print a comprehensive evaluation report.
- Parameters:
X – Original data matrix
- class archetypax.tools.evaluation.BiarchetypalAnalysisEvaluator(model)[source]
Bases:
objectEvaluator for Biarchetypal Analysis results.
Provides metrics and visualizations to assess model quality for biarchetypal models, which use two sets of archetypes to represent data.
- __init__(model)[source]
Initialize the evaluator.
- Parameters:
model – Fitted BiarchetypalAnalysis model
- archetype_separation()[source]
Calculate separation metrics between archetypes.
- Returns:
Dictionary of separation metrics
- comprehensive_evaluation(X: ndarray) dict[source]
Perform a comprehensive evaluation of the model.
- Parameters:
X – Data matrix
- Returns:
Dictionary of evaluation metrics
- dominant_archetype_purity() dict[source]
Calculate purity metrics for dominant archetypes.
- Returns:
Dictionary of purity metrics
- explained_variance(X: ndarray) float[source]
Calculate the explained variance of the model.
- Parameters:
X – Data matrix
- Returns:
Explained variance (0-1)
- print_evaluation_report(X: ndarray) None[source]
Print a comprehensive evaluation report.
- Parameters:
X – Original data matrix
- print_summary(results: dict)[source]
Print a summary of the evaluation results.
- Parameters:
results – Dictionary of evaluation results
archetypax.tools.interpret module
Interpretability metrics for Archetypal Analysis.
- class archetypax.tools.interpret.ArchetypalAnalysisInterpreter(models_dict: dict[int, ArchetypalAnalysis] | None = None)[source]
Bases:
objectInterpreter for Archetypal Analysis results, focusing on interpretability metrics.
Provides quantitative measures for archetype interpretability and optimal number selection.
- __init__(models_dict: dict[int, ArchetypalAnalysis] | None = None) None[source]
Initialize the interpreter.
- Parameters:
models_dict – Optional dictionary of {n_archetypes: model} pairs
- add_model(n_archetypes: int, model: ArchetypalAnalysis) ArchetypalAnalysisInterpreter[source]
Add a fitted model to the interpreter.
- cluster_purity(weights: ndarray, threshold: float = 0.6) tuple[ndarray, float][source]
Calculate purity of each archetype’s associated data points.
- Parameters:
weights – Weight matrix (n_samples, n_archetypes)
threshold – Threshold for considering an archetype as dominant
- Returns:
Tuple of purity scores per archetype, average purity
- evaluate_all_models(X: ndarray) dict[int, dict[str, Any]][source]
Evaluate interpretability metrics for all models.
- Parameters:
X – Original data matrix
- Returns:
Dictionary of results per number of archetypes
- feature_consistency(X: ndarray, n_archetypes: int, n_trials: int = 5, top_k: int = 5, random_seed: int = 42) ndarray[source]
Calculate feature importance consistency across multiple initializations.
- Parameters:
X – Original data matrix
n_archetypes – Number of archetypes to evaluate
n_trials – Number of different initializations to try
top_k – Number of top features to consider
random_seed – Base random seed (will be incremented for each trial)
- Returns:
Array of consistency scores for each archetype
- feature_distinctiveness(archetypes: ndarray) ndarray[source]
Calculate how distinctive each archetype is in terms of feature values.
- Parameters:
archetypes – Archetype matrix (n_archetypes, n_features)
- Returns:
Array of distinctiveness scores for each archetype
- information_gain(X: ndarray) list[tuple[int, float]][source]
Calculate information gain when adding each additional archetype.
- Parameters:
X – Original data matrix
- Returns:
List of (n_archetypes, gain) pairs
- plot_interpretability_metrics()[source]
Plot interpretability metrics for different numbers of archetypes.
- sparsity_coefficient(archetypes: ndarray, percentile: float = 80) ndarray[source]
Calculate sparsity of each archetype’s feature representation.
- Parameters:
archetypes – Archetype matrix (n_archetypes, n_features)
percentile – Percentile threshold for considering features as prominent
- Returns:
Array of sparsity scores for each archetype (higher is more interpretable)
- class archetypax.tools.interpret.BiarchetypalAnalysisInterpreter(models_dict: dict[tuple[int, int], BiarchetypalAnalysis] | None = None)[source]
Bases:
objectInterpreter for Biarchetypal Analysis results, focusing on interpretability metrics.
Provides quantitative measures for biarchetype interpretability and optimal number selection.
- __init__(models_dict: dict[tuple[int, int], BiarchetypalAnalysis] | None = None) None[source]
Initialize the interpreter.
- Parameters:
models_dict – Optional dictionary of {n_archetypes_first, n_archetypes_second: model} pairs
- add_model(n_archetypes_first: int, n_archetypes_second: int, model: BiarchetypalAnalysis) BiarchetypalAnalysisInterpreter[source]
Add a fitted model to the interpreter.
- cluster_purity(weights: ndarray, threshold: float = 0.6) tuple[ndarray, float][source]
Calculate purity of each archetype’s associated data points.
- Parameters:
weights – Weight matrix (n_samples, n_archetypes)
threshold – Threshold for considering an archetype as dominant
- Returns:
Tuple of purity scores per archetype, average purity
- compute_information_gain(X: ndarray) None[source]
Calculate information gain between different archetype number combinations.
- Parameters:
X – Original data matrix
- evaluate_all_models(X: ndarray) dict[tuple[int, int], dict[str, Any]][source]
Evaluate interpretability metrics for all models.
- Parameters:
X – Original data matrix
- Returns:
Dictionary of results per combination of archetypes
- feature_distinctiveness(archetypes: ndarray) ndarray[source]
Calculate how distinctive each archetype is in terms of feature values.
- Parameters:
archetypes – Archetype matrix (n_archetypes, n_features)
- Returns:
Array of distinctiveness scores for each archetype
- plot_interpretability_heatmap() Figure[source]
Plot heatmaps of interpretability metrics for different archetype number combinations.
- Returns:
The matplotlib figure object
- sparsity_coefficient(archetypes: ndarray, percentile: float = 80) ndarray[source]
Calculate sparsity of each archetype’s feature representation.
- Parameters:
archetypes – Archetype matrix (n_archetypes, n_features)
percentile – Percentile threshold for considering features as prominent
- Returns:
Array of sparsity scores for each archetype (higher is more interpretable)
- suggest_optimal_biarchetypes(method: str = 'balance') tuple[int, int][source]
Suggest optimal archetype number combination based on interpretability metrics.
- Parameters:
method – Method to use for selection (‘balance’, ‘interpretability’, or ‘information_gain’)
- Returns:
Optimal combination of n_archetypes_first, n_archetypes_second
archetypax.tools.visualization module
Visualization utilities for Archetypal Analysis.
- class archetypax.tools.visualization.ArchetypalAnalysisVisualizer[source]
Bases:
objectVisualization utilities for Archetypal Analysis.
- static plot_archetype_distribution(model: ArchetypalAnalysis) None[source]
Plot the distribution of dominant archetypes across samples.
- Parameters:
model – Fitted ArchetypalAnalysis model
- static plot_archetype_profiles(model: ArchetypalAnalysis, feature_names: list[str] | None = None) None[source]
Plot feature profiles of each archetype.
- Parameters:
model – Fitted ArchetypalAnalysis model
feature_names – Optional list of feature names for axis labels
- static plot_archetypes_2d(model: ArchetypalAnalysis, X: ndarray, feature_names: list[str] | None = None) None[source]
Plot data and archetypes in 2D.
- Parameters:
model – Fitted ArchetypalAnalysis model
X – Original data
feature_names – Optional feature names for axis labels
- static plot_loss(model: ArchetypalAnalysis) None[source]
Plot the loss history from training.
- Parameters:
model – Fitted ArchetypalAnalysis model
- static plot_membership_weights(model: ArchetypalAnalysis, n_samples: int | None = None) None[source]
Plot membership weights for samples.
- Parameters:
model – Fitted ArchetypalAnalysis model
n_samples – Optional number of samples to visualize (default: all)
- static plot_reconstruction_comparison(model: ArchetypalAnalysis, X: ndarray) None[source]
Plot original vs reconstructed data.
- Parameters:
model – Fitted ArchetypalAnalysis model
X – Original data matrix
- class archetypax.tools.visualization.BiarchetypalAnalysisVisualizer[source]
Bases:
objectVisualization utilities for Biarchetypal Analysis.
- static plot_biarchetypal_reconstruction(model: BiarchetypalAnalysis, X: ndarray) None[source]
Plot original data vs. reconstructions from each archetype set and combined.
- Parameters:
model – Fitted BiarchetypalAnalysis model
X – Original data matrix
- static plot_dual_archetypes_2d(model: BiarchetypalAnalysis, X: ndarray, feature_names: list[str] | None = None) None[source]
Plot data and both sets of archetypes in 2D.
- Parameters:
model – Fitted BiarchetypalAnalysis model
X – Original data
feature_names – Optional feature names for axis labels
- static plot_dual_membership_heatmap(model: BiarchetypalAnalysis, n_samples: int = 50) None[source]
Plot heatmap of membership weights for both sets of archetypes.
- Parameters:
model – Fitted BiarchetypalAnalysis model
n_samples – Number of samples to visualize
- static plot_dual_simplex_2d(model: BiarchetypalAnalysis, n_samples: int = 200) None[source]
Plot samples in separate 2D simplex spaces for each archetype set (only works for 3 archetypes per set).
- Parameters:
model – Fitted BiarchetypalAnalysis model
n_samples – Number of samples to plot
- static plot_mixture_effect(model: BiarchetypalAnalysis, X: ndarray, mixture_steps: int = 5) None[source]
Plot the effect of different mixture weights between the two archetype sets.
- Parameters:
model – Fitted BiarchetypalAnalysis model
X – Original data matrix
mixture_steps – Number of different mixture weights to try
Module contents
Utility modules for Archetypal Analysis.