Mathematics for Archetypal Analysis
This is a side-bar to a longer paper on Archetypal Analysis in Marketing Research within the web site of Action Marketing Research.
Archetypal Analysis uses a form of iterative regression. This summary is based on the formulation by Professor Adele Cutler.
Assume:
- n consumers (survey respondents)
- m variables
- p archetypes to be solved for
- X is an observed data matrix of dimensions n*m
- Z is a matrix of archetypes to be solved for of dimensions p*m
- a is a matrix of mixture weights of dimensions n*p
- ß is a matrix relating consumers to archetypes of dimensions p*n
The object of the procedure is to maximally explain the data in X or more precisely, to minimize the residual sum of squares (RSS) in X given these two alternating least squares problems:
Each row of a represents one respondent.
- The sum of values in that row = 1.00
- Each value >= 0.00 and <= 1.00
Each row of ß represents one archetype.
- The sum of values in that row = 1.00
- Each value >= 0.00 and <= 1.00
Algorithm:
- 1. Initialize ß matrix
- 2. Solve for a using constrained least squares
- 3. Solve for ß using constrained least squares
- 4. Check for improvement in RSS; return to 2) if needed