IBM SPSS Categories

Predict outcomes and reveal relationships in categorical data

  • Overview
  • Data Analysis

Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping, preference scaling, and dimension reduction techniques—including optimal scaling of your variables. IBM SPSS Categories (formerly called SPSS Categories) provides you with all the tools you need to obtain clear insight into complex categorical and numeric data, as well as high-dimensional data. For example, use IBM SPSS Categories to understand which characteristics consumers relate most closely to your brand, or to determine customer perception of your products compared to other products you or your competitors offer.

You can visually interpret datasets and see how rows and columns relate in large tables of scores, counts, ratings, rankings, or similarities. This gives you the ability to:

  • Work with and understand nominal and ordinal data with procedures similar to conventional regression, principal components, and canonical correlation
  • Deal with non-normal residuals in numeric data or nonlinear relationships between predictor variables and the outcome variable. You can now use Ridge Regression, the Lasso, the Elastic Net, variable selection, and model selection for both numeric and categorical data.

Turn Qualitative Variables into Quantitative Ones

The advanced procedures available in IBM SPSS Categories enable you to perform additional statistical operations on categorical data.

  • Use IBM SPSS Categories’ optimal scaling procedures to assign units of measurement and zero-points to your categorical data
  • Choose from state-of-the art procedures for model selection and regularization
  • Perform correspondence and multiple correspondence analyses to numerically evaluate similarities between two or more nominal variables in your dataset
  • Summarize your data according to important components by using principal components analysis
  • Quantify your ordinal and nominal variables with an optimal scaling correlation matrix
  • Use nonlinear canonical correlation analysis to incorporate and analyze variables of different measurement levels

Graphically Display Underlying Relationships

IBM SPSS Categories’ dimension reduction techniques enable you to clarify relationships in your data by using perceptual maps and biplots.

  • Perceptual maps are high-resolution summary charts that graphically display similar variables or categories close to each other. They provide you with unique insight into relationships between more than two categorical variables.
  • Biplots and triplots enable you to look at the relationships among cases, variables, and categories. For example, you can define relationships between products, customers, and demographic characteristics.

By using the preference scaling feature, you can further visualize relationships among objects. The breakthrough algorithm on which this procedure is based enables you to perform non-metric analyses for ordinal data and obtain meaningful results. The proximities scaling procedure allows you to analyze similarities between objects, and incorporate characteristics for objects in the same analysis.

Plot in IBM SPSS Categories showing the results of a two-dimensional multiple correspondence analysis of a table.

The data are a 2x5x6 table containing information on two genders, five age groups and six products. This plot shows the results of a two-dimensional multiple correspondence analysis of the table. Notice that products such as "A" and "B" are chosen at younger ages and by males, while products such as "G" and "C" are preferred at older ages.

IBM SPSS Categories is available in English, Japanese, French, German, Italian, Spanish, Chinese, Polish, Korean, and Russian. Contact your local office to find out more.



Read more about the procedures included in IBM SPSS Categories.


Procedures and Statistics for Analyzing Categorical Data

Using IBM SPSS Categories with IBM SPSS Statistics Base gives you a selection of statistical techniques for analyzing high-dimensional or categorical data.

  • Categorical regression (CATREG) predicts the values of a nominal, ordinal, or numerical outcome variable from a combination of categorical predictor variables. Optimal scaling techniques are used to quantify variables. Three new regularization methods: Ridge regression, the Lasso, and the Elastic Net, improve prediction accuracy by stabilizing the parameter estimates. Automatic variable selection makes it possible to analyze high-volume datasets—more variables than objects. And by using the numeric scaling level, you can do regularization in regression by using the Lasso or the Elastic Net for your numeric data as well. You can also use CATREG to apply particular Generalized Additive Models (GAM), both for your numeric and categorical data.

  • Correspondence analysis (CORRESPONDENCE) enables you to analyze two-way tables that contain some measurement of correspondence between rows and columns, as well as display rows and columns as points in a map.

  • Multiple correspondence analysis (MULTIPLE CORRESPONDENCE) is used to analyze multivariate categorical data. It differs from correspondence analysis in that it allows you to use more than two variables in your analysis. With this procedure, all the variables are analyzed at the nominal level (unordered categories).

  • Categorical principal components analysis (CATPCA) uses optimal scaling to generalize the principal components analysis procedure so that it can accommodate variables of mixed measurement levels. It is similar to multiple correspondence analysis, except that you are able to specify an analysis level on a variable-by-variable basis.

  • Nonlinear canonical correlation analysis (OVERALS) uses optimal scaling to generalize the canonical correlation analysis procedure so that it can accommodate variables of mixed measurement levels. This type of analysis enables you to compare multiple sets of variables to one another in the same graph, after removing the correlation within sets.

  • Multidimensional scaling (PROXSCAL) performs multidimensional scaling of one or more matrices with similarities or dissimilarities (proximities). Alternatively, you can compute distances between cases in multivariate data as input to PROXSCAL.

  • Preference scaling (PREFSCAL) visually examines relationships between two sets of objects, for example, consumers and products. Preference scaling performs multidimensional unfolding in order to find a map that represents the relationships between these two sets of objects as distances between two sets of points.

Download more information