Help

Overview

SurveyAnalytica’s Advanced Segmentation module lets you divide your data into meaningful groups based on shared characteristics. The platform supports three segmentation approaches: rules-based segmentation using business logic, auto-segmentation using unsupervised machine learning algorithms, and predictive segmentation using supervised ML classifiers. Access segmentation from any analytics view by clicking the Segments tab.

Rules-Based Segmentation

Rules-based segmentation lets you define segments using explicit conditions and business rules. This approach is ideal when you know exactly how you want to categorize your data.

Creating a Segment Rule

Navigate to the Segments view within Analytics.
Click Create Segment and select Rules-Based.
Give your segment a descriptive name.
Add conditions using the condition builder.

Condition Operators

Each condition evaluates a field against a value using one of these operators:

EQUALS / NOT_EQUALS: Exact match (case-insensitive for strings)
CONTAINS / NOT_CONTAINS: Substring match
STARTS_WITH / ENDS_WITH: Prefix or suffix match
GREATER_THAN / LESS_THAN: Numeric comparison
GREATER_THAN_OR_EQUAL / LESS_THAN_OR_EQUAL: Inclusive numeric comparison
BETWEEN: Range check (requires a second value)
IN / NOT_IN: List membership
IS_EMPTY / IS_NOT_EMPTY: Null or blank check
MATCHES_REGEX: Regular expression matching

Combining Conditions

Conditions are organized into groups. Within a group, conditions are combined using AND or OR logic. Multiple groups can themselves be combined with AND/OR logic, giving you full flexibility to express complex business rules.

Segment Statistics

After applying a segment rule, the system calculates and displays:

Total records matching the segment
Percentage of total dataset
Distribution of values within the segment

Auto-Segmentation (ML Clustering)

Auto-segmentation uses unsupervised machine learning to automatically discover natural groupings in your data. The platform supports three clustering algorithms:

K-Means Clustering

The most commonly used algorithm, ideal for finding spherical clusters of similar size.

How it works: Partitions data into K clusters by minimizing the distance between data points and their cluster center.
Configuration: Specify the number of clusters (K) or let the system auto-detect optimal K using the elbow method and silhouette analysis.
Scalability: Optimized to handle large datasets efficiently.

Hierarchical Clustering

Builds a tree-like hierarchy of clusters, useful for exploring data at different levels of granularity.

How it works: Uses agglomerative (bottom-up) approach with linkage methods including ward, complete, average, and single.
Best for: Smaller datasets where you want to explore nested cluster structures.
Output: A dendrogram visualization showing how clusters merge at different distance thresholds.

DBSCAN (Density-Based Spatial Clustering)

Discovers clusters of arbitrary shape based on data density, automatically detecting outliers.

How it works: Groups together points that are closely packed, marking points in low-density regions as outliers.
Key parameters: eps (neighborhood radius) and min_samples (minimum points to form a cluster).
Best for: Datasets with irregularly shaped clusters or when you expect outliers.

Mixed Data Type Support

The clustering engine handles both numeric and categorical data:

Numeric features: Standardized using normalization before clustering.
Categorical features: Encoded appropriately depending on cardinality.
Auto-detection: The system automatically identifies feature types from question types (RADIO, CHECKBOX, DROPDOWN are treated as categorical; NPS, SLIDER, numeric fields as numeric).

Feature Studio

Before running clustering, you can use the Feature Studio to prepare and transform your data features. Available transformations include:

Normalize: Standard scaling, min-max normalization
Encode: Encoding for categorical variables
Impute: Handle missing values with mean, median, mode, or custom strategies
Bin: Convert continuous variables into discrete bins (equal-width, equal-frequency, or custom)
Scale: Apply logarithmic, square root, or other scaling transformations
Derived features: Create new features from existing ones using custom expressions
Feature selection: Automatically identify the most important features for clustering

Clustering Quality Metrics

After running auto-segmentation, the platform provides quality metrics to evaluate your clusters:

Silhouette Score: Measures how similar a point is to its own cluster versus other clusters. Ranges from -1 to 1, with higher values indicating better-defined clusters.
Davies-Bouldin Index: Measures the average similarity between clusters. Lower values indicate better separation between clusters.
Cluster sizes: The number of records in each cluster, helping you identify imbalanced segments.
Cluster profiles: Statistical summaries of each cluster showing the distinguishing characteristics.

Predictive Segmentation

Once you have established segments (either through rules or clustering), you can train a supervised ML model to predict segment membership for new data. Predictive segmentation supports:

Logistic Regression: Fast, interpretable model for segment classification.
Random Forest: More complex model that can capture non-linear relationships.

Training a Predictive Model

Select the features to use for prediction.
Choose the target column (existing segment assignments).
Configure test/train split ratio (default: 80/20).
Select the model type (logistic regression or random forest).
Train the model and review metrics (accuracy, precision, recall, F1 score, confusion matrix).

Feature Importance

After training, the system extracts feature importance scores showing which variables have the most influence on segment assignment. This helps you understand what drives the differences between your segments.

Applying Segments

Once segments are defined, they are stored on each record in the data. Each segment record includes:

configId: The segmentation configuration ID
configName: Name of the segmentation config
segmentName: The assigned segment name
appliedAt: Timestamp when the segment was applied

Segments can be used as filters throughout the analytics interface and as targeting criteria in campaign workflows.

Confirming your payment...

We use cookies