Predictive Modeling

Marketing is expensive when you target the wrong people

Improve Targeting Through Predictive Modeling

Targeting prospects with the highest probability to become your customer reduces waste and improves marketing efficiency.

 

Custom audiences can be marketed through many platforms: programmatic, video like OTT/CTV, YouTube, social media, e-mail and mail.

 

We can build the models and even provide the campaign management services  needed to execute in-market tests, or we can work with your existing providers. Whatever works for you.

 

Different Models for Different Purposes

1. Look-alike Models

Look-alike modeling targets consumers who are more likely to be interested in your brand, product or service due to their attributes and demographics matching your current customers. Think of it like cloning your current customers. Look-alike models are a common starting point for custom audiences.

 

2. Response Models

Response modeling identify consumers most likely to respond to your ads based on using in-market data from previous marketing tests. This is one step up from look-alike models as it predicts action (responding) rather than a profile.

 

3. LTV Models

LTV modeling identifies consumers who are likely to respond to your ad and fit the profile of high lifetime-value customers. This is important in subscription-based products and typically includes retention modeling in the model development process.

 

4. Lift Models

Lift modeling identifies consumers most likely to generate incremental sales or LTV due to seeing your ads and can only be completed through in-market experiments while maintaining a representative holdout audience. Response modeling can target consumers who were already going to purchase without your ad. Lift modeling targets consumers who only purchase because of your ads, resulting in the lowest marginal cost per acquisition.

Why Work With MineTrove?

1. We’re Specialists

We compiled decades of thought leadership and in-market experience to produce our modeling platform: An unsupervised modeling environment combining machine learning and regression-based predictions to produce unparalleled results.

 

2. We’re Fast

We build all of our predictive models in-house, using our proprietary predictive modeling platform. What does this mean to you? Better customer targeting based on better science and shorter model development timelines.

 

3. Flexible Pricing

We can provide models based on a fixed price per model for internal company models (think lead optimization & remarketing) or on a CPM basis if third-party data needs to be licensed for acquisition audiences. We provide modeling services directly to marketers, ad agencies, direct mail agencies & data brokers.

 

4. No Platform Fees

We built our own technology to support our data processing and machine learning platform. We don’t incur charges from platforms like AWS, DataBricks, Snowflake, Google Cloud, or any other third party platforms because we’re not renting their tools.  What we deliver to you is either: 1) a completely developed audience (if you’re licensing third party data) or 2) a SQL-based model algorithm that you can implement in-house.

 

5. Your Data Is Secure

Producing predictive models requires obtaining a sample of your current customers to train the model. This can be accomplished by hashing PII to further enhance security. All of our data is stored in the U.S. in a secure data center, accessible only to us. Our models are all developed in-house, not outsourced to a third party where your data would be exposed to yet another unknown party.

Smarter. Faster. Better.

What does this mean to you? Better customer targeting based on better science, with shorter model development timelines, flexible pricing and data security for your customer data.

Here’s How We Do It

Programmatic Hygiene

We have programmatic routines to scan for unusable features (attributes) and remove them from the modeling process. This includes features with high cardinality, single values, or an unusually high percent of nulls or zeros.

Null Imputation

It’s common for third party data appends to include features with a high degree of null or zero values. Yet these features are often still useful. Our algorithms impute values for nulls based on the dependent variable, and also determine if a zero value should be imputed or not.

IV Screening

Training predictive models on datasets with hundreds or thousands of features is the new norm. We use Information Value screening to identify features with sufficient variability, streamlining the modeling process.

Dimension Reduction

Dimension reduction is an important step in feature selection. The larger the number of explanatory variables introduced into regression modeling, the greater the likelihood of overfitting, leading models to fail to generalize to other datasets. We use machine learning algorithms to classify features into principal components through a clustering process. By then selecting the top candidate feature from each cluster, we reduce collinearity, helping to reduce model over fitting.

Non-linear Transformation

Most continuous numeric features like age, housing values, wealth, etc. have a non-linear effect on your dependent variable (like response rates). For example, response rates may grow with age up to a point, then flatten out. We conduct automated transformations of continuous numeric features to find the perfect fit. This ensures that the transformed feature has a linear effect on the dependent variable, maximizing model lift and stability, and reducing overfitting.

Feature Recoding

Many class-level features, which are coded attributes such as segmentation codes or state codes, struggle to perform in predictive models as most coded variables have high cardinality. Sorting alphabetically is typically irrelevant, and even if you sort by response rate data may be too thin in specific codes. We cluster values within class features into like-performing groups prior to introducing those attributes into the final modeling stage. The process results in more relevant class features surviving the pre-screen process, improving final model fit.

Iterative Regression Modeling

Once we reduce our feature set to the transformed and recoded versions of the principal components that shows the most promise for model development, we begin an iterative model training routine that seeks to further reduce the final feature set. Our goal is to produce the simplest model possible with the greatest predictive lift. Simpler models improve model stability, which improves the model’s ability to accurately predict (e.g. generalize) during production use.

Cross-Validation

No modeling process would be complete without cross-validation. A validation set is a subset of the initial dataset set aside from model training. Validation indicates whether a model can generalize in production use on a rollout basis. This is standard fare for our process.

Diagnostic Reporting

A suite of reporting is provided to illustrate general model diagnostics, model lift, rank features from most influential in the model to least, and finally to profile the consumer from highest scored (e.g. most likely to respond, purchase, etc.) to lowest scored. This is an especially useful way to help marketers understand their target audience.

Simplified Implementation

Our modeling platform automatically recodes the final modeling algorithm using SQL so you don’t need a proprietary software license to implement the model. This makes scoring a breeze for any database environment. Prefer for us to score a file and send it back? That’s fine too.

Let’s discuss your next project