627
Views
0
CrossRef citations to date
0
Altmetric
Mechanical Engineering

Effectiveness verification framework for coupon distribution marketing measure considering users’ potential purchase intentions

, , , , &
Article: 2307718 | Received 26 Jun 2023, Accepted 16 Jan 2024, Published online: 21 Jan 2024

Abstract

In recent years, web marketing has thrived, and online coupon distribution has become a significant marketing measure that leads to increased sales. However, randomly distributing coupons risks lowering the profit ratio of companies. Therefore, it is important to estimate the effect of coupons and analyze the causal relationship between coupons and results. The potential purchase intention (PPI) of users is believed to influence the effect of coupons. For example, distributing coupons to users with a low PPI is likely to increase the gross profit of companies, whereas distributing coupons to users with a high PPI is likely to decrease the gross profit. Therefore, by analyzing the relationship between PPI and the effect of coupons, highly effective targeting can be conducted based on the PPI. In this paper, we propose an experimental design based on machine learning to analyze the effect of coupons, which varies depending on the PPI. We propose a method to predict users’ PPI based on their purchase history data using machine learning and analyze the relationship between PPI and the effect of coupons. Finally, we demonstrate the effectiveness of the proposed framework by applying it to real-world data.

1. Introduction

In recent years, data-driven marketing has gained increasing importance, as businesses make decisions based on accumulated data (Rodrigues et al., Citation2020; Sugisaki et al., Citation2021; Yoneda et al., Citation2022). As customers on e-commerce (EC) sites log in using their user IDs to browse product pages, search, and purchase products, the site side can acquire various action histories. Purchase and browsing history data on EC sites are valuable for understanding customer preferences, and it is expected that useful knowledge can be obtained by analyzing these data (Goto et al., Citation2015). Regarding purchasing behavior in real stores, the spread of point cards, credit cards, and electronic money systems has made it possible to acquire purchase history data for numerous customers and use them for marketing activities. Several retailers use purchase history data to develop measures for building better long-term relationships with customers. In this context, various studies have been actively conducted on methods for analyzing customer purchase and browsing history data to obtain knowledge that can be used in businesses (Apichottanakul et al., Citation2021; Okayama et al., Citation2019; Seko et al., Citation2021; Yang et al., Citation2022; Hasumoto and Goto, Citation2022; Shimizu et al., Citation2022; Amano et al., Citation2023).

To conduct effective marketing activities based on data, it is necessary to estimate the effects of various measures using historical data accurately and analyze the causal relationships between the measures and the results obtained. One of the most effective marketing measures is coupon distribution, which can increase sales, particularly in web marketing (Dias et al., Citation2015). However, randomly distributing coupons risks lowering the profit ratio of companies (Cheng & Dogan, Citation2008) because of the costs and the existence of customers who only use coupons and do not contribute to an increase in sales. Therefore, it is important to estimate the effect of distributing coupons accurately and analyze the causal relationship between coupons and their results. Randomized controlled trials (RCTs) (Medical Research Council, Citation1948; Crofton & Mitchison, Citation1948) are required to evaluate the effects of marketing measures, such as coupon distribution, correctly (Didero et al., Citation2021). However, it is often difficult to conduct RCTs for companies conducting daily business because of the high experimental costs. Therefore, obtaining ideal experimental data from RCTs in the field of business is often difficult however, a large amount of observational data are sometimes accumulated when measures, such as distributing coupons, have been implemented in the past. The effect of the measure needs to be correctly estimated from such observational data, and various approaches have been studied in the field of causal inference (Morgan, Citation2014; Pearl et al., Citation2016). A method based on causal inference was proposed to estimate the effect of coupons on a per-user basis (Tsuboi et al., Citation2022). However, this method is limited to verifying the effect of coupons through simulations using artificial and past real-world data. Studies that have gone beyond estimating the effect of measures using past data and incorporated a series of processes that include determining the implementation method of measures in advance, implementing the measures in actual services, and verifying the effectiveness of the measures to estimate the effect of coupons more accurately are rare and invaluable.

This paper proposes a series of frameworks for designing experiments based on machine-learning methods to estimate the differences in the effectiveness of coupon measures based on users’ potential purchase intention (PPI) accurately. This proposal assumes that the user PPI influences the effect of coupons. By planning measures that consider PPI and analyzing the relationship between the PPI and the effects of coupons, it is possible to verify the effectiveness of the measures accurately and obtain important knowledge for marketing purposes.

In this study, a machine-learning method is used to predict the PPI of users based on purchase history data. An experimental design based on a conditional RCT is developed to analyze the differences in the effects of the measures based on predicted PPI accurately. Furthermore, detailed measurement effects are estimated and analyzed using both model-based and model-free approaches. The proposed framework are applied to historical coupon measurement data to demonstrate its usefulness.

2. Related work

2.1. Randomized controlled trials

An RCT is an experimental method to estimate the effect of measures accurately by randomly assigning customers to either a treatment group (with measures) or control group (with no measures) (Medical Research Council, Citation1948; Crofton & Mitchison, Citation1948). The differences in gender, age, and other factors can have an effect on the outcome variable, making it difficult to estimate the effect of measures accurately. When analyzing the effect of measures from various observational data obtained from business activities rather than data obtained from properly controlled experiments, it is necessary to consider that such data may contain selection bias. To eliminate such selection bias, it is possible to apply causal inference methods to data with selection biases (Morgan, Citation2014; Pearl et al., Citation2016). However, the RCT accurately estimates the effect of measures by randomly assigning customers to either the treatment or control group and comparing the results of the treatment group with those of the control group.

2.2. Studies on effectiveness verification of past actual measures

Various studies have been conducted to analyze customer purchasing characteristics by applying machine-learning models to customer purchase histories and browsing histories (Goto et al., Citation2015; Apichottanakul et al., Citation2021; Okayama et al., Citation2019; Seko et al., Citation2021; Yang et al., Citation2022; Hasumoto and Goto, Citation2022). These analytical models are expected to provide results that can lead to ideas for planning subsequent marketing measures, such as grouping customers with similar preferences and identifying important products and target customers.

In recent years, several studies have estimated and validated the effectiveness of marketing measures in real-world businesses. Tsuboi et al. (Citation2022) introduced a model that utilizes observational data to estimate the effects of multiple measures, aiming to analyze the optimal measures for individual users. Saito et al. (Citation2019) proposed a standardized approach for uplift modeling using observational data, to achieve optimal personalization across various fields. Langen and Huber (Citation2023) presented a causal machine-learning algorithm that assesses the causal effects of coupon campaigns, marketing campaigns aimed at increasing sales, by analyzing not only the average effects of different types of coupons but also the heterogeneity of causal effects among different customer subgroups. Shimizu et al. (Citation2023) proposed a method for fashion image retrieval and performed comparative verification through questionnaire experiments.

Several studies that estimate and verify the effectiveness of business measures rely on simulations using past measurement results or artificial data. It is valuable to conduct studies that go beyond simulation for effect estimation to calculate the effects of measures accurately by planning experimental designs, implementing the measures in an actual service, and analyzing their effects. Therefore, we particularly refer to Li et al. (Citation2021)’s study, which proposed a framework encompassing experimental design, measure implementation, and effect verification in the following section.

2.3. Effectiveness verification of reminder notification measures

Li et al. (Citation2021) proposed a series of frameworks in which they developed an experimental design, implemented measures on an actual online service, and estimated and analyzed the effect of measures. The study evaluated the causal effect of a marketing measure based on experimental results using a precise design based on the authors’ insights and hypotheses. However, our study attempted to construct a machine-learning model from log data, which are the result of measures implemented in the past, and to design appropriate experiments using this model. Here, we summarize the research by Li et al., who obtained important results using traditional experimental approaches.

2.3.1. Problem settings

The objective of Li et al. (Citation2021) was to analyze whether the length of time until a reminder is sent affects the probability of a customer purchasing a product left in the cart. In previous studies and considering the general idea of reminders, the length of time until a reminder is sent should be as short as possible (Moore, Citation2013; Prioleau, Citation2013). In Li et al.’s study, simple aggregations of data suggested that a shorter length of time until a reminder was sent was associated with higher purchase probabilities. However, if we assume that the shorter the time elapsed after adding a product to the cart, the higher is the probability of purchase (Brown, Citation1958; Mueller et al., Citation2003), regardless of the reminder, an early reminder may have a negative effect on the probability of purchase (Aaker & Bruzzone, Citation1985; Goldstein et al., Citation2014; Todri et al., Citation2020; Yoo & Kim, Citation2005), as shown in .

Figure 1. Image of the length of time until a reminder is sent and the probability of purchase. The difference between the treatment group and the control group is the effect of the reminder.

Figure 1. Image of the length of time until a reminder is sent and the probability of purchase. The difference between the treatment group and the control group is the effect of the reminder.

As shown in , reminding users too early may negatively affect the probability of purchase. Therefore, Li et al. proposed a framework that enables an accurate estimation of the effects of reminders by comparing a control group with a treatment group each time.

2.3.2. Experimental design

This study used an RCT to estimate the effect of reminders by comparing a control group with a treatment group at each time point.

The experimental design in Li et al. is as follows:

  1. Set multiple conditions for the length of time until a reminder (0.5, 1, 3, 6, 9, 12, 24, and 72 h later).

  2. Assign conditions to target users randomly.

  3. Assign randomly whether to send a reminder to users who have left an item in their cart until the allocated time has passed.

  4. Send (or not) a reminder according to the assigned condition.

  5. Observe and analyze whether the target users purchase the item in their cart within one month after receiving the reminder.

2.3.3. Effectiveness verification

Based on the experimental design described above, we implemented a reminder and compared the results of the treatment and control groups for each condition. The results showed that too short a length of time until a reminder had a negative effect on the probability of purchase, whereas a long length of time until a reminder had a positive effect on the probability of purchase.

In addition to the simple verification described above (model-free verification), Li et al. predicted the purchase probability by applying a regression model and analyzed the effect of the length of time until a reminder on the probability of purchase by comparing the coefficients of the obtained regression model (model-based verification).

As described above, by designing and conducting the RCT in which the treatment and control groups can be compared based on the length of time until a reminder, it is possible to analyze the relationship between the length of time until a reminder and the effect of measures in the series of proposed frameworks.

3. Proposed framework

In the framework proposed by Li et al., which is based on specialized business knowledge, the variable that can be controlled by humans, called the notification time, is set as the axis of the experimental design. In this study, machine learning was used to set the unobservable variables as the axes of the experimental design.

3.1. Problem setting and overview of the proposal

We propose a series of frameworks from an experimental design based on a conditional RCT to verify the effect of coupons to estimate and analyze the relationship between the PPI and the effect of coupons, following the hypothesis below.

  • Hypothesis 1. The effectiveness of coupon measures varies depending on the PPI. (Barat & Ye, Citation2015)

  • Hypothesis 2. By using machine learning to predict PPI based on past data, it is possible to leverage it as a key factor in setting up future campaign targeting. (Andronie et al., Citation2021; Dabija et al., Citation2022; Pop et al., Citation2023).

  • Hypothesis 3. By conducting randomized controlled trials separately for each group in the experimental setup, it is possible to allocate a consistent number of users to each group with certainty. (Li et al., Citation2021)

We posit that user PPI is a major factor that causes differences in coupon effects according to the conventional understanding (Barat & Ye, Citation2015). Therefore, analyzing the relationship between the PPI and the effects of coupons will provide useful knowledge for highly effective targeting (based on PPI) in the future. To this end, using machine-learning methods, we predicted the PPI, such as the expected purchase amount and frequency of purchases for each user, based on data, such as past purchase history and user attributes, in the proposed framework.

Subsequently, using the predicted PPI, we divided the users into groups and planned a conditional randomized comparison test for each group in the proposed framework. In other words, the users in each group were randomly assigned to a treatment group, to which coupons were distributed, or a control group, to which no coupons were distributed.

Finally, by comparing the results of the treatment and control groups, we analyzed the relationship between the PPI and the effect of coupons in the proposed framework.

In the RCT, the entire population was randomly assigned to the treatment or control groups. In this study, on the other hand, it is ideal to have a certain number of samples for each condition of users. Therefore, users were divided into groups based on the predicted PPI, and reminders were randomly assigned to users within each group. In this study, we define such an experimental design as ‘conditional randomized comparison test’.

The details of the framework proposed in this paper are described below.

  • STEP1. Predict users’ PPI by machine learning.

  • STEP2. Group users based on the predicted PPI.

  • STEP3. Assign randomly whether or not to distribute coupons to users assigned to each group (setting control and treatment groups).

  • STEP4. Distribute (or not) coupons according to the assigned condition.

  • STEP5. Observe and analyze whether or not the target user will purchase using the coupon during the implementation period.

shows the overall architecture diagram of the framework proposed in this paper.

Figure 2. The overall architecture diagram. Based on the Data Flow Diagram (DFD), the arrows represent input and output, vertical lines indicate data, circles represent processes, and below, the tools used are shown.

Figure 2. The overall architecture diagram. Based on the Data Flow Diagram (DFD), the arrows represent input and output, vertical lines indicate data, circles represent processes, and below, the tools used are shown.

3.2. Prediction of Potential purchase Intention and Setting of experimental conditions

In this section, we will provide a detailed explanation of STEP 1, 2 and 3 of the proposed framework.

To group users based on their PPI, we predicted their PPI using a machine-learning method in the proposed framework. The learning process uses the user’s behavioral history, such as past purchases and browsing, and user attribute information as input data and predicts the user’s PPI, which is expressed in terms of future purchase amounts, purchase frequency, and other purchase behavior outcome indices. In this study, we first input attribute and behavioral data on users during a specific period in the past and learn and predict with LightGBM (Ke et al., Citation2017) using the user’s purchase data for the period thereafter as correct answers.

LightGBM is a popular machine learning framework designed for efficient and high-performance gradient boosting. It’s particularly useful for tasks like classification and regression. LightGBM stands out for its speed and memory efficiency, making it capable of handling large datasets and complex models with ease. It uses a histogram-based algorithm to split data during training, which speeds up the process significantly.

Subsequently, using the learned model for all the users in the experiment, the PPIs of all the users were predicted.

In the proposed framework, two types of PPI were predicted and groupings based on both types of PPI were created to conduct a more detailed analysis in a single experiment. The variables that express the PPI include the future purchase amount and the frequency of purchases by each user. Depending on the content of the measure, it may be possible to set variables that are limited to items in specific categories, such as the purchase amount for cosmetics or purchase frequency for shoes. The details of the variables used in the experimental analyses are presented in the experimental section.

shows the grouping of users based on the two types of PPI and the assignment of experimental conditions for conditional RCTs in the proposed framework.

Figure 3. Image of grouping users based on PPI and assigning experimental conditions. Using two types of PPI, divide into nine groups and assign the conditions of no coupon and coupon available within them.

Figure 3. Image of grouping users based on PPI and assigning experimental conditions. Using two types of PPI, divide into nine groups and assign the conditions of no coupon and coupon available within them.

After grouping the users, as shown in , we plan to conduct a conditional RCT to assign users in each group randomly to a treatment group to which coupons are distributed and a control group to which no coupons are distributed.

3.3. Effectiveness verification

In this section, we will provide a detailed explanation of STEP 4 and 5 of the proposed framework. After implementing the measures (experiments), in addition to comparing the outcome variables by propensity score matching, a prediction model was constructed using the outcome variables as objective variables, and the effects of the explanatory variables (covariates) were confirmed in the proposed framework. Thus, we not only obtain an overall understanding of the effects obtained from the measures, but also analyze the relationship among the PPI, covariates, and the effects of the measures.

Propensity score matching makes it possible to remove the influence of selection bias and compare the outcome variables between the control and treatment groups. This method uses data, such as user attribute information and past behavioral history, as inputs, similar to the explanatory variables used to predict the PPI. A logistic regression is then learned using whether coupons are distributed to users as the objective variable, and the results are compared between the users in the control and treatment groups whose predicted values (propensity scores) are similar. This verification of the proposed framework enabled us to obtain an overview of the effects of the measures on each group.

Furthermore, by using a machine-learning method to predict the outcome variables during the measurement period, with user attribute information, past behavioral history, experimental condition groups, and whether to distribute coupons as inputs, and by analyzing the degree of influence of each variable on the outcome variables in the proposed framework, it is possible to analyze the relationship among the PPI, other covariates, and the effect of measures.

In Li et al. (Citation2021), the authors conducted this verification using a logistic regression model with high interpretability but lower predictive accuracy than other methods. However, recent developments in explainable artificial intelligence technology, such as SHAP (Lundberg & Lee, Citation2017), have made it possible to interpret easily models with high prediction accuracy but low interpretability, such as LightGBM.

SHAP assesses how much each individual feature contributes to a prediction and breaks down the model’s predictions into interpretable components. This allows for a better understanding of model predictions and the evaluation of feature importance. SHAP helps improve the interpretability of black-box models and makes the model’s decision-making process more transparent.

Various studies based on LightGBM and SHAP are actively being conducted in various fields (Kim Chi et al., Citation2023; Chadaga et al., Citation2022; Muhamedyev et al., Citation2020). In the proposed framework, we used LightGBM as a validation model and SHAP to interpret the model and analyze the relationship between the PPI and other covariates and the effect of the measures in detail.

4. Experimental analyses

To demonstrate the usefulness of the proposed framework, we simulate an actual experimental design using data related to past coupon distribution measures on the fashion EC site ZOZOTOWN (ZOZO, Inc., 2023).

There are numerous studies being conducted to analyze PPI in the fashion industry and it is important to predict and analyze PPI (Andronie et al., Citation2021; Dabija et al., Citation2022; Pop et al., Citation2023). In this study, we assume that the user PPI influences the effect of coupons. Therefore, it is believed that in this study, it would be beneficial to analyze the relationship between PPI in the fashion industry and the effectiveness of coupon measures.

In this study, the target measure is a coupon distribution measure that aims to promote new purchases of second-hand fashion items by users who have not previously purchased second-hand fashion items from the target service.

4.1. Analysis conditions

Based on the results of a preliminary experiment, two predicted values were used as the PPI to divide users into groups in the proposed framework:

  1. the expected purchase amount of second-hand fashion items in the next year

  2. the expected purchase amount of new fashion items in the next year

To train the model to calculate these predictions in the proposed framework, we used the data of 200,000 users who made their first purchase of second-hand fashion items in September 2021, including their purchase history, browsing history of new fashion items, browsing history of second-hand fashion items, and user attribute information, for the period from September 2020 to August 2021 as explanatory variables. In addition, LightGBM was trained using the purchase amounts of new and second-hand fashion items for a one-year period from September 2021 as the objective variable.

For measured data, we use the results of the coupon distribution measure implemented for 1.6 million users who had not purchased second-hand fashion items in August 2022 and who had purchased new fashion items within the past year. Therefore, using variables, such as purchase history, browsing history of new fashion items, browsing history of second-hand fashion items, and user attribute information, from August 2021 to July 2022, we predicted the purchase amounts of new and second-hand fashion items for the year after August 2022 using a learned model. The predicted values were then used as the PPI of the users targeted by the measure, and the users were divided into groups in the proposed framework.

The outcome variable was whether the user made the first purchase of second-hand fashion items during the measurement period. To verify the effects of the measure, we compare the treatment and control groups within each group using propensity score matching (Rosenbaum & Rubin, Citation1983). A logistic regression model was used for propensity scoring. As mentioned previously, model-based validation combines predictions by LightGBM and interpretations by SHAP to provide a detailed analysis of the results and other covariates in the proposed framework.

4.2. Grouping users based on Potential purchase Intention

Under the aforementioned conditions, we used LightGBM in the proposed framework to predict 1) the expected purchase amount of new fashion items and 2) the expected purchase amount of second-hand fashion items in the next year.

In the prediction of the expected purchase amount of new fashion items in the next year, the average time spent on the site and the number of products purchased in the past year were used as important features, and in the prediction of the expected purchase amount of second-hand fashion items, age and the number of second-hand fashion items viewed in the past month were used as important features, and were shown to affect the predicted values. Thus, the prediction using the proposed framework provides additional insights into user characteristics.

As it was necessary to have a certain number of users in each group, we set the number of segments to nine (3 × 3) such that the number of users in each group was as even as possible. For convenience, we do not mention the threshold values used for segmenting on the axis. shows the number of users assigned to each group.

Table 1. Number of users for each group and each condition.

4.3. Results and Discussion of the effectiveness test

shows the results of the comparison between the treatment and control groups using propensity score matching within each group divided by the two types of PPI in the proposed framework. In other words, the average treatment effect (Rubin, Citation1974) of the probability of purchasing second-hand fashion items with coupons is higher.

Figure 4. Comparison result of the average treatment effect between the treatment and control groups using propensity score matching. A darker color in the figure indicates that the measure was more effective. *the values on the axis are hidden for convenience.

Figure 4. Comparison result of the average treatment effect between the treatment and control groups using propensity score matching. A darker color in the figure indicates that the measure was more effective. *the values on the axis are hidden for convenience.

The results in the proposed framework are positive for all the groups, indicating that the probability of purchasing second-hand fashion items increases with the coupon distribution, regardless of the PPI. In addition, a comparison based on the PPI shows that the measure has a particularly large effect on the groups with a ‘high’ or ‘medium’ PPI for second-hand fashion items and a ‘low’ PPI for new fashion items. This suggests that users who are not attached to new fashion items and have a latent preference for second-hand fashion items are more likely to purchase second-hand fashion items because coupons stimulate their purchase intentions for second-hand fashion items. In contrast, the measure has a small effect on the users with a ‘high’ PPI of new fashion items regardless of their PPI of second-hand fashion items. This result suggests that users have a strong preference for new fashion items and have already used the target service sufficiently to purchase new fashion items. Therefore, coupons for second-hand fashion items are unlikely to encourage them to purchase second-hand fashion items.

The results of the analysis of the features that contribute to the purchase of second-hand fashion items were obtained by learning LightGBM and interpreting the obtained model with SHAP in the proposed framework. The objective variable was whether a new purchase of second-hand fashion items was made during the period of the measure, and the explanatory variables were the covariates, whether coupons were distributed, and the group to which the user was assigned. shows the top five variables and their SHAP values, which were determined to have a significant effect on the objective variable by SHAP.

Figure 5. Results of the analysis of the effects of each covariate on the outcome variable using model-based validation. The top five variables and their SHAP values, which were determined to have a significant effect on the objective variable by SHAP.

Figure 5. Results of the analysis of the effects of each covariate on the outcome variable using model-based validation. The top five variables and their SHAP values, which were determined to have a significant effect on the objective variable by SHAP.

The results indicate that the probability of purchasing newly second-hand fashion items is higher for users who have an original interest in second-hand fashion items and have browsed through the products. In addition, users with lower average unit prices for new items were more likely to purchase second-hand fashion items. This may be due to the background motivation to minimize costs and the lower emphasis on the importance of new fashion items. The top SHAP values do not include a feature related to whether or not coupons are distributed, which may be because the number of users who purchased second-hand fashion items is small.

Certainly, it is possible to obtain more detailed analysis results based on the proposed framework for this experiment. However, due to confidentiality obligations related to the data source, we are unable to provide more specific business insights in this paper. Therefore, this paper presents an overall view of the relationship between PPI and the effectiveness of coupon strategies within each group. In practice, it is possible to obtain more detailed analysis results, such as the specific relationship between PPI and coupon strategy effectiveness within each group, as well as individual characteristics and the effectiveness of coupon measures, among other details.

In the proposed framework, we first performed an experimental design based on machine learning, considering the factors that the service side wanted to analyze, and then verified the effectiveness of the measures using the results while ensuring a sufficient number of users who would (or would not) implement the measures. Model-based verification confirmed that it is possible to analyze in detail the extent to which each factor influences the outcome variables while capturing the overall impression through model-free verification.

5. Discussion

5.1. Effectiveness of the proposed framework

In this study, we applied the proposed framework to data from past coupon distribution measures implemented in an actual service. We predicted the PPI based on machine learning and planned an experimental design based on the predicted PPI. Assuming the application of the experimental design to past data, we conducted an effect verification using the results of the measures.

Regarding the prediction of PPI in the proposed framework, we used LightGBM and SHAP, which are machine-learning techniques, to predict the annual expected purchase amount for new and second-hand fashion items as target variables. Through this prediction, we obtained additional insights into the characteristics of users with a high PPI. When conducting the effect verification of the pseudo-applied experimental design in the proposed framework, we confirmed that model-free verification captures the overall sense of the effectiveness of coupon measures, whereas model-based verification allows for a detailed analysis of the effect of each factor on the outcome variable.

By observing the results obtained from the proposed framework, we gained insights into how the effectiveness of measures varies depending on the PPI, and the features that influence the outcome variable (the presence or absence of purchases using distributed coupons). Simultaneously, we confirmed the usefulness of the proposed framework.

5.2. Scalability of the proposed framework

In this study, we adopted two dimensions of expected purchase amounts for new and second-hand fashion items as the PPI. However, this can vary according to the service objectives. For example, if the purpose of the coupon measure is to promote purchases of new items among dormant users, we can use the dimensions of purchase unit price and purchase frequency for new items to plan an experimental design based on users’ multifaceted purchasing tendencies. Furthermore, through effect verification and analysis in the proposed framework, we can accurately estimate the effectiveness of the measure and gain insights into appropriate targeting of the business aspect. Thus, the framework can be extended and utilized by determining appropriate dimensions according to the objectives.

Additionally, while this study focused on analyzing coupon measures and PPI, the framework can be applied to other fields if variables related to the effectiveness of the measures and their correlation with user behavior or attributes can be predicted. For example, when estimating the effects of ingredients believed to reduce the risk of cancer, analyzing the predicted risk of cancer as an axis could provide insights into how certain ingredients affect individuals who are more susceptible to cancer-related risks owing to their health conditions.

5.3. Practical applicability of the proposed framework

We conducted simulations using data from previously implemented measures and demonstrated the effectiveness of the proposed framework. When a sufficient amount of data are available for past measures, it is possible to apply the proposed framework to these measures and gain new insights. On the other hand, by utilizing the proposed framework in actual experimental planning and effect verification, it becomes possible to estimate the effectiveness of new coupon measures accurately and utilize the insights for targeting and other purposes. The proposed framework demonstrates excellent practical applicability in estimating and analyzing the effectiveness of coupons, and other novel measures, enabling its utilization in various business contexts.

6. Conclusion and Limitations

6.1. Conclusion

In this study, we proposed a series of frameworks to analyze the relationship between users’ PPI and the effectiveness of coupon distribution measures. The framework includes predicting PPI using machine learning, experimental planning of coupon distribution measures based on two predicted types of PPI, and the analysis of the results of these measures. Such a comprehensive framework for evaluating the effectiveness of measures is rare and valuable.

Furthermore, we pseudo-applied the proposed framework to historical data of coupon distribution strategies implemented in an actual service and observed the obtained results. By predicting two types of PPI and grouping users accordingly, we were able to conduct a more detailed analysis in a single experiment. Additionally, by interpreting the results of predicting PPI through machine learning, we gained insights into business-related knowledge, such as the characteristics of users with high PPI.

Moreover, the results obtained using the proposed framework allowed us to analyze how the effectiveness of measures varies based on different levels of PPI. For example, we observed that the effectiveness of measures is particularly significant in groups with high or medium purchase amounts for used items and low purchase amounts for new items. By interpreting the results of predicting strategy outcomes through machine learning, we analyzed which user characteristics influence the results of measures, thus confirming the usefulness of our proposal.

In the proposal framework, by utilizing machine learning to predict PPI from accumulated historical data and grouping individuals accordingly, it is expected that effective targeting based on PPI in the future can be conducted using past data, guided by the insights derived from this research.

6.2. Limitations

In this study, the coupon distribution measures employed on the past online service were extensive, resulting in a substantial number of recipients. However, it is worth considering that in some cases, due to cost constraints or other factors, it may be necessary to narrow down the number of recipients for these measures. However, it is conceivable that there are cases where the number of target subjects needs to be limited due to cost constraints and other factors. In such situations, it may be difficult to conduct conditional randomized controlled trials by grouping users based on their PPI and ensuring a certain number of users in each group. In cases where cost constraints are limiting, and prioritizing profit is essential, it is considered preferable to limit coupon distribution to groups that are likely to have an effect rather than conducting conditional randomized controlled trials for all groups.

In this study, we adopted two axes, namely the expected purchase amount for new and used items, as a measure of PPI for each user. However, it is important to note that various configurations are possible depending on the service’s objectives and other factors. On the other hand, if it is difficult to predict PPI for grouping based on existing data, it may not result in an accurate experimental design, as proper group allocation cannot be achieved in such cases.

Supplemental material

booktabs.sty

Download (6.2 KB)

rotating.sty

Download (5.5 KB)

interact.cls

Download (23.9 KB)

epsfig.sty

Download (3 KB)

Acknowledgment

The authors express their gratitude to Hokuto Sasaki and Tomoaki Ozawa for their helpful discussions and reviews of the contents of our paper.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Additional information

Funding

This study was supported by JSPS KAKENHI [grant number: 21H04600].

References

  • Aaker, D. A., & Bruzzone, D. E. (1985). Causes of irritation in advertising. Journal of Marketing, 49(2), 47–57. https://doi.org/10.2307/1251564
  • Amano, T., Shimizu, R., & Goto, M. (2023). Recommendation item selection algorithm considering the recommendation region in embedding space and new evaluation Metric. Industrial Engineering & Management Systems, 22(3), 340–348. https://doi.org/10.7232/iems.2023.22.3.340
  • Andronie, M., Lăzăroiu, G., Ștefănescu, R., Ionescu, L., & Cocoșatu, M. (2021). Neuromanagement decision-making and cognitive algorithmic processes in the technological adoption of mobile commerce apps. Oeconomia Copernicana, 12(4), 1033–1062. https://doi.org/10.24136/oc.2021.034
  • Apichottanakul, A., Goto, M., Piewthongngam, K., & Pathumnakul, S. (2021). Customer behaviour analysis based on buying-data sparsity for multi-category products in pork industry: A hybrid approach. Cogent Engineering, 8(1), 1865598. https://doi.org/10.1080/23311916.2020.1865598
  • Barat, S., & Ye, L. (2015). Effects of coupons on consumer purchase behavior: a meta-analysis. Journal of Marketing Development and Competitiveness, 6, 131–145. https://doi.org/10.1007/978-3-319-11779-9_15
  • Brown, J. (1958). Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology, 10(1), 12–21. https://doi.org/10.1080/17470215808416249
  • Chadaga, K., Prabhu, S., Sampathila, N., Chadaga, R., K S, S., & Sengupta, S. (2022). Predicting cervical cancer biopsy results using demographic and epidemiological parameters: a custom stacked ensemble machine learning approach. Cogent Engineering, 9(1), 2143040. https://doi.org/10.1080/23311916.2022.2143040
  • Cheng, H. K., & Dogan, K. (2008). Customer-centric marketing with Internet coupons. Decision Support Systems, 44(3), 606–620. https://doi.org/10.1016/j.dss.2007.09.001
  • Crofton, J., & Mitchison, D. A. (1948). Streptomycin resistance in pulmonary tuberculosis. British Medical Journal, 2(4588), 1009–1015. https://doi.org/10.1136/bmj.2.4588.1009
  • Dabija, D. C., Câmpian, V., Pop, A.-R., & Băbuț, R. (2022). Generating loyalty towards fast fashion stores: a cross-generational approach based on store attributes and socio-environmental responsibility. Oeconomia Copernicana, 13(3), 891–934. https://doi.org/10.24136/oc.2022.026
  • Dias, G. P., Gomes, H., Gonçalves, J., Magueta, D., Marques, F., Martins, C., & Araújo, J. (2015). Discount coupons dematerialization: a comprehensive literature review [Paper presentation]. 7th International Conference on Information Process and Knowledge Management, 92.
  • Didero, N., Costanigro, M., & Jablonski, B. (2021). Promoting farmers market via information nudges and coupons: A randomized control trial. Agribusiness, 37(3), 531–549. https://doi.org/10.1002/agr.21688
  • Goldstein, D. G., Suri, S., McAfee, R. P., Ekstrand-Abueg, M., & Diaz, F. (2014). The economic and cognitive costs of annoying display advertisements. Journal of Marketing Research, 51(6), 742–752. https://doi.org/10.1509/jmr.13.0439
  • Goto, M., Mikawa, K., Hirasawa, S., Kobayashi, M., Suko, T., & Horii, S. (2015). A New Latent Class Model for Analysis of Purchasing and Browsing Histories on EC Sites. Industrial Engineering & Management Systems, 14(4), 335–346. https://doi.org/10.7232/iems.2015.14.4.335
  • Hasumoto, K., & Goto, M. (2022). Predicting customer churn for platform businesses: Using Latent variables of variational autoencoder as consumers’ purchasing behavior. Neural Computing and Applications, 34(21), 18525–18541. https://doi.org/10.1007/s00521-022-07418-8
  • Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30, 3146–3154.
  • Kim Chi, D. T., Van Lang, T., & Nguyen, T. Q. (2023). Clinical data-driven approach to identifying COVID-19 and influenza from a gradient-boosting model. Cogent Engineering, 10(1), 2188683. https://doi.org/10.1080/23311916.2023.2188683
  • Langen, H., & Huber, M. (2023). How causal machine learning can leverage marketing strategies: Assessing and improving the performance of a coupon campaign. PloS One, 18(1), e0278937. https://doi.org/10.1371/journal.pone.0278937
  • Li, J., Luo, X., Lu, X., & Moriguchi, T. (2021). The double-edged effects of e-commerce cart retargeting: does retargeting too early backfire? Journal of Marketing, 85(4), 123–140. https://doi.org/10.1177/0022242920959043
  • Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 4768–4777. https://doi.org/10.48550/arXiv.1705.07874
  • Medical Research Council. (1948). Streptomycin treatment of pulmonary tuberculosis: a Medical Research Council investigation. British Medical Journal, 2(4582), 769–782. https://doi.org/10.1136/bmj.2.4582.769
  • Morgan, S. L. (2014). Counterfactuals and Causal Inference: Methods And Principles For Social Research (Analytical Methods for Social Research). Cambridge University Press.
  • Moore, J. (2013). Time Means Everything in Programmatic Display. Marketing Land. https://marketingland.com/the-element-of-time-means-everything-in-programmatic-display-33928
  • Mueller, S. T., Seymour, T. L., Kieras, D. E., & Meyer, D. E. (2003). Theoretical implications of articulatory duration, phonological similarity, and phonological complexity in verbal working memory. Journal of Experimental Psychology. Learning, Memory, and Cognition, 29(6), 1353–1380. https://doi.org/10.1037/0278-7393.29.6.1353
  • Muhamedyev, R., Yakunin, K., Kuchin, Y. A., Symagulov, A., Buldybayev, T., Murzakhmetov, S., & Abdurazakov, A. (2020). The use of machine learning “black boxes” explanation systems to improve the quality of school education. Cogent Engineering, 7(1), 1769349. https://doi.org/10.1080/23311916.2020.1769349
  • Okayama, S., Yamashita, H., Mikawa, K., Goto, M., & Yoshikai, T. (2019). Relational Analysis Model of Weather Conditions and Sales Patterns Based on Nonnegative Tensor Factorization. International Journal of Production Research, 58(8), 2477–2489. https://doi.org/10.1080/00207543.2019.1692157
  • Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. Wiley.
  • Pop, R.-A., Hlédik, E., & Dabija, D.-C. (2023). Predicting consumers’ purchase intention trough fast fashion mobile apps: the mediating role of attitude and the moderating role of COVID-19. Technological Forecasting and Social Change, 186(Part A), 122111. https://doi.org/10.1016/j.techfore.2022.122111
  • Prioleau, F. (2013). The Recency Bump: In Retargeting Timing Is Everything. Search Engine Land, https://searchengineland.com/the-recency-bump-in-retargeting-timing-is-everything-151099
  • Rodrigues, A. P., Chiplunkar, N. N., & Fernandes, R. (2020). Aspect-based classification of product reviews using Hadoop framework. Cogent Engineering, 7(1), 1810862. https://doi.org/10.1080/23311916.2020.1810862
  • Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41–55. https://doi.org/10.1093/biomet/70.1.41
  • Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5), 688–701. https://doi.org/10.1037/h0037350
  • Saito, Y., Sakata, H., & Nakata, K. (2019). Doubly robust prediction and evaluation methods improve uplift modeling for observational data [Paper presentation]. Proceedings of the 2019 SIAM international conference on data mining (pp. 468–476). https://doi.org/10.1137/1.9781611975673.53
  • Seko, Y., Shimizu, R., Kumoi, G., Yoshikai, T., & Goto, M. (2021). A latent class analysis for item demand based on temperature difference and store characteristics. Industrial Engineering & Management Systems, 20(1), 35–47. https://doi.org/10.7232/iems.2021.20.1.35
  • Shimizu, R., Matsutani, M., & Goto, M. (2022). An explainable recommendation framework based on an improved knowledge graph attention network with massive volumes of side information. Knowledge-Based Systems, 239, 107970. https://doi.org/10.1016/j.knosys.2021.107970
  • Shimizu, R., Saito, Y., Matsutani, M., & Goto, M. (2023). Fashion intelligence system: An outfit interpretation utilizing images and rich abstract tags. Expert Systems with Applications, 213, 119167. https://doi.org/10.1016/j.eswa.2022.119167
  • Sugisaki, T., Nishio, Y., Mikawa, K., Goto, M., & Sakurai, T. (2021). Analysis of entry behavior of students on job boards in Japan based on factorization machine considering the interaction among features. Cogent Engineering, 8(1), 1988381. https://doi.org/10.1080/23311916.2021.1988381
  • Todri, V., Ghose, A., & Singh, P. V. (2020). Trade-offs in online advertising: Advertising effectiveness and annoyance dynamics across the purchase funnel. Information Systems Research, 31(1), 102–125. https://doi.org/10.1287/isre.2019.0877
  • Tsuboi, Y., Sakai, Y., Shimizu, R., & Goto, M. (2022 Multiple treatment effect estimation for e-commerce marketing using observational data [Paper presentation]. Proceedings of Asia Pacific Industrial Engineering & Management System Conference.
  • Yang, T., Kumoi, G., Yamashita, H., & Goto, M. (2022). Transfer Learning Based on Probabilistic Latent Semantic Analysis for Analyzing Purchase Behavior Considering Customers’ Membership Stages. Journal of Japan Industrial Management Association, 73, 160–175. https://doi.org/10.11221/jima.73.160(2E).
  • Yoneda, A., Matsunae, R., Yamashita, H., & Goto, M. (2022). A study of improving the serendipity of recommendation lists based on collaborative metric learning [Paper presentation]. Proceedings of the annual conference of japanese society for artificial intelligence. https://doi.org/10.11517/pjsai.JSAI2022.0_1A4GS203
  • Yoo, C. Y., & Kim, K. (2005). Processing of animation in online banner advertising: The roles of cognitive and emotional responses. Journal of Interactive Marketing, 19(4), 18–34. https://doi.org/10.1002/dir.20047
  • ZOZO, Inc. (2023). ZOZOTOWN. https://zozo.jp/