(spvqjohipufmsftubvsboudvtupnfstcbtfepobcfibwjpsbmtdpsjoh ...ijthr.or.kr/xml/12087/12087.pdf ·...
TRANSCRIPT
Ⅰ. Introduction
Segmentation, Targeting and Positioning, what we
called STP, are most important key to successful marketing
strategy in hospitality industry. This means dividing the
characteristics of various customers into similar groups and
implementing a marketing strategy tailored to the needs and
desire of customers. For the first step, STP marketer should
identify various customer's group (Kotler, Bowen, &
Makens, 2017). Marketers believe that they should be
aware of customer segments that can make them more
satisfied than their competitors, provide them with direct
* Professor, College of Hospitality and Tourism, Sejong University,
e-mail: [email protected]
† (Corresponding author) Professor, Sogang Business School, Sogang
University, e-mail: [email protected]
marketing activities, and then provide products or services
that can capture target segments. The most carefully
selected step for this strategic approach is to choose which
segments the company will focus on. Many hospitality
companies use several methods for this approach. It is more
important to know which method to use for this purpose
(Bowen, 2000). For example, while male business people
often enjoy hotels with restaurants such as bars and clubs,
families or housewives may prefer hotels that include large
restaurants or bakeries. Therefore, knowing who can be
satisfied with the products and services offered by
companies is a beginning and a necessary step in hospitality
business.
Prior to 1950, direct marketers used ‘mail orders’ to
accomplish mass marketing. The purpose of mass marketing
in the traditional approach was to reach a larger number of
International Journal of Tourism and Hospitality ResearchVolume 31, Number 10, pp. 85-97, 2017 ISSN(Print): 1738-3005Homepage: http://www.ktra.or.kr DOI: http://dx.doi.org/10.21298/IJTHR.2017.10.31.10.85
Grouping hotel restaurant customers based on a behavioral scoring model
: An exploratory study
Yukyeong Chong*⋅Gunhee Lee†7)
College of Hospitality & Tourism, Sejong University, 209 Neungdong-ro, Gwangjin-gu, Seoul 05006, Republic of Kore
Sogang Business School, Sogang University, 35 Baekbeom-ro, Mapo-gu, Seoul 04107, Republic of Kore
AbstractSegmentation, targeting, and positioning are the most important keys to a successful marketing strategy in the
hospitality industry. Among these three keys, segmentation is the first step for a marketer to identify various customer needs and desires. Hospitality operators have been trying to increase customer satisfaction and corporate profits by utilizing mass marketing, database marketing, and individual marketing. Despite the increased interest in scoring consumer behavior, applications of the score remain difficult. The lack of understanding and utilization of scores has been an important issue in the hospitality industry. Analysis of customer behavior is not an easy problem to solve because dynamic modeling is required due to changes to customers’ records over time. The current study explores customer data in hotel restaurants and proposes an individual behavior scoring model (BSM) based on the traditional RFM (recency, frequency, monetary) concept. By comparing it with the traditional profiling scoring model (PSM), it is shown that BSM provides a high prediction power of future consumers' behavior. However, PSM has an important role in a complementary sense to identify potential customers who have low behavior scores. This research proposes how to build and validate BSM and PSM with a focus on the utilization of the two models to identify future potential customers efficiently.
Key words: Segmentation, Customer scoring, Behavior scoring model (BSM), Profiling scoring model (PSM), RFM measure
86 Grouping hotel restaurant customers based on a behavioral scoring model
customers and to reach a wider customer base. The traditional
mass marketing processes have been challenged by
one-to-one marketing of new approaches (Rygielski, Wang,
& Yen, 2002). Although the purpose of direct marketing or
mass marketing has not been changed, the current issues have
changed to refer to a practice of database or relationship
marketing that emphasizes individual customer and focuses
on customers’ needs and wants (Petrison, Blattberg, & Wang,
1993). In other words, the recent marketing approach to
improve the satisfaction of one-on-one individual customers
is to build a deep relationship by filling each individual
customer's needs rather than a wide customer base. A deep
relationship with customers can be achieved through a more
customized approach that utilizes individual customer data.
Database marketing, sometimes is called integrated
marketing, relationship marketing, or even maxi-marketing.
Regardless of the names, all techniques seek to build
customer’s behavior information (Nash, 2000). To
accomplish sound relationship with customers, scoring
techniques based on historical transaction data are important
to differentiate customers to develop relationship marketing
strategies. The most common scoring method is to sort the
customers from those who are profitable to those who are not.
Typical customer data available in this case are recency,
frequency, and monetary data (Miglautsch, 2000). The more
advanced form of customer data is the customer's transaction
data.
The current research is exploratory study to investigate
customer’s transaction data in hotel restaurants and
checking possibility of applications in future customer’s
behavior and aims to provide a view of scoring modeling in
the context of the hospitality industry. In particular,
predicted expenditure estimates were used to assign a score
to each individual. The score proposed several scoring
techniques and suggested segmentation of customers based
on the scores. This paper is divided into three sections. First,
traditional relationship marketing concepts and several
scoring techniques are reviewed. In the next section,
empirical study of behavior scoring models (BSM) and
profiling scoring model (PSM) are conducted with
investigating prediction power and customer segmentation.
Conclusions that can provide new approach for customer
behavior quantification are finally addressed.
Ⅱ. Literature review
It is important to understand data-driven relationship
marketing. In general, four aspects of relationship marketing
should be considered: statistical model produced by
quantitative analysis, customer’s information collected at the
individual level, design of linkage between analytic results
and marketing activities to increase the effectiveness of
customer contact, and time and efforts to make relationship
building (Roberts, 1992). Well-designed sets of customer’s
historical records that can track historical pattern of buying
products or services are required to use of scientific statistical
methods to support relationship marketers to keep strong
relationships. Identifying profitable customers to expand
relationship with customers is vital. Also building strong
relationships with loyal customers is the key reason for
marketing activities (Berson, Smith, & Thearling, 1999).
There are recent researches dealing with customer’s
behavior score. One of the researches is to identify profitable
customer based on RFM (Recency, Frequency, and
Monetary) behavior using SOM (Self-Organizing Map)
technique in u-Commerce industry (Cho, Moon, & Ryu,
2014). In this research, customer’s behavior score is proven
to be recommendation service effectively. Another
interesting research can be found on bank industry. The
research proposed customer behavior score using mobile
banking transaction history and break into six groups to
attract and maintain customers with keeping high
customer’s satisfaction (Noori, 2015).
In the hospitality industry, data-driven marketing has
been emphasized as an important marketing issue lately.
Integrated data sets and analytical techniques that are being
used by hospitality marketers have been stressed to discover
the answer to set customized service to a customer (Dev,
Buschman, & Bowen, 2010). In three decades ago, building
a customer-database for micro-marketing in a hotel has been
practiced and showed far exceeded financial performance
(Francese & Renaghan, 1990). However, as Dev et al. (2010)
described, marketing communication has been changed as
International Journal of Tourism and Hospitality Research 31(10), 2017 87
mobile marketing technologies appears. Individualized
personal attention, incentives, and recognition have
specially been important factors to cultivate brand loyalty in
a hotel business (Francese & Renaghan, 1990). Building
customer loyalty is an essential factor in creating
relationships (Bowen & Shoemaker, 1998; Dube &
Renaghan, 1999) and customer information is weighty in
building such relationships between the hospitality business
and the customer.
Researchers in around early the ’90s were aware of how
important frequent guest programs are in the hotel and
airline industries (McCleary & Weaver, 1991; Toh, Rivers,
& Withiam, 1991; Tou & Hu, 1988). Frequent guest profiles,
that are demographic and psychographic characteristics of
the frequent customers, used pivotal sources to develop
marketing strategies and target promotions in the restaurant
business (Wilbourn, McCleary, & Phadeesuparit, 1997).
Bowen (1990) indicated that using information available
through existing databases prepares managers to be
competitive in a radically changing industry, especially
when such valuable customer information was provided
through effective loyalty programs. People in management
level, including hotelier or restaurateurs, should practically
use databases for strategic purposes and not just for tactical
focuses (Bowen, 2000).
Database-driven marketing approaches to customer
relationship management (CRM) (Berson et al., 1999) can
be used interchangeably with relationship marketing, or
one-to-one marketing (Peppers et al., 1999). A database is a
prerequisite of the CRM, which requires managerial
philosophy that allows a company to become familiar with
its customers. Also the CRM needs to work with data-driven
activities in order to run CRM system effectively (Piccoli,
O'Connor, Capaccioli & Alvarez, 2003). The major
attraction of data mining is its capability to build predictive
rather than retrospective models (Shmueli, Bruce, & Patel,
2016). In other words, data mining uses well-established
statistical and machine learning techniques to build models
that predict customer behavior and helps marketing users to
target marketing campaigns more accurately. It also aligns
campaigns more closely with the needs, wants, and attitudes
of customers and prospects. Therefore, data mining aims to
create models for decision-making that predict future
behavior based on analyses of past activity (Berson et al.,
1999; Magnini, Honeycutt, & Hodge, 2003). The ultimate
goal of direct marketing, database marketing, CRM, and
data mining is to differentiate customers, that is to say who
are and will be the profitable valuable customers that a
company has to try to have strong relationship with although
the technical terms are phrased differently (Berson et al.,
1999).
To make better decisions and identify more profitable
customers, direct marketers have been aware of both
relationship strength and relationship quality (Schijns &
Schröder, 1996). Relationship strength has been frequently
measured by behavioral or descriptive indicators (e.g.,
RFM) that can easily be captured in a database. Those
behavioral differentiations, transaction information, have
been used as segmentation variables among different
customer groups (Fader, Hardie, & Lee, 2005; Sarvari,
Ustundag, & Takci, 2016). It is important to discriminate
against the worst customers to provide a customized service
to the best customer. One way to determine who will be
receiving an upcoming marketing personal contact, such as
telephone calls and emails, by predicting likelihood to
respond or expect sales from a perspective customer is
predictive scoring models (Schijns & Schröder, 1996).
By using data from a single piece of previous contact
information, recency, frequency, and monetary value,
scoring models can predict future revenue, and these
predictions are scores (Malthous & Derenthal, 2008). RFM
code or customer-lifetime-value (CLV) has been studied to
obtain the answers to quantify customer behaviors
(Miglautsch, 2000; Borle, Singh, & Jain, 2008). According
to Hughes’ calculation (1996), customers are broken down
by frequency (e.g., number of visits a store) and frequency
is categorized into five-quintile groups. Customers who
visit the store many times are much more likely to visit
again than those who seldom visit. Customers are also
grouped by their monetary value. Similarly, a quintile
categorization of the customers by how much a customer
spent can be used. After customers are broken down by
recency, frequency, and monetary value, each customer will
be assigned a three-digit RFM code. Since each RFM code
88 Grouping hotel restaurant customers based on a behavioral scoring model
is constructed with a three-digit five quintile number, the
total possible combination of values for the three-digit code
is 125. That is, the RFM code is a single cell from 125
possible cells such as 555, 554,…, 445, 444, …, 355, 354,
…, 113, 112, and 111. The RFM code is not an ordinal scale
that has the property of order but a nominal scale that is
assigned for the sole purpose of differentiating one object
from another. The RFM code itself by nature, thus, could
not be treated as a score that has an order of high and low
(Qiasi et al., 2012).
Miglautsch (2000) discussed two common RFM scoring
methods: customer quintile scoring and behavior quintile
scoring. In the customer quintile scoring, customers are
sorted by descending order and broken into five equal
groups using their RFM information to generate 125 equal
sized segments. On the other hand, the behavior quintile
scoring method uses the monetary score that would
generate an equal amount of sales in each quintile instead of
using an equal number of individual in each group as used in
the customer quintile scoring. However these scoring
methods still remain to define each individual cell such as
435 or 233 and fail to score to individual customer in each
cell. He discussed different weighing methods to convert
RFM value to a single score by adding up three actual
numbers, adding three RFM codes, and multiplying certain
numbers by each RFM value.
In Rhee and McIntyre's study (2008), marketing firm's
contact-efforts was considered to be the essential variable in
scoring modeling. Such an approach is recognizable in some
industries; however, the contact-efforts of the promotion
campaign would largely be determined by which customer
is valuable in hospitality operations. There will be
considerable variation in scoring methods with subjectivity,
leaving aside whether the scoring methods are right or wrong.
Another type of information, in addition to the behavioral
transaction data, is customers' demographic data that can be
used for understanding the current market situation. Sheth
(1977) criticized using demographic factors as determinants
or correlates of consumption behavior of consumers due to
the lack of relevance of the factors and poor prediction etc.
However, many researchers have used demographic profiles
in various academic fields. For example, there is research
with some topics include: the correlation demographic
variables with consumer alienation in the marketplace
(Lambert, 1981), the effect of demographic variables of
modeling for determining segment membership using panel
data (Gupta & Chintagunta, 1994), the influence of
demographic characteristics over consumers' decision on
usage frequency in the bank industry using adoption theory
(Branca, 2008), the role and the effect of demographic and
socioeconomic variables on travel choice (Kattiyapornpong
& Miller, 2008), how demographic profiles affect
consumers' on-line shopping behavior (Hashim, Ghani, &
Said, 2009), and a good many others. Although marketers
need much more information to comprehend customers'
behavior in marketplaces other than demographic variables,
demographic profiles serve a basic, yet important role in
interpreting the characteristics of clusters, groups, or
segments of customers (Yeh, Plante, & Agrawal, 2011).
In studies that especially use customer data, demographic
profiles are essential in research, yet in the most part the usage
of the information is mostly limited to descriptive analysis.
Demographic profile could be used far more than describing
group-characteristics. They could also be used as the same
vehicles as RFM in scoring individual customer. This paper
introduces how to build one-dimensional scoring model
(BSM) reflecting customers' longitudinal behavior data and
scoring model (PSM) based on demographic profiles. These
scoring models are to differentiate segments of customers and
predict customers’ future contribution to a restaurant. In the
next empirical study section, after describing data and
cleaning process, the monetary value analyses is given based
on the expenditures and number of visits. Next part includes
the comparison analysis between BSM and PSM in terms of
model fitting and prediction power. Finally, the customers are
distinguished by predicted expenditure estimates for future
contribution.
Ⅲ. Methodology
1. Data description
This research uses customer data from a hotel in Seoul,
International Journal of Tourism and Hospitality Research 31(10), 2017 89
Korea. The hotel is globally-franchised five-star hotel and at
that time of the research operated 10 different restaurant
outlets; Italian Restaurant, Lounge, Bakery and Beverage,
Banquet, Club, French Restaurant, Bar, Chinese Restaurant,
Japanese Restaurant, and Buffet when we obtained the data
set. The hotel accumulates the types of restaurants that a
customer visits, the gender and occupation of a customer, what
time (month) a customer visits a restaurant, and how much
money a customer spends each visit to a restaurant. Most of
the customers who have a membership reside domestically,
so tourists are not included.
Out of the hotel restaurant customers who have
memberships, 959 customers information (11,466
transactions) was collected to identify customers' behavior.
The data for the current research includes longitudinal
information that may provide a customer's behavior pattern
instead of using cross-sectional data that explains only a
single transaction. To facilitate analysis, the individual
expenditure data has been transformed to an average
expenditure per month. The frequency of visits to each
restaurant and the purchasing expenditures are obtained.
The nature of the frequency of restaurant visits is a discrete
variable and the monetary value is a continuous variable.
Gender and occupation are the demographic variables
available for this study.
2. Data cleaning process
Data cleaning is the next step after gathering data and
refers to a process of removal of noise, errors, and incorrect
input from a database (Adriaans & Zantinge, 1996). These
are inevitable problems that analysts encounter as they
begin to use a new data set. To some degree, any database
system may have inconsistent, incomplete, or erroneous
data. As much as 80 percent of the time associated with the
data mining process will be spent dealing with these
problems (Westphal & Blaxton, 1998). In this study, some
fields, such as birth date, contain very little customer data,
while other fields, such as joining date, have no data
recorded at all, although there was a field for it. After
removal of non-usable fields during the discovery state,
gender, and occupation are selected as usable demographic
variables. Frequency of restaurant visits and the expenditure
of 959 customers are also collected.
For the scoring modeling analysis at the end, the
customers who did not indicate their occupation were
removed. Because this unidentified group might be
included in any other occupation group, this group of
customers was not considered for the study. 30 customers'
data were deleted due to the missing value of occupation
(23cases) and unusual transactions(7cases). Finally, 929
customers were selected for further analysis. Secondly, 340
customers (identified as a dormant group) visited less than
four times during the 12-month period and were not
included in the next analysis. In addition, no frequency,
which means that a customer has not visited any restaurant
in this hotel in a certain month, is transformed to ‘zero’
rather than treated as a missing value. Therefore, 589 active
customers who had four or more visits during the 12-month
observation period were used for further analysis. Table 1
presents the proportion of removed, active, and dormant
customers in this study.
3. Analysis
The monetary value is defined as how much a customer
spends during a specified time interval. Unlike frequency
that represents the number of visits, monetary value can be
treated as a continuous random value. There are two types
of analytic models in this study: behavioral scoring model
(BSM) and profiling scoring model (PSM). Both BSM and
PSM provide individual customer score which is equivalent
with the expected expenditure of a customer for the next
month (December in this case). In the BSM, 589 individual
scoring models are estimated while one aggregated scoring
model is used in the PSM based on the past 12-month
Data group(N=959) Active customers Dormant customers Removed customersFrequency(%) 589(61.4%) 340(35.5%) 30(3.1%)
Table 1. Distribution of active, dormant, and removed customers
90 Grouping hotel restaurant customers based on a behavioral scoring model
transaction data. All statistical analyses of data were
performed using the SAS(Statistical Analysis System).
In the BSM for expenditure analysis, 589 regression
models are employed based on the transaction data during
11-month (January through November). The regression
models for customers are as follows: yt = + 1(time) + εt,
where yt indicates expenditures per restaurant visit per month.
In this model, and 1 represent an intercept and a slope for
changes of yt over time (11 months), respectively. The εt,
represents individual variability treated as random error,
assuming that the mean equals zero and constant variance is
2.
Means and standard deviations of slopes and intercepts
for each gender are shown in Table 2. The negative mean
value of slope for females implies that the expenditures per
restaurant visit of females decrease during the period from
January through November, while the positive mean value
of slope for males indicates increased expenditures over
time. High standard deviations of slopes and intercepts
indicate that large variability exists among individuals.
Also, the average value of the intercept for males is higher
than that for females. It is concluded that male customers
spend more money than female customers at restaurants in
this hotel, with expenditures increasing over time.
However, due to large amounts of variability among
individuals, these differences of slope(p=0.1765) and
intercept(p=0.1219) between males and females are not
statistically significant with 5% significant level.
Figure 1 presents average values of slope and intercept
estimates for food purchases by occupation. The slopes of four
of the occupations--housewives, doctors, business owners,
and professors--are located below zero, implying that
expenditure per restaurant visit decrease from January
through November. The slopes of four other occupations―
government officers, lawyers, presidents/chairmen, and
businessmen―are located above zero.
Tables 3 summarizes the comparison of mean values of
slope and intercept estimates. Although slopes and intercepts
Note: Each position represents averages of estimates by each occupation
Figure 1. Means of slope and intercept of expenditure by occupation
Parameter estimate (N=589) Male(N=445) Female(N=144)Intercept(Mean±SD)
Slope(Mean±SD)62.31±84.490.35±10.33
52.50±73.19-0.86±7.41
Table 2. Mean differences of intercept and slope of expenditures between male and female
International Journal of Tourism and Hospitality Research 31(10), 2017 91
for each occupation are shown to be different, due to the large
subject variability, there is no statistical significance among
occupations (p-value=0.8951 for intercept; p-value=0.8275
for slope) at 5% significant level.
In the PSM, 589 customers with eleven months of data are
used for building a predictive PSM model. At first, analysis
of variance(ANOVA) model with two factors, gender and
occupation, and one time covariate was used for the analysis
of expenditures including two interaction effects: (time× gender) and (gender×occupation). Since the ANOVA model
shows no significance of two interaction effects with 5%
significant level, we consider only main effects of ANOVA
model without interaction. Therefore, the final PSM is as yt
= + 1(time) + εt+ 2 (gender) + j (occupation)i + εt, where
yt indicates expenditures per month, a represents an intercept
and b1 represents a slope for changes of yt over time (11
months). The term(occupation)i represents seven dummy
variables. The et represents individual variability treated as
random error with the assumption that the mean equals zero
and constant variance is 2. According to the summary of final
PSM presented in Table 4, main effects of gender and
occupation are statistically significant at 5% significance
level. The results confirm that gender and occupation play in
major role for building PSM model. The PSM is used in the
comparison of model fitting and prediction power of BSM.
Ⅳ. Results
1. Model assessment and validation
The performances of the individual behavior models and
the aggregating profiling model are evaluated in two ways,
model fitting and assessment of prediction power on a test
data set(December data). The validation of the model is a
way to evaluate how good the model is at predicting the data
set. The validation process is important because the results
of data mining are often used for strategic issues throughout
an organization. In data mining, there is a danger of
over-fitting the model. That is, it is possible that the model
can be highly predictive for a training set but can be less
efficient with data not used in building the model (Groth,
1998). Therefore, the model validation process required for
data mining is that after building the model on some
historical data, the model can be applied to similar historical
data from which the model was not built (Berson et al.,
1999).
For the training and test method, the entire data set is
divided into two data sets: a training set and a test set (or
holdout sample). After the model is fitted using the training
data set, the test set is applied to evaluate the model. In using
the training and test method, it is known that the results of
Occupation (N=589)Parameter estimates
intercept (Mean ± SD)Parameter estimatesslope (Mean ± SD)
Businessmen (n=128) 59.43 ± 81.95 0.18 ± 0.58Housewives (n=108) 49.88 ± 72.48 -0.65 ± 7.32
Doctors (n=20) 63.99 ± 53.54 -1.08 ± 6.57Business owners (n=6) 56.14 ± 86.05 -3.63 ± 7.99
Government officers (n=2) 21.33 ± 20.17 4.02 ± 6.91Presidents / Chairmen (n=296) 63.87 ± 88.37 0.42 ± 10.39
Lawyers (n=17) 55.46 ± 62.34 1.16 ± 9.43Professors (n=12) 63.61 ± 69.08 -2.55 ± 7.18
Table 3. Means and standard deviation of slope and intercept of expenditure by occupations
Source df Sum of squares Mean squares F value p valueMonth 1 200.70 200.70 0.03 0.8552Gender 1 58859.15 58859.15 9.77 0.0018**
Occupation 7 139520.39 19931.48 3.31 0.0016**Note: **p<0.05
Table 4. Analysis of variance of PSM
92 Grouping hotel restaurant customers based on a behavioral scoring model
model assessment are sensitive to splitting up a small data
set. To overcome this problem, Malthouse & Derenthal
(2008) recommends stratified sampling to reduce the
variation across the splits. In cross validation, one case is
excluded from the original sample, and the model is trained
based on the remaining sample. Then the trained model
predicts the excluded case. This procedure is repeated for
each case. The accuracy of each case is summed over the
entire sample. The cross validation method may provide
nearly unbiased estimators of the prediction accuracy
(Sung, Chang, & Lee, 1999).
Figure 2 illustrates how to evaluate model fitting and
prediction power in this study. We used the data from
January through November to build predictive models of
both BSM and PSM. Using the predictive model, the
performance of December is predicted and compared to the
actual value. In this case, the data for 11 months acts as a
training set and the rest of the data in December works as a
test set.
Assessment of model fitting is performed using deviance
and Pearson’s chi-square for frequency of restaurant visits.
The deviance and Pearson’s chi-square provide goodness-
of-fit measures indicating discrepancy between actual
frequency and predicted frequency generated from the
predictive model using the training set. For assessment of
monetary model fitting, root MSE(mean square error), R2,
and adjusted R2 are measured as goodness of fit measures.
Prediction power of models for frequency of restaurant visits
and monetary value are evaluated using MAE(mean absolute
error), MSE.
2. Assessment prediction power
Three statistics, RMSE(root mean square error), R2, and
adjusted R2, are employed in the assessment of model
fitting between the BSM and the PSM. Table 5 shows that
the BSM generate smaller RMSE, larger R2 and adjusted R2
than the PSM does. Therefore, the BSM outperform to the
PSM. Prediction power of the two models is investigated in
the next phase to detect potential over fitting as well as
validation for the individual models.
Two estimated models based on eleven months of data
were used to predict the expenditures per restaurant visit in
December to assess prediction power. Each predicted value
of expenditure is compared with the actual expenditure per
restaurant visit in December. The results are displayed in
Model RMSE R-square Adjusted R-squarePSM 77.64 0.012 0.011BSM 38.15 ± 35.65 0.28 ± 0.25 0.19 ± 0.28
Table 5. Assessment of model fitting between PSM and BSM
Figure 2. An example of model fitting and prediction power
International Journal of Tourism and Hospitality Research 31(10), 2017 93
Table 6, 7, and 8. Table 6 shows that the BSM provide
similar patterns of expenditures to the actual value of
expenditures in December. The correlation coefficient
between the true value and the predicted value in the
BSM(0.5904) is higher than the one in the PSM (0.1516).
Table 7 provides mean values and standard deviations of
MAE and MSE for the PSM and BSM. The BSM
outperform the PSM in prediction, providing lower mean
values of MAE and MSE. Table 8 summarizes the
distributions of MAE and MSE in both models. Five
number summary statistics indicate that the BSM is
superior to the PSM.
3. Customer segmentation by predictive expenditures
Market segmentation describes the division of a market
into homogeneous groups, which will respond differently to
promotions, communications, advertising, and other
marketing mix variables. Direct marketers want to get away
from mass-marketing campaigns and use a more consumer-
oriented approach. This is done based on the behaviors
exhibited by the customers, such as using similar services
and products(Westphal & Blaxton, 1998). Segmenting
techniques look for similarities and differences within a data
set and group similar rows together into segments or clusters.
It is supposed that there are high similarities within a
segment and high differences between segments. There have
been two traditional approaches to specifying market
segments. The first one is to classify customers by objective
variables such as sex, age, life cycle stage and personality.
The second approach is based on the segments of
situation-specific events, such as purchases and users of
specific products, brand-loyal versus non-brand-loyal users,
attitude toward the brand, etc. (Frank, Massy, & Wind 1972).
The optimal number of segments is a subject of continuous
research, although many approaches to segmentation allow
the user to decide the number of segments (Groth, 1998).
The customers are first scored by predictive estimates of
expected expenditure for the next month. Segmentation is
performed using the scores with three groups: high
value(top 25% scored customers), middle value(between
top 25% and 50%), and low value(below 50%). The
distribution of the 589 scores is summarized in Table 9. The
average score is 153.04, implying that the expected
expenditures for the next month(December in this case) of
589 active customers is $153.04. Table 9 also shows that the
distribution of scores is highly skewed to the right, with a
Variable N Mean ± SD Correlation coefficient1
Predicted value by PSM 589 65.55 ± 8.69 0.1516 (p-value = 0.0002)Predicted value by BSM 589 62.90 ± 74.50 0.5904 (p-value < 0.0001)
True Value 589 68.91 ± 86.94 -Note: 1indicates correlation between predicted value and true value
Table 6. Mean and standard deviation of predicted and true value
Model N MAE ± SD MSE ± SDPSM 589 60.26 ± 61.95 7462.39 ± 23933.84BSM 589 44.50 ± 59.27 5487.17 ± 16670.38
Table 7. Evaluation of prediction power with December
Five number summaryModel Maximum 75% (Q3) 50% (Q2) 25% (Q1) Minimum
MAEPSM 591.64 67.57 46.48 28.03 0.65BSM 468.84 59.45 24.65 6.56 0
MSEPSM 350042.89 4545.63 2160.67 785.58 0.42BSM 21980.95 3534.84 607.47 43.05 0
Table 8. Five number summary of prediction power with December
94 Grouping hotel restaurant customers based on a behavioral scoring model
few extremely high scores. It is interesting to note that about
16% of customers can be treated as high value customers
spending at least $150 for the next month.
In fact, 68% of dormant customers and low value
customers can be referred to as customer groups that hardly
contribute to sales spending less than $35 for the next
month. As shown in Figure 3, the scores are validated
through the relationship of segmentations with the average
expenditures per visit and restaurant visits per month. It is
certain that high value customers have high average
expenditures per visit and number of visit per month.
Ⅴ. Conclusions
The purpose of this research was to provide the efficient
usage of customers' historical transaction data with scoring
model within the context of the hotel restaurants. The
purchasing history can be sources for BSM, while
demographic information such as gender and occupation
can be important factors of PSM. Unlike traditional
behavior score such as RFM measure, we proposed
behavior score defined as predictive expenditure for the
next month. The score includes all historical information
with emphasis on recent transactions. It is easy to
understand because the score itself means expenditure. In
particular, the proposed behavior score is powerful index to
predict existing customers' future behavior.
In BSM, past transactions of a customer during 11 months
can be summarized by intercept and slope on a regression
model. Customers with high intercepts with negative slopes
indicate that the customers are leaving on the given time
Figure 3. Relationship of segmentations with expenditures per visit and the number of visits
Mean ± SD Maximum 75% (Q3) 50% (Q2) 25% (Q1) Minimum153.04 ± 355.30 4466.24 149.48 35.07 3.16 0
Table 9. Distribution of customer score by purchasing pattern (Unit: US dollar)
Dormant customers
(340 cases) (36.60%)
High value customers
(148 cases) (15.93%)
Middle value customers
(154 cases) (16.58%)
Low value customers
(287 cases) (30.89%)
(Total N=929)
International Journal of Tourism and Hospitality Research 31(10), 2017 95
period. Therefore churn analysis is required for further
understandings. If customers have medium or high intercept
with positive slope, cross-selling or up-selling promotion
campaigns might be appropriate to increase their
expenditure. Figure 1 and Table 3 illustrate averages of
slopes and intercepts for each occupation. High standard
deviations of both intercept and slope are detected due to
large variability of individual customers within the same
occupation. Such variability of customers' behavior affects
poor performance of prediction in PSM. It is natural to say
that behavior scores from BSM have high prediction power.
However, there are several limitations in BSM study.
Firstly, handling personally identifiable data in the process
of analyzing individual behavior is a very important issue of
privacy. In order to comply with personal privacy protection
and privacy laws, all personal identifiable information was
deleted in the process of data handling. Since the members
of the restaurant being studied are of a certain class of
customers, it is decided not to mention the name of the
restaurant to prevent from the possibility of personal
identification. And it is decide to limit the use of personal
behavior data for research purposes only. Therefore, in this
study, we would like to mention the limitation that the source
of the data cannot be disclosed in detail.
The second limitation is that BSM cannot be applied to
new customers who do not have historical transaction data.
In other words, BSM is only applicable to existing
customers. Lastly, it cannot identify potential customers in
low value segment. In this case, profile score rather than
behavior score plays an important role to overcome these
difficulties. For example, according to the results of Figure
1 and Table 3, the occupations of government officers,
lawyers, presidents/chairmen, and businessmen have high
profile scores so that we can promote these groups of people
as new or potential customers. Although we competitively
compare the prediction power of BSM and PSM in this
study, the PSM will be an excellent complement to the BSM
in distinguishing customers. In the management of new and
existing customers, marketers should consider how to
combine BSM based on the individual transaction data and
PSM based on the aggregated demographic data efficiently
as a powerful tool to understand customers and implement
strategies. In practice, BSM can be used to identify and
maintain loyal customer group avoiding churning.
However, BSM has difficulty in application of new
customer with no historical behavior data. In this case, PSM
is useful tool to identify potential customers that had poor
historical records in past. Therefore, promotion or
up-selling campaign might be applied to make them valued
customers.
References
Adriaans, P., & Zantinge, D. (1996). Data mining. New York, NY:
Addison-Wesley.
Berson, A., Smith, S., & Thearling, K. (1999.) Building data mining applications for CRM. New York, NY: McGraw-Hill.
Borle, S., Singh, S. S., & Jain, D. C. (2008). Customer lifetime
value measurement. Management Science, 54(1), 100-112
Bowen, J. T. (1990). Electronic information: Scanning the
environment. Hospitality Research Journal. Annual Conference Proceedings, 14(2), 95-101.
Bowen, J. T., & Shoemaker. S., (1998). Loyalty: A strategic
commitment. Cornell Hotel and Restaurant Administration Quarterly, 39(1), 12-25.
Bowen, J. T. (2000). A strategic approach to capturing and using
customer information. Journal of Restaurant and Foodservice Marketing, 4(1), 77-81.
Branca, A. S. (2008). Demographic influences on behavior: An update
to the adoption of bank delivery channels. International Journal of Bank Marketing, 26(4), 237-259.
Cho, Y. S., Moon, S. C., & Ryu, K. H. (2014). SOM Clustering
method using user’s features to classify profitable customer
for recommender service in u-Commerce. In Park, J.J.,
Pan, Y., Kim, C., & Yang, Y.(Ed.), Future Information Technology, 273-281, Springer, Dordrecht.
Dev, C. S., Buschman, J. D., & Bowen, J. T. (2010). Hospitality
marketing: a retrospective analysis (1960-2010) and
predictions(2010-2020). Cornell Hospitality Quarterly,
51(4), 459-469.
Dube, L., & Renaghan, L. M. (1999). Building customer loyalty.
Cornell Hotel and Restaurant Administration Quarterly,
40(5), 78-88.
Fader, P. S., Hardie, B. G., & Lee, K. L. (2005). RFM and CLV:
96 Grouping hotel restaurant customers based on a behavioral scoring model
Using iso-value curves for customer base analysis. Journal of Marketing Research, 42(4), 415-430.
Francese, P. A. & Renaghan, L. M. (1990). Data-base marketing:
building customer profiles. Cornell Hotel and Restaurant Administration Quarterly May, 31(1), 60-63.
Frank, R. E., Massy, W. F., & Wind, Y. (1972). Market segmentation.
Englewood Cliffs, NJ: Prentice Hall.
Groth, R. (1998). Data mining: A hand on approach for business professionals. Upper Saddle River, NJ: Prentice Hall.
Gupta, S., & Chintagunta, P. K. (1994). On using demographic
variables to determine segment membership in logit mixture
models. Journal of Marketing Research, 31(1), 128-136.
Hashim, A. E. Ghani, K., & Said, J. (2009). Does consumers’
demographic profile influence online shopping? : An
examination using Fishbein's theory. Canadian Social Science, 5(6), 19-31.
Hughes, A. M. (1996). The compile database marketer. New York,
NY: McGraw-Hill.
Kattiyapornpong, U., & Miller, K. E. (2008). Socio-demographic
constrains to travel behavior. International Journal of Culture, Tourism and Hospitality Research, 3(3), 246-258.
Kotler, P. T., Bowen, J. T., Makens, J. (2017). Marketing for hospitality and tourism (7th Edition). England: Pearson Education.
Lambert, Z. V. (1981). Profiling demographic characteristics of
alienated consumers. Journal of Business Research, 9(1),
65-86.
Magnini, V. P., Honeycutt, Jr. E. D., & Hodge, S. K. (2003). Data
mining for hotel firms: Use and limitations. Cornell Hotel and Restaurant Administration Quarterly, 44(2), 94-105.
Malthous, E. C. & Derenthal, K. M. (2008). Improving predictive
scoring models through model aggregation. Journal of Interactive Marketing, 22(3), 51-68.
McCleary, K. W., & Weaver, P. A. (1991). Are frequent-guest program
effective? Cornell Hotel and Restaurant Quarterly, 32(2),
39-45.
Miglautsch, J. R. (2000). Thoughts on RFM scoring. Journal of Database Marketing, 8(1), 67-72 .
Nash, E. (2000). Direct marketing: Strategy, planning, execution
(4th Ed.). New York, NY: McGraw-Hill.
Noori, B. (2015). An analysis of mobile banking user behavior
customer segmentation. International Journal of Global Business, 8(2), 55-64.
Peppers, D., Rogers, M., & Dorf, B. (1999). Is your company ready
for one-to-one marketing? Harvard Business Review,
151-160.
Petrison, L. A., Blattberg, R. C., & Wang, P. (1993). Database
marketing - past, present, and future. Journal of Direct Marketing, 7(3), 27-43.
Piccoli, G., O'Connor, P., Capaccioli, C., & Alvarez, R. (2003).
Customer relationship management-A driver for change
in the structure of the U.S. lodging industry. Cornell Hotel and Restaurant Administration Quarterly, 44(4), 61-73.
Qiasi, R, Baqeri, D. M., Minaei, M. B., & Amooee, G. (2012).
Developing a model for measuring customer’s loyalty and
value with RFM technique and clustering algorithms. The Journal of Mathematics and Computer Science, 4(2),
172-181.
Rhee, S., & McIntyre, S. (2008). Including the effects of prior and
recent contact effort in a customer scoring model for
database marketing. Journal of the Academy of Marketing Science, 36(4), 538-51.
Roberts, M. L. (1992). Expanding the role of the direct marketing
database. Journal of Direct Marketing, 6(2), 51-60.
Rygielski, C., Wang, J. C., & Yen, D. C.(2002). Data mining techniques
for customer relationship management. Technology in Society, 24(4), 483-502
Sarvari, P. A., Ustundag, A., & Takci, H. (2016) Performance
evaluation of different customer segmentation approaches
based on RFM and demographics analysis. Kybernetes, 45(7), 1129-1157.
Schijns, J. M. C., & Schröder, G. J. (1996). Segment selection by
relationship strength. Journal of Direct Marketing, 10(3),
69-79.
Sheth, J. N. (1977). Demographics in consumer behavior. Journal of Business Research, 5(2), 129-138.
Shmueli, G., Bruce, P. C. & Patel, N. R. (2016). Data mining for business analytics: Concepts, techniques, and applications with XL miner. New York, NY; John Wiley and Sons.
Sung, T. K., Chang, N., & Lee. G. (1999). Dynamic of modeling
in data mining: Interpretive approach to bankruptcy
prediction. Journal of Management Information Systems, 16(1), 63-85.
Toh, R. S., Rivers, M-J., & Withiam. G. (1991). Frequent-guest
programs: Do they fly? Cornell Hotel and Restaurant Quarterly, 32(2), 46-52.
Toh, R. S., & Hu, M. Y. (1988). Frequent-flier programs: Passenger
International Journal of Tourism and Hospitality Research 31(10), 2017 97
attributes and attitudes. Transportation Journal, 28(2),
11-22.
Westphal, C., & Blaxton, T. (1998). Data mining solutions: Methods and tools for solving real-world problems. New York,
NY; John Wiley and Sons.
Wilbourn, L. C., McCleary, K. W., & Phadeesuparit, A. (1997).
Demographic and psychographic determinants of coupon
users at pizza restaurants. Journal of Restaurant and
Foodservice Marketing, 2(1), 45-61.
Yeh, R. S., Plante, R. D., & Agrawal, D. (2011). Consumer data
analysis and its managerial application for the grocery
industry. Journal of Promotion Management, 17(1), 96-113.
Received March 10, 2016Revised September 4, 2017
Accepted September 18, 2017