• A survey on mobile data uses

      Colot, Christian; Linden, Isabelle; Baecke, Philippe (International Journal of Decision Support System Technology, 2016)
      Mobile devices leave an unprecedented volume and variety of digital traces of human beings. In this paper, the authors propose an overview of multiple uses of mobile data published in the scientific literature. The organization of the survey follows a typology built on two criteria: interaction level and focus of analysis. Crossing these two dimensions would suggest 8 research areas. Only 4 of them are actually covered by the collected pieces of work. They are discussed in turn showing off the main characteristics of them. Finally, the discussion of the 4 remaining areas highlights new research areas with a special focus on the possibility to use mobile data to influence individual users towards efficient collective behaviors. To conclude, current and future research avenues suggest that mobile devices and their underlying data are likely to be employed in many domains and may be used not only to observe human life but also to influence it.
    • Bluetooth tracking of humans in an indoor environment: An application to shopping mall visits

      Oosterlinck, Dieter; Benoit, Dries F.; Baecke, Philippe; Van de Weghe, Nico (Applied Geography, 2017)
      Intelligence about the spatio-temporal behaviour of individuals is valuable in many settings. Generating tracking data is a necessity for this analysis and requires an appropriate methodology. In this study, the applicability of Bluetooth tracking in an indoor setting is investigated. A wide variety of applications can benefit from indoor Bluetooth tracking. This paper examines the value of the method in a marketing application. A Belgian shopping mall served as a real-life test setting for the methodology. A total of 56 Bluetooth scanners registered 18.943 unique MAC addresses during a 19-day period. The results indicate that Bluetooth tracking is a sound approach for capturing tracking data, which can be used to map and analyse the spatio-temporal behaviour of individuals. The methodology also provides a more efficient and more accurate way of obtaining a variety of relevant metrics in the field of consumer behaviour research. Bluetooth tracking can be implemented as a new and cost effective practice for marketing research, that provides fast and accurate results and insights. We conclude that Bluetooth tracking is a viable approach, but that certain technological and practical aspects need to be considered when applying Bluetooth tracking in new cases.
    • Data Augmentation by predicting spending pleasure using commercially available external data

      Baecke, Philippe; Van den Poel, Dirk (Journal of Intelligent Information Systems, 2011)
    • From one-class to two-class classification by incorporating expert knowledge: Novelty detection in human behaviour

      Oosterlinck, Dries; Benoit, Dries F.; Baecke, Philippe (European Journal of Operational Research, 2020)
      One-class classification is the standard procedure for novelty detection. Novelty detection aims to identify observations that deviate from a determined normal behaviour. Only instances of one class are known, whereas so called novelties are unlabelled. Traditional novelty detection applies methods from the field of outlier detection. These standard one-class classification approaches have limited performance in many real business cases. The traditional techniques are mainly developed for industrial problems such as machine condition monitoring. When applying these to human behaviour, the performance drops significantly. This paper proposes a method that improves existing approaches by creating semi-synthetic novelties in order to have labelled data for the two classes. Expert knowledge is incorporated in the initial phase of this data generation process. The method was deployed on a real-life test case where the goal was to detect fraudulent subscriptions to a telecom family plan. This research demonstrates that the two-class expert model outperforms a one-class model on the semi-synthetic dataset. In a next step the model was validated on a real dataset. A fraud detection team of the company manually checked the top predicted novelties. The results show that incorporating expert knowledge to transform a one-class problem into a two-class problem is a valuable method.
    • Home location prediction with telecom data: Benchmarking heuristics with a predictive modelling approach

      Oosterlinck, Dieter; Baecke, Philippe; Benoît, Dries (Expert Systems with Applications, 2020)
      Correctly identifying the home location is crucial for human mobility analysis with telecom data, more specifically call detail record (CDR) data. To that end, multiple heuristics have been developed in literature. Nevertheless, due to the lack of ground truth home location data, no study has thoroughly validated these widely used methods so far. We present a detailed performance analysis of existing home detection heuristics, using a unique dataset that enables this important validation on the lowest level, being the level of the cell tower. Our research indicates that simple heuristics surprisingly outperform their more complex counterparts. The benchmark study revealed that the best heuristic is able to identify the home location with an average error of approximately 4.5 kilometres and selects the correct home tower in 60.69% of the cases. Based on the insights provided by our study, we propose a new heuristic that increases the accuracy to 61% and lowers the average distance error to 4.365 kilometres. Secondly, if the home location is known for possibly only a fraction of the instances, we propose a labelled predictive modelling approach. Adding social network based variables in this predictive model further enhances the predictive performance. Our best model reduces the average distance error to 2.848 kilometres and selects the correct home location in 72.08% of the cases. Furthermore, this result provides an indication of the upper bound for home detection with CDR data. Finally, models that only make use of social network based data are developed as well. Results show that even without using data of the focal individual, these models are able to select the correct home tower in 37.65% of the cases and achieve an average distance error of 8.1 kilometres.
    • Identifying influencers in a social network: the value of real referral data

      Roelens, Iris; Baecke, Philippe; Benoit, Dries F. (Decision Support Systems, 2016)
      Individuals influence each other through social interactions and marketers aim to leverage this interpersonal influence to attract new customers. It still remains a challenge to identify those customers in a social network that have the most influence on their social connections. A common approach to the influence maximization problem is to simulate influence cascades through the network based on the existence of links in the network using diffusion models. Our study contributes to the literature by evaluating these principles using real-life referral behaviour data. A new ranking metric, called Referral Rank, is introduced that builds on the game theoretic concept of the Shapley value for assigning each individual in the network a value that reflects the likelihood of referring new customers. We also explore whether these methods can be further improved by looking beyond the one-hop neighbourhood of the influencers. Experiments on a large telecommunication data set and referral data set demonstrate that using traditional simulation based methods to identify influencers in a social network can lead to suboptimal decisions as the results overestimate actual referral cascades. We also find that looking at the influence of the two-hop neighbours of the customers improves the influence spread and product adoption. Our findings suggest that companies can take two actions to improve their decision support system for identifying influential customers: (1) improve the data by incorporating data that reflects the actual referral behaviour of the customers or (2) extend the method by looking at the influence of the connections in the two-hop neighbourhood of the customers.
    • Improving customer acquisition models by incorporating spatial autocorrelation at different levels of granularity

      Baecke, Philippe; Van den Poel, Dirk (Journal of Intelligent Information Systems, 2013)
      Traditional CRM models often ignore the correlation that could exist among the purchasing behavior of surrounding prospects. Hence, a generalized linear autologistic regression model can be used to capture this interdependence and improve the predictive performance of the model. In particular, customer acquisition models can benefit from this. These models often suffer from a lack of data quality due to the limited amount of information available about potential new customers. Based on a customer acquisition model of a Japanese automobile brand, this study shows that the extra value resulting from incorporating neighborhood effects can vary significantly depending on the granularity level on which the neighborhoods are composed. A model based on a granularity level that is too coarse or too fine will incorporate too much or too little interdependence resulting in a less than optimal predictive improvement. Since neighborhood effects can have several sources (i.e. social influence, homophily and exogeneous shocks), this study suggests that the autocorrelation can be divided into several parts, each optimally measured at a different level of granularity. Therefore, a model is introduced that simultaneously incorporates multiple levels of granularity resulting in even more accurate predictions. Further, the effect of the sample size is examined. This shows that including spatial interdependence using finer levels of granularity is only useful when enough data is available to construct stable spatial lag effects. As a result, extending a spatial model with multiple granularity levels becomes increasingly valuable when the data sample becomes larger.
    • Improving purchasing behavior predictions by data augmentation with situational variables

      Baecke, Philippe; Van den Poel, Dirk (International Journal of Information Technology & Decision Making, 2010)
    • Including spatial interdependence in customer acquisition models: a cross-category comparison

      Baecke, Philippe; Van den Poel, Dirk (Expert Systems with Applications, 2012)
    • Investigating the added value of integrating human judgement into statistical demand forecasting systems

      Baecke, Philippe; De Baets, Shari; Vanderheyden, Karlien (International Journal of Production Economics, 2017)
      Whilst the research literature points towards the benefits of a statistical approach, business practice continues in many cases to rely on judgmental approaches for demand forecasting. In today's dynamic environment, it is especially relevant to consider a combination of both approaches. However, the question remains as to how this combination should occur. This study compares two different ways of combining statistical and judgmental forecasting, employing real-life data from an international publishing company that produces weekly forecasts on regular and exceptional products. Two forecasting methodologies that are able to include human judgment are compared. In a 'restrictive judgement' model, expert predictions are incorporated as restrictions on the forecasting model. In an 'integrative judgment' model, this information is taken into account as a predictive variable in the demand forecasting process. The proposed models are compared on error metrics and analysed with regard to the properties of the adjustments (direction, size) and of the forecast itself (volatility, periodicity). The integrative approach has a positive effect on accuracy in all scenarios. However, in those cases where the restrictive approach proved to be beneficial, the integrative approach limited these beneficial effects. The study links with demand planning by using the forecasts as input for an optimization model to determine the ideal number of SKUs per Point of Sale (PoS), making a distinction between SKU forecasts and SKU per PoS forecasts. Importantly, this enables performance to be expressed as a measure of profitability, which proves to be higher for the integrative approach than for the restrictive approach.
    • Judgmental forecast adjustments over different time horizons

      Van den Broeke, Maud; De Baets, Shari; Vereecke, Ann; Baecke, Philippe; Vanderheyden, Karlien (Omega: the international Journal of Management Science, 2019)
      Accurate demand forecasting is the cornerstone of a firm’s operations. The statistical system forecasts are often judgmentally adjusted by forecasters who believe their knowledge can improve the final forecasts. While empirical research on judgmental forecast adjustments has been increasing, an important aspect is under-studied: the impact of these adjustments over different time horizons. Collecting data from 8 business cases, retrieving over 307,200 forecast adjustments, this work assesses how the characteristics (e.g., size and direction) and accuracy of consecutive adjustments change over different time horizons. We find that closer to the sales point, the number of adjustments increases and adjustments become larger and more positive; and that adjustments, both close and distant from the sales point, can deteriorate the final forecast accuracy. We discuss how these insights impact operational activities, such as production planning.
    • Mind the gap - Assessing maturity of demand planning, a cornerstone of S&OP

      Vereecke, Ann; Vanderheyden, Karlien; Baecke, Philippe; Van Steendam, Tom (International Journal of Operations and Production Management, 2018)
      The purpose of this paper is to develop and empirically validate a model for assessing demand planning maturity in organisations. The authors developed a maturity assessment model for demand planning through iterations of theoretical and empirical work, combining insights from literature and practitioners. An online survey is developed to validate the model using data from different industries. The authors identify six dimensions of demand planning maturity: data management, the use of forecasting methods, the forecasting system, performance management, the organisation and people management. The empirical study indicates that demand data are well managed and organisation readiness is high, yet improvements in the forecasting system and the management of forecast performance are needed. The results show a positive relationship between the size of an organisation and its demand planning maturity.
    • The value of vehicle telematics data in insurance risk selection processes

      Baecke, Philippe; Bocca, L. (Decision Support Systems, 2017)
      The advent of the Internet of Things enables companies to collect an increasing amount of sensor generated data which creates plenty of new business opportunities. This study investigates how this sensor data can improve the risk selection process in an insurance company. More specifically, several risk assessment models based on three different data mining techniques are augmented with driving behaviour data collected from In-Vehicle Data Recorders. This study proves that including standard telematics variables significantly improves the risk assessment of customers. As a result, insurers will be better able to tailor their products to the customers' risk profile. Moreover, this research illustrates the importance of including industry knowledge, combined with data expertise, in the variable creation process. Especially when a regulator forces the use of easily interpretable data mining techniques, expert-based telematics variables are able to improve the risk assessment model in addition to the standard telematics variables. Further, the results suggest that if a manager wants to implement Usage-Based-Insurances, Pay-As-You-Drive related variables are most valuable to tailor the premium to the risk. Finally, the study illustrates that this new type of telematics-based insurance product can quickly be implemented since three months of data is already sufficient to obtain the best risk estimations.