Forecasting Consumer Consumption Behavior of Water Bottles Using Generlized Linear Model for Supply Chain Resilience

Abdulaziz Alidrees, University of Texas at El Paso


When forecasting the amount of water consumed by individuals at Kuwait’s Mubarak Al-Kabeer city Block 8, it is important for any water producing company to tell the amount of water they should produce for its target consumers. After interviewing a couple of residents in this block, it was discovered that they were restricted by being unable to shop for their essentials, including water packs. Data on the amount of water consumption in milliliters for different individuals were collected and recorded. Generic scanners and smartphone scanners were installed in 20 different households that rely on water packs instead of central water supply. Every individual in each household was asked to scan the water bottle after consumption. This information gets transferred to an excel spreadsheet, indicating that the individual has consumed a 500ml water bottle. The data was collected over a period of 30 days, and other factors for these individuals were also recorded. Also, qualitative data was collected from manufacturing employees about the circumstances that they are facing. These factors are the age of the individuals, weight and height, whether they have any of the following diseases: hypertension, diabetes, kidney problems, thyroid problems, allergy, asthma, heart disease, post-traumatic stress disorder, and irritable bowel syndrome. The total number of individuals who were recorded was 79. In this study, we used the generalized linear regression model to forecast the amount of water consumption, since this model has the ability to forecast the response variables from the normal distribution and uses categorical explanatory variables on the response variable. We began by cleaning and preparing the data by treating missing values and outliers, performed feature selection, and finally proceeded to fit the Generalized Linear Model. Parameter estimation of the regression coefficients of the model was done using the maximum likelihood estimation formula. The best regression model was selected using stepwise regression based on AIC values. The model that reduced the Akaike Information Criterion most was selected. Further investigation was done on this model, such as multicollinearity, which was absent, and tested the interaction between independent variables where the interaction between age and diabetic variables was considered. The model was finally used to forecast the amount of water consumption, and the accuracy of the final model was tested using the adjusted R squared and was found to be 29%. This is because the model was trained on a dataset that had a large number of observations in the range of 3-5 and was able to accurately predict only the values that lay in this range, compared to the values outside this range. The study finally recommended using clustering models for further research, such as the K-means clustering algorithm for forecasting.

Subject Area

Engineering|Sustainability|Behavioral psychology

Recommended Citation

Alidrees, Abdulaziz, "Forecasting Consumer Consumption Behavior of Water Bottles Using Generlized Linear Model for Supply Chain Resilience" (2020). ETD Collection for University of Texas, El Paso. AAI28262047.