This is the code repository for the Masters dissertation.
This dissertation investigates fuel usage patterns and anomalies in the Eastern Cape Province government fleet in South Africa from April 2021 to January 2022. Employing exploratory data analysis, clustering techniques, and predictive modeling, the analysis uncovers insights to optimize fuel consumption and detect fraudulent activities.
Univariate and bivariate analyses reveal patterns in fleet composition, transaction volumes, and fuel efficiency across vehicle makes, model derivatives, and departments. Clustering techniques identify distinct vehicle segments and transaction patterns, highlighting the importance of contextual factors in analyzing fuel usage.
Three key indicators - abnormally large transactions, frequent transactions, and fuel price differences - are developed to detect potential fraud. Predictive models, such as XGBoost, Multi-layer Perceptron, and Random Forest, automate the classification of transactions based on fraud indicators, with the Multi-layer Perceptron demonstrating the best performance (accuracy on the test set of 87%).
The study is limited by the scope of the data and missing information for certain variables. Future research could expand the geographical and temporal range, incorporate qualitative data, explore real-time monitoring systems, and investigate vehicle maintenance and fuel efficiency.
This dissertation contributes to the knowledge on fuel management and fraud detection in government fleets, offering a data-driven approach to uncover inefficiencies and anomalies. The insights and methodologies presented serve as a foundation for future research and practical applications, leading to more efficient, cost-effective, and transparent fleet operations.