Wanderio is an online travel booking platform that allows users to compare flights, train and bus journeys, and connections to and from airports and train stations. The app’s network covers 700 airports across the world, more than 500 bus destinations in Europe, and two of Italy’s main rail companies.
Video of the final presentation of the project
Since 2013, Wanderio has been using advanced technology to plan door-to-door travel experiences, providing travellers with cutting-edge solutions and supporting users through every step of the planning and booking experience.
As an online platform, payment fraud is one of the most common issues. Online payments are often the target of fraud, as hackers can access the funds quickly with little risk. The fraud is often discovered days after it has occurred, at which point it is already too late to identify and track the perpetrator.
Wanderio uses a third-party provider to evaluate transactions according to how fraudulent they seem based on features such as the customer’s profile and transaction history and the transaction itself. The most suspicious transactions are then assessed by an operator, which is a slow and expensive process.
Wanderio’s CTO, Luca Rossi, understood that Artificial Intelligence could provide a better solution to the fraud management process. He thus decided to work with Pi School on this project. Pi School’s Artificial Intelligence Programme allocated the project to software engineer Davide Poveromo, who was supported in his research by his mentor Simone Scardapane, researcher at La Sapienza University in Rome, and by Sébastien Bratières, Faculty Director of the Artificial Intelligence Programme.
There are several methods used for fraud detection, all of which attempt to increase the detection rate while keeping false alarms to a minimum. They are all based on the assumption that every classification error costs the same, which is not true in many real-world scenarios.
Taking this approach, we can understand that not identifying a fraud is worse than simply identifying a legitimate transaction as a fraud. It is essential to take all the different costs into consideration, making decisions that can optimise the real-world costs incurred.
This project presents different cost-sensitive models with the aim of minimising the average cost of every transaction.
For every transaction, the third-party provider establishes a score from 1 to 100 based on a certain set of characteristics taken from the customer’s profile, their transaction history, and the transaction itself. Wanderio decided to flag every transaction with a score higher than 28 (around 30% of the total) as a potential case of fraud. After being reviewed by human operators, a large majority of these turn out to be legitimate, i.e. false positives. Only 4% of all transactions turn out to be fraudulent. Knowing the cost associated with each misclassification, we can calculate the average cost per transaction for the data set provided by Wanderio. Both the chosen threshold value of 28 and the associated average cost per transaction defined by Wanderio were considered good starting points to evaluate the performance of the chosen models.
The cost-sensitive models are a non-standard approach to the classification problem that has not been fully explored and documented. In order to find the best model for Wanderio, different models were tested:
Logistic regression with Minimum Bayes Risk. This model consists of two main parts: the first one is a classic logistic regression model that is trained using the training data. The predictions made by the logistic model are then evaluated by the BMR classifier. The BMR classifier is a decision model based on quantifying trade-offs between various decisions based on probabilities and the costs of such decisions. This means that it processes the probabilities predicted by the logistic model with the costs defined in the cost matrix. The prediction returned by the model is the one that presents the lower risk.
Support Vector Machine with Minimum Bayes Risk. This model is almost identical to the logistic regression model with Minimum Bayes Risk, but it uses a support vector machine model instead of the logistic regression.
Random forest with Minimum Bayes Risk. As in the two previous models, the predictions of a random forest model are evaluated by a BMR classifier, and the prediction returned by this model is the one that presents the lower risk.
Cost-sensitive logistic regression. This model consists of a logistic regression with a customised loss function using the values of the cost matrix. When trained, the model attempts to minimise the average cost of every transaction.
The random forest model with Minimum Bayes Risk performed well in both the validation and test sets. The proportion of true positives and false negatives is better than the third-party provider case, improving the final average cost per transaction. This model tends to be more conservative, flagging a greater number of transactions as potential frauds than the third-party provider prediction. Of all the models analysed, this one performed best.
This project systematically considers the cost of the outcomes of actions on the basis of predictions (the “cost-sensitive approach”). This allows operating regimes such as the choice of thresholds to be established.
The direct optimisation of the cost function is a non-differentiable combinatorial optimisation problem and is very hard to optimise directly. Tentative experiments carried out as part of this project, using the evolutionary algorithm outlined in the thesis of A. Correa Bahnsen and included in the CostCla Github repository, were unsuccessful.
The costs used in this analysis should be examined and corrected where relevant, since optimal decisions are determined from cost ratios. For instance, the margin used to estimate the gain from a legitimate transaction recognised as such has been arbitrarily set to 5%. The costs associated with fraud in general should be tracked more accurately in order to improve the analysis.
The researchers chose not to focus their work on feature engineering and feature enrichment from external sources, both of which are competitive advantages of the third-party provider. Doing so would have required us to acquire external databases or to work with APIs from external data providers. In turn, this would have introduced external dependencies that the tight time frame of the project did not allow.
There are still promising areas for improvement. In particular, simple feature engineering could improve performance, bearing in mind that the data set is large enough to set aside a validation set for this purpose. In the experiments reported here, variance due to the choice of the test set may have a strong impact on the test error (e.g. an undetectable fraud might have a high impact on the average cost). Test error estimates should therefore be made more robust using k-fold cross-validation.
Wanderio will continue to invest in the project, basing their strategic actions on the results presented.