Airline Passenger Name Record Generation using Generative Adversarial Networks
Passenger Name Records (PNRs) are at the heart of the travel industry. Created when an itinerary is booked, they contain travel and passenger information. It is usual for airlines and other actors in the industry to inter-exchange and access each other's PNR, creating the challenge of using them without infringing data ownership laws. To address this difficulty, we propose a method to generate realistic synthetic PNRs using Generative Adversarial Networks (GANs). Unlike other GAN applications, PNRs consist of categorical and numerical features with missing/NaN values, which makes the use of GANs challenging. We propose a solution based on Cramér GANs, categorical feature embedding and a Cross-Net architecture. The method was tested on a real PNR dataset, and evaluated in terms of distribution matching, memorization, and performance of predictive models for two real business problems: client segmentation and passenger nationality prediction. Results show that the generated data matches well with the real PNRs without memorizing them, and that it can be used to train models for real business applications.
READ FULL TEXT