Handcrafted vs Deep Learning Classification for Scalable Video QoE Modeling
Mobile video traffic is dominant in cellular and enterprise wireless networks. With the advent of diverse applications, network administrators face the challenge to provide high QoE in the face of diverse wireless conditions and application contents. Yet, state-of-the-art networks lack analytics for QoE, as this requires support from the application or user feedback. While there are existing techniques to map QoS to QoE by training machine learning models without requiring user feedback, these techniques are limited to only few applications, due to insufficient QoE ground-truth annotation for ML. To address these limitations, we focus on video telephony applications and model key artefacts of spatial and temporal video QoE. Our key contribution is designing content- and device-independent metrics and training across diverse WiFi conditions. We show that our metrics achieve a median 90 comparing with mean-opinion-score from more than 200 users and 800 video samples over three popular video telephony applications -- Skype, FaceTime and Google Hangouts. We further extend our metrics by using deep neural networks, more specifically we use a combined CNN and LSTM model. We achieve a median accuracy of 95 which is a 38
READ FULL TEXT