A Holistic Survey of Wireless Multipath Video Streaming

06/14/2019 ∙ by Samira Afzal, et al. ∙ University of Campinas SAMSUNG 0

Most of today's mobile devices are equipped with multiple network interfaces and one of the main bandwidth-hungry applications that would benefit from multipath communications is wireless video streaming. However, most of current transport protocols do not match the requirements of video streaming applications or are not designed to address relevant issues, such as delay constraints, networks heterogeneity, and head-of-line blocking issues. This article provides a holistic survey of multipath wireless video streaming, shedding light on the different alternatives from an end-to-end layered stack perspective, unveiling trade-offs of each approach and presenting a suitable taxonomy to classify the state-of-the-art. Finally, we discuss open issues and avenues for future work.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Publication Network environment Protocol/ feature Performance improvements compared to single path
MRTP [1] Mesh ad hoc network with high burst loss RTP PSNR gains of 1.26 dB more in multipath than single path, 64.14% loss rate reduction together with making packet losses more random.
MPRTP [2] Two 3G links with bandwidth variations RTP In the quick bandwidth scenario, PSNR is better than the single path with 0.5% and 1.0% loss rate. In the slow bandwidth change scenario, it is comparable to the single path with 1.0% loss rate.
RTRA [3] WiFi and Bluetooth networks with bandwidth variations DASH RTRA shows better results for both slow and rapid changing bandwidth scenarios in terms of startup delay (reduced up to half), playback fluency average (no segment missing in multipath but high misses in some single path scenarios), playback quality (PSNR improved 1 to 3 dB), quality switch (up to 4 times reduction), and bandwidth utilization.
MPLOT [4] Wireless mesh network with burst loss rate of 50% TCP MPLOT archives 75%, with a mean of 50%, more bandwidth aggregation compared to the single path.

et al.  [5]
Burst lossy wireless network IP source routing/relay While the proposed approach results in drops of only 1.5 to 7 dB, but single path drops of 12 to 15 dB.
Reference Year Scope Comments
Qadir et al. [6] 2015 Control and data plane
Multipath for data in general,
Focus on network-layer multipath solutions
Singh et al. [7] 2015 Control and data plane
Multipath for data in general,
Limited research on video streaming services
Li et al. [8] 2016 Data plane
Multipath for data in general.
Regarding video streaming, relevant aspects not covered
Trestian et al. [9] 2018 Data plane Multimedia delivery solutions following three key directions: adaptation, energy efficiency and multipath limited to MPTCP and SCTP/CMT
Current Survey 2018 Data plane Multipath investigation mainly for video streaming

Multimedia services (e.g., Skype, FaceTime) and on-demand mobile video content (e.g., Hulu, YouTube, Netflix) have become part of daily use. Likewise, online cloud gaming is a very popular entertainment [10]. Such applications require high quality video streaming capabilities to meet the end user expectations. The annual Cisco reports [11, 12] show that, since 2012, mobile video has represented more than half of global mobile data traffic and will keep being responsible for the largest traffic growth upfront.

The increase of on-demand video is expected to affect mobile networks as much as it will affect fixed networks. Another trend is that Ultra HD (UHD)UHDUltra HD/4K video will be more prevalent in the network, as well as Multi-View Video (MVV)MVV Multi-View Video and even 8K, in the short-mid term. For example, current smartphones are already able to record 4K videos at a bit rate of around 42-48 Mbps [13, 14, 15, 16].

Delivering high-quality video streaming services makes the task of providing real-time wireless transmission of multimedia while ensuring Quality of Experience (QoE) quite challenging due to bandwidth and time constrains [17]. One of the approaches to tackle this challenging scenario is to add multipath transmission where video streaming can be delivered over IP broadcast and/or broadband with bidirectional connectivity between video sources and users. Table I presents published results on the potential performance gains wireless video streaming when exploiting multiple network paths.

Several surveys in the literature have covered different aspects of multipath data communications in general, such as [6, 7, 8, 18, 19, 20]. However, to the best of our knowledge, this is the first survey focusing especially on the multipath transmission of wireless video. The contributions of this survey include new insights, protocol technique discussions, and a taxonomy of existing solutions, altogether serving as a valuable source of information to researchers and developers in this field. More specifically, we focus on the data plane problem of how to schedule data on multiple paths and intentionally leave out works on multipath routing, i.e. the control plane aspects of how to compute routes such as multipath proposals [21, 22] based Software-Defined Networking (SDN) [23]. Readers interested in those aspects are referred to recent surveys [6, 7] covering control plane approaches. We also leave out of scope wireless sensor networks and refer to surveys on multipath video streaming in this type of networks [24, 25]. Peer-to-peer (P2P) P2Ppeer to peer video streaming applications, which have been surveyed in-depth [26, 27], are also not in the scope of this survey.

In the following, we provide a brief overview of the most related and recently published surveys on multipath data communications [6, 7, 8, 9], summarized in Table II.

Qadir et al. [6] investigated multipathing for data in general, mainly on the network layer. Besides that, they also investigated multipath transmission on the transport layer. Their investigation is organized by discussing key aspects of network-layer multipathing: 1) route computation (source routing, hop-by-hop routing, overlay routing, and SDN-based routing), 2) routing metrics (e.g., delay, bandwidth) 3) load balancing techniques (static or dynamic) 4) number of paths to use 5) how to use multiple paths together.

Singh et al. [7] cover multipathing for data communications in general, covering fundamentals of multipath routing, multipath computation algorithms, multipath forwarding algorithms, and traffic splitting algorithms. The work also reviews various multipath protocols following a layer-based structure, from the application layer to the physical layer.

Li et al. [8] investigates multipath solutions for data in general and presents research problems at various protocol layers including cross layer approaches. Although some video streaming multipath solutions are discussed in this survey, many of the video streaming specific issues are not considered (e.g., importance and influence of video content). In addition, the work does not cover multipath attempts on key video streaming protocols, e.g., Dynamic Adaptive Streaming over HTTP (DASH), and MPEG Media Transport (MMT).

As a related survey, we should also consider Trestian et al. [9], which is a survey on seamless multimedia delivery within a heterogeneous wireless networks environment. The authors evaluate three key aspects of multimedia delivery: adaptation, energy efficiency, and multipath delivery. Regarding the latter, only proposals based on the Multipath TCP (MPTCP) and Stream Control Transmission Protocol (SCTP)/Concurrent Multipath Transfer (CMT) are studied.

Differently from existing surveys, this survey is centered around multipath methods for wireless video streaming to mobile devices. In this survey, mobile means wireless communications which may or not include mobility scenarios (e.g., a wireless laptop at home or a wireless laptop on a train). In this survey, we cover in-depth relevant approaches and new techniques in the field. We majorly categorize existing works considering two main aspects. The first aspect relates to the protocol layer perspective of each work: application layer, transport layer and/or network layer. We sub-group each layer approaches based on which standard protocol/feature is used in the schemes proposed by the authors. Such classification is beneficial to understand the advantages, drawbacks, and trade-offs of each layer and protocol/feature. We also indicate which part of the network (server and/or client) requires adjustment in order to become compatible with the multipath transmission approach. In the second aspect, we analyze the approaches based on the specific scheduling functions to transmit video data over wireless link technologies. The works are classified according to the following scheduling functions: packet selection, packet protection, and path selection. In addition to these two aspects, we also discuss primary research problems related to multipath video transmission such as network heterogeneity, out-of-order packets, Head-of-Line (HOL) blocking, End-to-End delay, overdue packets, implementation aspects, and pros and cons of each approach.

Fig. 1: Visual representation of the organization of the survey.

The high-level organization of this survey is illustrated in Figure 1. An overview of video streaming protocols is provided in Section II. The benefits and challenges of adding multipath transmission in the video streaming scenario are presented in Section III. Surveyed works are then initially introduced in Section IV and classified based on the protocol layer stack position and on the used protocol/feature. In Section V, the works are then investigated based on the scheduling functions: choice of the next packet to be transmitted (packet selection), data packet protection method (packet protection), and selection of the proper network channel (path selection). Section VI provides additional information about the surveyed works that may also be of interest for the reader, such as packet loss differentiation, fairness consideration, video codecs, the employed network simulator, performance metrics, and video services. Section VII presents research issues and directions. Finally, Section VIII provides concluding remarks. There is also a list of abbreviations in the appendix to help readers track them easily.

Ii Historical Overview of Video Streaming

Fig. 2: Historical overview of video streaming protocols.

This section provides a general picture of video streaming development as presented by the timeline and milestones in Figure 2. Interested readers are referred to [28] for a deeper review of different MPEG standards.

The first widely used video streaming protocol is the Real-time Transport Protocol (RTP) [29], which was initially released in 1992 by IETF. It is a UDP-based protocol used for unidirectional real-time video streaming. The advantage of this protocol is that it has very low overhead and works well in managing IP networks. However, RTP has also some disadvantages. For example, it requires a payload format for each media type or codec [30], suffers from lack of multiplexing and has limited support for non-real-time video. Another disadvantage of RTP is that many CDNs do not support it because the server must manage a separate streaming session for each client, turning large-scale deployment more resource intensive. Moreover, RTP cannot traverse firewalls and is connectionless. Therefore, RTP is generally employed for private managed networks where the number of packet losses is small, such as pay-TV cable networks. More technical details on RTP will be provided in Section IV-A.

The next widely known and adopted video streaming protocol shown in Figure 2 is the MPEG-2 Transport System (MPEG-2 TS) [31]MPEG-2 TSMPEG-2 Transport System. It has been widely used since 1995 in digital broadcasting, mobile broadcasting systems and streaming over the Internet. Several standards have also adopted this protocol, such as the Terrestrial Digital Multimedia Broadcasting (T-DMB), T-DMBTerrestrial Digital Multimedia Broadcasting the Digital Video Broadcasting Handheld (DVB-H),DVB-HDigital Video Broadcasting Handheld the Advanced Television Systems Committee (ATSC) ATSCAdvanced Television Systems Committee and the Internet Protocol TeleVision (IPTV)IPTVInternet Protocol TeleVision [32]. MPEG-2 TS is not only a format for fast and reliable packetized streaming delivery but also a format for storage. In addition, MPEG-2 TS is fully codec agnostic.

Since the requirements for on-demand and personalized video delivery over the Internet have dramatically increased, it became challenging for MPEG-2 TS to achieve the high requirements of broadcasting over IP [30, 33]. For example, MPEG-2 TS is not appropriate for UHD delivery over packet networks due to the pre-multiplexing mechanism, not flexible packetization and small-fixed packet size (188 bytes).

The next protocol in the timeline of Figure 2 is the Real-Time Messaging Protocol (RTMP) [34].RTMPReal Time Messaging Protocol It is an Adobe proprietary protocol standardized in 2002 that was initially developed by Macromedia. RTMP is a TCP-based protocol used for bidirectional video streaming. This protocol provides the advantage of multiplexing capability but requires flash player plugin. All video and audio files must be sent in a Small Web Format (SWF)SWFSmall Web Format [35] file to make it able to play with flash player. Another disadvantage is that RTMP suffers from not being codec agnostic and for not supporting some newer video codecs, such as High Efficiency Video Coding (HEVC) [36]. Yet another disadvantage is that it is blocked by firewalls and not supported by all Content Delivery Networks (CDNs)CDNContent Delivery Networks.

Multiple types of RTMP protocols were designed: RTMPS, RTMPE, RTMPT and RTMFP. RTMPS is an encrypted RTMP over a Transport Layer Security (TLS)TLSTransport Layer Security/Secure Sockets Layer (SSL)SSLSecure Sockets Layer connection. RTMPE uses Adobe’s own encryption security mechanism. RTMPT is used to encapsulate RTMP, RTMPS, or RTMPE packets within Hypertext Transfer Protocol (HTTP) [37] requests in order to traverse firewalls. Finally, RTMFP is used to replace RTMP over User Datagram Protocol (UDP) [38] instead of TCP [39].

The next highlighted point in the timeline of Figure 2 is not a protocol, but a video streaming technique introduced in 2006 that become highly adopted in the subsequent streaming protocols. Adaptive Bit Rate (ABR) ABRAdaptive Bit Rate streaming was introduced by Move Networks [40, 41] and is over HTTPHTTPHypertext Transfer Protocol [37] with some adaptations sequentially described. At the server side, the video sequences are stored in various resolutions (bit rates) and are fragmented into small segments. The streaming logic is located on the client side and it is responsible for selecting the suitable segment by considering different parameters (e.g., bandwidth availability, media playout situations [41]). Without such a flexible service, if only one bit rate video is available, a smaller video bit rate than the network bandwidth would lead to a smooth video but waste available resources. On the other hand, a higher video bit rate than the bandwidth network would impose delay.

One advantage of HTTP-based video streaming solutions is that they are easy to deploy in the current Internet architecture. In addition, HTTP flow can traverse middleboxes, such as firewalls, Network Address Translators (NATs)NATNetwork Address Translator and network proxies. Moreover, the client can manage the streaming without the need to maintain a session state on the server, thus it improves scalability and servers can supply a large number of clients at no additional cost [42]. Therefore, HTTP is supported by most of CDNs [43] and the interest in using HTTP for video streaming has been significantly increased in recent years. In 2015, a new version of HTTP, namely HTTP/2, was standardized [44] and received the attention of researchers in the multimedia community [45, 46, 47]. The results show that advancing video streaming services, which are built on top of HTTP/1.1, to HTTP/2 could improve the video quality and performance. More technical details on HTTP/2 will be provided in Section IV-A.

The initial HTTP Adaptive Streaming (HAS) HASHTTP Adaptive Streaming commercial successful protocols [41] were the Microsoft Silverlight Smooth Streaming (MSS) [48] developed by Microsoft in 2008, the HTTP Live Streaming (HLS) [49] developed by Apple in 2009 and the Adobe HTTP Dynamic Streaming (HDS) [50] developed by Adobe in 2010. Since all these protocols were proprietary and incompatible, in 2011, the Dynamic Adaptive Streaming over HTTP (MPEG-DASH) [51] protocol was developed to become a unified codec agnostic standard [52]. MPEG-DASH flexible delivery and codec agnostic properties have turned it into a successful protocol widely adopted by content providers [53]. For example, Netflix and YouTube are currently using MPEG-DASH with Hypertext Markup Language (HTML) as their core streaming technology [53]. Another advantage is that DASH supports both multiplexed and unmultiplexed encoded content. However, the protocol has some disadvantages. For instance, DASH cannot support low latency delivery because the server requires waiting until the completion of movie fragments or of the whole file before transmission [30]

. Besides, the inaccurate bandwidth estimation, especially in mobile networks, causes several switches, freezes, and poor quality of experience (QoE) for DASH 

[45]. More technical details on DASH are provided in Section IV-A.

All ABR-based protocols employ similar versions of the previously explained technology and support live and video on demand (VoD) VoDVideo on Demand delivery. The differences rely on technical parameters [41]. For instance, MPEG-2 TS is used by HLS/DASH while ISO Base Media File Format (ISOBMFF) ISOBMFFISO Base Media File Format is used by DASH and fragmented MP4 is used by HDS/MSS. Regarding video codecs support, H.264 and H.265/HEVC are supported by all HAS protocols. In addition, MSS also supports VC-I [54], HDS also supports VP6 [55] and DASH supports all video codecs. Finally, segment lengths are specified as 2 seconds for MSS, 10 seconds for HLS, 2-5 seconds for HDS and are not specified for DASH. More details on HAS protocols are discussed in [41].

The next protocol in the timeline of Figure 2 is the MPEG Media Transport (MMT) [56]. It was standardized by MPEG in 2014 considering recent changes in multimedia delivery and requirements for Internet technologies, such as IP and HTML for Internet-based video streaming solutions [30]. This protocol also supports UHD resolution and HEVC video codec. MMT was designed to inherit some MPEG-2 TS features, such as content agnostic media delivery, easy conversion between storage and delivery format and support of multiplexing. In addition, MMT was developed due to a need for an international standard to support hybrid delivery in various heterogeneous network environments. Then, in 2015, implementation guidelines standardized to provide technical guidelines for implementing and deploying MMT systems.

The last highlight point in the timeline is MMT enhancement for mobile environment specifying multipath support which has already been added to the protocol and standardized [57]. MMT was adopted by some recent standards, such as the ATSC 3.0 [58], which is a recent standard with a hybrid delivery model which includes MMT and DASH. Especially, in ATSC 3.0, MMT protocol (MMTP) MMTPMPEG Media Transport Protocol is proposed for broadcasting, and DASH over HTTP is proposed for broadband service.

One important difference between DASH and MMT is that, typically, DASH supports a client driven Quality of Service (QoS) QoSQuality of Service control standard, while MMT supports a server driven QoS control services [58]. More technical details on MMT are provided in Section IV-A.

To that end, multipath has been investigated in some of the above mentioned video streaming protocols but not in all of them, especially it has not investigated for the proprietary protocols due to their closed and incompatible design (Figure 2).

Commercial services. We already mentioned some companies using their own developed proprietary protocols such as Move Networks, Microsoft (MMS), Apple (HLS) and Adobe (RTMP, HDS). Besides them, there are other company services adopting or in the process of developing streaming solutions. For example, Skype and WhatsApp are mobile application platforms providing video calls or video conferences for their users. These services use RTP for video streaming [59]. Hulu [60] is an online video service providing on-demand shows, movies, documentaries, and more. Hulu requires flash player for video streaming through the RTMP protocol [9].

A number of service providers use DASH. Among the most famous ones are YouTube [61], Netflix [62], Twitch [63] and Vimeo [64]. YouTube [61] is a video-sharing website providing live and on-demand video streaming. Netflix [62] allows watching on-demand movies and Twitch is the world’s leading live streaming platform for gamers. Vimeo also provides free video viewing services. Another commercial streaming service is Bitmovin, which provides adaptive streaming supporting MPEG-DASH and HLS [65]. Generally, DASH has gotten broad support from commercial companies – see DASH Industry Forum member list available in [66]. In addition, browsers such as Chrome and Firefox also support DASH [53].

NHK, Nippon Hoso Kyokai, is a Japan’s telecommunication company (public service broadcaster) uses MMT as the protocol of choice for 4K/8K Super Hi-Vision.

Iii Benefits and Challenges of Multipath Video Transmission over Wireless

As previously stated, providing high/optimal QoE for the final user in the wireless video streaming scenario requires high bandwidth and low transmission delay. This is a challenging task, considering the several aspects involved in wireless transmission, such as bandwidth constraints, lossy wireless channels, delay, lack of coverage and congested networks. Adding multipath transmission ability can help with this challenge and its benefits can be summarized as:

  • Reliability and seamless connectivity: using multipath allows the user to simultaneously utilize multiple available network connections

    . A better coverage is achieved and the probability of keeping an end-to-end connection alive is increased. In the case of failure or congestion in one network path, multipathing provides a resilient alternative, resulting in improved user video experience;

  • Throughput increase: by aggregating bandwidth and distributing video traffic over multiple network paths, faster transmission can be achieved, which is essential for real-time video streaming applications [67];

  • Load balancing: refers to efficiently distribute video traffic through the available network paths in order to relieve congestion [6]. Load balancing improves stability by achieving lower variability and inter-packet delay (jitter);

  • Reduction of burst loss length: burst loss length refers to the continuous packet losses which is harmful to perceived video streaming quality [5]. This is because decoder can recover the loss of a small number of video packets by exploiting correlations in the previously received video sequence to conceal the lost information. However, the effectiveness of this recovery decreases dramatically in case of losing large number of continuous video packets. Using multipath streaming benefits to convert burst losses to isolated losses, and consequently, probability of recovering lost packets would be increased.

  • Delay decrease: Using multiple paths contributes to having video data ready at the receiver faster, thus, decreasing the effective delay especially from an applicaiton perspective. Probing multiples paths enables to get the data from the lowest delay path, reducing the Time To First Byte (TTFB)TTFBTime To First Byte, i.e., the time between the video request being sent and the first packet received after the request [68].

  • Security: video split over multiple paths improves protection to some security threats [7] once that each network path only carries parts of the whole video stream.

Fortunately, with recent development, many of current devices [69, 70] are already equipped with both cellular and WiFi interfaces. Multiple interface devices, which could be equipped with two or more than two interfaces, having the ability to connect simultaneously to multiple network paths are known as multihomed devices, as illustrated in Figure 3. Multihomed devices can utilize multipath communication by aggregating the available bandwidth from multiple Radio Access Technologies (RATs)RATRadio Access Technologies. With multiple interfaces, users can receive data through parallel paths with multiple IP addresses.

Fig. 3: Multipath wireless video streaming over LTE and WiFi networks.

Despite all of its benefits, attempting to deploy a multipath solution for video delivery can bring the following potential roadblocks from a practical perspective:

Compatibility. Implementation of a general multipath solution usually requires changing one or both of server and client sides, modifying standardized protocols, improving operating systems kernel and/or changing third-party network equipment.

Networks heterogeneity. Heterogeneous wireless networks vary based on different bandwidth constraints, delays, jitters and packet loss rates. These different physical properties cause asymmetric communication for video transmission, and consequently, may decrease the overall streaming quality. For instance, a large difference between LTE and WiFi bandwidth decreases the bandwidth aggregation performance [71]. Second and third generation (2G and 3G) of cellular networks do not provide enough bandwidth to support live video streaming due to high data rate [72, 73, 74]. In addition, the retransmission mechanism in 3G [74] may increase the Round-Trip Time (RTT) and the rate variability. 4G LTE offers higher data transmission rate and signal coverage than 2G/3G [75, 74]. When comparing 4G to WiFi, there relevant differences in terms of bandwidth, packet loss, and round-trip time can be observed [76, 74]. These aspects, in addition to wireless losses being recognized as congestion by some protocols (e.g. TCP) resulting in decreased network throughput [77], turn multipath communications over heterogeneous wireless networks a truly challenging task.

Out-of-order packets. Spreading data over heterogeneous paths with different RTTs, throughput fluctuations, and jitter existence introduces the out-of-order packets problem. This phenomenon causes unnecessary packets retransmissions, wasted bandwidth, and consequently, network congestion. In addition, more time is required to recover the ordered data. A robust multipath transmission solution is required to cope with packet reordering in heterogeneous wireless networks [8] to avoid video quality degradation.

Fig. 4: Challenges of employing multipath transmission in wireless video streaming applications and possible adverse effects to be avoided.

Head-of-Line (HOL) blocking. When many packets are stored in the destination buffer waiting for delayed packets, the buffer may become full and blocked. This issue is referred as Head-of-Line blockingHOLHead-of-Line [78, 79]. Generally, buffer blocking occurs with reliable protocols that guarantee in-order packet delivery, such as TCP, and it may become worse in case of multipath delivery. The Bufferbloat phenomenon is the main reason of HOL blocking, contributing to high latency, especially in 3G/4G cellular networks [80, 79]. Bufferbloat occurs because of significantly large network buffers (e.g., large router queues) that avoid packet loss at the cost of adding high latencies under congestion. The problem can become worse in case of multipath delivery because if bufferbloat occurs in one of the paths, those packets arrive at the destination with high delay and out-of-order, resulting in HOL blocking. Consequently, HOL blocking not only increases End-to-End delay and jitter but successfully arrived packets may become obsolete (i.e. discarded) due to the long waiting time in the destination buffer.

End-to-End delay. Real-time video streaming requires a bounded End-to-End delay [81]

, which refers to the measured delay from the generation of a video frame to the moment when it can be decoded. End-to-End delay includes holding time of a video frame at both sender and receiver sides, and the transmission delay. It could also include the queuing delay, propagation delay, access delay, and reordering delay. The queuing delay refers to packet buffering in the sender, receiver and other nodes in the network during packet transmission. The transmission delay and radio access delay occur in the physical transmitter to map the data from packets to bits on physical radio interface’s hardware. The distance between entities causes the propagation delay. The access points introduce transfer and propagation delay. In the case of video streaming on multipath networks, reordering delay can be increased 


Overdue packets. Video data packets arriving at the destination after decoding deadlines are expired and known as overdue packets. While overdue packets for UDP-like transmissions may cause video distortions (i.e., degradation of the visual video fidelity [17]) similar to lost packets, in reliable transport protocols like TCP the effects surface as stalling (i.e., video freezes) or rebuffering. Avoiding stalls becomes most critical in live streaming scenarios. Thus, this kind of real-time applications, even when based on TCP-like solutions, consider the overdue packets as lost packets since they are discarded. This concept is called liveness [17]. Therefore, suitable multipath streaming strategies need to consider potential decoding deadlines of the receivers.

Wrapping up the Challenges. Figure 4 aims at putting together all key issues and possible adverse effects of multipath wireless video streaming. The design of any multipath wireless video streaming solution, it is important to avoid or at least minimize such effects. In other words, optimization of QoS-related parameters leads to improved QoEQoEQuality of Experience [83]. QoS measurements may differ based on the type of video streaming service [84, 17] such as VoD, live or real-time. VoD is a video streaming service which encoded media is pre-stored at the server, and the user can select and watch it at any time (e.g., Netflix movies). In contrast, in live and real-time video streaming services (e.g., live sport streaming, real-time video including interactive video call, gaming, etc.) the video content is not pre-stored/available when the streaming starts. In live streaming, the buffer is smaller compared to VoD streaming to avoid long delays and it also has stricter deadlines. Real-time video streaming has even shorter delay constraint. For example, according to [85] and [86], a large delay of 5 seconds may be acceptable for VoD and around 1 second delay is acceptable for live streaming, but in order to achieve excellent real-time streaming quality, the solution should provide the End-to-End delay not exceed 150 ms. Besides, packet loss rates higher than 1% are not acceptable for live video streaming solutions [87, 88]. In some applications with high scenes variability, such as football, it was reported [89] that subjects already become uncomfortable for packet loss rate slightly above 0.3%. Finally, meeting all QoS requirements does not necessarily guarantee high(est) user QoE. Devices’ operating system, hardware, battery, operator pricing, light, people around the user and emotion are some examples of factors that impact the users’ experience [9, 83].

Iv Layer-based survey of wireless multipath video streaming approaches

In this section, the surveyed multipath wireless video streaming works are initially introduced and classified in Table III according to protocol stack layers and protocol/features. The table also indicates which parts of the network equipment (whether client, server, network or a combination of them) need to be adjusted in order to become compatible with multipath transmission schemes. Most flexible solutions require only client side modification because they are compatible with the current network infrastructure and does not need any change neither on the server nor in the network infrastructure. On the other hand, there are some other approaches that require server side modification or even both server and client together. Most difficulties are with solutions that they need to adjust network infrastructure.

Iv-a Application Layer Approaches

Video streaming approaches focused on the application layer have the advantage of accessing player buffer status and relevant video content information, such as frames priorities and coding dependencies. Application-specific information provides the multipath approach with richer inputs to define the video streaming scheduling strategies. One key advantage is that there is no need to change lower layer protocols. However, a big drawback of these solutions is that they commonly require modifications of the video software. In application layer approaches, generally, an application level sequence number is used for lost detection, which often increases the overall protocol overhead. In addition, in order to perform knowledgeable packet scheduling decisions, the application requires a mechanism to estimate the network paths’ performance, e.g., through application-specific probes or from TCP congestion control information [8].

In this subsection, we discuss relevant works that are based on RTP, DASH, MMT and other adaptive streaming approaches. All of these protocols were previously introduced in Section II and will be further detailed here. Figure 5 illustrates the protocol stack position of these protocols and Table III presents each category.

Fig. 5: Application layer protocols position in a network protocol stack.
Protocol layer Applied protocols/features Works Compatibility
MRTP [1] Server and Client
RTP MPRTP [2] Server and Client
MRTP-AR [90] Server and Client
Xing et al. [91] Client
Chowrikoppalu et al. [92] Client
DASH RTRA [3] Client
Houzé et al. [93] Client
MMT Kolan et al. [94] Server and Client
Afzal et al. [95] Server and Client
Sohn et al. [96] Server and Client
Evensen et al. [97] Client
Evensen et al. [98] Client
Evensen et al. [99] Client
Application Layer Other Adaptive Streaming Approaches GreenBag [71] Client
BEMA*  [86] Server and Client
Freris at al. [100] Server and Client
Multipath UDP Correia at al. [101] Server and Client
Multipath TCP MPLOT [4] Server and Client
Multipath DCCP MP-DCCP [102] Server
ADMIT [103] Server and Client
MPTCP-SD [104] Server
MPTCP MPTCP-PR [104] Client
Xu et al. [105] Server and Client
PR-MPTCP [106] Server
Kelly et al. [107] Not defined
Okamoto et al. [108] Server
SRMT [109] Server
PR-SCTP [110] Server
CMT-QA [76] Server and Client
CMT-DA [111] Server and Client
Transport Layer SCTP and CMT (extension of SCTP) CMT-CA [112] Server and Client
Yap at al. [67] Server (depends on the application), Client and Network
SDN MARS [81] Network
Network Layer Proxy BAG [113] Client and Network
Corbillon et al. [78] Server
Ojanperä et al. [114] Server, Client and Network
GALTON [115] Server and Client
Application Layer Decision FRA-JSCC [116] Server and Client
MP-DASH [117] Server and Client
Nam et al. [118] Server (depends on the application), Client and Network
Cross Layer Transport Layer Decision CMT-CL/FD [119] Server
*BEMA: UDP (for video data transmission) and TCP (for connection establishment and feedback information).

Iv-A1 Rtp

The Real-time Transport Protocol (RTP) RTPReal-time Transport Protocol was first published in 1992 [29] and then updated in RFC 1889 [120], later obsoleted by RFC 3550 [121]. The newest protocol specification is RFC 8108 [122] published in 2017. RTP is an application layer transport protocol that provides end-to-end network transport functions to supports live, on-demand, and interactive multimedia applications. Next, we highlight more properties of the protocol, and then we survey multipath works based on RTP.

RTP Properties. Although RTP is designed to run over UDP, it could also carry data over other transport protocols such as TCP or SCTP. Another property of RTP is that it can be used in conjunction with the RTP Control Protocol (RTCP) to send monitored information and QoS parameters periodically. RTP also can be used in conjunction with other protocols, such as Real-time Streaming Protocol (RTSP) RTSPReal-time Streaming Protocol [123], which is used to control multimedia playback. A big problem of RTP, running over UDP, is that it lacks congestion control and it is unfair to give room to other flows. There is also no guarantee of reliable delivery and it needs a method to protect high priority frames (I-frames). Furthermore, a challenge to improve RTP to support multipath streaming is that RTP establishes at the media session level and receiver reports per media (video or audio) flow [2].

Multipath support. Multiflow Real-Time Transport Protocol (MRTP) [1], Multipath RTP (MPRTP) [2] and Multipath Real-Time Transport Protocol Based on Application-Level Relay (MPRTP-AR) [90] improved RTP to support multipath video streaming.

The works MRTP and MPRTP are Constant Bit Rate (CBR) approaches. Since RTP lacks congestion control, a considerable receiver buffer is required to compensate the different path latencies of RTP streams when playing a CBR video [7]. Both MRTP and MPRTP use QoS reports (e.g., sender report and receiver report), similar to RTCP reporting in RTP, to carry periodic per flow and session statistics. The time interval between reports is set by the application in MRTP, and can be adapted based on network conditions by the receiver.

The goal of Multiflow Real-time Transport Protocol (MRTP) [1] is to remedy the failure and congestion in mobile wireless ad hoc networks. It is claimed by its authors that the approach is also applicable to the Internet. MRTP is used in conjunction with the Multi-flow Realtime Transport Control Protocol (MRTCP). Inherently, MRTP/MRTCP is an extension of the RTP/RTCP to support media delivery over multiple wireless networks. Unlike RTP, MRTP is a session-oriented protocol. Therefore, MRTCP establishes the session in a three handshake to exchange information (e.g., available paths). Data transmission could be over UDP/TCP/SCTP and during transmission, it is possible to add or remove paths based on the QoS reports. In particular, media divides into flows, and each flow is for one path (in MRTP, the concept of flow is used for series of video packets which are transmitted through an individual path). MRTCP manages flows by utilizing ADD/DELETE acknowledgments (ACKs) for flows.

QoS reports are transferred through the best path or multiple paths to guarantee reliable delivery. These reports are useful for the sender to adapt to transmission errors. For example, by adding redundancy to increase error resilience and by assigning data to more proper paths. There is a reassembly buffer at the receiver side to compensate jitter, reorder and reassemble packets by utilizing session ID, flow ID and flow sequence number.

MRTP uses a retransmission mechanism to retransmit packets to cope with unreliable UDP/IP. The timeout value for retransmission is set by RTT and the maximum number of retransmissions is set by the application. Different error control schemes, including Forward Error Correction (FEC), Multiple Description Coding (MDC) or Automatic Repeat reQuest (ARQ) could incorporate with MRTP. Finally, the results of the surveyed work show that MRTP outperforms single path RTP on received video quality.

In MRTP, it is possible to choose the data distribution method. For example, it could be just a simple Round Robin, striping (over multiple servers), layered coding, multiple description coding or object-oriented coding (video or audio objects encode individually).

Multipath RTP (MPRTP) protocol [2] is a RTP extension with multipath transmission for real-time media. The target of MPRTP is minimizing latency. Initially, the scheduler distributes equal traffic rate to each path and then after gathering information about the path characteristics, it recalculates the data distribution for each path. It uses RTCP to monitor and control information (e.g., jitter and packet loss). As a result, paths are categorized as congested, mildly congested, and non-congested conditions based on the packet loss information. The scheduler, which is responsible for packet distribution over different paths, assigns more media data on the non-congested path and fewer media data on congested ones. I-frames have the highest priority and are transferred over the path with the highest bandwidth, the least delay and packet losses. The sender is informed to retransmit packets by NACK and also retransmits packets on the path with the highest bandwidth, least delay and packet losses.

The approach is not integrated with congestion control but tries to keep the load balancing by using network characteristics. The authors developed a de-jitter algorithm at the receiver side to overcome the variation of RTT and packet reordering with an adaptive playback buffer. An MPRTP sender assigns a subflow ID to each path (in MPRTP, the concept of subflow is used for series of video packets transmitted over an single path) and subflow-specific sequence numbers to determine subflow-related packet jitter, packet loss, and packet discards at the receiver side. The approach is less unfair than RTP with the aim of system balancing and spreading data over paths.

Recently, Multipath Real-Time Transport Protocol Based on Application-Level Relay (MPRTP-AR) [90] was defined by IETF. As shown in Figure 6, the proposed MPRTP-AR protocol stack has two sub-layers: RTP sub-layer and multipath transport control (MPTC) sub-layer. The RTP sub-layer helps this protocol to be fully compatible with existing RTP applications. Therefore, there is no need to change the Application Programming Interface (API)APIApplication Programming Interface. The MPTC sub-layer is responsible for functions such as flow partitioning, subflow packaging and recombination, and also subflow reporting.

At the sender side, data from the application layer are formatted in RTP packets which are sent to the MPTC sub-layer. Then, MPTC formats them into MPRTP-AR data packets. At the receiver side, MPTC extracts the fixed header of MPRTP-AR data packets and sends them to the RTP sub-layer. RTCP packets could also be generated by the RTP sub-layer for generating media transport statistics. RTCP data could be packaged in MPRTP-AR data packets which would be distributed over multiple paths by MPTC sub-layer.

In addition to MPRTP-AR data packets, MPRTP-AR control packets are defined for providing keep-alive packets and MPRTP-AR reports. MPRTP-AR reports (MPRTP-AR Subflow Receiver Report (SRR) and MPRTP-AR Flow Recombination Report (FRR)) contain transport qualities of active paths (e.g., packet loss rate and jitter) and effects on scheduling and flow partitioning. Flow partitioning methods are categorized into two groups that are named coding-aware methods and coding-unaware methods. Coding-aware methods are used for layering coding, multiple description coding or object-oriented coding, and are on RTP sub-layer. In this method, each coding flow is assigned to a subflow, or several coding flows are multiplexed into one subflow. Coding-unaware methods are on MPTC sub-layer, and the RTP/RTCP that are passed from upper layer would evenly spread based on the quality of the associated active paths. Flow reporting is also optionally available for the whole recombined flows.

Fig. 6: MPRTP-AR protocol stack (source: adapted from  [90]).

Iv-A2 Dash

Dynamic Adaptive Streaming over HTTP (MPEG-DASH) [51] DASHDynamic Adaptive Streaming over HTTP was standardized in 2011 by MPEG. DASH supports both VoD and live video delivery. We first detail the DASH system and its main performance limitations. Then, we explain rate adaptation methods. Finally, we discuss the works that are based on this protocol.

DASH system. As explained in Section II, DASH has the same background technology of HTTP adaptive streaming and its system is shown in Figure 7. In DASH system, representations are fragmented into small segments at the server side. DASH component characteristics (text, video, audio, etc.) are described in a XML document named Media Presentation Description (MPD)MPDMedia Presentation Description. Typically, DASH client is responsible for choosing the next media segments and requesting the related HTTP URL. Therefore, a rate adaptation method, named adaptation engine in Figure 7, is required to select the proper segments’ bit rate by considering the segment availability indicated by the MPD, the network conditions and the media playout situation (e.g., playout buffer level) [41].

Fig. 7: DASH system (source: adapted from [41]).

Performance limitations. The rate adaptation method is responsible for key issues that influence QoE, namely, startup delay, stall, and video quality switches. Startup delay refers to the time since the client request a video until it starts to play, namely pre-buffering. This delay occurs because, generally, one or more segments have to be downloaded completely before the video starts to play. Although this delay helps to prevent stalls under poor network conditions, studies show that it often results in users stopping from watching the video [124]. It is important to note that while VoD streaming applications can pre-buffer few seconds of video, live and interactive video streaming can only pre-buffer few hundreds of ms of video [2].

Stall or interruption refers to the pauses during the video playback due to the playback buffer is emptied, and it needs to wait to buffer video (also called re-buffering) [125]. Studies show that stalling happens 40 - 70% of all sessions [43]. Generally, this issue occurs because of insufficient bandwidth. In DASH, each segment is available to transmit after completing the process of coding. In addition, there is a dead time between receiving the last packet of one segment and requesting for the next segment. For example, this process time together with transmission time over TCP takes at least 3s (when segments have 1 second) [126].

One approach to mitigate the stall problem is using a dynamic method to find reasonable segment size (segment duration). Studies in [47] and [127] show that segment size has a high effect on live latency. While with shorter size segmentation, latency significantly decreases, but the number of HTTP requests increases and consequently, the available bandwidth decreases [91].

Another approach to decrease latency, and consequently solve the stall issue, is applying subsegmentation transmission. In this approach, each segment divides into several subsegments, and receiver fetches subsegments before the whole segment coding terminates [128]. This approach is improved by sending subsegments over more than one link simultaneously, which means adding multipath transmission capability, to increase the fetching segment speed. Multipath transmission approach is used in a few works [93, 91, 3] to achieve this target. However, the subsegmentation transmission technique also increases HTTP request overhead [91]. In particular, the overhead problem is caused by subsegmentation transmission because at the client side after each request, an average of RTT is required to receive the response from the server. In the case of a large file with small segments/subsegments, this overhead causes a high latency. The HTTP pipelining [129] is a technique to decrease both number and length of each idle time. In this technique, the client sends the next subsegment request before completing the download of the current subsegment. However, in pipelining, the responses of the requests at the server have to return back in the same order that the requests arrived at the server. Therefore, if it takes a long time to process one request, the other requests would be blocked. For this reason, pipelining is not widely adopted.

To mitigate the overhead problem, the new version of HTTP (HTTP/2) could be used due to its ability to push content in advance, and consequently reduce live latency and network traffic [47].

Switching between different video quality representations is also a problem that impacts the video quality on the user side and causes annoying of viewers. Video switching happens because of the network bandwidth changing or buffer occupancy status. Therefore, it is important to adapt a suitable rate adaptation method, which could identify the network resources and congestion on time in order to have an optimal user experience [114].

Rate adaptation methods. Typically, rate adaptation methods use throughput monitoring, receiver buffer status, or power level in the process of video segment bit rate decision [17].

Throughput-based methods estimate the available bandwidth by monitoring throughput. These type of methods try to avoid re-buffering while providing the highest possible video quality. In case of using a throughput-based method, the video quality is unstable [130] due to throughput variations, which could be caused by TCP behavior [131, 114]. For example, TCP underestimates bandwidth when segmentation sizes are small (because the corresponding congestion window does not increase) or when bandwidth prediction is weak in networks with fast throughput variation such as cellular networks [132].

Buffer-based methods choose the video segment bit rate based on the buffer characteristics and usage. The proposed algorithms try to provide a smooth video streaming, but often result in sudden changes in video quality, or freezing when the buffer level (number of unplayed segments in the queue) drops to zero [130, 133].

Power-based methods select the video segment bit rate using the battery level. Regarding [134], video streaming consumes twice the energy of playing the same content offline. Therefore, power-based methods try to maximize the battery life time during a video streaming session.

Due to the bit rate selection being more accurate, the work in [135] shifted the adaptation logic to the server side by deploying a mirrored client buffer on the server. Wilk et al. [136] use a proxy server while Mao et al. [137]

leverage Pensieve’s neural network model on an ABR server to enforce or assist the mobile client adaptation. Such server side approaches have better network utilization compare to the client side approaches 

[17]. However, server side approaches are not considered scalable. Rate adaptation methods also perform more efficiently if they can access the network information [138]. For example, SDN is a technology to implement such a mechanism [139, 140]. Another example is Server and Network-assisted DASH (SAND)SANDServer and Network-assisted DASH [141, 142] which is a system standardized recently by MPEG to collect and propagate the network information for DASH bit rate adaptation decision. The proposed architecture in [114] is built upon the Distributed Decision Engine (DDE)DDEDistributed Decision Engine [143] framework to provide more network information (e.g., available capacity, load, QoS) for better rate adaptation decision in multipath scenario.

Works Year Separate TCP MPTCP
Xing et al. [91] 2012 Y N
Chowrikoppalu et al. [92] 2013 N Y
RTRA [3] 2014 Y N
Houzé et al. [93] 2016 Y N
Corbillon et al. [78] 2016 N Y
Ojanperä et al. [114] 2016 N Y
MP-DASH [117] 2016 N Y
Nam et al. [118] 2016 N Y

Multipath support. Current DASH version lacks multipath support, but it is being promoted as its future. Table IV presents some efforts to integrate DASH with Multipath separate TCP (e.g., [91, 3, 93]) and MPTCP (e.g., [78][114], MP-DASH [117][118] and [92]). The table shows attention for combining DASH with MPTCP has increased recently. This is due to MPTCP aggregates bandwidth and supports mobility. MPTCP is also middlebox friendly, and it is supported by the Linux kernel. Besides, MPTCP has got high attention in industry [8][144]. More technical details about MPTCP will be provided in Section IV-B4.

James et al. [43] explored “Whether MPTCP would always benefit mobile video streaming?”. This research analyzed the performance of different scenarios for DASH over MPTCP. The results show that having two paths with stable bandwidths is beneficial even with small bandwidth capacity on the secondary path. Another positive impact of an additional link is when the primary path has high bandwidth variability. However, there are some harmful cases too. For example, adding an unstable secondary path could harm the stable primary path or when the bandwidth of the secondary path is not enough to transmit higher video bit rates. Therefore, MPTCP is significantly sensitive to bandwidth fluctuation. The results also show that unnecessary multipath causes more energy consumption, resource wasting or increase cost of the quality switch.

One note regarding provide multipath delivery for DASH is about which one of the client or the server is responsible for packet scheduling decisions. In all the surveyed works that spread data over separate TCP connections in Table IV, the client is responsible for choosing the proper path and fetching the suitable segments/subsegments over that path due to the fact that DASH logic is on the client side. But, integration of DASH with MPTCP is challenging when DASH logic resides on the client side, and MPTCP scheduler is on the server side. Besides, MPTCP is transparent to the application layer. Therefore, in the surveyed works of Table IV, which MPTCP is used as transport protocol, rate adaptation logic is kept at the client side. But, scheduling decisions related to packet selection and distributing them through the paths are placed at the server side or both client and server side. The surveyed works [78][114], MP-DASH [117] and [118] are more related to the cross layer protocol stack. So, we will discuss them with details about scheduling strategies in Section IV-D. The other works are explained in more details below.

Xing et al. used Markov Decision Process (MDP) 


 MDPMarkov Decision Process to formulate video streaming process as a reinforcement learning task in their works 

[91] and [3] for non-scalable and Scalable Video Coding (SVC), respectively. The works’ goals are decreasing startup delay, improving video quality and achieving better smoothness. In each of these works, the implemented rate adaptation method selects the next segment based on the current queue length and estimated available bandwidth. To estimate an accurate available bandwidth, Markov channel model is used. This way, adaptation logic finds the transit probability of each link in real-time and determines the best action (e.g., using both links, using only WiFi link, client wait or smoothing). There is also a reward function implemented to reward each action with concern of video QoS requirements (by measuring interruption rate, video quality, video smoothness and search time cost). However, the major problem of using MDP is the high computational cost of solving the complex optimization problem, especially in online and high mobile speed users [145]. In addition, the approach is not a content-aware solution.

Chowrikoppalu et al. [92] modified DASH protocol in order to utilize multipath capability. In this work, the adaptation logic is fed with a proposed bandwidth estimation algorithm and some proposed parameters, including path stability, total path stability and buffer level. The bandwidth estimation algorithm is based on sniffing packets on the interface level. Path stability and total path stability are defined to show the variation of bandwidth on each path and on MPTCP connection, respectively. However, the main problem of this approach is that it does not access the video content information.

Houzé et al. [93] implemented a video player utilizing multipath capability over multiple TCP connections. The goal of this scheme is achieving low-latency in DASH video delivery (below 100 ms). In this approach, server encodes frames of each representation and put them in the related segment every x ms (x depends on the frame rate, for example, x is 40 ms for 25 fps). The client has to fetch each whole frame before the deadline (play time of the frame) and in x ms before a new frame becomes available to fetch. For this target, the authors utilized video delivery over multiple paths as a way to reduce latency. Each frame divides to byte ranges to transfer over different paths and the approach has a mechanism to find the best byte range size in order to receive them with a small inter-arrival time. The larger byte ranges are transferred over faster paths, this way, the variation of transfer delay decreases, and consequently, HOL problem mitigates. Besides, another adapted mechanism is proposed to select the proper representation. In this mechanism, when a segment starts, the biggest frame of each representation is considered in making the decision. The biggest frame is commonly the first frame of each representation (I-frame). Therefore, a representation would be selected that the biggest frame has high probability of reaching the destination on time. The problem, however, is that the approach does not consider the video content information. In addition, while the work uses RTT to estimate each path speed, it needs to improve the scheduling strategy to manage the paths.

Iv-A3 Mmt

MPEG Media Transport (MMT)MMTMPEG Media Transport [56] was standardized by MPEG in 2014. MMT is a part of the ISO/IEC 23008 High Efficiency Coding and Media Delivery in Heterogeneous Environments (MPEG-H) standard [30]. This application layer transport protocol supports VoD and live video streaming. MMT has been widely used for Virtual Reality (VR) and Augmented Reality (AR) technologies, three-dimensional (3-D) scene communication, Multi-View Video (MVV), and for major advances in televisual technology worldwide [146, 147]. We previously explained some of the properties and behaviors of MMT in Section II. Here, firstly, we indicate more properties of the protocol. Then, we explain the related technologies, MMT functions, and data transmission details. Finally, the surveyed works that are based on the MMT protocol are discussed.

MMT properties. MMT could be used for all unidirectional, bidirectional, unicast, multicast, multisource and, even, multipath media delivery. Besides, MMT supports both broadcast and broadband video streaming [148] [149]. It also provides traditional IPTV broadcasting service and all-Internet Protocol (All-IP) networks.

Capability of hybrid media delivery is one of the most important properties of MMT. Hybrid media delivery [150] refers to the combination of delivered media components over different types of network. For example, it could be one broadcast channel and one broadband, or it could be two broadband channels. MMT has different hybrid service scenarios that are classified into three groups by ISO/IEC 23008-13 [151]: live and non-live, presentation and decoding, and same/different transport schemes. The first one, live and non-live, refers to the combination of live streaming components or combination of live with pre-stored components. The second group, presentation and decoding, is the combination of the stream components for synchronized presentation or synchronized decoding. The third group, same transport schemes and different transport schemes, supports the combination of just MMT components or MMT components with other format components (e.g., MPEG-2 TS). An instance of hybrid model comprises of MMT (as a broadcast channel) and DASH (as a broadband channel) over heterogeneous networks is also presented in ISO/IEC 23008-13 [151].

The work in [152] compared two MMT broadband systems: a combination of MMT with HTTP versus a combination of MMT with Quick UDP Internet Connection (QUIC) QUICQuick UDP Internet Connection[153]. QUIC is a transport protocol atop UDP for broadband systems, which was developed by Google. QUIC aims to reduce latency because it has zero round trip connection establishment in many cases. For example, when the client talked to the server before and there is some cached context (repeated connections). In addition, QUIC has multiplexed transport with no HOL blocking. Other features of QUIC are utilizing congestion control, FEC protection, and its own retransmission mechanism. The results of [152] show that using QUIC is more appropriate than HTTP in the networks with high delay and lossy networks. This experiment is just for a single path and there is room to evaluate it in multipath transmission defined for QUIC [154].

Related technologies. ISO/IEC 23008-1 [56] defined some related MMT technologies. For example, Application Layer Forward Error Correction (AL-FEC) AL-FECApplication Layer Forward Error Correction to repair data, ARQ to retransmit lost data, MMT data model and built-in hypothetical buffer model.

Regarding the MMT data model [56], [155], MMT package is a logical entity, illustrated in Figure 8, that comprises of one or more assets and required information for video delivery, such as Composition Information (CI), CIComposition Information Presentation Information (PI) PIPresentation Information and Asset Delivery Characteristics (ADC). ADCAsset Delivery Characteristics Asset refers to a logical data entity containing a number of Media Processing Units (MPUs)MPUMedia Processing Unit. Video, audio, picture, text are some examples of assets. CI provides information on temporal relationships among MPUs written in XML. HTML5 file is referred to PI, which provides initial information on spatial relationships among media elements, and ADC contains QoS information for multiplexing.

Fig. 8: Logical MMT package (source: adapted from [33]).

Built-in hypothetical buffer model [56] aims to compensate for jitter and multipath delay delivery. In this model, the sending entity runs the hypothetical receiver buffer model (HRBM) HRBMhypothetical receiver buffer model to emulate the receiving entity behavior. In this way, the sending entity determines the required buffering delay and buffer size. Then, sending entity signals this information to the receiving entity. Since at the receiver entity, several buffers exist to reconstruct of MPU from the MMT packets, the received signal is used to define operations of the buffers to ensure that at any time the buffer occupancy is within the buffer size requirement. These buffers are FEC decoding buffer to perform FEC decoding. De-jitter buffer to provide the fixed transmission delay, and MMTP de-capsulation buffer to perform MMT packet processing (e.g., de-encapsulation, de-fragmentation/de-aggregation).

MMT functions. MMT has three major functional layers, shown in Figure 9, independent of video codecs [150]: encapsulation, signaling, and delivery. Encapsulation functional layer is responsible for encapsulating MPUs, which are complied with ISOBMFF [156]. Thereby, it enables easy conversion between storage and delivery format [30]. MMT is beneficial to the broadcasting system because MPUs are self-contained, which means that they can completely decode at the terminal without requiring any further information. Signaling functional layer is responsible for signaling messages and delivery management (e.g., CI, PI and ADC). Delivery functional area defines the application layer protocol that supports packetized streaming including the payload format through a heterogeneous network environment. Delivery functional area also provides Multiplexing, flow control and cross layer. Cross layer ability provides exchanging QoS between application layer and transport layer.

Fig. 9: Major functional areas of MMT (source: adapted from [33]).

MMT data transmission. Regarding MMT data transmission, each MMTP session consists of one MMTP flow [150]. MMTP flow is defined as all packet flows that are delivered to the same IP and port destination. A MMT flow may carry multiple assets, which are identified with a unique packet_id. MMTP packet uses two types of sequence number as different purposes: packet_counter and packet_sequence_number. packet_counter represents sequence of packets in a delivery session and it is regardless of the value of packet_id. packet_counter enables packet loss detection. However, packet_sequence_number is the sequence number specific to each packet_id (each asset).

Initially, MMT was designed for broadcast networks (over UDP/IP) with reserved channel capacity. Therefore, congestion control was left to the implementation of the senders. However, MMT inherently supports receiver and sender feedback for stream thinning and bitstream switching. It also may support any Receiver-driven Layered Multicast (RLM)RLMReceiver-driven Layered Multicast-based congestion control algorithms (e.g., WEBRC, TFMCC).

MMT has four modes for payload format; MPU mode to transport MPU packetized streaming, Signaling mode to transfer signaling information, Repair symbol mode to carry FEC repair data, and also Generic File Delivery (GFD) GFDGeneric File Delivery mode, which transports all types of files.

Multipath support. Regarding multipath delivery, Kolan et al. [94] defined a method to establish multipath delivery over MMT, Afzal et al. [95] proposed a path-and-content-aware scheduling strategy for packet distribution, and Sohn et al. [96] proposed a synchronization scheme for hierarchical video streams over heterogeneous networks. Next we explain each work in more detail.

kolan et al. [94] defined a method to establish multipath delivery over MMT. In this method, MMT protocol utilizes signaling protocols such as RTSP or HTTP to establish and control multipath sessions between sender and receiver (transport connection could be either TCP or UDP). For example, in RTSP, the client and the server could be aware of the multipath capability by sending OPTIONS request to each other. This new option tag, called ”multipath”, could be implemented in the header of the OPTIONS request. The same way, while HTTP is used to set up multipath sessions, the client includes ”Multipthid” header to tell the server about its multipath capability. It is also possible to add or drop a network path during the connection. While media is delivering, MMT periodically sends feedbacks to the sender to inform about the path quality information (e.g., loss, delay and jitter). Therefore, the sender could have a view of different paths’ situations and dynamically select better performing paths for packets.

Afzal et al. [95] proposed a novel path-and-content-aware scheduling strategy for MMT to stream real-time video over heterogeneous wireless networks. The authors claim to be the first work attempting to improve the MMT standard by adding multipath scheduling strategies. The path-and-content-aware scheduling strategy, implemented at the server side, applies some methods to improve the perceived video quality based on adaptive video traffic split schemes, Markovian-based techniques, in addition to a discard and a content-aware strategy. Adaptive video traffic split scheme allocates a proper bit rate for each transmission path considering heterogeneous network context with the aim of executing load balancing, relieve congestion, and proper utilizing of each path capacity. The Markovian-based method estimates path conditions and transition probabilities. Discard strategy reduces congestion by avoiding sending packets that would probably be lost. Content-aware strategy protects packets with high priority (I frames and the closest P frames, named as near-I (NI) frames in this work) by duplicating or assigning them to the best path. The client constantly monitors the path condition, calculates the path metrics which are sent as feedback information packets to the server through the best path. For this purpose, the feedback signaling mechanisms defined in the MMT standard are leveraged. Finally, the proposed path-and-content-aware scheduling strategy lead to QoE improvements around 12 dB for PSNR and 0.15 for SSIM by significant packet loss rate reductions ( 90%). It is important to note that the approach does not require any change in the protocol itself since the scheduler can be implemented as part of the client/server applications.

Sohn et al. [96] proposed a synchronization scheme for hierarchical video streams over heterogeneous networks. This scheme is a combination of MMT (for broadcasting) and HTTP (for broadband) video streaming. The work utilizes scalable video streaming. Each layer is segmented in time (in seconds), and duration value can vary according to the user’s definition. SHVC-encoded stream is used in the experiment with 3-layers: base layer (HD), first enhanced layer (Full HD (FHD): FHDFull HD 2K) and second enhanced layer (UHD: 4K). Base layer and first enhanced layer of video are transferred over the broadcast network (MMT supports multiplexing on packet level), and the second enhanced layer is transferred over broadband network. If the receiver’s display has HD-resolution, it will drop the data of the first enhanced layer among data delivered over the broadcasting channel, and it does not need to have a connection with the server for the second enhanced layer, even if it can connect the networks. PI contains essential information, such as the content resolution, location of content, and MMT eXtension Document (MXD)MXDMMT eXtension Document, and can also be transferred on broadcast paths. MXD is inserted in PI and mimics the MPD of DASH-SVC. MXD synchronizes the contents over heterogeneous networks, and organizes content synchronization information. The synchronization scheme is implemented at the receiver side. Receiver requests the segments that can deliver on time. For this target, the expected time to download each segment is computed based on bandwidth calculation and segment size information from MXD. This approach is not aware of video content and there is no scheduling strategy to manage the paths.

Iv-A4 Other Adaptive Streaming Approaches

here we discuss other adaptive streaming approaches that also use HTTP to retrieve data. For example, DAAVI [157] has the same core functionality than DASH by making different bit rate segments on the server, providing MPD for the client, being client logic-based and transferring data over HTTP. However, the MPD structure of DAAVI is different from DASH’s MPD. In our surveyed works, the proposed approaches in [97, 98] and [99] are all based on DAAVI. These DAAVI-based approaches are for on-demand and live streaming, and the authors claimed that the solutions could also be implemented in a DASH approach.

All adaptive video streaming approaches have the same challenges explained for DASH-based protocols in Section IV-A2. One of these challenges is stalling during video playback. The works, [97, 98, 99] and GreenBag [71] utilized multipath transmission of subsegments to decrease latency, and consequently mitigate the stall issue. As previously explained in Section IV-A2, fetching subsegments over multiple paths can cause the overhead problem. These works used pipelining techniques [129] to mitigate the overhead issue.

The works [98, 99] and GreenBag [71] also proposed dynamic size subsegment methods to determine the size of each subsegment based on the throughput of each interface. As previously explained in Section IV-A2, large sized segments increase the out-of-order packet delivery. Instead, small size segments provide smoother video, but impose higher overhead time [158]. Another problem with using a fixed size subsegment method, instead of a dynamic one, is that a high buffer size is required to compensate for path heterogeneity, which is not desirable. This problem exists in the approach proposed in [97].

A feature of GreenBag [71] is that it is a middle-ware approach for video streaming over HTTP. Middle-ware approaches are designed to enable multipath interfaces to the current applications without application modifications. Therefore, middle-ware approaches are easy to deploy, but complex to implement [20]. This middle-ware approach, GreenBag, locates between a local video player and a remote server. The client requests a video file URL normally over HTTP. GreenBag extracts the URL, determines how to download portions of the video (segments/subsegments), and requests for portions over the decided links. RTT is used to determine when to send the requests for the next segments. Therefore, GreenBag is conventional without requiring any modification in Internet infrastructure or server side.

GreenBag is also an energy-aware bandwidth aggregation approach. Therefore, when single path can provide the required QoS, GreenBag stops using multipath and switches to the single path to improve energy efficiency. Besides, the approach has a medium load balancing and a recovery mechanism. Recovery occurs when a subsegment is lagging and it may pass the deadline. Therefore, the rest of the subsegment will be downloaded through both links. Finally, GreenBag leads to mitigate packet reordering problem and decrease latency.

Noteworthy, none of the adaptive streaming surveyed approaches considers video content features.

Iv-B Transport Layer Approaches

Video streaming approaches focusing on transport layer protocols have direct access to the network information. Therefore, they can estimate End-to-End characteristics of each path, such as capacity and congestion [159], that are useful in multipath scenarios. However, the biggest challenge of these solutions is that they generally require modifications in the standardized multipath transport protocols, which may require changes even in the kernel of operating systems.

There are several works exploiting multipath transmission in transport layer, but MPTCP and SCTP are the two main employed transport protocols with multihomed support. In this subsection, we will discuss surveyed works that are implemented based on UDP, DCCP, TCP, MPTCP and SCTP/CMT. Table III presents each category.

Iv-B1 Udp

The User Datagram Protocol (UDP)UDPUser Datagram Protocol [38], standardized by IETF in 1980, is widely used for unidirectional, broadcast, unicast, multicast, and anycast communications. Next, we provide a brief recap of UDP basics and discuss relevant multipath efforts.

UDP overview. UDP was designed to use a single path for data transmission. It is a connectionless protocol, it does not use sequence numbers for data transmission [144], and there is no guarantee for in-ordered and reliable delivery. UDP also has no congestion control for bandwidth adaptation. These properties make UDP a fast transmission protocol [160] upon which video streaming solutions can be easily implemented. However, the lake of bandwidth adaptation causes UDP to transmit the data with the same bit rate as sent by the application. Therefore, when the network is congested, unless the application holds back, packets get discarded leading to video distortion and reduced QoE [161]. Moreover, without congestion control, UDP may occupy a high fraction of the available bandwidth, and consequently, acting unfair to other congestion-avoiding network flows [102].

Multipath support. There are several efforts to add multipath transmission and bandwidth aggregation to UDP for video streaming [86, 100, 101]. Note that the approaches proposed in BEMA [86] and [100] introduced rate balancing methods to avoid network congestion.

Wu et al. [86] designed a Bandwidth-Efficient Multipath streAming (BEMA) protocol and claimed that it was the first work that employed Raptor coding and priority-aware scheduling to stream HD real-time video over heterogeneous wireless networks. This content-aware model sends packets with higher priority on the better-qualified paths and I-frame packets through all available paths. The approach also utilizes Forward Error Correction (FEC) to protect transmission data. BEMA also provides a TCP-Friendly Rate Control (TFRC) TFRCTCP-Friendly Rate Control in order to guarantee fairness toward TCP flows. TFRC [162] is an equation-based congestion control algorithm, which is designed for unicast multimedia traffic. TFRC estimates the loss event rate at receiver and informs it to the sender, which adapts its transmission rate based on the congestion estimation and on the equation that models TCP congestion control behavior. TFRC responds to the congestion with less fluctuation than standard TCP congestion control and over longer periods of time [163]. However, TFRC may cause unnecessary reduction of transmission rate during wireless losses. BEMA then adds a ZigZag scheme [163] in order to distinguish congestion losses from wireless losses. Only if ZigZag classifies a packet loss as a congestion loss, TFRC will consider it as a lost packet [163]. Considering the relevance of the feedback information for the proper scheduling process and its high effect on the performance, it is sent periodically from the client to the server over a reliable TCP connection.

Freris et al. [100] proposed a distortion-aware scalable video streaming to multiple multihomed clients. The authors claimed that their work is the first that simultaneously considered End-to-End rate control and scalable stream adaptation for multipath over heterogeneous access networks. In this approach, the requested video stream is divided into substreams on the server side. The authors developed an algorithm to determine the rate of each substream and the packets to be included in each substream considering network information (e.g., available bandwidth and RTT) and video content features in order to minimize video distortion. Besides that, different cost functions are proposed to provide service differentiation and fairness among users.

The authors also developed heuristic algorithms for deterministic packet scheduling. Once it is a scalable streaming approach, each packet is transmitted only if all other related packets in lower layers have been sent before. Substreams integrate into a single scalable video stream at the client. The authors also studied the trade-off between performance and computational complexity and concluded that it works better for a small number of clients because of overhead.

Correia et al. [101] proposed a video streaming approach for networks with path diversity using MDC as an error resilience technique. The authors proposed a priority classification. A limited number of packets were classified as high priority because they minimize the distortion of the decoded video affected by packet loss. These packets are delivered without losses. Remaining low priority packets are prone to transmission losses.

Iv-B2 Tcp

Transmission Control Protocol (TCP)TCPTransmission Control Protocol [39] is a transport protocol standardized by IETF in 1981. This protocol has been widely adopted for video streaming in Real-Time Communications (RTC)RTCReal-Time Communications [164] and in HTTP-based applications. We previously discussed TCP lack of throughput stability [86] with its negative effect on adaptive bit streaming in Section IV-A2. Here, we provide more details about TCP and discuss one surveyed work that is based on this protocol.

TCP overview. TCP is designed to use a single path for data transmission. Regarding data transmission process, TCP uses sequence numbers to detect losses, guarantee in-order packet delivery, and reconstruct the received data [144]. The receiver sends ACKs for the correctly received packets. These ACKs are used to provide reliable communication. Retransmission occurs in two cases. First, when there is no ACK from the receiver, which is detected by using a retransmission timer referred to as Retransmission Time-Out (RTO)RTORetransmission Time-Out. Second, when the sender receives three duplicate ACKs, which means loss occurred. As previously also discussed in Section I, retransmission wastes bandwidth and adds significant delays. Several protocol improvements have been proposed. For example, Selective Acknowledgements (SACK) [165], where the receiver informs the sender all successfully arrived packets, so the sender retransmits only the segments that have actually been lost, and Cumulative ACK, which acknowledge the last successfully received packet to the sender. In addition, Explicit Congestion Notification (ECN) [166] has been proposed as an optional capability to collect congestion information hop by hop and inform the sender about the congestion levels.

Using congestion control by monitoring packet losses and/or delay variations [144], TCP enables to adapt the data rate to network congestion and leads to minimize packet loss [161]. In case of not enough network bandwidth available, TCP sends video data with a lower bit rate than the required video bit rate. Thus, video transmission takes longer than the video playback, and consequently may cause the playback to stall. While stall has a severe effect on the perceived video quality, in case of VoD delivery, typically, stall is preferred over video distortions [161]. Previously in Section III, we explained about HOL issues and liveness strategies used in TCP-based applications for live or interactive video streaming to cope with stall and delay constraints requirements. Besides all the explained properties, TCP has also the advantage of traversing through firewalls and NATs, a common issue in UDP, altogether turning TCP into a dominant transport protocol for video services [17].

Multipath support. Sharma et al. [4] proposed MultiPath LOss-Tolerant (MPLOT) protocol based on SACK-based TCP and cumulative ACK. A framework, named Hybrid-ARQ (HARQ)HARQHybrid-ARQ/FEC, is defined for MPLOT. Based on HARQ/FEC, MPLOT is using adaptive FEC proactively and reactively instead of high retransmissions to recover losses. Proactive FEC (PFEC) PFECProactive FEC packets are used to recover losses and when PFEC packets in a block are not enough to recover lost data, then Reactive FEC (RFEC) RFECReactive FEC packets need to transmit. This method leads to goodput improvement and decreased recovery latency in high lossy channels [88]. Regarding packet scheduling, paths in MPLOT are categorized into good and bad paths. The channels with ranks higher than a threshold (median rank) are categorized as good paths. Ranks are calculated based on network parameters, such as congestion window, PLR and RTT. MPLOT provides an uncoupled congestion control which means each path has its own congestion control. ECN is used to find congestion losses (from faulty/lost channels) and to change the congestion window size. However, MPLOT is deployed for wireless mesh networks and it is not easily expendable on the Internet due to scalability and compatibility issues. The authors assume that a buffer is enough to compensate out of order delivered packets, which are important in video quality [88, 167]. Moreover, the approach is using a CBR coding scheme, which decreases the performance when the path quality decreases sharply [88].

Iv-B3 Dccp

Datagram Congestion Control Protocol (DCCP)DCCPDatagram Congestion Control Protocol [168] is a transport protocol standardized by IETF in 2006. Here, firstly, we provide an overview of DCCP, such as data transmission process, and its properties. Then, we discuss one surveyed work that is based on this protocol.

DCCP overview. DCCP is designed to use a single path for data transmission providing bidirectional and unicast data delivery. Regarding data transmission process, DCCP uses sequence numbers. Therefore, the client can detect losses and inform them to the sender by ACKs. There is no retransmission method and in-order data delivery. In addition, there is an ability for feature negotiation before or during transmission, such as ECN capability, ACK ratio, and congestion control mechanism.

DCCP has different congestion control mechanisms that are represented by Congestion Control IDentifier (CCID)CCIDCongestion Control Identifier, for example, CCID2 and CCID3. CCID2 has a TCP-like Congestion Control. Thus, the sender has a congestion window and sends data until making the window full. Both dropped packets and ECN trigger the congestion algorithm and halve the congestion window. Acknowledgments contain a list of received packets within some window, like Selective Acknowledgements (SACK)SACKSelective Acknowledgements-based TCP. Therefore, CCID2 [169] provides quick access to available bandwidth and deals with quick bit rate changing [168, 102]. CCID3 [170] provides TFRC. CCID3 responds to congestion smoothly and maintains steady bit rate [168, 102].

A comparison among UDP, TCP and DCCP variants (CCID2 and CCID3) for transferring MPEG4 video, shows that DCCP provides higher throughput and less packet loss compared to UDP while UDP supplies much less delay and jitter. Finally, DCCP comes up with the best QoS compared with TCP and UDP transport protocols over congested network [171]. However, since subjective results in the work [161] shows stalling caused by TCP is preferred over distortion caused by UDP for VoD streaming, DCCP without retransmission may also suffer from video distortion and may not outperform TCP and UDP for VoD in terms of QoE.

Multipath support. In our surveyed works, Huang et al. [102] proposed a Multipath Datagram Congestion Control Protocol (MP-DCCP) for supplying a multipath transmission to DCCP. In MP-DCCP, each link has its own DCCP connection, which means that each link can maintain its own congestion control window, sending rate adjustment and CCID. The proposed schedule scheme in MP-DCCP is called QoS-aware Order Prediction Scheduling (QOPS). QOPS assigns important frames, such as I-frames into paths with less Packet Loss Rate (PLR). Besides, QOPS predicts the order of packets at the receiver side by estimating the path latency to deal with the out-of-order problem. Based on the final results, among the congestion control algorithms defined in DCCP standard, conjunction of CCID3 to MP-DCCP is recommended due to its steady transmission.

Iv-B4 Mptcp

Multipath TCP (MPTCP) [172, 173]MPTCPMultipath TCP is a prominent protocol for multipath transmission developed at IETF since 2009. MPTCP has been implemented in the Linux kernel [174], and also as an experimental kernel patch for FreeBSD-10.x [175]. Industry has also adopted MPTCP on smartphones [176][177]. Two major deployments are voice recognition (SIRI) application [178] since 2013, and for any application on iOS11 [179]. Another major MPTCP deployment in high-end Android smartphones (e.g., Samsung Galaxy S6 and Galaxy S6 Edge smartphones) relies on network-operated SOCKS proxies, reaching bandwidth of 1 Gbps by KT Corporation, in Korea 2015 [180]. In the following, we first provide an overview of MPTCP. Then, we discuss performance problems. Finally, we survey relevant works based on this protocol.

MPTCP overview. MPTCP was designed to use multiple paths for data transmission. In particular, MPTCP establishes multiple subflows for a single MPTCP session. A subflow is a TCP flow over an individual path and looks similar to a regular TCP connection. Besides, there is a MP_CAPABLE option to identify that the connection is MPTCP rather than TCP. Further, a token is associated to the MPTCP session. This token is used for subflows to add to this particular session. In MPTCP, application layer sees MPTCP connections as unique, as shown in Figure 10. Therefore, sender’s transport layer packetizes data to TCP packets and receiver’s transport layer reorders and recreates the byte stream without application layer knowing about it. As a result, application layer stays unmodified and a standard socket API is used.

Fig. 10: MPTCP protocol stack (source: adapted from [172]).

Regarding data transmission process, each packet contains two sequence numbers: the Subflow Sequence Number (SSN) to lost detection SSNSubflow Sequence Number and an additional Data Sequence Number (DSN) DSNData Sequence Number to reconstruct the original data at the receiver. MPTCP also utilizes ACKs for subflow and connection level. SACK/Cumulative ACKs are used at subflow level and DSN-ACKs are used at connection level [144]. For data transmission protection, MPTCP uses retransmission mechanism as in regular TCP. Besides, in the case of packet loss over a subflow, retransmission could be over another subflow.

Default MPTCP uses coupled congestion control (each MPTCP connection has its own congestion control) to avoid an unfair TCP connection. This algorithm provides better congestion balancing than just using TCP congestion control over each subflow (uncoupled) [181, 182] because MPTCP over regular TCP connections could behave unfairly.

A shared MPTCP receiving buffer is used at the receiver side to receive and reorder packets of different paths [78]. In other words, there is a single window shared by all subflows at the receiver side.

Because in multipath approaches, packet scheduling strategy has an important role, there are different strategies introduced for MPTCP. Performance comparison of scheduling methods for multipath transfer is analyzed in [125] and different schedulers are implemented and evaluated in [183] for MPTCP. Default MPTCP packet scheduling strategy selects the packets in First-In First-Out (FIFO) order and maps them to the different paths according to RTT-based policy.

MPTCP supports middleboxes and is compatible with the current network infrastructure [144]. This is due to this fact that SSN contains a consecutive sequence number for each subflow packet. Therefore, it can pass through middleboxes [184]. However, in case of conflict, MPTCP handles middleboxes by fallback to the regular TCP [185]. Moreover, MPTCP provides resilience, mobility and load balancing [160].

Performance challenges. Studies in [186] and [125] show that MPTCP presents performance issues most critically in the case of heterogeneous paths. The reasons of MPTCP performance limitations are discussed below:

  • Out-of-order packets: MPTCP suffers from out-of-order packet problem. A comparison between Single Path TCP (SPTCP)SPTCPSingle Path TCP and MPTCP in [118] shows that SPTCP outperforms MPTCP when paths are heavily imbalanced in terms of throughput. MPTCP operates poorly in this case due to a large number of out-of-order delivered packets. Such imbalance throughput could also happen frequently in the case of using 5G network simultaneously with other wireless networks. In our surveyed works, the approach proposed in [118] introduced a dynamic MPTCP path control to remedy out-of-order problem.

  • HOL blocking due to ARQ mechanism: Using ARQ mechanism by MPTCP causes frequently HOL blocking problem, even more than a single TCP connection [78]. As previously explained in Section I, HOL incurs large End-to-End delay and low performance. In our surveyed works, the proposed approaches in ADMIT [103][104][105] and [106] attempted to solve the retransmission problem in order to decrease End-to-End delay.

  • Frequent throughput fluctuation and unnecessary fast retransmission: MPTCP uses Additive-Increase/Multiplicative-Decrease (AIMD) AIMDAdditive-Increase/Multiplicative-Decrease congestion control algorithm to set congestion window sizes. The problem is that AIMD causes frequent throughput fluctuation and significant End-to-End delay [103, 187]. For example, out-of-order packet delivery, which is common in multipath transmission, and losses, which could be wireless loss and not congestion loss, could trigger unnecessary fast retransmission, which impacts undesirable reduction in the size of congestion window and waste useful bandwidth [144]. In our surveyed works, ADMIT [103] considered the packet loss differentiation to mitigate this problem.

  • Content-agnostic traffic scheduling: In MPTCP, availability of multipath connections is unknown to the application. Therefore, MPTCP is unaware of application information and video content features. The approaches proposed in [78] and [117] introduced cross layer solutions to access the video content and deadlines, respectively.

  • Fully reliable and ordered service: MPTCP is an extension of TCP protocol with inherited fully reliable and ordered services, which are not required by video streaming. In our surveyed works, there are some efforts [104, 105], PR-MPTCP[106] applying the concept of partial reliability in MPTCP for real-time video delivery. This concept avoids retransmission for acceptable loss rates and provides partial reliable video data transmission to the upper layers [104, 105, 106].

    Partial reliability leads to improved network performance parameters (e.g., delay, bandwidth), and consequently, better QoE [104].

Improved scheduling mechanisms. There are several proposals to improve MPTCP regarding the above mentioned problems through scheduling functions that define the multipath decision. Next, we briefly review them and next we provide more details. Cross layer works to adapt application/network layer protocols with MPTCP (e.g., [117][78] and [118]) will be presented later in Section IV-D.

Wu J. et al. proposed quAlity-Driven MultIpath TCP (ADMIT) protocol [103] for streaming high-quality mobile video with multipath TCP in heterogeneous wireless networks. ADMIT is an extension of MPTCP with inheriting basic mechanisms from it, including coupled congestion control, the same connection, subflow level acknowledgments, and retransmission mechanism. The authors claimed that ADMIT is the first MPTCP scheme that incorporates the quality-driven FEC coding and rate allocation to mitigate End-to-End video streaming distortion. The proposed FEC Coding in ADMIT, adaptively chooses FEC redundancy and FEC packet sizes according to the network situations (e.g., RTT, bandwidth and, packet loss rate) and delay constraint. This adaptive FEC coding leads to remedy the shortcomings of packet retransmission (e.g., serious delay and performance degradation [86]) by protecting video data. Besides that, the proposed rate allocator algorithm is responsible for load balancing. ZigZag scheme [163] is also used in ADMIT. ZigZag has high effect on the FEC coding and rate allocator results due to distinguishing

congestion losses from wireless losses. Finally, packet scheduling strategy maps FEC packets to the different paths according to the rate allocation vector. However, there is no mechanism to ACK for reconstructed lost packets in FEC unit. Therefore, the ADMIT protocol keeps sending retransmissions of the lost packets until receiving the ACK. Besides, the packet scheduling strategy is not aware of the frames different priorities. Another problem is that all packets of the Group of Pictures (GoP) and redundant packets must be received before the GoP frames are processed. Each video unit may consist of several packets and it may also depend on other units.

The works [104, 105], and PR-MPTCP[106] apply the concept of partial reliability in MPTCP. These works demonstrate that capability of partial reliability for MPTCP outperforms the default MPTCP for real-time video streaming. As a comparison among these works, one can note that the approach in PR-MPTCP [106] defines that switching between MPTCP and partial reliable capability occurs dynamically based on the network situation. However, in [105], partial reliability is only activated in the initial handshake, and there is no explanation about how switching occurs in [104]. Besides, the works in [104] and PR-MPTCP [106] used old versions of MPTCP. Finally, these works defined different methods for applying partial reliability, which are explained in more details below.

Diop et al. [104] introduced QoS-ORIENTED MPTCP in order to improve QoS in terms of End-to-End delay. In this work, two QoS-aware mechanisms are implemented with the concept of “partial reliability” in MPTCP for interactive video applications. The first one, MPTCP-SD (selective discarding), eliminates the least important packets (B-frames) at the sender side. This could decrease the network traffic and avoid latency and loss of I and P frames. The capability of gathering priority information for MPTCP is implemented by using Implicit Packet Meta Header (IPMH) interface [188].

In the second mechanism, a time-aware policy is used. In MPTCP-PR (time constrained partial reliability), delay of each queued packet on the receiver side is calculated and whenever it gets close to a time limit (400 ms), packets are sent to the application, and acknowledge would be sent for the missed packets. In addition, delivered packets after a specific time limit are considered as losses, but acknowledgments are sent for them to the sender. The results show that MPTCP-SD provides better video QoS than MPTCP-PR and MPTCP.

Another MPTCP Partial Reliability extension is introduced in [105] to provide different required reliability level and recommended for video streaming. There is a threshold for the maximum number of retransmission attempts, or maximum delay of transmission for each packet. In this approach, the sender and receiver negotiate about partial reliability function in the initializing phase. During data transmission whenever a packet exceeds the defined threshold, the sender informs it to the receiver. Therefore, the receiver will not wait anymore to receive that packet. Consequently, the receiver will send a forced acknowledgment and sender eliminates that packet from its buffer similar to the time the packet delivered successfully. The forced acknowledgment also shows losses and congestion in the network and triggers the congestion control algorithm.

Cao et al. [106] proposed Context-aware QoE-oriented MPTCP Partial Reliability extension (PR-MPTCP). In this work, sender monitors network congestion and receiver buffer blocking to determine when it should enable partial reliability. In order to detect network congestion, a function of RTT for each path is proposed and to detect the buffer blocking, advertised receiver window (rwnd) rwndreceiver windowis used. In the case of a congested network, only the packets with enough deadline to play would be sent and the packets with the highest priority could be retransmitted. In particular, in this work, the concept of context is used to refer to the video content where I-frames have the highest priority. Whenever buffer blocking is detected, a subset of paths are adaptively selected based on their quality (e.g., bandwidth). The approach switches to the full MPTCP mode (standard MPTCP) when there is no buffer blocking. Authors of PR-MPTCP demonstrate that this method outperforms the proposed approach in [104] in terms of video performance metric.

Iv-B5 SCTP and CMT (extension of SCTP)

The first SCTP specification was published in the now obsolete RFC 2960 [189] in 2000 and then it was updated in RFC 3309 [190] and RFC 4460 [191]. The current protocol specification is in RFC 4960 [192] containing updates and standardized by IETF in 2007. SCTP provides multihoming, multistreaming, and there is support for SCTP by different operating systems and platforms (e.g., FreeBSD, Linux and Android). Here, firstly, we have an overview of SCTP, such as data transmission process, and SCTP properties. Then, we indicate performance limitations. Finally, we discuss the surveyed works that are based on this protocol.

SCTP overview. SCTP is a message-oriented protocol like UDP and supports reliability by using congestion control and retransmission like TCP [192]. Default SCTP uses one path as a primary path for transferring data packets, and other paths are used for redundancy transferring (retransmission and backup packets). Redundant paths are used to have more resilience and reliable data transferring than using only a single path. In particular, SCTP sets up an association with different IP addresses for each end host [193]. Association, in SCTP, refers to the connection between SCTP end hosts.

SCTP provides multistreaming capabilities that reduce the HOL blocking problem. In SCTP, each stream is a subflow within the overall data flow, where multistreaming refers to the simultaneous transmission of several independent streams of data in an SCTP association. SCTP multistreaming works by adding stream sequence numbers to the chunks of each stream. Sequence numbering guarantees the in-order packet delivery inside a stream while unordered delivery can happen across streams. Therefore, arrived data of a stream can be delivered to the application layer even if other streams are blocked because of losses. Default SCTP also uses another sequence space called Transmission Sequence Number (TSN) TSNTransmission Sequence Number for each chunk – the unit of information within an SCTP packet [192]. TSN is global for all streams with the goal of lost detection and reconstructing the original data at the receiver. Besides, SACK/Cumulative TSN ACK are leveraged as acknowledgment methods. Cumulative TSN ACK is a field of SACK to acknowledge the TSN of the last successfully received DATA chunk to the sender. For data transmission protection, SCTP uses a retransmission mechanism upon two types or events. First, whenever RTO expires. Second, after four SACK chunks have reported gaps with the same data chunk missing. Besides, SCTP uses uncoupled congestion control, and a shared buffer is used for all paths on the receiver side.

SCTP performance limitations. SCTP presents performance limitations in heterogeneous paths and it is challenging to adopt it for video streaming:

  • Applications modification requirement: SCTP requires distinct socket API and applications modifications [184].

  • Lack of support in middleboxes: SCTP suffers from lack of support in middleboxes [184].

  • Frequent primary path exchange: SCTP is slow due to frequent primary path exchanges in case of failure. In SCTP, the process of path primary exchange takes a long time [109] by, for example, detecting 6 lost packets. In SCTP, a packet is recognized as lost if the sender does not receive ACK at a specific time of RTO. RTO is set to 1 second at the start and after each lost detection, it doubles. Finally, the minimum time to change the path is 63 seconds. Therefore, the process of path primary exchange takes a long time and causes a high delay. This issue is considered in the works, [107, 108], and SRMT [109].

  • Lack of load balancing support: Default SCTP is not load balancing over multiple paths. Load balancing is an important factor in multipath transmission. Several efforts have been done to add capability of bandwidth aggregation to SCTP, and also adapting this protocol for video streaming. This issue is considered in the surveyed works, CMT-DA [111], CMT-CA [112] and CMT-QA [76].

  • Unnecessary fast retransmission: Out-of-order packet delivery and wireless losses could trigger unnecessary fast retransmissions, decrease goodput sharply, and consequently mitigate transmission efficiency [76]. This issue is considered in the surveyed works, CMT-DA [111], CMT-CA [112] and CMT-QA [76].

  • Content-agnostic traffic scheduling: While considering video content features in scheduling strategy could improve the QoE and network utilization, default SCTP scheduling treats in a content-agnostic fashion. This issue is considered in the surveyed work, CMT-CA [112].

  • Fully reliable and ordered service: SCTP is a fully reliable and in order protocol, which is not required by video streaming. In our surveyed works, PR-SCTP [110] applied the concept of partial reliability in SCTP for real-time video delivery.

Improved scheduling mechanisms. There are several approaches to improve SCTP to solve the above mentioned problems and provide video streaming over this transport protocol. We briefly mentioned them and next, we provide more details.

To reduce the explained problem of longtime primary path exchanging in SCTP, Kelly et al. [107] proposed a delay-centric strategy to set the primary path based on the lowest End-to-End delay and RTT. The solution improves quality, but using this adaptive primary path selection in the lossy wireless environment makes the SCTP slow due to frequent path exchanges. This approach does not use the full ability of all paths and uses the primary path for data transmission and secondary paths as backup.

A more stable solution based on SCTP is in [108]. The authors defeated with packet loss by proposing a selective bicasting method. Therefore, instead of sending the same data through two different paths (bicasting), which would lead to significant congestion and reduce the throughput, the selective bicasting method duplicates only important packets. These important packets are retransmissions. However, this approach has not defined sensitive data, like I-frames, as important packets.

Da Silva et al. [109] proposed a Selective-Redundancy Multipath Transfer (SRMT) scheme. In this approach, the primary path is used to transfer data and secondary paths are used to send redundant packets, which have more priority and stronger delay limitation. These redundancies mitigate degradation QoE. There are two key factors for packet selection over secondary paths. The first one is the amount of redundant packets to be transferred, which is calculated based on smooth Round Trip Time (sRTT) of the primary path and the maximum delay tolerated by the application. The second one is the selection of packets, which have to be sent redundantly based on the importance of packets for reconstructing the video (a content-aware approach). For example, I-frames have the highest priority and among the I-frame packets, the initially ordered ones have more priority than others. P-frames are the next and the lowest priority is for B-frames. Duplicated packets on the receiver side would be discarded. SRMT uses the default SCTP handover scheme to avoid HOL problem.

In order to make reliable SCTP protocol flexible for video streaming, the Partially Reliable SCTP (PR-SCTP) extension was firstly defined in [194], and later additional policies were specified in [195]. Similar to the explained concept of partial reliability for MPTCP in Section IV-B4, PR-SCTP introduced some policies for choosing reliability level. PR-SCTP supports choosing the retransmission policy by using either a maximum number or a time for retransmissions, and after that, the packet will not be retransmitted anymore. PR-SCTP shows benefits for time-sensitive applications involving video and audio streaming [196]. In our surveyed works, the proposed approach in [110] utilized the partial reliability services of PR-SCTP for real-time H.264/AVC video streaming. H.264/AVC has a Network Adaptation Layer (NAL) feature, which is a layer of abstraction over the actual encoded data. NAL header contains decoding parameters and its level of importance for decoding. This information is used by PR-SCTP to decide the number of retransmissions for each I, P and B-frames. A probabilistic model is developed to find optimum values for the maximum number of retransmissions for different types of frames in order to provide a trade-off between reliability and delay. Retransmissions are over the secondary paths. The result shows that the proposed solution outperforms UDP and TCP.

Another extension solution of SCTP is Concurrent Multipath Transfer (CMT) [197]CMTConcurrent Multipath Transfer. Most CMT solutions use all the available paths simultaneously for data transferring to increase the throughput and network resiliency. There are many schemes developed based on CMT, such as CMT-DA [111], CMT-CA [112] and CMT-QA [76]. Among these works, CMT does not use any path selection method and uses Round Robin for data distribution. Using Round Robin for CMT not only increases out-of-order delivery, and HOL blocking at receiver, but also increases SACK overhead and additional unnecessary retransmission. CMT evolved to perform better estimation of the network situation and choosing qualified paths for data transmission in CMT-QA [76], CMT-DA [111] and CMT-CA [112]. CMT-CA [112] is also fed with video content properties besides the network situation. These works are also different in designing of congestion control and retransmission mechanism. More details will be presented in Section VI-A.

Xu et al. [76] proposed a path and quality-aware adaptive concurrent multipath transfer (CMT-QA) approach for packet scheduling over network channels. The goal of this scheme is decreasing out-of-order problem by reducing the unnecessary fast retransmissions and reordering delay. To achieve this target, a path quality estimation model (PQEM)PQEMPath Quality Estimation Model, an Optimal Retransmission Policy (ORP) ORPOptimal Retransmission Policy and Data Distribution Scheduler (DDS) DDSData Distribution Scheduler are introduced. PQEM calculates each path quality by estimating the rate of the distributed data, which is a function of sending buffer size and transmission delay. In PQEM, the shared sender buffer is divided into subbuffers. Each path has its own subbuffer and management independently and the allocation of buffer space size is dynamical. ORP handles packet loss differentiation and retransmits the lost packets over faster paths. DDS predicts the arrival time of data distributed over each path, and determines the amount of data to be transferred based on the congestion control parameters including cwnd, rwnd and sender buffer size. Therefore, DDS distributes data per path in the way that they arrive to the receiver in order. SACK is used for acknowledgment method. However, the approach does not concern TCP fairness toward other traffic flows [119] and it is not appropriate for video due to the lack of use of video content parameters.

Wu et al. [111] proposed a distortion-aware concurrent multipath transfer (CMT-DA) scheme and claimed that this approach was the first work to introduce the video distortion into SCTP for enhancing HD video quality in heterogeneous wireless environments. The goal of this approach is decreasing video distortion by mitigating the effective loss rate for variable bit rate video streaming. To achieve this goal, three main methods are proposed: path status estimation and congestion control, flow rate allocation, and data retransmission control. CMT-DA estimates path situations (e.g., RTT and available bandwidth) by processing ACK feedbacks, and applies a distortion-aware model at the flow level to schedule the packets. Aggregated feedback packets are sent after each packet delivery. The used SACK/Cumulative ACK feedback packets return to the sender through the most reliable paths to avoid losing or dropping during the network transmission. In addition, the congestion control is designed per path and defined parameters are RTT, cwnd and RTO. ECN detects path congestion and changes the congestion window size. The rate controller is proposed to choose a subset of paths dynamically and assign data transmission rates. The data retransmission control is defined to retransmit the packets which are estimated to arrive at the destination within the deadline. However, only flow level distortion consideration without analyzing frame priority and decoding dependency of frames is not adequate for video streaming.

In another surveyed work, Wu et al. [112]

proposed a content-aware CMT (CMT-CA) scheme and claimed this approach was the first SCTP to incorporate the video content analysis into the scheduling for enhancing HD video quality in heterogeneous wireless environments. The goal of CMT-CA is to accurately estimate the video content parameters and appropriately schedule the video frames to achieve the optimal quality. To achieve this goal, three main methods are proposed: quality evaluation based decision making, congestion control, and data distribution. Quality evaluation based decision making estimates network situation and frame level distortion. Further, these pieces of information are used for packet scheduling. Similar to what explained for CMT-DA, SACK/Cumulative ACK feedbacks are used for path situation estimation and they are sent after each packet delivery through the most reliable paths. The congestion control for CMT-CA is designed per path, Markov model-based (MDP), and is TCP-Friendliness. Congestion control parameters are RTT, cwnd, RTO and ssthresh. ZigZag scheme 

[163] detects path congestion and MDP changes the congestion window size. Data distribution is responsible for packet scheduling and different transmission is applied for I and P frames. Therefore, high priority frames can be transmitted first, which helps to decrease video distortion. Besides that, the proposed algorithm drops the video frame if its parent frame cannot be delivered due to bandwidth restriction. Therefore, this algorithm conserves network resources. Besides the proposed methods, CMT-CA also utilizes similar data retransmissions methods designed in CMT-DA. For example, SACK [165], which provides a list of correctly/incorrectly received packets to the sender, and cumulative ACK, which informs the last successfully received packet to the sender.

Iv-C Network Layer Approaches

Video streaming approaches focusing on the network layer have access to the IP level and to useful information in multipath scenarios, such as network, routing and data forwarding information. In addition, network layer multipath approaches take care of data spread over different interfaces without the application awareness about this process. The biggest challenge of these solutions is that they generally require network changes, new infrastructure or modifications in the kernel of operating systems. Our surveyed works are categorized into two groups based on the required network technologies: SDN/OpenFlow-based and Proxy-based approaches. Theses surveyed works will be discussed in this subsection. Table III presents each category.

Iv-C1 SDN/OpenFlow

Software-Defined Networking (SDN) SDNSoftware-Defined Networking is a network architecture based on a logically centralized control plane [23] and programmatic abstractions (e.g., OpenFlow) to define the behaviour of the forwarding devices (e.g., routers, switches). SDN controllers gather network information including capacity and packet loss rate of links in real-time and dynamically change routing paths based on the network situations and policy definitions. In this survey, we leave out of scope the topic of how paths are computed. We only cover relevant works on refactoring and modifying the networking stack on Android and Linux devices to be able to use multiple network interfaces simultaneously in [67], and we also discuss SDN feedback approach for path decision actions, as proposed by MARS [81].

Yap et al. [67] explored how to make use of all the available networks around us. The approach provides a seamless HTTP connectivity on heterogeneous networks. In this approach, to transfer data from one application over multiple interfaces, the application uses one IP source address. Then, the networking stack spreads data over multiple interfaces and assigns an IP address for each one. This was implemented by using a virtual Ethernet interface to connect the application, with its local IP address, to a special gateway inside the Linux kernel. This gateway combines multiple interfaces together without the application knowledge. To implement the solution, the authors re-factored the networking stack connectivity service of the Android kernel and added a controller Open vSwitch (OVS) OVSOpen vSwitch in the kernel of the mobile devices. OVS has an OpenFlow interface and can utilize flow table entries. Therefore, controller and OVS helped to route and re-route the flows and packet controlling.

The goal of Multiple Access Radio Scheduling (MARS) [81] is solving out-of-order problem and reducing the End-to-End delay. MARS is implemented on separate TCP connections. The authors used SDN for flow aggregation and flow splitting, and also designed a scheduling scheme, named MARS, which is based on relative RTT measurement (which will be explained in Section V-C). The relative RTT is calculated each fixed period of time to make sure it is always valid. Accordingly, the low-latency paths are chosen for data transmission. In MARS, the controller calculates bandwidth and RTT of each path, and notifies them to the sender. The sender can also inquiry such information from the controller. This information would be used in scheduler to split video blocks into several paths. These flows combine on edge router close to the client for one-interface receiver, but it can also work for the receiver with two interfaces. However, the approach considers neither packet loss for path quality calculation nor priority of video data units.

Iv-C2 Proxy solutions

It is possible to use proxy at one side (client/server) or at both sides. Using proxy at one side hides multipath transmission from the other side. In the case of using proxy on both sides, each endpoint communicates with the proxy via a normal connection without awareness of the multipath communication. In proxy-based applications, a tunneling IP-in-IP mechanism (to encapsulate one IP packet as a payload in a new IP packet) is used to redirect data to different paths over routing level. Consequently, proxy-based approaches are transparent to both transport and application layers and do not require any changes in them [8].

Chebrolu et al. designed a network layer architecture, Bandwidth Aggregation (BAG) [113], to utilize bandwidth aggregation for real-time applications. In BAG, server streams video data to the client by using a UDP socket. In particular, there is a proxy at the client side, which is aware of client interfaces and splits flow over these network interfaces by using IP-in-IP tunneling (see Figure 11). The proposed scheduling algorithm, Earliest Delivery Path First (EDPF)EDPFEarliest Delivery Path First, estimates the delivery time of each packet over each path and spreads packets over the fastest path in order to avoid packets from missing their deadlines and minimizing packet reordering. Delay and wireless bandwidth between the proxy and the client are used for delivery time estimation. As a result, EDPF is more efficient than Round Robin in avoiding HOL [8]. The advantage of using proxy at the client side is that no change is required at the server side [8].

Fig. 11: BAG [113] system architecture featuring the use of a proxy and IP-in-IP tunneling between a client and the proxy (source: adapted from [113]).

Iv-D Cross Layer Approaches

Although it is possible to estimate throughput or bandwidth and other network parameters at the application layer, they are not as accurate as the transport or network layer measurements. Different layers have different knowledge levels. For instance, the application layer is aware of video features, player buffer and deadlines. The transport layer is able to calculate the bandwidth and RTT, and it also has a congestion control mechanism. The network layer accesses IP level and routing paths, and the link layer has wireless parameter access.

Therefore, the interaction between different layers has the benefit of utilizing the advantages of different layers by signaling messages among them. This interaction is known as cross layer and was epitomized in the Transport Services (TAPS) working group by IETF [160]. Mostly, lower layers gather network information and feed them to higher layers [8].

In cross layer approaches, usually application layer or transport layer becomes the main layer. The main layer could decide a path for data transferring and manage load balancing or apply a method to save energy. The main layer could even change other layers behaviors. For example, application layer could change the TCP window size in order to control throughput, modifies routing tables, disconnect and reconnect the interfaces to manage failure or energy saving [8].

Therefore, we categorize our surveyed works into two groups: decision by application layer, and decision by transport layer, depending on which layer can be considered the main one, as discussed further in this subsection and summarized in Table III.

Iv-D1 Application Layer Decision

Corbillon et al. [78] proposed a cross layer approach with interaction between application and transport layer. In this approach, an adaptive mechanism is used to select the segments on application layer and MPTCP is used as transport protocol. The main goal of this approach is to maximize the amount of data that is received on time to destination. Therefore, it utilizes the benefit of being application aware to estimate playback deadline and it only sends the video units that have chance to arrive in time. As there is no cross layer feedback available in MPTCP, it is assumed that such a feedback exists and can be used. The feedback should indicate which path should be selected by MPTCP to send the next packet and only after that the cross layer scheduler would give MPTCP the data to send on this selected path (only one packet at a time). Therefore, the scheduler, which is content-aware, can decide if and when a video unit is given to the transport layer.

Ojanpera et al. [114] proposed a cross layer approach with interaction between application and network layer. The goal of this approach is to improve quality and availability of video streaming. The approach utilizes DASH to provide transparently bit rate adaptation support and MPTCP with default settings (coupled congestion control and default scheduling strategy) to provide multipath transmission capability. As explained in Section IV-A2, rate adaptation method available in DASH system could perform more efficiently if it could access accurate network information. Therefore, in this work, a network management system, built upon the Distributed Decision Engine (DDE) framework, is proposed. DDE provides network information, including QoS, load, and capacity. Consequently, the client is adjusted to support DDE in order to incorporate the gathered network information into the bit rate adaptation decision in order to cope with changes in the network available bandwidth. Then, the MPTCP scheduler on the server side is responsible for mapping data on the different paths. For achieving network load balancing, the operator network management (of DDE) can dynamically disable the access network for the client by DDE signaling. MPTCP reacts to the event by stopping the usage of the corresponding path and mapping the traffic to other available paths. Finally, the results of the work show that using more network information for client bit rate adaptation decision outperforms standalone throughput-based by improving the stability of the video.

Wu et al. [115] developed a model, Goodput-Aware Load distribuTiON (GALTON), in application-network layer. GALTON optimizes the goodput performance of video streaming over multipath networks. Goodput is an application level throughput, a key parameter for video QoS and refers to the successfully received data at the receiver within the deadline. In GALTON, the receiver monitors network status (e.g., available bandwidth, RTT, PLR) and informs this information to the sender via feedback. The sender estimates the path quality based on the reported network information and detects congested paths by ZigZag scheme. There is also a proposed flow rate allocator which is responsible for partitioning flows to several subflows and assigning them to the available paths to optimize the aggregated goodput. It is also responsible for performing load balancing. Then, packets scheduled to the same path would be spread out within imposed deadline through the UDP connections. Besides that, scheduler adjusts probe rate and probing packet sizes dynamically over the congested paths.

Wu et al. [116] proposed a flow rate allocation-based Joint Source and Channel Coding (FRA-JSCC) approach in an application-physical layer. Joint Source and Channel Coding (JSCC) is an efficient solution for improving error-resilient in wireless video transmission. Therefore, in this work, JSCC is optimized to a FRA-JSCC for mobile video broadcasting in multipath networks. In FRA-JSCC approach, three main methods are proposed. First, FEC redundancy estimation to protect video data against channel losses. Second, source rate adaptation based on the calculated encoding rate. The encoding rate is concerned because high encoding rate makes more channel distortion and imposes high delay due to heavier load and network congestion. On the other hand, low encoding rate cannot provide the video delay requirements. Third, flow rate allocation is responsible to dynamically select the appropriate paths out of all available access networks and assign the transmission rates to them based on Weighted Round Robin (WRR) scheduling strategy.

Iv-D2 Transport Layer Decision

Han et al. [117] proposed MP-DASH framework, with overall goal of enhancing MPTCP to support adaptive video streaming (DASH) under user-specified interface preferences. For this goal, MP-DASH is designed as a cross layer approach with interaction between application and transport layer. In order to implement MP-DASH two components are designed: MP-DASH scheduler, and MP-DASH video adapter, as shown in Figure 12.

MP-DASH scheduler is implemented with MPTCP scheduler with knowledge of network interface preferences from the user and aggregated throughput. MP-DASH video adapter component, which is a lightweight add-on, is implemented to integrate the MP-DASH scheduler with DASH rate adaptation. Video adapter exchanges information between video player and MP-DASH scheduler (segment sizes and deadlines from video player to MP-DASH scheduler, and throughput from MP-DASH scheduler to the video player). This way, DASH algorithms becomes multipath friendly and MP-DASH scheduler becomes aware of delivery deadline. Besides that, MP-DASH splits the MP-DASH scheduling functions into two parts: decision function on the client, and enforcement function on the server. Decision function determines how to manage paths based on information from video player (e.g., segment sizes and deadlines), and enforcement function operates the decisions. The knowledge of network interface preferences is used to reduce cellular data usage while maintaining video QoE. Therefore, the approach starts data transferring with WiFi link and checks WiFi throughput dynamically to see if it is sufficient. If WiFi cannot deliver data before deadline time, the cellular network should be enabled. The results of the work show cellular usage reduced up to 99%, and radio energy consumption reduced up to 85% compared with the default MPTCP.

Fig. 12: MP-DASH system architecture (source: adapted from [117]).

The work in [118] proposed a dynamic MPTCP path control using Software-Defined Networking (SDN) (which makes cross layer approach of transport and network layer). The goal of the approach is to cope with out-of-order delivered packets to speed up download rate and improve video QoE in ABR streaming. In this work, the authors show the feasibility of using SDN platform regarding MPTCP. The SDN controller monitors information and estimates path capacity. Then, the SDN controller communicates periodically with the SDN clients to inform which paths are the best. The SDN platform on the client side removes poor and low capacity links because poor links increase the MPTCP reordering queue size. The removed paths attach again when they return to the proper capacity. Throughput measurement is used to find the available path capacity. It also may consider other multiple factors, such as RTT and delay to compute the best paths depending on the applications (e.g., video, VoIP or web surfing). Therefore, SDN application dynamically selects the proper paths and adjusts the number of paths in real-time. The evaluation shows that dynamically switch between MPTCP and SPTCP increases download time. In addition, the results of DASH implementation over the proposed dynamic MPTCP path control shows less bit rate change and rebuffering than without dynamic MPTCP path control.

Cross-layer fairness-driven SCTP-based CMT solution (CMT-CL/FD) approach [119] is a path quality-aware approach over CMT. In CMT-CL/FD, cross layer evaluates path quality by using loss rate information in effective signal-to-noise ratio (ESNR) ESNReffective signal-to-noise ratio (which is calculated at the link layer), and bandwidth or transmission rate information (which are estimated at the transport layer). ESNR is an upgrade calculation for signal-to-noise ratio/noise ratio (SNR) SNRsignal-to-noise ratio to evaluate wireless communication quality because the default SNR method has some shortcomings. For example, SNR is not accurate in real-time communication, and is not able to capture co-channel interference, frequency-selective fading and signal multipath effects [198]. Then, CMT-CL/FD distributes data intelligently over different paths depending on their estimated quality. A loss-cause dependent retransmission (RTX) policy is also introduced to distinguish wireless loss from congestion loss. Consequently, in case of congested network, cwnd is changed and retransmission occurs (as explained in Section IV-B5). Finally, this proposed approach mitigates reordering, losses, and consequently decreases HOL problem. However, none of these works use video content features for the scheduling strategy.

V Scheduling, Resilience, and Path Selection

A key characteristic of video data is that, based on the en/decoding technology, packets may have unequal importance (e.g. I-frames vs P-frames). Considering the importance of each packet, different error protection levels can be applied. In addition, packets can be sent over different network paths based on paths quality to meet real-time deadlines, increase reliability, minimize out-of-order packet delivery, circumventing path heterogeneity issues [8], as discussed in Section 4. Therefore, wirelesss multi-path video scheduling strategies need to consider, at least, three main functional aspects; packet selection, packet protection and path selection.

We now revisit the works surveyed in Section IV through the new classification presented in Tables VVIVII, and VIII based on the following questions:

  • Which packet should be sent next?

  • How to protect the packet?

  • Which is the best path to send the packet?

Which packet? How to protect the packet? Which path?
Channel Level
Source Level
Bandwidth/ Video
Works Content Awareness
Video Distortion
(Frame Level)
ARQ FEC Scalability MDC
Distortion (Flow Level)
MRTP  [1] N N Y Y Y Y Y Y Y N N
MPRTP  [2] Y N Y N N N Y Y Y N N
Xing et al.  [91] N N Y N N N N N Y N N
RTRA  [3] N N Y N Y N N N Y N N
Houzé et al.  [93] N N Y N N N Y N N Y N
Afzal et al.  [95] Y N N N N N Y Y Y N N
Sohn et al.  [96] N N Y N Y N Paths pre-selected
Evensen et al.  [97] N N Y N N N Y N Y N N
Evensen et al.  [98] N N Y N N N Y N Y Y N
Evensen et al.  [99] N N Y N N N Y N Y Y N
Greenbag  [71] N N Y N N N Y N Y Y N
Which packet? How to protect the packet? Which path?
Channel Level
Source Level
Bandwidth/ Video
Works Content Awareness
Video Distortion
(Frame Level)
ARQ FEC Scalability MDC
Distortion (Flow Level)
BEMA  [86] Y Y N Y N N Y Y Y Y N
Freris at al.  [100] Y Y N N Y N Y Y Y N N
Correia at al.  [101] Y N N N N Y Paths pre-selected
MPLOT  [4] N N Y Y N N Y Y N N N
MP-DCCP [102] Y N N N N N Y Y N N N
ADMIT  [103] N N Y Y N N Y Y Y Y Y
MPTCP-SD  [104] Y N Y N N N Y N N N N
MPTCP-PR  [104] N N Y N N N Y N N N N
PR-MPTCP  [106] Y N Y N N N Y N Y Y N
SRMT  [109] Y N Y N N N Paths pre-selected
CMT-QA  [76] N N Y N N N Y N N N N
CMT-DA  [111] N N Y N Y N Y Y Y N* Y
CMT-CA  [112] Y Y Y N N N Y Y Y Y N
Which packet? How to protect the packet? Which path?
Channel Level
Source Level
Bandwidth/ Video
Works Content Awareness
Video Distortion
(Frame Level)
ARQ FEC Scalability MDC
Distortion (Flow Level)
Yap at al.  [67] N N Y N N N Paths pre-selected
MARS  [81] N N Y N N N Y N Y N N
BAG  [113] N N N N N N Y N Y N N
Which packet? How to protect the packet? Which path?
Channel Level
Source Level
Bandwidth/ Video
Works Content Awareness
Video Distortion
(Frame Level)
ARQ FEC Scalability MDC
Distortion (Flow Level)
Corbillon et al.  [78] Y N Y N N N Y Y Y Y N
Ojanperä et al.  [114] N N Y N N N Y N N N N
GALTON  [115] N N N Y Y N Y Y Y Y N
FRA-JSCC  [116] N N N Y Y N Y Y Y Y Y
MP-DASH  [117] N N Y N N N N N Y Y N
Nam et al.  [118] N N Y N N N N N Y N N
CMT-CL/FD  [119] N N Y N N N Y Y Y N N

V-a Which packet should be sent next?

One important scheduling task is selecting the next packet to be sent. Content awareness and video distortion at frame level are key features to select the proper packets. These features will be discussed in this subsection. Tables VVIVII, and VIII present each category related to the protocol layer.

Note that, generally, ABR approaches rely on HTTP and separate TCP connections do not consider each one packet for data transmission and proper path for a DASH segment/subsegment need to be determined instead of packet (e.g., [91, 3, 93, 97, 98, 99, 71]). However, when using MPTCP for HTTP-based ABR video, the MPTCP scheduler performs its own transport-level scheduling for the received DASH data stream.

V-A1 Content Awareness

Considering video content features in the scheduling strategy helps to define the priority of each packet, and subsequently choose the frame packets with higher priority to send it first or via more qualified paths. In video streaming, some frames have higher effect on video quality, and large frame inter-dependency. For example, I-frames have highest priority among other frames. These strategies are generally referred to as content-aware scheduling strategies. In addition, a content-aware scheduling strategy could use stronger packet protection for higher priority packets than the less priority packets, for example, by applying adaptive FEC, which will be explained in next subsection. On the other hand, if the scheduler is unaware of the video content features, the sending buffer would transmit data packets in the same order as they arrived in the buffer (FIFOFIFOFirst-In First-Out) without considering the priority of packets (e.g., MPTCP scheduler).

Video content features are considered as inputs to the scheduling strategy in the following works: MPRTP [2],  [95], BEMA [86],  [100],  [101], MP-DCCP [102], MPTCP-SD [104], PR-MPTCP [106], CMA-CA [112],  [78]. In SRMT [109], the primary path is used for all data while the secondary paths are used to send redundant packets, which are, in turn, chosen based on their priority (e.g. I-frame packets have highest priority).

V-A2 Video Distortion (Frame Level)

Video distortion impacts perceived video quality. Generally, video distortion is considered at both frame level and flow level. In this section, we study the frame level video distortion because it assesses inter-frame dependencies and analyzes each specific video frame, including the frame priority and decoding dependency [100]. We will discuss flow level video distortion in Section V-C5. In particular, frame level distortion refers to the quality degradation of each frame of GoP after data transmission and video decoding process [86]. This way, the frame level distortion is calculated as a total of truncation and drifting distortion. The truncation distortion refers to the video quality degradation caused by packet drops during transferring data, and the drifting distortion refers to the video quality distortion occurred by imperfect reconstruction of parent frames which are used for inter-frame prediction. In the surveyed works, frame level distortion is used by BEMA [86] for calculating FEC coding parameters (e.g., code rate and symbol size), and also it is used by [100] to assign higher priority values to the pictures which minimize the distortion of the decoded video affected by packet loss. Such information could also be used for path selection in CMT-CA [112].

V-B How to protect the packet?

Providing packet protection techniques to the scheduler leads to data loss rate decreases, and consequently, better video streaming throughput and QoE. In fact, inter-dependency among video frames causes a compressed video to be very sensitive to data loss. By this idea [199], individual frames of pictures are grouped together, which is called GoPGoPGroup of Picture. Each GoP consists of one initial Intra (I)-frame, several Predicted (P)-frames and possibly Bidirectional (B)-frames [200]. While an I-‑frame is encoded without reference to any other video frames, but a P-‑frame is encoded with reference to previous I or P-frames, and a B-‑frame is encoded with reference to both immediate previous and forward I or P-frames. Therefore, in the decoding process, loss of some frames may preclude a proper decoding, especially in the miss of I-frames. Thus, it is important to protect frames (especially I-frames) in lossy wireless channels. For this purpose, some JSCC/Channel Level and Error Resilience/Source Level techniques have been implemented. These techniques will be discussed in this subsection. Tables VVIVII, and VIII present each category of such techniques divided by the protocol layer.

V-B1 Joint Source and Channel Coding (JSCC)/Channel Level techniquesJSCC Joint Source and Channel Coding

The channel level techniques for JSCC are Automatic Repeat reQuest (ARQ) and Forward Error Correction (FEC).

Automatic Repeat reQuest (ARQ)ARQAutomatic Repeat reQuest retransmits requests to provide reliable data transmission. The retransmission occurs in case of packets lost or received with bit error. Inherently, all protocols atop or extensions of TCP (e.g., HTTP, DASH, MPTCP) use ARQ. However, the retransmission wastes bandwidth, causing network congestion, and consequently, increasing End-to-End delay. For example, in efforts to mitigate these problems: CMT-QA [76] retransmits packets over the path with minimum transfer delay; CMT-DA [111] and CMT-CA [112] retransmit only the estimated packets to arrive at the destination within the deadline; and CMT-CL/FD [119] selects the path with the largest cwnd for the retransmission, which sends the lost packet before all the other packets that exist in the path buffer. In addition, considering the existence of many clients in multicast communications, responding to the retransmission requests of all clients might be difficult for the server.

Other surveyed works, which utilize ARQ as JSCC technique are MRTP [1], MPRTP [2][93][91], RTRA [3][96][97][98][99], Greenbag [71], MPLOT [4], ADMIT [103], MPTCP-SD [104], MPTCP-PR [104], PR-MPTCP [106], SRMT [109][67], MARS [81][78][114], MP-DASH [117], and [118].

Forward Error Correction (FEC) FECForward Error Correction appeared to remedy the shortcoming of packet retransmission and delay constraints, especially for live video streaming. Such a technique is also applied in multicast communication and whenever retransmission is costly or impossible, for instance, in one-way communication links [201].

FEC can be applied to circumvent packet erasures/loss by cross-packets FEC in the application or transport layer (inter-packet FEC), and/or to handle bit errors in the physical layer [202] (intra-packet FEC). In wired networks, it can have packet loss and packet truncation due to congestion. Therefore, either the packets are dropped by the network routers or the receiver due to excessive delay. In wireless networks, besides packet loss and packet truncation, there exists also bit errors due to noisy channels. Next, more details about inter- and intra-packet FEC techniques are provided.

In inter-packet FEC, redundant/parity packets are commonly generated in addition to source packets to perform cross-packet FEC, which is usually achieved by erasure codes. These allow the receiver to detect error packets and correct data without retransmission. The capability of FEC to recover the lost data depends on the added redundant symbols. Among the many existent erasure codes, the most commonly studied ones are Reed-Solomon (RS)RSReed-Solomon [203], Low-Density Generator Matrix (LDGM) [204] LDGMLow-Density Generator Matrix and Raptor codes [205]. In our surveyed works, ADMIT [103], GALTON [115] and FRA-JSCC [116] utilize RS due to stringent delay constraint. MPEG-H part 10 defines several MMT AL-FEC algorithms, including RS codes and LDGM. Raptor coding is used in BEMA [86] due to low processing time and high error correction capability. Such erasure codes could be applied at frame level, GoP level, or subGoP level for video protection [116].

In frame level [206], the frames in each GoP are classified in terms of their type and their distance from the leading I-frame. Then, FEC is applied on the frames according to their priority. Besides, low priority frames can be dropped based on network conditions. In GoP level (see Figure 13), each GoP packetizes in source packets. Then, FEC encoding maps source packets to some encoded packets. A FEC block of data packets contains of source packets, and redundant packets. Redundancy in FEC is calculated as , and the code rate is equal to . In SubGoP level [86], each GoP consists of several subgroups, each mapped to a source block. In our surveyed works, GoP level is used in ADMIT [103], GALTON [115], FRA-JSCC [116], and SubGoP level is used in BEMA [86].

Fig. 13: GoP level FEC technique.

A trade-off between bandwidth/End-to-End delay and FEC redundancy is required. In particular, a smaller FEC packet size indicates a larger FEC block size due to the larger number of redundant packets [103]. While higher redundancy leads to better recoverability, it also increases overhead rate and bandwidth consumption. Consequently, congestion, packet reordering, FEC decoding delay and End-to-End delay have their probability increased, especially in the presence of burst losses. Therefore, an adaptive FEC is required to minimize these problems (e.g., bandwidth consumption and End-to-End delay), and maximize the recoverability by adaptively changing FEC parameters (e.g., adequate FEC packet size and FEC redundancy) according to the network channel status, application delay characteristics, or based on the importance of content data. For example, a stronger FEC would be used in a more lossy channel while not required in a more stable channel with less loss rate percentage, or more robust FEC could also be used only for I-frames rather than B or P-frames.

In our surveyed works, adaptive FEC is used in several works, like FRA-JSCC [116] and GALTON [115] to find FEC redundancy, ADMIT [103] to adjust FEC redundancy and code rate, and BEMA [86] to set code rate and symbol size. Moreover, MPLOT [4] also adaptively chooses block sizes, considering the usage of large block sizes in order to reduce bursty loss for delay-tolerant applications. We also identified FEC usage in MRTP [1].

Besides using FEC method, an adequate technique is also requested to distinguish losses due to traffic congestion with the ones caused by wireless channel disturbances and impairments. It is based on the fact that FEC redundancy in wireless lossy networks leads to better packet recovery; however, adding more FEC redundancy in a congested network worsens network situation since it pushes higher congestion and more losses [207] due to bit stuffing operations. More technical details on packet loss differentiation are provided in Section VI-A.

In intra-packet FEC, channel coding is applied to correct bit errors in the physical layer. Turbo Codes (parallel Concatenated Constitutional coding) and Low-Density Parity-Check (LDPC)LDPCLow-Density Parity-Check codes are generally used. Error detection is performed at the link layer, based on Cyclic Redundancy Check (CRC)CRCCyclic Redundancy Check. Due to this approach, only packets passing CRC stage are visible on the network/Internet layer.

Therefore, FEC provides reliable access network and End-to-End video distortion minimization. Moreover, a joint-ARQ and FEC usage approach can enhance efficiency, depending on the adopted strategy to couple both techniques. For example, in our surveyed works, ADMIT [103] utilizes FEC for reconstructing data, and consequently, it leads to delay reduction, making video data ready for fast video playback. However, there is no additional help to mitigate the number of retransmissions and bandwidth consumption increases drawbacks, since there is no ACK message sending to inform the server that the data is successfully reconstructed. Therefore, the MPTCP protocol on the server keeps sending retransmission of each lost packet until it receives the ACK from the receiver. This scenario outlines a motivation for a proper ARQ-and-FEC joint approach, using FEC for data protection, while retransmitting events only occur when there is no way to perform data reconstruction.

V-B2 Error Resilience/Source Level

Besides employing JSCC techniques to recover from packet loss and bit errors, increasing the error resilience of the video sequence itself is also an important task. To provide this functionality, error resilience techniques embrace, among others, the usage of Scalable Video Coding (SVC)SVCScalable Video Coding and Multiple Description Coding (MDC)MDCMultiple Description Coding methods.

In SVC  [208], source video is encoded in one base layer and several enhancement layers. These layers are hierarchically dependent to each other. This means that, at the receiver, each layer can be decoded only when its lower layers have been correctly received. Therefore, video quality is improved based on the number of received enhancement layers. In order to improve the efficiency of SVC, base layer is often protected by FEC or it is transmitted through more reliable paths, due to its importance. In the proposed approach [100], each packet is transmitted to the network only if all other related packets in lower layers have been sent before. Other surveyed works, which utilize SVC as Error Resilience are MRTP [1], RTRA [3][96], CMT-DA [111], GALTON [115] and FRA-JSCC [116].

In MDC [208], source video is encoded into several independent compressed streams which are called descriptions. Each description can be decoded independently and shall provide acceptable quality. When one or more descriptions arrive at the receiver, a video with a certain quality level would be made by the decoder. MDC is a good alternative to retransmission in order to remedy the delay constraint in real-time video streaming.

According to a reviewed work about MDC techniques for video streaming [208], MDC is more useful than FEC in the case of high lossy networks, since FEC uses long code block sizes, increasing bandwidth consumption as well. MDC also outperforms SVC in high lossy networks, but SVC is more proper than MDC in low loss rate networks, due to overhead reduction. MDC is also recommended for multicast with heterogeneous receivers [209]. Accordingly, works like  [101] and MRTP [1] utilize MDC as error resilience technique.

V-C Which is the best path to send the packet?

Before discussing how could select the proper path to transfer the packet, it is worthwhile to mention that using many paths for data transmission does not always lead to better QoE, since many paths for video delivery make large overheads due to parallel connections [144]. According to [210], it is possible to achieve maximum multipath benefits with just using two paths by using a proper scheduling strategy.

The simplest scheduling strategy is Round Robin [8]. This strategy sorts paths and sends data to the next available path in circular order without taking into account the heterogeneous paths’ characteristics. In Round Robin strategy, slow channels would be overloaded while fast channels remain underutilized (e.g., CMT [197]).

Obviously, scheduling strategies that are aware of path characteristics (e.g., RTT, packet loss rate) generate wiser scheduling decisions. These strategies generally referred to path-aware scheduling strategies. For example, Weighted Round Robin (WRR)WRRWeighted Round Robin is a scheduling strategy which assigns weight to each path. Weight shows path capability regarding available bandwidth/delay/packet loss rate. This way, data distribution is proportional to the path transmission capability (e.g., MPTCP and FRA-JSCC [116]). Earliest Delivery Path First (EDPF) is another scheduling strategy that estimates the delivery time of each packet over each path. Then, the packets are transmitted over the fastest path in order to prevent from missing their deadlines and minimizing packet-reordering (e.g., BAG [113] and MPLOT [4]).

Finding End-to-End path capability of real-time video traffic communication leads to estimate path quality or path reliability [95], [86, 103, 76]. Therefore, scheduling strategy could map higher priority packets to the more reliable or qualified paths (assume that it is a combination of content-aware and path-aware scheduling strategy).

It is important to note here that mapping many packets to most qualified or reliable paths pushes congestion over that path, and consequently decreases video quality, which is called load imbalance problem [111]. Therefore, using a method to balance the data over the available paths is required. In our surveyed works, BEMA [86][100], ADMIT [103], CMT-CA [112], CMT-DA [111] and GALTON [115] use load balancing mechanism to avoid imbalance problem.

Most network characteristics that are used to find the quality or reliability probability of network channels are RTT/Delay, PLR, Available bandwidth/Throughput/Goodput. There are also some other metrics that lead to better path selection and scheduling decision, such as delay constraint and video distortion at flow level. These network characteristics and metrics will be discussed in this subsection. Tables VVIVII, and VIII present each category per protocol layer.

V-C1 RTT/Delay

Round Trip Time (RTT)RTTRound-Trip Time is the time required for a packet to be sent plus the time it takes to receive an ACK of that packet [115, 103]. Therefore, RTT consists of the packet transmission time and path propagation delay [103]. In order to avoid sudden variations of RTT, some approaches (e.g., MPTCP and SCTP) apply a smoothing factor to the RTT which is called smooth Round Trip Time (sRTT)sRTTsmooth Round Trip Time. In the approaches without ACK method, for example, UDP-based approaches, one-way delay could be considered instead of RTT.

Considering RTT/Delay for path scheduling decreases the probability of expired arrival packets, stall or out-of-order packet delivery. In our surveyed works, MARS [81], which is implemented over separate TCP connections, utilized a relative RTT measurement method based on OpenFlow protocol. In this approach, duplicated packets (probes) are sent through different interfaces. The probes would return to the sender through the common reverse path from the edge switch close to the client side. The transfer process can be implemented with the tables of OpenFlow at the edge switch. The approach measures the relative delay of forward paths instead of their absolute delay because, in case of absolute forward path delays, the tight clock synchronization between sender and receiver is required. More information and comparison details between relative and absolute delay can be found at [211].

In SCTP protocol, the acknowledgment of the sent packet (SACK) can be transmitted over different paths. Mostly the acknowledgment packet returns through the most reliable path to mitigate the probability of dropped or overdue feedback packets. Since paths have different delay characteristics, the estimated RTT is incorrect and using this estimated RTT to find the path quality leads to the wrong result. For this reason, CMT-QA [76] does not use RTT directly. Instead, it uses transmission delay. Transmission delay refers to the time difference between the time of the first chunk entering each path sender buffer from a group of distributed data chunks and the time of the last chunk leaving the path sender buffer. CMT-CL/FD [119] utilizes the SCTP heartbeat mechanism to calculate RTT. In this mechanism, the HEARTBEAT-ACKs have to return through the same path used to send the HEARTBEAT messages.

Since in RTCP protocol, which is generally used by RTP, is possible to calculate RTT by using sender and receiver reports, the multipath transmission approaches over RTP, such as MRTP [1] and MPRTP [2] extended RTCP in order to calculate RTT in multipath transmission solutions.

FRA-JSCC [116] and BAG [113], which are the approaches that use UDP as transport protocol, utilize propagation delay. FRA-JSCC [116] calculates propagation delay network characteristic by using the existing time stamp in each header packet.

RTT/Delay is also used for packet loss differentiation decision in CMT-QA [76], CMT-CL/FD [119], BEMA [86], ADMIT [103], GALTON [115], and CMT-CA [112]. More technical details on packet loss differentiation are provided in Section VI-A. Besides, RTT/Delay can also be used for other tasks. For example, MRTP [1] sets retransmission timeout value by RTT, and Greenbag [71] utilizes RTT to determine when to send requests for the next segments.

Other surveyed works, which consider RTT/Delay network characteristic for their scheduling decision are  [93],  [95],  [97],  [98],  [99],  [100], MPLOT [4], MP-DCCP [102], MPTCP-SD [104], MPTCP-PR [104], PR-MPTCP [106], CMT-DA [111],  [78], and  [114].

V-C2 Plr

Packet Loss Rate (PLR)PLRPacket Loss Rate comprises of network transmission lost packets, which are lost/error arrived packets during the communication paths, and the expired arrival packets (overdue) [115]. Three basic reasons cause packet losses [76]; 1) congestion due to limited bandwidth or buffer size, 2) noise or interference in the wireless networks, 3) path failure or handover. Therefore, sending highest priority frame packets on the paths with less PLR leads to better QoE. Besides, PLR network characteristic and distinguishing packet loss differentiation are key factors for adaptively FEC protection (Section V-B1), avoiding unnecessary fast retransmission (Section IV-B), and video distortion estimation (Section V-A2 and Section V-C5).

PLR is considered for scheduling decision in the following works: MRTP [1], MPRTP [2], [95], BEMA [86],  [100], MPLOT [4], MP-DCCP [102], ADMIT [103], CMT-DA [111], CMT-CA [112][78], GALTON [115], FRA-JSCC [116], and CMT-CL/FD [119].

V-C3 Available bandwidth/Throughput/Goodput

Available bandwidth is defined as the maximum video rate that can be transmitted over End-to-End path [103]. Different methods are introduced to estimate available bandwidth in the literature [212, 213, 214]. Some approaches utilize throughput or goodput for this purpose. The amount of data that could traverse through a path is known as throughput. Throughput refers to all useful and not useful data, including data retransmissions, and overhead data (e.g., headers). If the scheduler considers only throughput among all network characteristics, it may distribute packets over high loss rate channels, and consequently, serious degrade of goodput performance and video quality occurs [215]. Goodput refers to the amount of useful data (exclusive protocol overhead or retransmission) delivered successfully to the destination within the imposed specific deadline[115]. Goodput is also known as application level throughput. Regarding [103], the approaches over HTTP/TCP could estimate the available bandwidth by using the observed TCP throughput. In our surveyed works, [100] measures bandwidth by using Abing [216]. GALTON [115] and FRA-JSCC [116] implement pathChirp algorithm [217] for this purpose. CMT-CL/FD [119] computes available bandwidth as the ratio between the average packet length and average inter-packet sending time. CMT-CA [112] and CMT-DA [111] believe that cwnd has effect on bandwidth, therefore, these works calculate it as . In RTRA [3], once a segment has been successfully downloaded, the transmission bandwidth would be calculated as division of the total size of transmitted data over the transmission time, and then, a Markov channel model is used to estimate future available bandwidth.

Other surveyed works, which consider Available bandwidth/Throughput/Goodput network characteristic for their scheduling decision are MRTP [1], MPRTP [2][91], [95], [97][98][99], Greenbag [71], BEMA [86], ADMIT [103], PR-MPTCP [106], MARS [81], BAG [113][78], MP-DASH [117][118].

V-C4 Delay Constraint

A real-time video application imposes a decoding deadline. In this manner, the overdue packets cannot handle at the decoder, even if they arrive successfully. Therefore, the End-to-End delay has to be less than delay constraint [103]. Besides that, considering delay constraint in scheduling strategy could also avoid playback buffer starvation [86].

In our surveyed works, the delay constraint of GALTON [115], ADMIT [103], FRA-JSCC [116] and CMT-CA [112] are set with values 300, 500, 250 and 100 ms for each video frame respectively. This value in BEMA [86] is set equal to its playback duration, so the delay constraint should be 40 ms if the video is encoded at 25 frames per second. GALTON [115] uses delay constraint to compute transmission intervals in order to mitigate consecutive losses. ADMIT [103] calculates the rate allocation vector and FEC coding parameters respect to delay constraint. FRA-JSCC [116] finds source rate adaption under delay constraint. CMT-CA [112] finds the optimal congestion window sizes and frame scheduling vector to mitigate video distortion. While CMT-DA [111] is not appropriate for the video streaming with stringent delay constraint but the retransmission method is based on the delay constraint. In the work [100], the same deadline time is assumed for all users, which is determined as a system parameter by the service provider. Then, this is used to find packet loss probability.

Proposed approaches in [93][98, 99] and GreenBag [71] are application-aware, therefore, they are aware of buffer level at the receiver in order to calculate the delay constraint. These approaches utilize adaptive streaming over multiple separate TCP connections, and mostly path selection is integrated with the adaptation logic. In works [98] and [99], the delay constraint is calculated by the client to select the suited bit rate. The client calculates the amount of already received content to playout in the buffer (transfer-deadline) and estimates how long it takes to receive the already requested data (pipeline-deadline). The difference between pipeline-deadline and transfer-deadline shows the amount of time that the client can wait to receive the next segment without interruption. Then, this estimation is compared with the estimation of the times it takes to receive the desired segment in the different bit rates, and the most proper bit rate is selected. After that, the segment is divided into subsegments. The size of each subsegment is decided based on the measured throughput of each interface that it will be requested through. The approach in [93] finds suited segment bit rate with checking the size of the first frame in each segment representation. It chooses the representation with the highest bit rate and high probability to get the frame on time. Then, it finds the best size of byte range per path dynamically based on paths’ RTT. GreenBag [71] utilizes paths’ delay and available bandwidth to determine per path subsegment size. If one path received its subsegment within a segment, but the other path is significantly lagging, so, the former path takes over some portion of the problematic path to recover. The above mentioned approaches could achieve zero or close to zero interruption during playback time.

Two more other application-aware approaches with concerning delay constraint are [78] and MP-DASH [117]. These approaches utilize adaptive streaming over MPTCP paths. MP-DASH [117] feed the modified MPTCP with the deadline of each video data unit in order to further use and path selection. The approach in [78] understands the display time of each video unit with access to the Picture Order Counts (POC)POCPicture Order Counts and the coding identifier of each frame (because it is content awareness). Therefore, the approach estimates the deadlines and ignores transmission of packets which will miss their playback deadline and instead, assigns more priority to the packets which their deadline time is close. The high priority packets can be spread through less RTT paths. This helps to use bandwidth more efficiently and experience less video distortion. In PR-MPTCP [106], when the network is detected as congested, only the packets with enough deadline time to play would be sent.

V-C5 Video Distortion (Flow Level)

We previously discussed frame level video distortion in Section V-A2. Here, we study flow level video distortion. End-to-End video distortion at flow level (intra-coding) is calculated as total of source and channel distortion [111]. Source distortion is determined by the video source rate and video sequence parameters because of their impact on the efficiency of video codec. For example, in case of the same video encoding rate, a more complex video sequence has higher distortion. As another example, increasing the video encoding rate causes decreasing distortion. Channel distortion refers to the packet losses during the network transmission and expired arrivals. Some other features including the frame structure and GoP size also have an impact on both the source and the channel distortion. Flow level video distortion is considered for scheduling strategy in the following surveyed works: ADMIT [103], CMT-DA [111] and FRA-JSCC [116].

Although most important network characteristics and metrics for path selection were discussed, but there are some other parameters that are used directly or indirectly (to calculate RTT, PLR or other metrics) by different approaches. For example, cwnd is used in MPLOT [4], MP-DCCP [102] (CCID2), CMT-CA [112] and CMT-DA [111], sending rate is used in MP-DCCP [102] (CCID3) and CMT-CL/FD [119], cost function is utilized in MP-DASH [117] and GreenBag [71]. In MP-DASH, cost can be data usage, energy consumption or both, and in GreenBag [71], cost refers to energy consumption. Other useful factors can be buffer size, packet size, packet count and etc.

Vi Analysis and Comparison of Methods and Techniques

In the previous two sections, we analyzed different multipath wireless video streaming works based on layer dependency and scheduling functions. In this section, we study other effected features and related methods that are used in these works. Table IX re-classified the candidate previously explained surveyed works based on the features or methods the authors used.

Packet loss
MRTP  [1] Not used N Not defined Not defined OPNET PSNR, Bandwidth utilization, Buffer overflow probability, Playout buffer size Real-time
MPRTP  [2] Used Y H.264/AVC x264 Realistic testbed, NetEm, Disjoiont paths, Client interfaces: WiFi and 3G or multiple 3G PSNR, Loss rate, Bandwidth utilization, Connection setup time
Xing et al.  [91] Not used N H.264/AVC
Realistic testbed,
Android framework,
Disjoint paths,
Client interfaces: WiFi and 3G
Playback fluency average,
Playback quality,
Quality switch,
Average 3G traffic,
Playback traces,
Buffer occupancy
Not defined
Not used N H.264/SVC JSVM
Realistic testbed,
Android framework,
Client interfaces:
WiFi and Bluetooth
Startup delay,
Playback fluency average,
Playback quality,
Quality switch,
Bandwidth utilization,
Playback traces,
Buffer occupancy
Houzé et al.  [93] Not used N HEVC HM
Client interfaces:
five homogeneous xDSL links
Cumulative Distribution
(CDF)CDFCumulative Distribution Function of frame sizes,
QoE (SAMVIQ method)
Afzal et al.  [95] Not used N H.264
Client interfaces:
LTE, WiFi (802.11n)
Loss rate,
I and NI
frame packet loss rate,
Sohn et al.
Not used N SHVC JSVM
Own visual studio
Client interfaces:
WiFi and Ethernet
Play time for base layer,
Quality switch
Evensen et al.
Not used N Not defined Not defined
Realistic testbed,
Ubuntu framework,
Client interfaces:
WiFi (IEEE 802.11b) and
Cellular (HSDPA)
Quality distribution,
Missed deadlines,
Evensen et al.
Not used N Not defined Not defined
Realistic testbed,
Ubuntu framework,
Client interfaces:
WiFi (IEEE 802.11b) and
Cellular (HSDPA)
Quality distribution,
Missed deadlines,
Evensen et al.
Not used N Not defined Not defined
Realistic testbed,
Ubuntu framework,
Client interfaces:
WiFi (IEEE 802.11b) and
Cellular (HSDPA)
Quality distribution,
Missed deadlines,
Not used N Not defined Not defined
Realistic testbed,
Own C and JAVA implementation,
Android framework,
Client interfaces: WiFi and LTE
Playback time,
Interruption time,
Energy consumption,
Buffer size,
In-order data
ZigZag Y H.264/AVC JM
Client interfaces:
Cellular, WiFi (802.11a/g) and
WiMAX (802.16)
End-to-End delay,
Streaming rate,
Number of frames lost,
Inter-packet delay,
Bandwidth utilization,
Loss rate
Packet loss
Freris at al.
Not used
H.264/SVC Not defined
Matlab for subroutins,
Client interfaces:
Ethernet, WiFi (802.11b) and
WiFi (802.11g)
Streaming rate,
Packet delivery delay,
Delivery ratio,
Run time,
Cost functions evaluation
(service differentiation)
Correia at al.
Not used N H.264 /AVC
Not defined
Not defined
ECN Y Not defined Not defined
Bandwidth utilization,
Congestion window size
(fairness test),
Effect of loss correlations
Not defined
H.264 /AVC Not defined
Disjoint Paths,
Client interfaces:
WiFi, 3G and Ethernet
Decodable ratio of
transmitted frames
ZigZag Y H.264/AVC JM
Client Interfaces:
WiFi, Cellular and WiMAX
End-to-End delay,
Congestion window size
(fairness test),
Inter-packet delay,
FEC redundancy,
Out-of-order packets
Not used Y H.264/AVC Not defined
Disjoint paths,
Client Interfaces: 3G and 3G
MPTCP-PR  [104] Not used Y H.264/AVC Not defined NS2, Disjoint paths, Client Interfaces: 3G and 3G PSNR
PR-MPTCP  [106] Not used Y Not defined Not defined NS3, Disjoint paths, Client interfaces: WiFi and LTE PSNR, VQM, SSIM, Number of frames received or dropped Real-time
SRMT  [109] Not used N H.264/AVC Not defined Simulator not defined, Client interfaces: WiFi (802.11g), 3G Or WiFi, ADSL PSNR, SSIM, Goodput, Delay distribution
PR-SCTP  [110] Not used N H.264/AVC Not defined Realistic testbed, FreeBSD framework, Netem Successful frame transmission ratio, Frame late index Real-time
CMT-QA  [76] ORP N H.264/AVC Not defined
Disjoint paths,
Client interfaces:
3G, WiMAX (802.16)
and WiFi (802.11)
PSNR, VQM, SSIM, Number of frames lost, Out-of-order packets, Average retransmission, Average throughput Real-time
CMT-DA  [111] ECN N H.264/SVC JSVM EXata, Client interfaces: Cellular, WiFi and WiMAX PSNR, Inter-packet delay, Goodput, Loss rate, Out-of-order packets Real-time
CMT-CA  [112] ZigZag Y H.264/AVC FFmpeg
Client interfaces:
Cellular, WiFi and WiMAX
PSNR, End-to-End delay, CDF of inter-packet delay, Out-of-order packets, Goodput, Number of frames (I,P) lost
Yap at al.  [67] Not used N Not defined Not defined Realistic testbed, Android and Ubuntu framework, Real access networks, Up to 10 client interfaces composed of: 3G (HSPA, CDMA), WiMAX and WiFi (802.11a/g) Throughput, Goodput, CPU load, Power consumption, RTT Not defined
Packet loss
MARS  [81] Not used N Not defined Not defined Own JAVA socket implementation, Four client interfaces composed of: WiFi and LTE Out-of-order packets, Reordering delay, End-to-End delay, Throughput Real-time
BAG  [113] Not used N H.263 Not defined Realistic testbed, Up to five client interfaces composed of 3G Delay distribution, Lost frame ratio, Required Bandwidth, Video disruption (glitch statistics) Real-time (interactive)
Corbillon et al.  [78] Not used Y HEVC FFmpeg Own C++ implementation, Disjoint paths, Client interfaces: 3G and WiFi PSNR, MS-SSIM, Received frame ratio, Received tile ratio
Ojanperä et al.
Not used Y H.264/AVC FFmpeg
Realistic testbed,
Ubuntu framework,
Client interfaces:
WiFi (802.11g) and
WiFi (802.11a)
Quality switch,
Not defined
GALTON  [115] ZigZag N H.264/SVC JSVM EXata, Client interfaces: WiFi, WiMAX, Cellular (HSDPA) or multiple wired interfaces PSNR, Goodput, End-to-End delay, Loss rate Real-time
Not used N H.264/SVC JSVM
Client interfaces:
WiFi (802.11b), WiMAX
and Cellular
End-to-end delay,
Loss rate,
Available bandwidth
MP-DASH  [117] Not used N H.264/AVC Not defined Realistic testbed, Ubuntu framework, Real access networks, Client interfaces: WiFi and Cellular Throughput, Energy consumption, Download time, Average 3G traffic Not defined
Nam et al.  [118] Not used Y H.264/AVC Not defined Realistic testbed, Ubuntu framework, Real MPEG-DASH platform, Mininet over WiFi for SDN, Real access networks, Client interfaces: WiFi (802.11g) and WiFi (802.11a) Played bit rate, Rebuffering, Out-of-order packets Real-time
CMT-CL/FD  [119] RTX Y Not defined Not defined NS2, Disjoint paths, Server and client interfaces: 3G (WCDMA), WiMAX (802.16) and WiFi (802.11) PSNR, Video buffer underflow, Throughput, Fairness test Real-time

Vi-a Packet Loss Differentiation

A packet loss differentiation method can distinguish congestion losses from wireless losses. In the heterogeneous wireless networks, packet losses due to lost channels, handover, noise or interface in the wireless network occur more than losses due to congestion [76]. Identifying reason for losses is essential. For example, if losses occur because of congestion in the network, then retransmission or adding more FEC redundancy pushes worse congestion and more losses [207] (Section V-B). But, decreasing cwnd mitigates congestion. On the other hand, if losses occur because of wireless lossy network, then decreasing cwnd drops goodput sharply (Section IV-B). But, adding more FEC redundancy leads to better recovery. Therefore, with an accurate loss differentiation method could react properly to the network situation.

In our surveyed works, MPRTP [2] categorizes a path as a lossy one if feedback reports show only transmission losses and no discards (overdue packets) over that path. A path is categorized as a mildly congested one if feedback reports show both transmission losses and discards either in a single or consecutive reports. If this behavior occurs in more than three consecutive reports, it means that the path is congested. CMT-QA [76] handles the packet loss differentiation by proposing optimal retransmission policy (ORP). In ORP, when a loss occurs, is calculated, and the result would be compared with a threshold. This threshold is defined as path quality. Therefore, if is more than the threshold, the loss is due to wireless loss. Otherwise, it is a congestion loss. If losses occur more than once and consecutively, then congestion is the reason. CMT-CL/FD [119] proposed loss-cause dependent retransmission (RTX) policy. In RTX, two cases are considered; 1) When the loss is detected by fast retransmission. Thus, the residual capacity of the path is calculated. If it is a positive value, it means that the path is underused and wireless loss is occurred. Otherwise, if the residual path value is negative, congestion is the reason. 2) When the loss is detected by expiring RTO. In this case, the path is failed or severe congestion is occurred. CMT-DA [111], MPLOT [4] and [101] utilize Explicit Congestion Notification (ECN)ECNExplicit Congestion Notification to distinguish loss differentiation. ECN is defined by IETF [218] in 2001. ECN-aware routers informs congestion by setting a mark in the IP header, without dropping any packet. BEMA [86], ADMIT [103], GALTON [115] and CMT-CA [112] use ZigZag scheme, which is introduced in [163]. ZigZag classifies losses as wireless based on the number of losses and on the difference between relative one-way trip times and the mean of relative one-way trip times. For further information about the effect of different types of losses like random loss or bursty loss on video streaming quality refer to [5].

Vi-B Fairness

Table IX summarizes the surveyed works that consider fairness, which was previously introduced in Section IV. Works address fairness in terms of consumed resources by the proposed congestion algorithms (e.g., MPRTP [2], MPLOT [4], CMT-CL/FD [119]), or as adopted by TFRC (e.g., BEMA [86], CMT-CA [112]) or in terms of MPTCP coupled congestion control (e.g., ADMIT [103], MPTCP-SD/PR [104], PR-MPTCP [106], Corbillon et al. [78], Ojanperä et al. [114]). Besides, in our surveyed works, Freris at al. [100] consider user fairness of network resources.

Vi-C Video Compression and Error Concealment

Several video codecs were used in the surveyed works cited in Table IX, such as H.263 [219], H.264/AVC [220], H.264/SVC [221], HEVCHEVCHigh Efficiency Video Coding [36], and SHVC [222]

. After video transmission, if protection methods are not able to recover the lost packets, the decoder itself can employ error concealment. This way, decoder exploits correlations in the previously received video sequence to conceal the lost information. JM, for instance, performs frame copy while FFmpeg performs temporal interpolation. According to 

[223], in case of whole-frame losses, when isolated B-frames were lost and concealed by either JM or FFmpeg, about 40% of the losses were not even noticed by observers. Our surveyed works used JM [224], x264 [225], JSVM [226], FFmpeg [227], HM [228] for error concealment.

Vi-D Experimental environment

Table IX shows that experimental evaluation is mostly dominated by network simulators, such as OPNET [229], NS2 [230], NS3 [231], EXata [232], NetEm [233, 234]. Only few works, mainly due to costs, scale, and scope, carried their evaluation on real testbeds. Wireless-enabled network emulators like Mininet-WiFi [235] are also another category of experimental environments. We also cover some additional implementation details. For example, which type of network interfaces are used in experiments, or if the simulation uses disjoint paths (no common link or node). Using disjoint paths improves bandwidth aggregation and has the benefit of additional fault-tolerance compared with non-disjoint paths [7], altogether contributing to the users video experience.

Vi-E Performance Metrics

Several performance metrics were used in the surveyed works cited in Table IX. Most of them are explained in Section IV-B. We have added some additional video quality metrics, such as Peak Signal-to-Noise Ratio (PSNR)PSNRPeak Signal-to-Noise Ratio, Video Quality Metric (VQM)VQMVideo Quality Metric, Structural SIMilarity (SSIM) [236]SSIMStructural SIMilarity, MultiScale Structural SIMilarity (MS-SSIM) [237]MS-SSIMMultiScale Structural SIMilarity, and Subjective Assessment Methodology for VIdeo Quality (SAMVIQ) [238]SAMVIQSubjective Assessment Methodology for Video Quality.

Vi-F Video Services

The last column of Table IX presents for each of surveyed works which type of video service was considered by the authors, such as VoD, live, and real-time as an upper set including interactive video streaming applications. As discussed in Sections III and IV, each type of video services has different QoS requirements such as delay-sensitivity.

Vii Open Research Issues

Many research avenues around wireless multipath video streaming are open. In the following, we overview some relevant evolving aspects and present future work opportunities.
Standardization developments. MMT is a recent standard protocol with potential abilities discussed in the survey. Future work could evaluate the performance of MMT over MPTCP or QUIC utilizing multipath scheduling methods defined in these protocols for video streaming over heterogeneous networks. HTTP/2 provides noticeable features such as the ability to push content in advance, and frame multiplexing. Therefore, further attention on multipath delivery over HTTP/2 shall be pursuit [239]. Another standards related topic would be the use of HEVC, especially SHVC, which do not seem to be widespread in the networking literature despite being widespread in the video coding community.
Network Softwarization. Attempts to integrate SDN with multipath video streaming (Section IV-C) promise effectiveness for path-aware strategies due to its ability to programmatically define the end-to-end network behaviour. While OpenFlow is considered the mostly accepted interface between control and data planes [23], alternative means for southbound interaction of controllers and datapath devices (e.g. P4 programmable data planes), including SDN protocol extensions relevant for wireless communications (e.g., [240, 241]) deserve further research efforts. SDN and NFV as enabling technologies of multi-domain network service orchestration [242] will certainly keep attracting research attention and will play a critical role in the realization of multipath strategies for video streaming and other types of services.
5G. Fifth generation (5G)5GFifth Generation cellular wireless roll-outs will become reality over the next years. 5G aims to introduce new services that require extreme bandwidth and ultra-low latency [243]. One concept presented in 5G to reach this goal is the presence of multihoming capability. There is some research on 5G multihoming open challenges and multihoming services  [244, 245]. Furthermore, there are several study and efforts for MPTCP operation in 5G [246, 245]. In addition, studies show that emerging technologies such as SDN to MPTCP in 5G networks could improve the transmission performance due to SDN capability to control the subflows by monitoring network condition [247, 248]. Therefore, video streaming over 5G networks is an important emerging research area where innovative solutions will be required considering multihoming solutions along SDN/NFV-based technologies.
WiFi Evolution. There are also big advancement in wireless technologies from WiFi communication to increase the wireless networking performance such as 802.11ad and 802.11ay for 60 GHz, or 802.11ax for 2.4 Ghz and 5 Ghz concurrently, in the near future [249, 250] aim to achieve great throughput and ultra-low latency. Therefore, there is room to evaluate whether these new WiFi technologies can support video streaming alone or not? Another open question here is to explore the impact of these new WiFi technologies on multipath video streaming.

Energy considerations. Power efficiency is an essential requirement. The work [251] shows high power consumption by LTE when video streams over HTTP. Energy consumption even increases more by using multiple network interfaces. Therefore, optimizing power consumption needs further attention in the proposed approaches.
Security. Multipath delivery could mitigate some security threats inherently through the use of alternative paths throughout the network. There is little work in the scope of multipath multimedia streaming security. At the same time, Digital Rights Management (DRM)DRMDigital Rights Management and the license issues are also security related issues critical for some video services.
Mobility and Internet of Vehicles (IoV). Although terminal mobility, velocity, motion degree and related mobile aspects are factors affecting video quality, they are rarely discussed in the literature. This type of considerations are be key in the delivery of wireless video in mobile environments in scope of Intelligent Transport Systems (ITS) ITSIntelligent Transport Systems [252] and Vehicle-to-everything (V2X)V2XVehicle-to-everything communications [253].
Machine Learning and Artificial Intelligence. Leveraging artificial intelligence and machine learning methods are increasingly becoming key tools for network and service optimization [254] and can be used for advanced scheduling and adaptive coding decisions [255]. The importance of machine learning approaches to improve video quality has been recognized by Netflix which proposed a new video quality assessment method named Video Multimethod Assessment Fusion (VMAF)VMAFVideo Multimethod Assessment Fusion. VMAF is a machine learning-based model that is trained and tested using the results of a subjective experiment in order to deliver the best video quality to the user [9]. Besides, there are also several machine learning-based efforts to learn QoS measurements [256] or QoE from user reactions [257, 258] to solve various optimization and control problems for a single path video streaming. In our state-of-art, there are some approaches utilizing machine learning systems learning QoS from user device and using it for multipath scheduling decisions [91, 3]. Thus, similarly, an interesting solution could be utilizing machine learning systems learning QoE from user reactions and using it for multipath scheduling decisions.

Viii Conclusions

Demand for live and on-demand video delivery have dramatically increased along always higher user expectation on QoE. Mobile environments increase the challenges when a client is in movement and requires a seamless connection while throughput varies, and unpredictable latencies and failure exist.

One promising approach in order to improve QoE for wireless video streaming is multipath delivery, which increases available bandwidth, resilience and load balancing. From the industry perspective, several companies have implemented their own multipath approaches, such as AVAYA [259] and Cisco [260]. Apple and SAMSUNG have also started to support multipath on smartphones [179, 70] for different services like voice recognition, or to increase the download speed of specific software packages. Therefore, we expect a growth in multipath video streaming in the near future. However, there are still many issues to be solved, especially for solutions which are not compatible witch each other or that require changes in servers and/or clients, or network equipments to support it [179].

In this work, we have provided an in-depth survey of multipath wireless video streaming proposals, covering over forty relevant pieces of work. We have categorized and explained the surveyed works based on the layer in the protocol stack and the original protocol/feature dominating in each work. Network equipment compatibility has be also discussed. In addition, scheduling, resilience and path selection techniques are presented. Finally, we have studied different key related methods, such as packet loss differentiation, video compression, error concealment, etc.

To conclude with, we highlight some points and observations resulting from the literature survey. Several challenges exist when designing a multipath video streaming approach, which are explained in Section I. We observe that in order to overcome these challenges, packet scheduling strategy should consider several factors. The first one is the layer dependency that is discussed in Section IV, and the surveyed works are summarized and categorized based on it in Table III. Research shows that the scheduler has a better decision when it has complete and accurate information about video contents, packet delivery deadlines, playback buffer, RTT, available bandwidth, and other network information. Implementing scheduling functions on a specific layer could access only a part of this information. Therefore, cross layer approaches get high attention due to their ability of gathering information of different layers for better scheduling decision.

Another important factor to design a scheduler is client and network equipment compatibility. This topic is also discussed in Section IV and summarized in Table III. While the most flexible case to implement is when only client modification is required, some approaches require changing the server, or both server and client, or also the network infrastructure. It is also important to note the ability to traverse middleboxes.

Fundamental aspects to be considered to improve the performance of scheduling functions discussed in Section V and summarized in Tables VVIVIIVIII include which packet should sent next, through which path, and with which type of error protection. An adequate scheduling strategy should be content-aware, and path-aware, as well as it should utilize a proper channel or source level packet protection method. Such a scheduling approach improves QoE, bandwidth aggregation, load balancing, and mitigates HOL blocking and out-of-order video packet deliveries.

Section VI and Table IX show some related methods used in the surveyed works and the key performance indicators to evaluate the approaches. One observation is that while calculating video quality metrics is very useful to understand the performance of each approach, many of the works only consider network QoS metrics without assessing video performance in terms of QoE as the key performance indicators from an end-user perspective.

The path ahead towards the broad realization of wireless multipath of video streaming solutions is not without issues. In Section VII, we overview a series of open challenges and point to some research opportunities.


This work is part of the results obtained through the project “Hyper Realistic Media”, sponsored by Samsung Eletrônica da Amazônia Ltda., in the framework of law No. 8,248/91. We thank CNPq (Brazilian National Council for Scientific and Technological Development) for grants #141778/2015-6 and #310930/2016-2.

=0mu plus 1mu


  • [1] S. Mao, D. Bushmitch, S. Narayanan, and S. S. Panwar, “MRTP: a multiflow real-time transport protocol for ad hoc networks,” IEEE Transactions on Multimedia, vol. 8, no. 2, pp. 356–369, 2006.
  • [2] V. Singh, S. Ahsan, and J. Ott, “MPRTP: multipath considerations for real-time media,” Proceedings of the 4th ACM Multimedia Systems Conference, pp. 190–201, 2013.
  • [3] M. Xing, S. Xiang, and L. Cai, “A real-time adaptive algorithm for video streaming over multiple wireless access networks,” IEEE Journal on Selected Areas in communications, vol. 32, no. 4, pp. 795–805, 2014.
  • [4] V. Sharma, S. Kalyanaraman, K. Kar, K. Ramakrishnan, and V. Subramanian, “MPLOT: A transport protocol exploiting multipath diversity using erasure codes,” 27th Conference on Computer Communications, IEEE INFOCOM, pp. 121–125, 2008.
  • [5] J. G. Apostolopoulos, “Reliable video communication over lossy packet networks using multiple state encoding and path diversity,” Photonics West 2001-Electronic Imaging, pp. 392–409, 2000.
  • [6] J. Qadir, A. Ali, K.-L. A. Yau, A. Sathiaseelan, and J. Crowcroft, “Exploiting the power of multiplicity: a holistic survey of network-layer multipath,” IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 2176–2213, 2015.
  • [7] S. K. Singh, T. Das, and A. Jukan, “A survey on internet multipath routing and provisioning,” IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 2157–2175, 2015.
  • [8] M. Li, A. Lukyanenko, Z. Ou, A. Yla-Jaaski, S. Tarkoma, M. Coudron, and S. Secci, “Multipath transmission for the internet: A survey,” IEEE Communications Surveys Tutorials, vol. PP, no. 99, pp. 1–41, 2016.
  • [9] R. Trestian, I.-S. Comsa, and M. F. Tuysuz, “Seamless multimedia delivery within a heterogeneous wireless networks environment: Are we there yet?” IEEE Communications Surveys & Tutorials, vol. 20, no. 2, pp. 945–977, 2018.
  • [10] M. Jarschel, D. Schlosser, S. Scheuring, and T. Hoßfeld, “An evaluation of QoE in cloud gaming based on subjective tests,” in Innovative mobile and internet services in ubiquitous computing (imis), 2011 fifth international conference on.   IEEE, 2011, pp. 330–335.
  • [11] “Cisco Visual Networking Index: Forecast and Methodology, 2016–2021,” https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/complete-white-paper-c11-481360.html, accessed:15-JAN-2017.
  • [12] C. V. N. Index, “Cisco visual networking index: global mobile data traffic forecast update, 2014–2019,” Tech. Rep, 2015.
  • [13] “Samsung Galaxy S7 review,” http://www.gsmarena.com/samsung_galaxy_s7-review-1408p8.php, accessed:15-JUN-2017.
  • [14] “Apple Iphone 7 review,” http://www.gsmarena.com/apple_iphone_7-review-1497p8.php, accessed:15-JUN-2017.
  • [15] “Google Pixel Xl review,” http://www.gsmarena.com/google_pixel_xl-review-1513p8.php, accessed:15-JUN-2017.
  • [16] “LG G5 review,” http://www.gsmarena.com/lg_g5-review-1416p7.php, accessed:15-JUN-2017.
  • [17] Y. Sani, A. Mauthe, and C. Edwards, “Adaptive bitrate selection: A survey,” IEEE Communications Surveys & Tutorials, vol. 19, no. 4, pp. 2985–3014, 2017.
  • [18] J. Domzal, Z. Dulinski, M. Kantor, J. Rzkasa, R. Stankiewicz, K. Wajda, and R. Wojcik, “A survey on methods to provide multipath transmission in wired packet networks,” Computer Networks, vol. 77, pp. 18–41, 2015.
  • [19] S. Addepalli, H. G. Schulzrinne, A. Singh, and G. Ormazabal, “Heterogeneous access: Survey and design considerations,” Department of Computer Science, Columbia University, Tech. Rep., 2013.
  • [20] K. Habak, K. A. Harras, and M. Youssef, “Bandwidth aggregation techniques in heterogeneous multi-homed devices: A survey,” Computer Networks, vol. 92, pp. 168–188, 2015.
  • [21] A. A. Barakabitze, I.-H. Mkwawa, L. Sun, and E. Ifeachor, “QualitySDN: Improving Video Quality using MPTCP and Segment Routing in SDN/NFV,” in IEEE Conference on Network Softwarization and Workshops (NetSoft), 2018, pp. 182–186.
  • [22] K. Herguner, R. S. Kalan, C. Cetinkaya, and M. Sayit, “Towards QoS-aware routing for DASH utilizing MPTCP over SDN,” in Network Function Virtualization and Software Defined Networks (NFV-SDN), IEEE Conference on, 2017, pp. 1–6.
  • [23] D. Kreutz, F. M. Ramos, P. E. Verissimo, C. E. Rothenberg, S. Azodolmolky, and S. Uhlig, “Software-defined networking: A comprehensive survey,” Proceedings of the IEEE, vol. 103, no. 1, pp. 14–76, 2015.
  • [24] M. Z. Hasan, H. Al-Rizzo, and F. Al-Turjman, “A survey on multipath routing protocols for QoS assurances in real-time wireless multimedia sensor networks,” IEEE Communications Surveys & Tutorials, vol. 19, no. 3, pp. 1424–1456, 2017.
  • [25] K. Sha, J. Gehlot, and R. Greve, “Mutipath routing techniques in wireless sensor networks: A survey,” Wireless personal communications, vol. 70, no. 2, pp. 807–829, 2013.
  • [26] X. Zhang and H. Hassanein, “A survey of peer-to-peer live video streaming schemes - an algorithmic perspective,” Comput. Netw., vol. 56, no. 15, pp. 3548–3579, Oct. 2012. [Online]. Available: http://dx.doi.org/10.1016/j.comnet.2012.06.013
  • [27] Y. Liu, Y. Guo, and C. Liang, “A survey on peer-to-peer video streaming systems,” Peer-to-peer Networking and Applications, vol. 1, no. 1, pp. 18–28, 2008.
  • [28] L. B. Yuste, F. B. Segui, M. A. M. Climent, and H. Melvin, “Understanding timelines within MPEG standards,” in Communications Surveys and Tutorials, IEEE Communications Society, vol. 18, no. 1.   Institute of Electrical and Electronics Engineers (IEEE), 2015, pp. 368–400.
  • [29] H. Schulzrinne, “A Transport Protocol for Real-Time Applications,” 1992.
  • [30] Y. Lim, S. Aoki, I. Bouazizi, and J. Song, “New MPEG transport standard for next generation hybrid broadcasting system with IP,” IEEE Transactions on Broadcasting, vol. 60, no. 2, pp. 160–169, 2014.
  • [31] “Information technology – Generic coding of moving pictures and associated audio information: Part 1 Systems,” ISO/IEC 13818-1, 2007.
  • [32] C. K. Yie and Y. J. Lee, “Method for hybrid delivery of MMT package and content and method for receiving content,” US Patent No. 9,414,123, 2016.
  • [33] Y. Lim, K. Park, J. Y. Lee, S. Aoki, and G. Fernando, “MMT: An emerging MPEG standard for multimedia delivery over the internet,” IEEE MultiMedia, vol. 20, no. 1, pp. 80–85, 2013.
  • [34] “Real-Time Messaging Protocol (RTMP) specification,” http://www.adobe.com/devnet/rtmp.html, accessed:10-SEP-2017.
  • [35] “SWF and AMF Technology Center,” https://www.adobe.com/devnet/swf.html, accessed:10-SEP-2017.
  • [36] “High Efficiency Video Coding,” ITU-T Rec. H.265 and ISO/IEC 23008-2, 2018.
  • [37] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, “Hypertext transfer protocol–HTTP/1.1,” 1999.
  • [38] J. Postel, “User Datagram Protocol,” RFC 768, pp. 1–3, 1980.
  • [39] ——, “Transmission control protocol,” RFC 793, 1981.
  • [40] “Move Networks,” http://www.movenetworkshd.com, accessed:15-JUN-2017.
  • [41] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hobfeld, and P. Tran-Gia, “A survey on quality of experience of HTTP adaptive streaming,” IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 469–492, 2015.
  • [42] I. Sodagar, “The mpeg-dash standard for multimedia streaming over the internet,” IEEE MultiMedia, vol. 18, no. 4, pp. 62–67, 2011.
  • [43] C. James, E. Halepovic, M. Wang, R. Jana, and N. Shankaranarayanan, “Is Multipath TCP (MPTCP) Beneficial for Video Streaming over DASH?” 24th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), pp. 331–336, 2016.
  • [44] M. Belshe and M. Peon, Roberto Thomson, “Hypertext transfer protocol version 2 (http/2),” RFC 7540, 2015.
  • [45] M. B. Yahia, Y. Le Louedec, L. Nuaymi, and G. Simon, “When HTTP/2 Rescues DASH: Video Frame Multiplexing,” 3rd Communication and Networking Techniques for Contemporary Video Workshop (CNTCV), IEEE INFOCOM, 2017.
  • [46] R. Huysegems, J. van der Hooft, T. Bostoen, P. Rondao Alface, S. Petrangeli, T. Wauters, and F. De Turck, “HTTP/2-based methods to improve the live experience of adaptive streaming,” Proceedings of the 23rd ACM international conference on Multimedia, pp. 541–550, 2015.
  • [47] M. Xiao, V. Swaminathan, S. Wei, and S. Chen, “Evaluating and improving push based video streaming with HTTP/2,” Proceedings of the 26th International Workshop on Network and Operating Systems Support for Digital Audio and Video, p. 3, 2016.
  • [48] “Smooth Streaming Transport Protocol,” https://docs.microsoft.com/en-us/iis/media/smooth-streaming/smooth-streaming-transport-protocol, accessed:15-AGU-2017.
  • [49] R. Pantos and W. May, “HTTP Live Streaming,” RFC 8216, aug 2017.
  • [50] “Adobe Systems Inc. HTTP Dynamic Streaming,” http://www.adobe.com/products/hds-dynamic-streaming.html, accessed:15-JUN-2017.
  • [51] “MPEG systems technologies – Part 6: Dynamic adaptive streaming over HTTP (DASH),” ISO/IEC FCD 23001-6, January 2011.
  • [52] “MPEG-DASH vs. Apple HLS vs. Microsoft Smooth Streaming vs. Adobe HDS,” https://bitmovin.com/mpeg-dash-vs-apple-hls-vs-microsoft-smooth-streaming-vs-adobe-hds/, accessed:15-JUN-2017.
  • [53] “Why YouTube & Netflix use MPEG-DASH in HTML5,” https://bitmovin.com/status-mpeg-dash-today-youtube-netflix-use-html5-beyond/, accessed:09-SEP-2017.
  • [54] H. Kalva and J.-B. Lee, “The VC-1 video coding standard,” IEEE MultiMedia, vol. 14, no. 4, 2007.
  • [55] K. Rao, D. N. Kim, and J. J. Hwang, “VP6 Video Coding Standard,” in Video coding standards.   Springer, 2014, pp. 159–197.
  • [56] “Information technology – High efficiency coding and media delivery in heterogeneous environments – Part 1: MPEG media transport (MMT),” ISO/IEC 23008-1, 2014.
  • [57] “MMT Enhancements for mobile environments,” ISO/IEC 23008-1:2017/DAmd 2, 2017.
  • [58] Y. Ye, Y. He, Y.-K. Wang et al., “SHVC the Scalable Extensions of HEVC and Its Applications,” ZTE Communications, vol. 14, no. 1, 2016.
  • [59] O. Karya, S. Saesaria, and S. Budiyanto, “RTP analysis for the video transmission process on WhatsApp and Skype against signal strength variations in 802.11 network environments,” in IOP Conference Series: Materials Science and Engineering, vol. 453, no. 1.   IOP Publishing, 2018, p. 012062.
  • [60] “Hulu,” https://www.hulu.com/welcome, accessed:06-may-2019.
  • [61] “YouTube,” https://www.youtube.com/, accessed:06-may-2019.
  • [62] “Netflix,” https://www.netflix.com/br-en/, accessed:06-may-2019.
  • [63] “Twitch,” https://www.twitch.tv/, accessed:06-may-2019.
  • [64] “Vimeo,” https://vimeo.com/, accessed:06-may-2019.
  • [65] “Bitmovin,” https://github.com/bitmovin/bitmovin-player-web-samples, accessed:06-may-2019.
  • [66] “DASH industry forum,” https://dashif.org/members/, accessed:06-may-2019.
  • [67] K.-K. Yap, T.-Y. Huang, M. Kobayashi, Y. Yiakoumis, N. McKeown, S. Katti, and G. Parulkar, “Making use of all the networks around us: a case study in android,” Proceedings of the 2012 ACM SIGCOMM workshop on Cellular networks: operations, challenges, and future design, pp. 19–24, 2012.
  • [68] C. Gutterman, K. Guo, S. Arora, X. Wang, L. Wu, E. Katz-Bassett, and G. Zussman, “Requet: Real-Time QoE Detection for Encrypted YouTube Traffic,” MMSys, Amherst, MA, US, 2019.
  • [69] “Mushroom networks Inc,” http://www.mushroomnetworks.com/, accessed:15-JUL-2017.
  • [70] “Samsung Galaxy S5 Download Booster,” https://galaxys5guide.com/samsung-galaxy-s5-features-explained/galaxy-s5-download-booster/, accessed:05-may-2019.
  • [71] D. H. Bui, K. Lee, S. Oh, I. Shin, H. Shin, H. Woo, and D. Ban, “Greenbag: Energy-efficient bandwidth aggregation for real-time streaming in heterogeneous mobile wireless networks,” Real-Time Systems Symposium (RTSS), 2013 IEEE 34th, pp. 57–67, 2013.
  • [72] S. Han, H. Joo, D. Lee, and H. Song, “An end-to-end virtual path construction system for stable live video streaming over heterogeneous wireless networks,” IEEE Journal on Selected Areas in Communications, vol. 29, no. 5, pp. 1032–1041, 2011.
  • [73] F.-L. Luo, Digital Front-End in Wireless Communications and Broadcasting: circuits and signal processing.   Cambridge University Press, 2011.
  • [74] Y.-C. Chen, Y.-s. Lim, R. J. Gibbens, E. M. Nahum, R. Khalili, and D. Towsley, “A measurement-based study of multipath tcp performance over wireless networks,” Proceedings of the 2013 conference on Internet measurement conference, pp. 455–468, 2013.
  • [75] J. Yoon, H. Zhang, S. Banerjee, and S. Rangarajan, “MuVi: A multicast video delivery scheme for 4G cellular networks,” Proceedings of the 18th annual international conference on Mobile computing and networking, pp. 209–220, 2012.
  • [76] C. Xu, T. Liu, J. Guan, H. Zhang, and G.-M. Muntean, “CMT-QA: Quality-aware adaptive concurrent multipath data transfer in heterogeneous wireless networks,” IEEE Transactions on Mobile Computing, vol. 12, no. 11, pp. 2193–2205, 2013.
  • [77] M. C. Chan and R. Ramjee, “TCP/IP performance over 3G wireless links with rate and delay variation,” Wireless Networks, vol. 11, no. 1-2, pp. 81–97, 2005.
  • [78] X. Corbillon, R. Aparicio-Pardo, N. Kuhn, G. Texier, and G. Simon, “Cross-layer scheduler for video streaming over MPTCP,” Proceedings of the 7th International Conference on Multimedia Systems, p. 7, 2016.
  • [79] S. Ferlin-Oliveira, T. Dreibholz, and Ö. Alay, “Tackling the challenge of bufferbloat in multi-path transport over heterogeneous wireless networks,” in Quality of Service (IWQoS), 22nd International Symposium of.   IEEE, 2014, pp. 123–128.
  • [80] Y.-C. Chen and D. Towsley, “On bufferbloat and delay analysis of multipath TCP in wireless networks,” in Networking Conference, IFIP.   IEEE, 2014, pp. 1–9.
  • [81] G. Sun, K. Eng, S. Yin, G. Liu, and G. Min, “MARS: Multiple Access Radio Scheduling for a Multi-homed Mobile Device in Soft-RAN,” KSII Transactions on Internet & Information Systems, vol. 10, no. 1, 2016.
  • [82] E. Brosh, S. A. Baset, V. Misra, D. Rubenstein, and H. Schulzrinne, “The delay-friendliness of TCP for real-time traffic,” IEEE/ACM Transactions On Networking, vol. 18, no. 5, pp. 1478–1491, 2010.
  • [83] S. Baraković and L. Skorin-Kapov, “Survey and challenges of QoE management issues in wireless networks,” Journal of Computer Networks and Communications, 2013.
  • [84] M. Van der Schaar and P. A. Chou, Multimedia over IP and wireless networks: compression, networking, and systems.   Elsevier, 2011.
  • [85] G. Recommendation, “1010 End-user multimedia QoS categories,” ITU-T, November, 2001.
  • [86] J. Wu, C. Yuen, B. Cheng, Y. Yang, M. Wang, and J. Chen, “Bandwidth-efficient multipath transport protocol for quality-guaranteed real-time video over heterogeneous wireless networks,” IEEE Transactions on Communications, vol. 64, no. 6, pp. 2477–2493, 2016.
  • [87] D. Austerberry, The technology of video and audio streaming.   Taylor & Francis, 2005.
  • [88] A. L. Chow, H. Yang, C. H. Xia, M. Kim, Z. Liu, and H. Lei, “EMS: Encoded multipath streaming for real-time live streaming applications,” 17th IEEE International Conference on Network Protocols (ICNP), pp. 233–243, 2009.
  • [89] J. S. Mwela, “Impact of packet loss on the quality of video stream transmission,” Ph.D. dissertation, Blekinge Institute of Technology, 2010.
  • [90] W. Lei, W. Zhang, and S. Liu, “Multipath Real-Time Transport Protocol Based on Application-Level Relay (MPRTP-AR),” IETF Internet Draft, subsequent updates, 2017.
  • [91] M. Xing, S. Xiang, and L. Cai, “Rate adaptation strategy for video streaming over multiple wireless access networks,” IEEE Global Communications Conference (GLOBECOM), pp. 5745–5750, 2012.
  • [92] Y. Chowrikoppalu and P. Gowda, “Multipath Adaptive Video Streaming over Multipath TCP,” Ph.D. dissertation, Intel, 2013.
  • [93] P. Houzé, E. Mory, G. Texier, and G. Simon, “Applicative-layer multipath for low-latency adaptive live streaming,” IEEE International Conference on Communications (ICC), pp. 1–7, 2016.
  • [94] P. Kolan and I. Bouazizi, “Method and apparatus for multipath media delivery,” Dec. 22 2016, uS Patent App. 15/001,018. [Online]. Available: https://www.google.com/patents/US20160373342
  • [95] S. Afzal, V. Testoni, J. F. F. de Oliveira, C. E. Rothenberg, P. Kolan, and I. Bouazizif, “A Novel Scheduling Strategy for MMT-based Multipath Video Streaming,” in IEEE Global Communications Conference (GLOBECOM), 2018, pp. 206–212.
  • [96] Y. Sohn, M. Cho, M. Seo, and J. Paik, “A Synchronization Scheme for Hierarchical Video Streams over Heterogeneous Networks,” KSII Transactions on Internet & Information Systems, vol. 9, no. 8, 2015.
  • [97] K. Evensen, T. Kupka, D. Kaspar, P. Halvorsen, and C. Griwodz, “Quality-adaptive scheduling for live streaming over multiple access networks,” Proceedings of the 20th international workshop on Network and operating systems support for digital audio and video, pp. 21–26, 2010.
  • [98] K. Evensen, D. Kaspar, C. Griwodz, P. Halvorsen, A. Hansen, and P. Engelstad, “Improving the performance of quality-adaptive video streaming over multiple heterogeneous access networks,” Proceedings of the second annual ACM conference on Multimedia systems, pp. 57–68, 2011.
  • [99] K. Evensen, D. Kaspar, C. Griwodz, P. Halvorsen, A. F. Hansen, and P. Engelstad, “Using bandwidth aggregation to improve the performance of quality-adaptive streaming,” Signal Processing: Image Communication, vol. 27, no. 4, pp. 312–328, 2012.
  • [100] N. M. Freris, C.-H. Hsu, J. P. Singh, and X. Zhu, “Distortion-aware scalable video streaming to multinetwork clients,” IEEE/ACM Transactions on Networking, vol. 21, no. 2, pp. 469–481, 2013.
  • [101] P. Correia, L. Ferreira, P. A. A. Assuncao, L. Cruz, and V. Silva, “Optimal priority mdc video streaming for networks with path diversity,” IEEE International Conference on Telecommunications and Multimedia (TEMU), pp. 54–59, 2012.
  • [102] C.-M. Huang, Y.-C. Chen, and S.-Y. Lin, “The QoS-Aware Order Prediction Scheduling (QOPS) Scheme for Video Streaming Using the Multi-path Datagram Congestion Control Protocol (MP-DCCP),” IEEE 5th International Conference on Network-Based Information Systems (NBiS), pp. 276–283, 2012.
  • [103] J. Wu, C. Yuen, B. Cheng, M. Wang, and J. Chen, “Streaming high-quality mobile video with multipath TCP in heterogeneous wireless networks,” IEEE Transactions on Mobile Computing, vol. 15, no. 9, pp. 2345–2361, 2016.
  • [104] C. Diop, G. Dugue, C. Chassot, and E. Exposito, “QoS-oriented MPTCP extensions for multimedia multi-homed systems,” IEEE 26th International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 1119–1124, 2012.
  • [105] C. Xu, H. Huang, H. Zhang, C. Xiong, and L. Zhu, “Multipath Transmission Control Protocol (MPTCP) Partial Reliability Extension,” 2016.
  • [106] Y. Cao, Q. Liu, G. Luo, Y. Yi, and M. Huang, “PR-MPTCP+: Context-aware QoE-oriented multipath TCP partial reliability extension for real-time multimedia applications,” Visual Communications and Image Processing (VCIP), 2016, pp. 1–4, 2016.
  • [107] A. Kelly, G. Muntean, P. Perry, and J. Murphy, “Delay-centric handover in SCTP over WLAN,” Transactions on Automatic Control and Computer Science, vol. 49, no. 63, pp. 1–6, 2004.
  • [108] K. Okamoto, N. Yamai, K. Okayama, K. Kawano, M. Nakamura, and T. Yokohira, “Performance improvement of SCTP communication using selective bicasting on lossy multihoming environment,” IEEE 38th Annual Computer Software and Applications Conference (COMPSAC), pp. 551–557, 2014.
  • [109] C. A. G. da Silva, E. P. Ribeiro, and C. M. Pedroso, “Preventing quality degradation of video streaming using selective redundancy,” Computer Communications, vol. 91, pp. 120–132, 2016.
  • [110] H. Sanson, A. Neira, L. Loyola, and M. Matsumoto, “PR-SCTP for real time H. 264/AVC video streaming,” IEEE 12th International Conference on Advanced Communication Technology (ICACT), vol. 1, pp. 59–63, 2010.
  • [111] J. Wu, B. Cheng, C. Yuen, Y. Shang, and J. Chen, “Distortion-aware concurrent multipath transfer for mobile video streaming in heterogeneous wireless networks,” IEEE Transactions on Mobile Computing, vol. 14, no. 4, pp. 688–701, 2015.
  • [112] J. Wu, C. Yuen, M. Wang, and J. Chen, “Content-aware concurrent multipath transfer for high-definition video streaming over heterogeneous wireless networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 3, pp. 710–723, 2016.
  • [113] K. Chebrolu and R. R. Rao, “Bandwidth aggregation for real-time applications in heterogeneous wireless networks,” IEEE Transactions on Mobile Computing, vol. 5, no. 4, pp. 388–403, 2006.
  • [114] T. Ojanperä and J. Vehkaperä, “Network-assisted multipath DASH using the distributed decision engine,” IEEE International Conference on Computing, Networking and Communications (ICNC), pp. 1–6, 2016.
  • [115] J. Wu, C. Yuen, B. Cheng, Y. Shang, and J. Chen, “Goodput-aware load distribution for real-time traffic over multipath networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 8, pp. 2286–2299, 2015.
  • [116] J. Wu, Y. Shang, J. Huang, X. Zhang, B. Cheng, and J. Chen, “Joint source-channel coding and optimization for mobile video streaming in heterogeneous wireless networks,” European Association for Signal Processing (EURASIP) Journal on Wireless Communications and Networking, vol. 2013, no. 1, p. 283, 2013.
  • [117] B. Han, F. Qian, L. Ji, V. Gopalakrishnan, and N. Bedminster, “MP-DASH: Adaptive Video Streaming Over Preference-Aware Multipath,” Proceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies, pp. 129–143, 2016.
  • [118] H. Nam, D. Calin, and H. Schulzrinne, “Towards dynamic MPTCP Path control using SDN,” IEEE NetSoft Conference and Workshops (NetSoft), pp. 286–294, 2016.
  • [119] C. Xu, Z. Li, J. Li, H. Zhang, and G.-M. Muntean, “Cross-layer fairness-driven concurrent multipath video delivery over heterogeneous wireless networks,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 25, no. 7, pp. 1175–1189, 2015.
  • [120] and H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications,” Internet Requests for Comments, January 1996.
  • [121] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications,” Internet Requests for Comments, July 2003.
  • [122] J. Lennox, M. Westerlund, Q. Wu, and C. Perkins, “Sending Multiple RTP Streams in a Single RTP Session,” Internet Requests for Comments, March 2017.
  • [123] H. Schulzrinne, “Real time streaming protocol (RTSP),” RFC 2326, 1998.
  • [124] S. S. Krishnan and R. K. Sitaraman, “Video stream quality impacts viewer behavior: inferring causality using quasi-experimental designs,” IEEE/ACM Transactions on Networking, vol. 21, no. 6, pp. 2001–2014, 2013.
  • [125] A. Singh, C. Goerg, A. Timm-Giel, M. Scharf, and T.-R. Banniza, “Performance comparison of scheduling algorithms for multipath transfer,” pp. 2653–2658, 2012.
  • [126] V. Swaminathan and S. Wei, “Low latency live video streaming using HTTP chunked encoding,” IEEE 13th International Workshop on Multimedia Signal Processing (MMSP), pp. 1–6, 2011.
  • [127] S. Wei and V. Swaminathan, “Low latency live video streaming over HTTP 2.0,” Proceedings of Network and Operating System Support on Digital Audio and Video Workshop, p. 37, 2014.
  • [128] J. Le Feuvre, C. Concolato, N. Bouzakaria, and V.-T.-T. Nguyen, “MPEG-DASH for low latency and hybrid streaming services,” Proceedings of the 23rd ACM international conference on Multimedia, pp. 751–752, 2015.
  • [129] D. Kaspar, K. Evensen, P. Engelstad, and A. F. Hansen, “Using HTTP pipelining to improve progressive download over multiple heterogeneous interfaces,” IEEE International Conference on Communications (ICC), pp. 1–5, 2010.
  • [130] H. T. Le, H. N. Nguyen, N. Pham Ngoc, A. T. Pham, and T. C. Thang, “A Novel Adaptation Method for HTTP Streaming of VBR Videos over Mobile Networks,” Mobile Information Systems, vol. 2016, 2016.
  • [131] T. C. Thang, H. T. Le, A. T. Pham, and Y. M. Ro, “An evaluation of bitrate adaptation methods for HTTP live streaming,” IEEE Journal on Selected Areas in Communications, vol. 32, no. 4, pp. 693–705, 2014.
  • [132] A. Samba, Y. Busnel, A. Blanc, P. Dooze, and G. Simon, “Instantaneous Throughput Prediction in Cellular Networks: Which Information Is Needed?” IFIP/IEEE International Symposium on Integrated Network Management (IM), 2017.
  • [133] Y. Zhou, Y. Duan, J. Sun, and Z. Guo, “Towards simple and smooth rate adaption for VBR video in DASH,” in Visual Communications and Image Processing Conference, 2014 IEEE.   IEEE, 2014, pp. 9–12.
  • [134] R. Trestian, A.-N. Moldovan, O. Ormond, and G.-M. Muntean, “Energy consumption analysis of video streaming to android mobile devices,” in Network Operations and Management Symposium (NOMS), 2012 IEEE.   IEEE, 2012, pp. 444–452.
  • [135] Y. Shuai, G. Petrovic, and T. Herfet, “Server-driven rate control for adaptive video streaming using virtual client buffers,” in Consumer Electronics–Berlin (ICCE-Berlin), 2014 IEEE Fourth International Conference on.   IEEE, 2014, pp. 45–49.
  • [136] S. Wilk, D. Stohr, and W. Effelsberg, “VAS: a video adaptation service to support mobile video,” in Proceedings of the 25th ACM Workshop on Network and Operating Systems Support for Digital Audio and Video.   ACM, 2015, pp. 37–42.
  • [137] H. Mao, R. Netravali, and M. Alizadeh, “Neural adaptive video streaming with pensieve,” in Proceedings of the Conference of the ACM Special Interest Group on Data Communication.   ACM, 2017, pp. 197–210.
  • [138] T. Ojanperä and H. Kokkoniemi-Tarkkanen, “Wireless bandwidth management for multiple video clients through network-assisted DASH,” in World of Wireless, Mobile and Multimedia Networks (WoWMoM).   IEEE, 2016, pp. 1–3.
  • [139] J. W. Kleinrouweler, S. Cabrero, and P. Cesar, “Delivering stable high-quality video: An SDN architecture with DASH assisting network elements,” in Proceedings of the 7th International Conference on Multimedia Systems.   ACM, 2016, p. 4.
  • [140] G. Cofano, L. D. Cicco, T. Zinner, A. Nguyen-Ngoc, P. Tran-Gia, and S. Mascolo, “Design and performance evaluation of network-assisted control strategies for HTTP adaptive streaming,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 13, no. 3s, p. 42, 2017.
  • [141] “Information technology – Dynamic adaptive streaming over HTTP (DASH) – part 5: Server and network assisted DASH (SAND),” ISO/IEC CD 23009-5, 2014.
  • [142] E. Thomas, M. van Deventer, T. Stockhammer, A. C. Begen, M.-L. Champel, and O. Oyman, “Applications and deployments of server and network assisted DASH (SAND),” 2016.
  • [143] M. Luoto, T. Rautio, T. Ojanperä, and J. Mäkelä, “Distributed decision engine—An information management architecture for autonomie wireless networking,” IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 713–719, 2015.
  • [144] S. Habib, J. Qadir, A. Ali, D. Habib, M. Li, and A. Sathiaseelan, “The past, present, and future of transport-layer multipath,” Journal of Network and Computer Applications, vol. 75, pp. 236–258, 2016.
  • [145] A. Bokani, M. Hassan, and S. Kanhere, “Http-based adaptive streaming for mobile clients using markov decision process,” IEEE 20th International Packet Video Workshop (PV), pp. 1–8, 2013.
  • [146] S. BAE, “Method for configuring and transmitting m-unit,” Apr. 18 2013, uS Patent App. 13/651,191. [Online]. Available: http://www.google.sr/patents/US20130094594
  • [147] S. Aoki, “Emerging 8K services and their applications towards 2020,” ITU-T 2nd mini-Workshop on Immersive Live Experience, 2017.
  • [148] R. I.-R. BT.2074-1, “Service configuration, media transport protocol, and signalling information for MMT-based broadcasting systems,” 2017.
  • [149] I. Bouazizi, “MPEG Media Transport Protocol (MMTP),” Working Draft, IETF Secretariat, Internet-Draft draft-bouazizi-mmtp-01, September 2014. [Online]. Available: http://www.ietf.org/internet-drafts/draft-bouazizi-mmtp-01.txt
  • [150] T.-J. Jung, H.-r. Lee, and K.-d. Seo, “Overview on MPEG MMT Technology and Its Application to Hybrid Media Delivery over Heterogeneous Networks,” Pacific Rim Conference on Multimedia, pp. 660–669, 2015.
  • [151] “Information technology — High efficiency coding and media delivery in heterogeneous environments — Part 13: 3rd Edition MPEG Media Transport Implementation Guidelines,” ISO/IEC 23008-13, 2013.
  • [152] B. Li, C. Wang, Y. Xu, and Z. Ma, “An MMT based heterogeneous multimedia system using QUIC,” IEEE 2nd International Conference on Cloud Computing and Internet of Things (CCIOT), pp. 129–133, 2016.
  • [153] R. Hamilton, J. Iyengar, I. Swett, and A. Wilk, “QUIC: A UDP-Based Secure and Reliable Transport,” IETF Internet Draft, subsequent updates, 2016.
  • [154] H. Chan, A. Wei, F. Song, and H. Zhang, “One Way Latency Considerations for Multipath in QUIC,” IETF Internet Draft, subsequent updates, 2017.
  • [155] K. Park, N. Kim, and B.-D. Lee, “Performance evaluation of the emerging media-transport technologies for the next-generation digital broadcasting systems,” IEEE Access, vol. 5, pp. 17 597–17 606, 2017.
  • [156] “Information technology – Coding of audio-visual objects – Part 12: ISO base media file format,” ISO/IEC 14496-12, 2012.
  • [157] D. Johansen, H. Johansen, T. Aarflot, J. Hurley, Å. Kvalnes, C. Gurrin, S. Zav, B. Olstad, E. Aaberg, T. Endestad et al., “DAVVI: A prototype for the next generation multimedia entertainment platform,” Proceedings of the 17th ACM international conference on Multimedia, pp. 989–990, 2009.
  • [158] S. Lederer, C. Müller, and C. Timmerer, “Dynamic adaptive streaming over HTTP dataset,” Proceedings of the 3rd Multimedia Systems Conference, pp. 89–94, 2012.
  • [159] C. Raiciu, C. Paasch, S. Barre, A. Ford, M. Honda, F. Duchene, O. Bonaventure, and M. Handley, “How hard can it be? designing and implementing a deployable multipath TCP,” Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, pp. 29–29, 2012.
  • [160] G. Fairhurst, B. Trammell, and M. Kuehlewind, “Services provided by IETF transport protocols and congestion control mechanisms,” RFC 8095, 2017.
  • [161] T. Hoßfeld, R. Schatz, and U. R. Krieger, “QoE of YouTube video streaming for current Internet transport protocols,” in Measurement, Modelling, and Evaluation of Computing Systems and Dependability and Fault Tolerance.   Springer, 2014, pp. 136–150.
  • [162] S. Floyd, M. Handley, J. Padhye, and J. Widmer, “Equation-based congestion control for unicast applications,” ACM SIGCOMM Computer Communication Review, vol. 30, no. 4, pp. 43–56, 2000.
  • [163] S. Cen, P. C. Cosman, and G. M. Voelker, “End-to-end differentiation of congestion and wireless losses,” IEEE/ACM Transactions on networking, vol. 11, no. 5, pp. 703–717, 2003.
  • [164] “Web Real-Time Communication,” http://www.webrtc.org/, accessed:15-JUL-2017.
  • [165] S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky, “An extension to the selective acknowledgement (SACK) option for TCP,” RFC 2883, 2000.
  • [166] D. Black, “Relaxing Restrictions on Explicit Congestion Notification (ECN) Experimentation,” RFC 8311, 2018.
  • [167] M. Li, A. Lukyanenko, S. Tarkoma, Y. Cui, and A. Ylä-Jääski, “Tolerating path heterogeneity in multipath TCP with bounded receive buffers,” Computer Networks, vol. 64, pp. 1–14, 2014.
  • [168] E. Kohler, M. Handley, and S. Floyd, “Datagram Congestion Control Protocol (DCCP),” RFC 4340, 2006.
  • [169] S. Floyd and E. Kohler, “Profile for Datagram Congestion Control Protocol (DCCP) Congestion Control ID 2: TCP-like Congestion Control,” RFC 4341, 2006.
  • [170] S. Floyd, E. Kohler, and J. Padhye, “Profile for Datagram Congestion Control Protocol (DCCP) Congestion Control ID 3: TCP-Friendly Rate Control (TFRC),” RFC 4342, 2006.
  • [171] M. A. Azad, R. Mahmood, and T. Mehmood, “A comparative analysis of DCCP variants (CCID2, CCID3), TCP and UDP for MPEG4 video applications,” International Conference on Information and Communication Technologies. ICICT, pp. 40–45, 2009.
  • [172] A. Ford, C. Raiciu, S. Barre, and Louvain, “TCP Extensions for Multipath Operation with Multiple Addresses,” RFC 6824, subsequent updates, 2009.
  • [173] A. Ford, C. Raiciu, M. Handley, s. Barre, and J. Iyengar, “Architectural Guidelines for Multipath TCP Development,” RFC 6182, 2011.
  • [174] “MultiPath TCP - Linux Kernel implementation,” http://www.multipath-tcp.org, accessed:15-JUN-2017.
  • [175] ““MPTCP Implementation for FreeBSD”,” http://caia.swin.edu.au/urp/newtcp/mptcp/, accessed:05-may-2019.
  • [176] O. Bonaventure and S. Seo, “Multipath TCP deployments,” IETF Journal, vol. 12, no. 2, pp. 24–27, 2016.
  • [177] “MultiPath TCP - versions of the MPTCP kernel on smartphones,” https://multipath-tcp.org/pmwiki.php/Users/Android, accessed:05-may-2019.
  • [178] “Improving Network Reliability Using Multipath TCP,” https://developer.apple.com/documentation/foundation/nsurlsessionconfiguration/improving_network_reliability_using_multipath_tcp?language=objc, accessed:27-MAR-2018.
  • [179] S. C. et al., ““Advances in networking, part 1”,” https://developer.apple.com/videos/play/wwdc2017/707/, accessed:05-may-2019.
  • [180] S. Seo, “Kt’s giga lte,” IETF93, 2015.
  • [181] D. Wischik, C. Raiciu, A. Greenhalgh, and M. Handley, “Design, Implementation and Evaluation of Congestion Control for Multipath TCP,” Networked Systems Design and Implementation (NSDI), vol. 11, pp. 8–8, 2011.
  • [182] C. Raiciu, M. Handley, and D. Wischik, “Coupled Congestion Control for Multipath Transport Protocols,” RFC 6356, 2011.
  • [183] C. Paasch, S. Ferlin, O. Alay, and O. Bonaventure, “Experimental evaluation of multipath TCP schedulers,” Proceedings of the ACM SIGCOMM workshop on Capacity sharing workshop, pp. 27–32, 2014.
  • [184] S. Barré, C. Paasch, and O. Bonaventure, “Multipath TCP: from theory to practice,” International Conference on Research in Networking, pp. 444–457, 2011.
  • [185] B. Hesmans, F. Duchene, C. Paasch, G. Detal, and O. Bonaventure, “Are TCP extensions middlebox-proof?” Proceedings of the workshop on Hot topics in middleboxes and network function virtualization, pp. 37–42, 2013.
  • [186] S. Deng, R. Netravali, A. Sivaraman, and H. Balakrishnan, “Wifi, lte, or both?: Measuring multi-homed wireless internet performance,” Proceedings of the Conference on Internet Measurement Conference, pp. 181–194, 2014.
  • [187] Y.-s. Lim, Y.-C. Chen, E. M. Nahum, D. Towsley, and K.-W. Lee, “Cross-layer path management in multi-path transport protocol for mobile devices,” pp. 1815–1823, 2014.
  • [188] E. Exposito, M. Gineste, L. Dairaine, and C. Chassot, “Building self-optimized communication systems based on applicative cross-layer information,” Computer Standards & Interfaces, vol. 31, no. 2, pp. 354–361, 2009.
  • [189] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, I. Rytina, M. Kalia, L. Zhang, and V. Paxson, “Stream control transmission protocol,” RFC 2960, 2000.
  • [190] J. Stone, R. Stewart, and D. Otis, “Stream Control Transmission Protocol (SCTP) Checksum Change,” RFC 3309, 2002.
  • [191] R. Stewart, I. Arias-Rodriguez, K. Poon, A. Caro, and M. Tuexen, “Stream Control Transmission Protocol (SCTP) Specification Errata and Issues,” RFC 4460, 2006.
  • [192] R. Stewart, “Stream control transmission protocol,” RFC 4960, 2007.
  • [193] S. Fu and M. Atiquzzaman, “SCTP: State of the art in research, products, and technical challenges,” IEEE Communications Magazine, vol. 42, no. 4, pp. 64–76, 2004.
  • [194] R. Stewart, M. Ramalho, Q. Xie, M. Tuexen, and P. Conrad, “Stream Control Transmission Protocol (SCTP) Partial Reliability Extension,” RFC 3758, 2004.
  • [195] M. Tuexen, R. Seggelmann, R. Stewart, and S. Loreto, “Additional Policies for the Partially Reliable Stream Control Transmission Protocol Extension,” RFC 7496, 2015.
  • [196] H. Wang, Y. Jin, W. Wang, J. Ma, and D. Zhang, “The performance comparison of PRSCTP, TCP and UDP for MPEG-4 multimedia traffic in mobile network,” IEEE International Conference on Communication Technology Proceedings(ICCT), vol. 1, pp. 403–406, 2003.
  • [197] J. R. Iyengar, P. D. Amer, and R. Stewart, “Concurrent multipath transfer using SCTP multihoming over independent end-to-end paths,” IEEE/ACM Transactions on networking, vol. 14, no. 5, pp. 951–964, 2006.
  • [198] K. Wu, J. Xiao, Y. Yi, D. Chen, X. Luo, and L. M. Ni, “CSI-based indoor localization,” IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 7, pp. 1300–1309, 2013.
  • [199] D. Le Gall, “MPEG: A video compression standard for multimedia applications,” Communications of the ACM, vol. 34, no. 4, pp. 46–58, 1991.
  • [200] T. Fang and L.-P. Chau, “Robust group-of-picture architecture for video transmission over error-prone channels,” in Circuits and Systems, 2005. ISCAS 2005. IEEE International Symposium on.   IEEE, 2005, pp. 3447–3450.
  • [201] Y. Huo, C. Hellge, T. Wiegand, and L. Hanzo, “A tutorial and review on inter-layer FEC coded layered video streaming,” IEEE Communications Surveys & Tutorials, vol. 17, no. 2, pp. 1166–1207, 2015.
  • [202] F. Zhai, Y. Eisenberg, T. N. Pappas, R. Berry, and A. K. Katsaggelos, “Rate-distortion optimized product code forward error correction for video transmission over IP-based wireless networks,” in EEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 5, 2004, pp. V–857.
  • [203] P. Frossard, “FEC performance in multimedia streaming,” IEEE Communications Letters, vol. 5, no. 3, pp. 122–124, 2001.
  • [204] T. Nakachi, T. Yamaguchi, Y. Tonomura, and T. Fujii, “Next-generation media transport MMT for 4K/8K video transmission,” NTT Technical Review, vol. 12, no. 5, pp. 1–7, 2013.
  • [205] A. Shokrollahi, “Raptor codes,” IEEE transactions on information theory, vol. 52, no. 6, pp. 2551–2567, 2006.
  • [206] C.-I. Kuo, C.-H. Shih, C.-K. Shieh, W.-S. Hwang, and C.-H. Ke, “Modeling and analysis of frame-level forward error correction for MPEG video over burst-loss channels,” Applied Mathematics & Information Sciences, vol. 8, no. 4, p. 1845, 2014.
  • [207] T. D. Wallace and A. Shami, “A review of multihoming issues using the stream control transmission protocol,” IEEE Communications Surveys & Tutorials, vol. 14, no. 2, pp. 565–578, 2012.
  • [208] M. Kazemi, S. Shirmohammadi, and K. H. Sadeghi, “A review of multiple description coding techniques for error-resilient video delivery,” Multimedia Systems, vol. 20, no. 3, pp. 283–309, 2014.
  • [209] M. Kobayashi, H. Nakayama, N. Ansari, and N. Kato, “Robust and efficient stream delivery for application layer multicasting in heterogeneous networks,” IEEE Transactions on Multimedia, vol. 11, no. 1, pp. 166–176, 2009.
  • [210] M. Mitzenmacher, “The power of two choices in randomized load balancing,” IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 10, pp. 1094–1104, 2001.
  • [211] E. P. Ribeiro and V. C. Leung, “Minimum delay path selection in multi-homed systems with path asymmetry,” IEEE Communications Letters, vol. 10, no. 3, pp. 135–137, 2006.
  • [212] A. K. Paul, A. Tachibana, and T. Hasegawa, “An enhanced available bandwidth estimation technique for an end-to-end network path,” IEEE Transactions on Network and Service Management, vol. 13, no. 4, pp. 768–781, 2016.
  • [213] M. Jain and C. Dovrolis, “Pathload: A measurement tool for end-to-end available bandwidth,” in In Proceedings of Passive and Active Measurements (PAM) Workshop, 2002.
  • [214] A. Zhou, M. Liu, Y. Song, Z. Li, H. Deng, and Y. Ma, “A new method for end-to-end available bandwidth estimation,” in IEEE Global Communications Conference (GLOBECOM), 2008, pp. 1–5.
  • [215] J. Wu, C. Yuen, N.-M. Cheung, and J. Chen, “Delay-constrained high definition video transmission in heterogeneous wireless networks with multi-homed terminals,” IEEE Transactions on Mobile Computing, vol. 15, no. 3, pp. 641–655, 2016.
  • [216] “Abing project page,” Stanford University, CA, http://iphome.hhi.de/suehring/tml/download/, accessed:10-JUL-2017.
  • [217] V. J. Ribeiro, R. H. Riedi, R. G. Baraniuk, J. Navratil, and L. Cottrell, “pathchirp: Efficient available bandwidth estimation for network paths,” in Passive and active measurement workshop, 2003.
  • [218] K. Ramakrishnan, S. Floyd, and D. Black, “The Addition of Explicit Congestion Notification (ECN) to IP,” RFC 3168, 2001.
  • [219] K. Rijkse, “H. 263: video coding for low-bit-rate communication,” IEEE Communications magazine, vol. 34, no. 12, pp. 42–45, 1996.
  • [220] “Advanced Video Coding for Generic Audio-Visual Services,” ITU-T Rec. H.264 and ISO/IEC 14496-10 (MPEG-4 AVC), 2017.
  • [221] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable video coding extension of the H. 264/AVC standard,” IEEE Transactions on circuits and systems for video technology, vol. 17, no. 9, pp. 1103–1120, 2007.
  • [222] J. Boyce, J. Chen, Y. Chen, D. Flynn, M. Hannuksela, M. Naccari, C. Rosewarne, K. Sharman, J. Sole, G. Sullivan et al., “Edition 2 Draft Text of High Efficiency Video Coding (HEVC), Including Format Range (RExt), Scalability (SHVC), and Multi-View (MV-HEVC) Extensions,” document JCTVC-R1013, 2014.
  • [223] Y.-L. Chang, T.-L. Lin, and P. C. Cosman, “Network-based H. 264/AVC whole-frame loss visibility model and frame dropping methods,” IEEE Transactions on Image Processing, vol. 21, no. 8, pp. 3353–3363, 2012.
  • [224] Joint Video Team (JVT), “H.264/MPEG4-AVC Model (JM) repository,” http://iphome.hhi.de/suehring/tml/download/, accessed:5-JUL-2017.
  • [225] “x264,” http://www.videolan.org/developers/x264.html, accessed:10-JUL-2017.
  • [226] Joint Video Team (JVT), “Joint Scalable Video Model (JSVM) repository,” https://www.hhi.fraunhofer.de/en/departments/vca/research-groups/image-video-coding/research-topics/svc-extension-of-h264avc/jsvm-reference-software.html, accessed:10-JUL-2017.
  • [227] “FFmpeg,” https://ffmpeg.org, accessed:10-JUL-2017.
  • [228] Joint Collaborative Team On Video Coding (JCT-VC), “HEVC Model (HM) repository,” https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/, accessed:15-JUL-2017.
  • [229] “OPNET,” https://www.riverbed.com/br/products/steelcentral/opnet.html?redirect=opnet, accessed:3-JUN-2017.
  • [230] “Network Simulator - NS-2,” http://www.isi.edu/nsnam/ns/, accessed:3-JUN-2017.
  • [231] “Network Simulator - NS-3,” https://www.nsnam.org/, accessed:3-JUN-2017.
  • [232] “EXata,” http://web.scalable-networks.com/exata, accessed:15-FEB-2017.
  • [233] “Netem,” https://wiki.linuxfoundation.org/networking/netem, accessed:15-JUN-2017.
  • [234] S. Hemminger, “Network emulation with NetEm,” Linux conf au, pp. 18–23, 2005.
  • [235] R. R. Fontes, S. Afzal, S. H. B. Brito, M. A. S. Santos, and C. E. Rothenberg, “Mininet-wifi: Emulating software-defined wireless networks,” in 2015 11th International Conference on Network and Service Management (CNSM), Nov 2015, pp. 384–389.
  • [236] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
  • [237] Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” IEEE Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, vol. 2, pp. 1398–1402, 2003.
  • [238] J.-L. Blin, “New quality evaluation method suited to multimedia context: SAMVIQ,” Proceedings of the Second International Workshop on Video Processing and Quality Metrics, VPQM, vol. 6, 2006.
  • [239] A. Frömmgen, A. Rizk, T. Erbshäußer, M. Weller, B. Koldehofe, A. Buchmann, and R. Steinmetz, “A programming model for application-defined multipath TCP scheduling,” in Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference.   ACM, 2017, pp. 134–146.
  • [240] M. Yan, J. Casey, P. Shome, A. Sprintson, and A. Sutton, “Ætherflow: Principled wireless support in sdn,” in 2015 IEEE 23rd International Conference on Network Protocols (ICNP), Nov 2015, pp. 432–437.
  • [241] J. Lee, M. Uddin, J. Tourrilhes, S. Sen, S. Banerjee, M. Arndt, K.-H. Kim, and T. Nadeem, “mesdn: Mobile extension of sdn,” in Proceedings of the Fifth International Workshop on Mobile Cloud Computing & Services, ser. MCS ’14.   New York, NY, USA: ACM, 2014, pp. 7–14. [Online]. Available: http://doi.acm.org/10.1145/2609908.2609948
  • [242] N. F. S. de Sousa, D. A. L. Perez, R. V. Rosa, M. A. Santos, and C. E. Rothenberg, “Network service orchestration: A survey,” Computer Communications, vol. 142-143, pp. 69 – 94, 2019. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0140366418309502
  • [243] G. Dandachi, “Multihoming in heterogeneous wireless networks,” Ph.D. dissertation, Evry, Institut national des télécommunications, 2017.
  • [244] S. Ibnalfakih, E. Sabir, and M. Sadik, “Multi-homing as an Enabler for 5G Networks: Survey and Open Challenges,” in International Symposium on Ubiquitous Networking.   Springer, 2016, pp. 347–356.
  • [245] P. Karimi, M. Sherman, F. Bronzino, I. Seskar, D. Raychaudhuri, and A. Gosain, “Evaluating 5g multihoming services in the mobilityfirst future internet architecture,” in IEEE Vehicular Technology Conference (VTC Spring).   IEEE, 2017, pp. 1–5.
  • [246] D. Purkayastha, M. Perras, and A. Rahman, “Considerations for MPTCP operation in 5G,” Working Draft, IETF Secretariat, Internet-Draft draft-purkayastha-mptcp-considerations-for-nextgen-00, October 2017. [Online]. Available: http://www.ietf.org/internet-drafts/draft-purkayastha-mptcp-considerations-for-nextgen-00.txt
  • [247] K. Lei, S. Zhong, F. Zhu, K. Xu, and H. Zhang, “An NDN IoT Content Distribution Model With Network Coding Enhanced Forwarding Strategy for 5G,” IEEE Transactions on Industrial Informatics, vol. 14, no. 6, pp. 2725–2735, 2018.
  • [248] A. A. Barakabitze, L. Sun, I.-H. Mkwawa, and E. Ifeachor, “A Novel QoE-Centric SDN-based Multipath Routing Approach for Multimedia Services over 5G Networks,” in IEEE International Conference on Communications (ICC), 2018, pp. 1–7.
  • [249] Y. Ghasempour, C. R. da Silva, C. Cordeiro, and E. W. Knightly, “IEEE 802.11 ay: Next-generation 60 GHz communication for 100 Gb/s Wi-Fi,” IEEE Communications Magazine, vol. 55, no. 12, pp. 186–192, 2017.
  • [250] “AirEngine: A Leading Wi-Fi 6 Product Among Future 5G Networks,” https://e.huawei.com/za/products/enterprise-networking/wlan/wifi-6?utm_medium=cpc&utm_source=google&utm_campaign=EEBGHQ197121L&utm_content=General&source=Adcopy&gclid=CjwKCAjw_MnmBRAoEiwAPRRWWzheigpJFt_8HWSpn_trjC7EpWSUXtdQCkU9bdGC5rRAcDMqElpGyBoCEs4QAvD_BwE, accessed:06-may-2019.
  • [251] X. Li, M. Dong, Z. Ma, and F. C. Fernandes, “Greentube: power optimization for mobile videostreaming via dynamic cache management,” Proceedings of the 20th ACM international conference on Multimedia, pp. 279–288, 2012.
  • [252] A. Kaul, K. Obraczka, M. A. S. Santos, C. E. Rothenberg, and T. Turletti, “Dynamically distributed network control for message dissemination in its,” in 2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT), Oct 2017, pp. 1–9.
  • [253] R. D. R. Fontes, C. Campolo, C. E. Rothenberg, and A. Molinaro, “From theory to experimental evaluation: Resource management in software-defined vehicular networks,” IEEE Access, vol. 5, pp. 3069–3076, 2017.
  • [254] M. Latah and L. Toker, “Artificial Intelligence Enabled Software Defined Networking: A Comprehensive Overview,” arXiv preprint arXiv:1803.06818, 2018.
  • [255] H. Mao, R. Netravali, and M. Alizadeh, “Neural adaptive video streaming with pensieve,” in Proceedings of the Conference of the ACM Special Interest Group on Data Communication, ser. SIGCOMM ’17.   New York, NY, USA: ACM, 2017, pp. 197–210. [Online]. Available: http://doi.acm.org/10.1145/3098822.3098843
  • [256] A. Al-Jawad, P. Shah, O. Gemikonakli, and R. Trestian, “LearnQoS: a learning approach for optimizing QoS over multimedia-based SDNs,” in IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), 2018, pp. 1–6.
  • [257] V. Vasilev, J. Leguay, S. Paris, L. Maggi, and M. Debbah, “Predicting QoE factors with machine learning,” in IEEE International Conference on Communications (ICC), 2018, pp. 1–6.
  • [258] I. R. Alzahrani, N. Ramzan, S. Katsigiannis, and A. Amira, “Use of Machine Learning for Rate Adaptation in MPEG-DASH for Quality of Experience Improvement,” in 5th International Symposium on Data Mining Applications.   Springer, 2018, pp. 3–11.
  • [259] “AVAYA,” https://downloads.avaya.com/css/P8/documents/100134063, accessed:05-may-2019.
  • [260] “EtherChannels,” https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst4500/12-2/31sg/configuration/guide/conf/channel.pdf, accessed:05-may-2019.