Towards Seamless Authentication for Zoom-Based Online Teaching and Meeting

05/21/2020 ∙ by Manoranjan Mohanty, et al. ∙ 0

The lockdowns and travel restrictions in current coronavirus pandemic situation has replaced face-to-face teaching and meeting with online teaching and meeting. Recently, the video conferencing tool Zoom has become extremely popular for its simple-to-use feature and low network bandwith requirement. However, Zoom has serious security and privacy issues. Due to weak authentication mechanisms, unauthorized persons are invading Zoom sessions and creating disturbances (known as Zoom bombing). In this paper, we propose a preliminary work towards a seamless authentication mechanism for Zoom-based teaching and meeting. Our method is based on PRNU (Photo Response Non Uniformity)-based camera authentication, which can authenticate the camera of a device used in a Zoom meeting without requiring any assistance from the participants (e.g., needing the participant to provide biometric). Results from a small-scale experiment validates the proposed method.



There are no comments yet.


page 2

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The current coronavirus (covid-19) pandemic situation has brought a lot of changes in how teaching and meeting are happening around the world. Due to lockdowns and travel restrictions, regular face-to-face teaching (classroom teaching) and face-to-face meeting are being replaced with video conference-based online teaching and meeting. Recently, the video conferencing tool Zoom111 has become extremely popular for its simple-to-use feature and low network bandwidth requirement, resulting so-called Zoom booming [1].

However, as has been reported by many media houses and acknowledged by Zoom, Zoom has serious security and privacy issues [2]. For lowering the network bandwidth and network latency requirements, Zoom does not use end-to-end encryption (as encryption introduces extra overheads). Although a password-based authentication mechanism has been provided, the use of password is also optional for making Zoom more user-friendly. The password setting is also not provided in the default Zoom setup. As a result, many Zoom sessions are password-less. Some Zoom sessions are attended by an unauthorized persons, leading to Zoom bombing (teachers being racially abused and students being shown pornographic videos), and Zoom eavesdropping (confidential conversation being secretly heard) issues [3, 4].

A combination of Zoom features can be used for addressing the security and privacy issues in some extent [5]. Besides using the password, the wait room feature can be used by the teacher or the meeting host for accepting or rejecting joining requests from students or meeting participants [6]. Although this feature can be useful in controlling a small classroom or meeting, for a bigger classroom or meeting (where hundreds of participant can join), this feature can also fail. It could be near impossible for the teacher or the meeting host for remembering hundreds of names and online identities of students and meeting participants. This solution also can create a lot of hassle as some of the joining requests can be attended in-the-middle of the Zoom session. Besides the wait room feature, other guidelines, such as not to share the meeting links in public domain, sharing the meeting links just before the start of the meeting and not using recurring meeting options, and controlling the screen share option etc. can be used. None of them, however, can provide full-proof defense against a savvy attacker (e.g., a hacker).

By taking Zoom as an example, in this paper, we present a preliminary work towards a scalable, automatic, and hassle-free authentication scheme for video-conferencing-based online meeting and teaching. The proposed scheme combines PRNU [7] (Photo Response Non-Uniformity)-based authentication with password-based authentication. The PRNU-based authentication can seamlessly authenticate a meeting participant (or a student) by authenticating the camera of the device she uses in the meeting (or online classroom). In such cases, the meeting participant enjoys the same usability that she enjoys with default zoom setup (no password to remember, no keys to press, no bio-metric required). Whenever the PRNU-based scheme cannot authenticate, the participant is asked to enter the password. The proposed scheme was experimentally validated by validating the effectiveness of the PRNU-based method for the webcams of desktops, and selfie-camera of mobile phones and tablets (as these cameras are typically used in Zoom sessions). The experiment with a set of cameras shows promising result.

The rest of the paper has been organized as follows. Section 2 presents related work and provides an overview of how Zoom works. In Section 3, we discuss the proposed method. Section 4 presents experimental result. Finally, Section 5 concludes and also discusses how we intend to extend this paper.

In the rest of the paper, we will use the term Zoom meeting in the place of Zoom-based meeting or Zoom-based teaching.

2 Background and Related Work

2.1 Video Conferencing Using Zoom

Zoom is arguably one of the most popular video conferencing tool now. One of the main reasons for this popularity is Zoom’s easy-to-use feature. For joining an online meeting room, the participant does not need to go with the hassle of registering into Zoom. Only a meeting ID (which is a multi-digit number) is enough. By default, Zoom does not authenticate a participant. This leads to a number of security and privacy issues, such as authentication issue.

Figure 1: Threat Model.

Figure 1 shows how authentication threat can happen for a Zoom meeting with default meeting setup. It is assumed that the Attacker can know the meeting ID. This is a safe assumption as people are sharing meeting IDs in public forum (e.g., social media). It is also assumed that the host and legitimate participants will not often detect that the Attacker has joined. This is also a safe assumption as the attacker can join in-between the meeting when the host and participants are busy. After joining the meeting, the attacker can launch Zoom bombing and Zoom eavesdropping.

After a series of security issues, Zoom has encouraged to use its wait room feature. Although this feature can address authentication issue for a small meeting, authentication is still a concern for a large meeting. For a large meeting, it would be difficult for the host to remember the identities of all participants (e.g., hundreds of name). In such case, it is safe to assume that either the host will not use the wait room feature or randomly approve connections requests. This will make the threat model presented in Figure 1 valid.

Figure 2: Camera attribution using two videos.

2.2 PRNU-Based Camera Fingerprinting

The PRNU (Photo Response Non-Uniformity)-based source camera attribution is an effective method for identifying if two different videos (or images) belong to the same camera [7, 8, 9, 10]. Figure 2 shows how this method works. PRNU-based camera attribution is based on the fact that the output of the camera sensor, , can be modeled as


where is the noise-free video frame (or an image), is the PRNU noise, and is the combination of additional noise, such as readout noise, dark current, and quantization noise. The multiplicative PRNU noise pattern, , is unique for each camera and can be used as a camera fingerprint which enables the attribution of a video to its source camera. Using a denoising filter

(such as a Wavelet filter) on a set of video frames of a camera (where it is known that the video belongs to the camera, physical access to the camera is not required), we can estimate a known camera fingerprint by first getting the noise residual,

, (i.e., the estimated PRNU) of the frame as , and then averaging the noise residuals of all the frames. For determining if a specific camera has taken a given query video, we similarly obtain a query fingerprint and the match (correlate) this fingerprint with the known fingerprint. The matching is typically done using Peak-to-Correlation Energy (PCE) with a matching threshold of . If the matching score is above the threshold, it is concluded that both videos belong to the same camera. So far, this PRNU method has been used for camera verification, camera identification, image/video clustering, etc.

The PRNU-based method has also been used for authentication of users via the camera they use [11, 12]. The existing schemes, however, have been designed for images. In contrast, our method considers video. Also, unlike previous schemes, our method combines the PRNU-based method with password.

3 Proposed Method

Figure 3: Camera fingerprint registration. ’User 1’ to ’User N’ are N legitimate participants.
Figure 4: Camera fingerprint matching. ’User 1’ to ’User N’ are N legitimate participants.

The proposed authentication method uses both PRNU-based authentication and password-based authentication for making authentication process as seamless as possible. First the PRNU-based authentication is invoked. If a participant is authenticated using this method, she is allowed to Zoom meeting. Otherwise, the password-based authentication is invoked. If the participant can enter a valid password, she is also allowed to join the meeting. Otherwise, the participant is not allowed to the meeting. The PRNU-based authentication is a truly seamless method. In this method, the participant’s direct involvement is not required as she is not asked to provide her biometric (e.g., facial expression or fingerprint), or asked to enter a password or respond to a security question. Rather, participant’s camera (of the device that the participant uses in Zoom sessions) is authenticated without prompting her for anything. The PRNU-based method does not have true positive rate. Thus a few legitimate participants will not be authenticated using this method. Theses participants will only be asked to enter a password.

Our method has two main steps: Camera fingerprint registration and Camera fingerprint matching.

Figure 3 provides an overview of the fingerprint registration step. In this step, each legitimate participant of a Zoom meeting must register the fingerprint of her camera with the meeting host. This step is done only once before the start of a meeting. For recurring meetings, registration is also required once before the start of the meeting series. This step can be implemented in a number of ways. A possible way that is more suitable for recurring meetings is to integrate the registration step in a customized Zoom installation app. The host will send the installation app to each participants. During the time of installation, a short video (say, one minute) will be automatically taken (after taking participant’s permission) using the camera of the device that is installing the App. Then a camera fingerprint from the video will be computed using the method described in Section 2. The computed fingerprint (e.g., Fingerprint by Participant N in Figure 3) will be sent to the host. This way of implementing can perfectly suit online teaching setup, where the students can be asked by the educational institute to install a customized Zoom app. Another possible way of implementation is to ask the participants for sending a short video from their camera using a secure communication method (such as secure email). The host can then compute the camera fingerprint of each participant.

Figure 4 provides an overview of the fingerprint matching step. In this step, the authenticity of each participant is checked. This step is performed each time a participant wants to join a Zoom meeting. From initial few seconds of the participant’s video, a camera fingerprint is computed. The camera fingerprint (e.g., Fingerprint for Participant N in Figure 4) is then sent to the host. Note that for computing the fingerprint and sending it to the host, no assistance from the participant is required. For each participant, the host matches the registered fingerprint with recently obtained fingerprint (for example, Fingerprint is matched with Fingerprint for Participant N). If the match result is above a threshold, the participant is authenticated, and her Zoom joining request is approved. Otherwise, the user is asked to enter a password that was setup by the meeting host. The current version of Zoom allows the meeting host to setup a password. This password needs to be sent to the participants using other communication methods (such as email or phone).

4 Experiments

Register time
I-frames (secs)
Register time
FP (secs)
Verify time
100 frames
First frame
Huawei honor 8 1280 x 720
Samsung S9+ 1280 x 720
Dell Laptop XPS 1280 x 720
Dell Desktop 1280 x 720
HP laptop 1280 x 720
Iphone 8 Plus 1920 x 1088
Dell Laptop XPS 1280 x 720
Samsung Note 9 1280 x 720
Samsung Note 10 1280 x 720
Dell Laptop Inspiron 1280 x 720
Table 1: Run time performance of registration and verification of various cameras

The proposed method is experimentally validated by assessing the performance of the PRNU-based method for short videos taken by front cameras of various computing devices (PC, laptop, mobile, tab). In this small scale-experiment, the goal is to study the feasibility of the proposed method.

Figure 5: Various Cameras PCE in the first 100 frames of video based on added Noise of subsequent frames

The PRNU-based method (both fingerprint computation and matching) is implemented in MATLAB on a Windows system having GB RAM, GHz CPU. Videos from different computing devices were used. Table 1 provides the list of computing devices considered for the experiment. From each computing device, a HD video (as Zoom uses HD video [13]) is considered. The videos have not gone through stabilization or out-camera processing (like scaling, cropping, etc.). For computing the camera fingerprint used in the registration step, i-frames from a seconds video is used. For computing the camera fingerprint used in the matching step (i.e., query fingerprint), frames (I, P, and B frames) from a shorter seconds video is used. This short video is used as for providing a smaller delay (due to PRNU matching) for meeting requests. The PRNU matching is done using PCE with a threshold of .

Figure 5 shows the performance of the proposed method. Out of matching, matching were successful (as the obtained PCE is larger than the threshold ) by giving a true positive rate of for this small scale experiment. The achieved true positive rate is by no means generalisable. A large scale experiment is required to find the error rates. However, this small scale experiment shows that the proposed approach works.

Table 1 show the computation cost in computing the query fingerprint and performing matching (Verify time) from frames. This cost is the main contributor to the run-time delay that a user needs to bear due to authentication. This table also reports the computation cost required in the registration phase. This cost, however, is one-time offline cost. Note that, for a smaller video, all the computation costs will be lesser; and for a larger video, these costs will be higher. This trade-off is due to the fact that a lesser number of frames will require lesser number of denoising (which is the main contributor to the delay).

5 Conclusion and Future Work

Zoom-based online meeting has become very popular due to current coronavirus scenario. Zoom, however, has serious security and privacy issues. One of the main security issue is the poor authentication mechanism, due to which an unauthorized person is able to join a meeting and create disturbances leading to Zoom bombing and Zoom eavesdropping. In this paper, we have proposed our preliminary work towards a seamless authentication scheme. The proposed scheme uses PRNU-based camera authentication method to authenticate a meeting participants. Those who are not authenticated using this method need to provide a password. A small scale experiment shows that the proposed method works as expected. However, a number of improvements need to be done for making the method usable. We have provided some of the possible improvements below.

5.1 Future Work

Large Scale Experiment: A large scale experiment needs to be done for finding better estimates of error rates (e.g., true positive rate and false positive rate). Protection against attacks: There can be a number of attacks. For example, if an attacker collects a video of a legitimate participant from another source (e.g., social media), she can impersonate the participant. Techniques need to be developed for withstanding such attacks. Developing an app: An app implementing the proposed idea needs to be developed. A large scale user study needs to be conducted by asking users (both computer experts and non-computer users) to use the app. This study will provide a better evaluation of the proposed idea in terms of performance, usability, and security.


This work has been supported by University of Technology MaPS Startup funding 263010-0226628.