Log In Sign Up

Performance Evaluation of Selective Fixed-filter Active Noise Control based on Different Convolutional Neural Networks

Due to its rapid response time and a high degree of robustness, the selective fixed-filter active noise control (SFANC) method appears to be a viable candidate for widespread use in a variety of practical active noise control (ANC) systems. In comparison to conventional fixed-filter ANC methods, SFANC can select the pre-trained control filters for different types of noise. Deep learning technologies, thus, can be used in SFANC methods to enable a more flexible selection of the most appropriate control filters for attenuating various noises. Furthermore, with the assistance of a deep neural network, the selecting strategy can be learned automatically from noise data rather than through trial and error, which significantly simplifies and improves the practicability of ANC design. Therefore, this paper investigates the performance of SFANC based on different one-dimensional and two-dimensional convolutional neural networks. Additionally, we conducted comparative analyses of several network training strategies and discovered that fine-tuning could improve selection performance.


A Hybrid SFANC-FxNLMS Algorithm for Active Noise Control based on Deep Learning

The selective fixed-filter active noise control (SFANC) method selecting...

Design and Evaluation of Active Noise Control on Machinery Noise

Construction workers and residents live near around construction sites a...

DeepCorrect: Correcting DNN models against Image Distortions

In recent years, the widespread use of deep neural networks (DNNs) has f...

Enhancing the Performance of Convolutional Neural Networks on Quality Degraded Datasets

Despite the appeal of deep neural networks that largely replace the trad...

Deep Networks with Internal Selective Attention through Feedback Connections

Traditional convolutional neural networks (CNN) are stationary and feedf...

Clipped DeepControl: deep neural network two-dimensional pulse design with an amplitude constraint layer

Advanced radio-frequency pulse design used in magnetic resonance imaging...

1 Introduction

Acoustic noise problems are becoming more prevalent as the quantity of industrial equipment increases [4]. The attenuation of low-frequency noises is quite difficult and expensive for passive noise control techniques such as enclosures, barriers, silencers, etc. Different from passive techniques, active noise control (ANC) involves the electro-acoustic generation of a sound field to cancel an unwanted existing sound field [7]. Moreover, ANC can offer a possible lower-cost alternative for the control of low-frequency noises. Thus, it attracts much interest from the industry. When dealing with different types of noises, traditional ANC systems typically use adaptive algorithms to adjust control filter coefficients to minimize the error signal [17]. Among adaptive algorithms, the filtered-X least mean square (FxLMS) and filtered-X normalized least-mean-square (FxNLMS) algorithm are commonly used since they can compensate for the delay involved by the secondary path to increase the system robustness [25].

Figure 1: Block diagram of the CNN-based SFANC algorithm.

However, due to the least mean square (LMS) based algorithms’ inherent slow convergence and poor tracking ability [16], FxLMS and FxNLMS are less capable of dealing with rapidly varying or non-stationary noises. Their slow responses to noises may impact customers’ perceptions of the noise reduction effect [20]. Fixed-filter ANC methods [19] can be adopted to tackle slow convergence, where the control filter coefficients are pre-trained rather than adaptive updated. However, the pre-trained control filter is only suitable for a specific noise type, resulting in the degradation of noise reduction performance for other types of noises. To rapidly select different pre-trained control filters given different noise types, a selective fixed-filter active noise control (SFANC) method based on the frequency band matching was proposed in [21].

Though the SFANC method [21] selects the most suitable pre-trained control filters in response to different noise types, several critical parameters of the method can only be determined through trials and errors. Considering the limitations, deep learning techniques, particularly convolutional neural networks (CNNs) [8, 23, 11, 12]

, appear to be powerful in classifying noises in SFANC methods. Automatic learning of the SFANC algorithm’s critical parameters based on deep learning would broaden its applications in real-world scenarios.

With the learning ability of CNN models, the SFANC algorithm can automatically learn its parameters from noise datasets and select the best control filter given different noise types without resorting to extra-human efforts [22]. Additionally, a CNN model implemented on a co-processor can decouple the computational load from the real-time noise controller. Therefore, in this paper, we compared the performance of several one-dimensional (1D) CNNs and two-dimensional (2D) CNNs in the SFANC method. Also, different network training strategies are tried to choose the best one for training the networks. Experiments show that the SFANC method based on CNN not only achieves faster responses than FxLMS and FxNLMS but also exhibits good robustness. Thus, it is expected to be used for attenuating dynamic noises such as traffic noises and urban noises, etc.

2 CNN-based SFANC Algorithm

The overall architecture of the CNN-based SFANC algorithm is depicted in Figure 1. Throughout the control process, the real-time controller conducts filtering to generate anti-noise while simultaneously sending the primary noise to a co-processor (e.g., a mobile phone). Given the primary noise, the co-processor employs a pre-trained CNN to produce the index for the most appropriate control filter and delivers it to the real-time controller. The controller then adjusts the control filter coefficients based on the received filter index. Notably, if the network is a 1D CNN, its input is the raw waveform [14]. However, if the network is a 2D CNN, its input is the Log Mel-spectrogram [18].

Figure 2: Markov model of the ANC progress.

1 Concise Explanation of SFANC

An ANC progress can be abstracted as a first-order Markov chain

[9] as shown in Figure 2, where represents the optimal control filter to attenuate the disturbance . To achieve the best noise reduction performance, the best control filter can be selected from a pre-trained filter set . Hence, the SFANC method can be represented as follows:


where operator returns the input value for minimum output; , , and represent the linear convolution, the reference signal, and the impulse response of the secondary path, respectively. The reference signal is assumed to be the same as the primary noise.

In practice, is typically seen as the linear combination of . Thus, Equation 1 equals to


which means that the selected control filter is the one with maximum posterior probability in the presence of reference signal

. Moreover, according to Bayes’ theorem


, the posterior probability can be replaced with a conditional probability as


which predicts the most suitable control filter straight from the primary noise .

A classifier model can be developed to approximate from the pre-recorded sampling set . The

denotes the parameters of the classifier and can be obtained through maximum likelihood estimation (MLE)

[24] as


Therefore, we can utilize deep learning approaches to lean the classifier model from the training set .

2 CNN-based SFANC algorithm

Motivated by the work [22]

, this paper compares some 1D CNNs and 2D CNNs used for classifying noises in the time domain and frequency domain, respectively. The min-max operation firstly normalizes the input of the network:


where and mean obtaining the maximum and minimum value of . It aims to rescale the input range into and retain the signal’s negative part that contains phase information. Phase information is quite critical for ANC applications.

A lightweight 1D CNN illustrated in Figure 3

is proposed. Every residual block in the network comprises two convolutional layers, subsequent batch normalization, and ReLU non-linearity. Note that a shortcut connection is adopted to add the input with the output in each residual block since residual architecture is demonstrated easy to be optimized

[5]. Additionally, the network uses a broad receptive field (RF) in the first convolutional layer and narrow RFs in the rest convolutional layers to fully exploit both global and local information.

Figure 3:

Architecture of the proposed 1D CNN. The configuration of convolution layer is denoted as: (kernel size, channels, stride, padding).

Figure 4: The frequency bands of the noise tracks for pre-training control filters.

3 Training of CNNs

The primary and secondary paths used in the training stage of the control filters are band-pass filters with a frequency range of Hz-Hz. Broadband noises with frequency ranges shown in Figure 4 are used to pre-train control filters. The FxLMS algorithm is adopted to obtain the optimal control filters for these broadband noises due to its low computational complexity. Subsequently, the pre-trained control filters are saved in the control filter database.

A noise dataset including synthetic and real noise tracks is used in this work. Specifically, synthetic noise tracks and real noise tracks are used for training, real noise tracks for validation, and real noise tracks for testing. The synthetic noise tracks are randomly generated with various frequency bands, amplitudes, and background noise levels.

The SFANC system’s sample rate is Hz, so each noise track of 1-second duration consists of samples. Each noise track of 1 second duration is taken as primary noise to generate disturbance. The class label of a noise track corresponds to the index of the control filter that achieves the best noise reduction performance on the disturbance.

3 Experiments

The Adam algorithm was employed to optimize the network during training. The training epoch was set to be

. The glorot initialization [3] was used to avoid bursting or vanishing gradients. Additionally, to prevent overfitting, the weights of CNNs were subjected to regularization with a coefficient of .

1 Comparison of Different Training Schemes

Four different training schemes are compared in training the proposed 1D CNN. The comparison results are summarized in Table 1. According to Table 1, training firstly on the synthetic noise tracks and then fine-tuning on the real noise tracks achieves the highest testing accuracy. Noted that simultaneously using synthetic dataset and real dataset for training has not obtained a superior testing accuracy since the characteristics of synthetic noises and real noises are quite different. As discussed above, in the SFANC system, the CNN models can be firstly trained with the synthetic dataset and then fine-tuned with the real noise dataset.

Training Scheme Testing Accuracy
Only using synthetic dataset 46.4
Only using real dataset 94.6
Fine-tuning method* 95.3
Simultaneously using synthetic dataset and real dataset 94.5
  • Training firstly by the synthetic dataset and then fine-tuning by the real dataset.

Table 1: The performance of different network training schemes.

2 Comparison of Different Networks

Based on above fine-tuning training scheme, we compared several different 1D networks utilizing raw acoustic waveforms: the proposed 1D CNN, M3 [2], M5 [2], M11 [2], M18 [2], and M34-res [2]. Also, some light-weight 2D networks including ShuffleNet v2 [15], MoblieNet v2 [1] and Attention Network [13] are compared in the SFANC method. The performance of these networks on the real testing dataset are summarised in Table 2.

Network Testing Accuracy Network Parameters
1D Convolutional Neural Networks
Proposed 1D Network 95.3 0.21M
M3 Network 93.7 0.22M
M5 Network 94.9 0.56M
M11 Network 94.5 1.79M
M18 Network 93.8 3.69M
M34-res Network 94.4 3.99M
2D Convolutional Neural Networks
ShuffleNet v2 95.5 0.25M
MoblieNet v2 95.6 2.89M
Attention Network 94.9 4.95M
Table 2: Comparisons of different networks used in the SFANC system.

As shown in Table 2, the proposed 1D network obtains the highest classification accuracy of with the fewest network parameters among the 1D networks. As for 2D networks, the ShuffleNet v2 achieves a similar classification accuracy as MoblieNet v2 and requires far fewer parameters. By considering both the testing accuracy and the number of parameters, the ShuffleNet v2 performs best on the testing dataset among the 2D networks. Compared to the proposed 1D network, the ShuffleNet v2 obtains a slight improvement in classification accuracy but requires a little more network parameters. Therefore, it is found that the proposed 1D network and ShuffleNet v2 perform better in classifying noises in the SFANC system. The two light-weight networks can be implemented on mobile platforms, but using acoustic models directly from the raw waveform data is more convenient [10]. Hence, the proposed 1D network is preferred.

3 Non-stationary Noise Cancellation

This section uses the SFANC algorithm based on the proposed 1D network, FxLMS algorithm, and FxNLMS algorithm to attenuate a recorded aircraft noise. The aircraft noise is non-stationary and has a frequency range of 50Hz-14,000Hz. It does not belong to the training dataset. The step size of the FxLMS and FxNLMS algorithm is set to , and the control filter length is taps. The noise reduction results using different ANC methods on the aircraft noise are shown in Figure 5.

From the results in Figure 5, we can observe that the SFANC method responds to the aircraft noise much faster than the FxLMS and FxNLMS algorithms. Also, the SFANC method consistently outperforms the FxLMS and FxNLMS algorithm in the noise reduction process. In particular, during s-s, the averaged noise reduction level achieved by the SFANC algorithm is about 7dB and 8dB more than that of FxLMS and FxNLMS, respectively. Therefore, the results on the aircraft noise confirm that the SFANC method can rapidly select the most suitable pre-trained control filter given the noise type. In contrast, adaptive algorithms show slow responses to the aircraft noise due to adaptive updating.

Figure 5: (a)-(c): Error signals of different ANC algorithms, (d): Averaged noise reduction level of every 1 second, on the aircraft noise.

4 Conclusions

Active noise control (ANC) technologies have been widely used to deal with low-frequency noises. However, adaptive ANC algorithms are typically limited by slow convergence speed. In this paper, CNNs are used to automatically select the best pre-trained control filters given different noises. Also, light-weight CNNs implemented on a co-processor can decouple the computational load from the real-time noise controller. Numerical simulations show that the CNN-based SFANC method improves response time while maintaining low computational complexity and high robustness. Additionally, the effectiveness of the proposed 1D network and the fine-tuning training strategy are confirmed in the SFANC method. In future works, we will explore more efficient and robust ANC algorithms based on deep learning.


  • [1] S. Adapa (2019) Urban sound tagging using convolutional neural networks. In Proceedings of the Detection and Classification of Acoustic Scenes and Events 2019 Workshop (DCASE2019), pp. 5–9. Cited by: §2.
  • [2] W. Dai, C. Dai, S. Qu, J. Li, and S. Das (2017) Very deep convolutional neural networks for raw waveforms. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 421–425. Cited by: §2.
  • [3] X. Glorot and Y. Bengio (2010) Understanding the difficulty of training deep feedforward neural networks. In

    Proceedings of the thirteenth international conference on artificial intelligence and statistics

    pp. 249–256. Cited by: §3.
  • [4] C. N. Hansen (1999) Understanding active noise cancellation. CRC Press. Cited by: §1.
  • [5] K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    pp. 770–778. Cited by: §2.
  • [6] S. M. Kay (1993) Fundamentals of statistical signal processing: estimation theory. Prentice-Hall, Inc.. Cited by: §1.
  • [7] S. M. Kuo and D. R. Morgan (1999) Active noise control: a tutorial review. Proceedings of the IEEE 87 (6), pp. 943–973. Cited by: §1.
  • [8] Y. LeCun, Y. Bengio, and G. Hinton (2015) Deep learning. nature 521 (7553), pp. 436–444. Cited by: §1.
  • [9] P. A. Lopes and M. S. Piedade (2000)

    A kalman filter approach to active noise control

    In 2000 10th European Signal Processing Conference, pp. 1–4. Cited by: §1.
  • [10] E. Loweimi, P. Bell, and S. Renals (2020) On the robustness and training dynamics of raw waveform models.. In INTERSPEECH, pp. 1001–1005. Cited by: §2.
  • [11] Z. Luo, Q. Gu, G. Qi, S. Liu, Y. Zhu, and Z. Bai (2019)

    A robust single-sensor face and iris biometric identification system based on multimodal feature extraction network

    In 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1237–1244. Cited by: §1.
  • [12] Z. Luo, Q. Gu, G. Su, Y. Zhu, and Z. Bai (2021) An adaptive face-iris multimodal identification system based on quality assessment network. In MultiMedia Modeling, pp. 87–98. Cited by: §1.
  • [13] Z. Luo, J. Li, and Y. Zhu (2021)

    A deep feature fusion network based on multiple attention mechanisms for joint iris-periocular biometric recognition

    IEEE Signal Processing Letters 28, pp. 1060–1064. Cited by: §2.
  • [14] Z. Luo, D. Shi, and W. Gan (2022) A hybrid sfanc-fxnlms algorithm for active noise control based on deep learning. IEEE Signal Processing Letters 29, pp. 1102–1106. Cited by: §2.
  • [15] N. Ma, X. Zhang, H. Zheng, and J. Sun (2018) Shufflenet v2: practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV), pp. 116–131. Cited by: §2.
  • [16] R. Ranjan and W. Gan (2015) Natural listening over headphones in augmented reality using adaptive filtering techniques. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23 (11), pp. 1988–2002. Cited by: §1.
  • [17] R. Ranjan, T. Murao, B. Lam, and W. Gan (2016) Selective active noise control system for open windows using sound classification. In INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Vol. 253, pp. 1921–1931. Cited by: §1.
  • [18] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen (2018) MobileNetV2: inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4510–4520. Cited by: §2.
  • [19] C. Shi, R. Xie, N. Jiang, H. Li, and Y. Kajikawa (2019) Selective virtual sensing technique for multi-channel feedforward active noise control systems. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8489–8493. Cited by: §1.
  • [20] D. Shi, W. Gan, B. Lam, S. Wen, and X. Shen (2020) Active noise control based on the momentum multichannel normalized filtered-x least mean square algorithm. In INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Vol. 261, pp. 709–719. Cited by: §1.
  • [21] D. Shi, W. Gan, B. Lam, and S. Wen (2020) Feedforward selective fixed-filter active noise control: algorithm and implementation. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, pp. 1479–1492. Cited by: §1, §1.
  • [22] D. Shi, B. Lam, K. Ooi, X. Shen, and W. Gan (2022) Selective fixed-filter active noise control based on convolutional neural network. Signal Processing 190, pp. 108317. Cited by: §1, §2.
  • [23] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich (2015) Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1–9. Cited by: §1.
  • [24] S. van Ophem and A. P. Berkhoff (2013) Multi-channel kalman filters for active noise control. The Journal of the Acoustical Society of America 133 (4), pp. 2105–2115. Cited by: §1.
  • [25] F. Yang, J. Guo, and J. Yang (2020) Stochastic analysis of the filtered-x lms algorithm for active noise control. IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, pp. 2252–2266. Cited by: §1.