Signal Processing and Machine Learning Techniques for Terahertz Sensing: An Overview

Following the recent progress in Terahertz (THz) signal generation and radiation methods, joint THz communications and sensing applications are shaping the future of wireless systems. Towards this end, THz spectroscopy is expected to be carried over user equipment devices to identify material and gaseous components of interest. THz-specific signal processing techniques should complement this re-surged interest in THz sensing for efficient utilization of the THz band. In this paper, we present an overview of these techniques, with an emphasis on signal pre-processing (standard normal variate normalization, min-max normalization, and Savitzky-Golay filtering), feature extraction (principal component analysis, partial least squares, t-distributed stochastic neighbor embedding, and nonnegative matrix factorization), and classification techniques (support vector machines, k-nearest neighbor, discriminant analysis, and naive Bayes). We also address the effectiveness of deep learning techniques by exploring their promising sensing capabilities at the THz band. Lastly, we investigate the performance and complexity trade-offs of the studied methods in the context of joint communications and sensing; we motivate the corresponding use-cases, and we present few future research directions in the field.



There are no comments yet.


page 1


Wireless Power Transfer for Future Networks: Signal Processing, Machine Learning, Computing, and Sensing

Wireless power transfer (WPT) is an emerging paradigm that will enable u...

Deep Learning Techniques for Geospatial Data Analysis

Consumer electronic devices such as mobile handsets, goods tagged with R...

Redefining Wireless Communication for 6G: Signal Processing Meets Deep Learning

The year 2019 witnessed the rollout of 5G standard, which promises to of...

Intelligent Radio Signal Processing: A Contemporary Survey

Intelligent signal processing for wireless communications is a vital tas...

Machine Learning for Wireless Communications in the Internet of Things: A Comprehensive Survey

The Internet of Things (IoT) is expected to require more effective and e...

Towards Integrated Sensing and Communications for 6G

For the next generation of mobile communications systems, the integratio...

A text-independent speaker verification model: A comparative analysis

The most pressing challenge in the field of voice biometrics is selectin...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

I-a History and Motivation for THz Sensing

The introduction of Terahertz (THz) technology has boosted research on exploring the atomic behavior of materials, fostering diverse opportunities in wireless sensing [1] and imaging [2]. The THz band, which extends from 100 Gigahertz (GHz) to 10 THz (wavelength from 1 mm to 0.1 mm) is sandwiched between the well-studied microwave and infrared bands of the electromagnetic spectrum. This far-infrared region is thus seen as a transition region where optics meets electronics. In the optical domain, THz radiation, also known as T-ray, is treated as a beam of light that can be manipulated by mirrors and lenses, of which the intensity of light can be measured. On the other hand, in electronics, T-ray is treated as an electrical wave, of which the phase of the electric field can be measured. Due to the lack of efficient and reliable sources, the THz band has been perceived as the THz gap. However, the development of high-energy strong-field THz generation and detection devices is bridging this gap. In particular, the THz generation and detection technologies [3]

are classified as optical, electronic, hybrid photonic-electronic, and plasmonic (graphene-based).

THz wireless sensing leverages the inherent spectral fingerprints of molecules at the THz band for various sensing and imaging applications, in the areas of quality control, food safety [4], security [5], health [6], astronomy [7], and environmental monitoring [8]. Owing to its low photon energy ( at ), T-ray is non-ionizing; therefore, it can analyze the content of biological bodies without posing significant harm. Many biological and chemical materials exhibit unique spectral fingerprint information in the THz range. For instance, THz vibrational modes enable the study of molecular structures and vibrational dynamics such as scissoring, wagging, and twisting that arise from both intermolecular and intramolecular interactions [9]. THz radiation also has special interactions and strong molecular coupling with hydrogen-bonded networks, which maximizes the potential of observing water dynamics [10]. Furthermore, T-rays can penetrate various non-conducting, amorphous, and dielectric materials and are highly reflected from metals, which gives T-rays the capability to inspect metallic components and detect weapons in security-related applications [5]. Most recently, THz technology is celebrated as a key enabler for future wireless communications [11, 12], which introduces unique applications in the area of joint THz communications and sensing [13, 14]. Note that wireless sensing can also denote localization functionalities, such as in radar sensing. However, accurate THz-band localization is a promising research topic that is outside the current paper scope.

I-B Comparison with Infrared and Microwave Bands

The THz-band’s sensing capabilities are superior to their counterparts in the neighboring microwave and infrared bands. In particular, THz spectroscopy allows faster analysis with higher precision than infrared spectroscopy, which is further degraded by aerosol-induced light scattering and weather conditions [15]. Many materials are transparent in the THz range. However, in the microwave region, most materials and living tissues are opaque, and the spectral data produces lower-definition images, despite being less sensitive to the weather conditions (due to larger wavelengths). In the THz region, spectral data can reconstruct higher resolution images at arbitrary directions due to antenna array beamforming capabilities and reduced scattering loss. However, THz sensing and imaging also have shortcomings. For instance, at room temperature, blackbody radiation is dominant at THz frequencies, complicating imaging-based object detection. THz signals are also severely absorbed by water vapor, making reliable material identification at long distances in high humidity conditions challenging. Moreover, the time response of a material in the reflected or absorbed THz radiation is calibrated by a sampling technique linked to the THz spectroscopy’s pulsed nature, which limits its resolution. For temporal waveform acquisition, the spectral resolution depends on the temporal window inverse; a trade-off between spectral resolution and temporal measurement time is thus unavoidable. Such promising gains and shortcomings motivate research on signal processing techniques for THz sensing.

I-C Advancements in THz Devices

A typical THz system comprises a source, a detector, and intermediate optical components, with some systems consisting of two or more of these elements. The source produces broad-spectrum THz radiation. The components, i.e., lenses, mirrors, waveguides, and polarizers, manipulate the radiation. Then, a detector sensitive to THz rays measures the radiation reaching it as depicted in Fig. 1. THz systems are divided into two categories: Pulsed-wave and continuous-wave. Both systems are suitable for spectroscopic applications in imaging and sensing. Pulsed THz systems generate short broadband or near-single-cycle THz signals, whereas continuous-wave THz systems generate narrow-linewidth and frequency-tunable THz waves. Continuous-wave electronic sources include frequency-multiplied Gunn diode oscillators and backward wave oscillators (BWOs), whereas promising THz photonic sources include photomixing techniques and cryogenically cooled quantum cascade lasers (QCL).

THz-time domain spectroscopy (THz-TDS) is a powerful spectroscopic tool that has long been utilized for the characterization and identification of materials. Photo-conductive generation and detection of short, broadband THz pulses is at the heart of THz-TDS systems. An ultrafast laser emits femtosecond optical pulses, which are converted into picosecond THz pulses, and photoconductive antennas serve as the emitter and detector. The generation and detection techniques mainly depend on the application.

I-D Significance of Signal Processing for THz Sensing

The literature lacks a holistic approach for THz signal processing for communications and sensing, despite the progress made on THz system design. It is still unclear whether we can mitigate quasi-optical THz propagation to mimic favorable microwave propagation characteristics and enable seamless connectivity and sensing. Consequently, the questions of whether THz sensing can extend beyond application-specific settings and whether joint THz sensing and communications are of real practical value remain open at this early stage of research on the topic.

The progress on THz channel modeling paves the way for solid research in signal processing techniques for THz wireless sensing. Such research is particularly significant because several challenges have not yet been addressed to achieve THz technology’s full potential. For example, THz signals are susceptible to high propagation losses and molecular absorption losses, which severely limit the achievable communication distances. Therefore, THz communications should be complemented by infrastructure-level and algorithmic-level enablers. At the infrastructure level, emerging beyond-5G technologies such as intelligent reflecting surfaces (IRSs) and ultramassive multiple-input multiple-output (UM-MIMO) configurations [16] are vital to expand the THz communication gains and overcome the distance problem. At the algorithmic level, novel signal processing techniques can optimize infrastructure utilization to get around THz quasi-optical propagation limitations and realize seamless wireless sensing. Efficient signal processing techniques are crucial to mitigate the noise and hardware impairments that distort measurements and worsen the sensing performance. This paper, therefore, presents a holistic overview of signal processing and machine learning techniques for efficient THz sensing, with an emphasis on signal pre-processing, feature extraction, and classification techniques. The paper further highlights the role of deep learning techniques by exploring their promising sensing capabilities at the THz band.

I-E Contributions of this Work

Several review papers on THz sensing exist in the literature, such as [17, 18, 19, 20]

. Up to the authors’ knowledge, however, this paper is the first article that presents a holistic overview of signal processing and machine learning techniques for efficient THz sensing. More specifically, the paper introduces a roadmap for future THz sensing use cases, with a special focus on the role of signal processing. We promote the importance of both THz-TDS and frequency-domain spectroscopy in future reconfigurable THz systems. The frequency-domain spectroscopy approach, particularly, introduces significant flexibility in carriers’ choice, thus converging on sensing information with minimum required measurements. We address the effect of THz channels on sensing performance. We perform a numerical merit investigation in which simulations validate our analyses. Such scope is significantly different from the treatment of THz sensing in literature, which mainly assumes controlled laboratory scenarios or specific use cases. We note that, at this early stage of research on the topic, the paper generates comprehensive numerical simulations using realistic data from existing databases

[21, 22], which would pave the way towards physical experimentations in future studies. The paper’s main contributions can be summarized as follows:

  1. We overview the studied signal processing and machine learning techniques in detail, assuming a unified THz system model, which facilitates the analyses of performance and complexity trade-offs. The presented overview particularly covers pre-processing (standard normal variate normalization, min-max normalization, and Savitzky-Golay filtering), feature extraction (principal component analysis, partial least squares, t-distributed stochastic neighbor embedding, and nonnegative matrix factorization), and classification techniques (support vector machines, k-nearest neighbor, discriminant analysis, and naive Bayes).

  2. We motivate the importance of machine learning solutions compared to conventional signal processing.

  3. We motivate the importance of deep learning compared to conventional machine learning, especially in autonomous joint communications and sensing applications, where THz systems can generate a lot of data very fast.

  4. We comment on the performance and complexity trade-offs of different techniques to come up with clear recommendations.

  5. We motivate and formulate joint THz communication and sensing system models.

I-F Organization of The Paper

The remainder of this paper is organized as follows. THz-TDS use cases and the corresponding spectroscopy system models are overviewed in Sec. II. Sec. III reviews the different methods for the pre-processing of THz spectral data. The qualitative (feature extraction) techniques and quantitative (classification) analysis of materials are then discussed in Sec. IV and Sec. V, respectively. We also investigate deep learning techniques that rely on raw recovered measurements in Sec. VI. Sec. VII highlights the relative performance and complexity trade-offs of the illustrated techniques based on numerical simulations. We motivate prospect use cases and open research directions in Sec. VIII, detailing joint THz communication and sensing applications and highlighting the importance of carrier-based spectroscopy. We conclude the paper in Sec. IX.

Regarding notation, bold upper case, bold lower case, and lower case letters correspond to matrices, vectors, and scalars, respectively.

is the probability that event

occurs. and stand for the transpose and inverse, respectively. Scalar norms (or absolute values) are denoted by and Frobenius norms are denoted by . The notation is the imaginary number.

Ii Terahertz-time domain spectroscopy

Spectroscopy is a reliable technique for material and gas identification over a wide range of spectral data, allowing the extraction of material parameters without requiring multiple samples of different thicknesses. THz-TDS [23, 2]

characterizes the refractive indices and absorption coefficients, which correspond to the static and transient optical properties of materials. In a conventional THz-TDS measurement, THz pulses’ temporal profiles are recorded twice with and without the media under examination. These pulses are referred to as the incident and transmitted pulses, or reference and sample pulses, respectively. The recorded pulse traces correspond to the electric field’s temporal dependence associated with THz waves, the transient THz electric field at the detector. The two profiles are then Fourier-transformed to obtain their complex-valued spectral behavior, where the ratio of the two pulses’ electric field strengths in the frequency domain determines the optical properties of the sample material. The complex spectrum yields amplitude and phase information and allows direct calculation of the frequency-dependent index of refraction, absorption coefficient, and sample thickness. Compared to conventional Fourier-transform-based infrared and Raman spectroscopy, THz-TDS infers more useful information and gives direct access to the electrical field amplitude.

Ii-a Transmission- and Reflection-Based THz-TDS

The necessary spectroscopic measurements that provide the diagnostic tools for determining the optical constants of materials are transmission-based and reflection-based. Transmission spectroscopy is correlated with absorption spectroscopy, which is based on analyzing the amount of light absorbed by a sample material at a given wavelength. On the other hand, reflection spectroscopy studies the reflected or scattered light from a material as a function of wavelength. For both reflection and transmission measurements, the Fresnel equations describe the energy transfer at the interface between two media, usually air and another homogeneous material. Most THz sensing and imaging applications are conducted in a transmission geometry (absorbance THz spectroscopy), owing to its simple set-up design and the high contrast of the transmitted THz signal. Several reasons, however, make THz reflection-based TDS more appealing for material identification. Reflection geometry is capable of measuring the spectrum of thick or highly absorptive materials that are opaque in the THz band. In particular, reflection geometry makes detecting targets on non-transparent substrates possible, such as in the case of threat items concealed in opaque envelopes and paint on the bodies of vehicles. Furthermore, reflection spectroscopy is capable of reconstructing full three-dimensional (3D) images of objects, where the short THz pulse duration provides precise means of calculating the distance to reflecting surfaces. Despite providing low-contrast spectroscopic imaging, reflection geometry offers a unique solution to applications at long distances, such as standoff distance THz sensing and imaging in open field environments.

Fig. 1: A conventional layout of a THz-TDS setup with reflection geometry and transmission geometry.

Ii-B Gas, Liquid, and Solid THz-TDS

THz-TDS supports inspection and identification of different states of matter through electromagnetic interactions with gas, liquid, and solid components:

  1. Gas Spectroscopy [15]: Used to detect air pollution, flammable gases, atmospheric pressure, and humidity, or to monitor molecules blended in unfavorable aerosols like smoke, fog, haze, dust, fume, and others.

  2. Liquid Spectroscopy [24]: Used to characterize liquid and aqueous solutions. The dielectric properties of liquids depend on interactions and dipole relaxations. T-rays are sensitive to relaxational, oscillatory, and collective motions and can be used to study ionic liquid solutions. The high sensitivity to water and non-destructive penetration abilities of THz radiation are appealing for a plethora of liquid spectroscopy applications, such as the identification of water content, alcohol, liquid fuels, and petrochemicals. THz-TDS can also investigate the color, carbonation, and flavor of commercial beverages.

  3. Solid Spectroscopy [25]: Used to investigate a wide variety of spectral fingerprints of molecular solids such as explosives, dielectrics, polymers, ferroelectrics, semiconductors, and photonic crystals, for security screening, quality inspection, or pharmaceutical applications. Several THz-TDS methods can quantify the concentration, index of refraction, absorption peak, dielectric constants, and polarizability of materials. There are also different kinds of low-energy excitations in solid materials that THz-TDS can successfully detect.

Ii-C THz-TDS Imaging

In THz material sensing, the inspection is not consistently feasible if the samples under investigation have no apparent features such as absorption peaks. Nevertheless, the capability to form images and provide spectral information by exploiting other T-ray unique aspects enables alternative THz-based material characterization techniques. In particular, there has been a resurged interest in THz time-domain spectroscopic imaging because of the low power and low energy interaction of THz waves with some materials [2]. The image data and spectral information can determine the type and the shape of the material. For instance, a non-contact method uses THz pulse imaging to quantify the coated pharmaceutical tablets’ thickness [26]. By setting a THz-TDS system with an intermediate focus, one can measure the THz waveform of the pulse traversing an object placed at the focal plane. Because of low THz scattering, THz-TDS imaging systems produce high contrast images that enable effective material analysis.

There are several THz-TDS imaging methods, each of which can extract different relevant information from samples. Such methods are amplitude-based, phase-based, or a combination of the two. With amplitude imaging, the THz waveform’s magnitude is measured using fast numerical Fourier transform over a particular frequency band to extract the waveform’s peak signal at each pixel. To form an image pixel by pixel, the transmitted THz wave is measured for each position of the object under study. After THz signal acquisition, THz imaging requires the processing of the entire waveform. Each waveform holds a wealth of information (both in amplitude and phase) that can be extracted at each image pixel with integrated digital signal processors; detailed chemical or physical information can be obtained at each pixel. In our paper’s context, signal processing for THz sensing does not account for imaging techniques, which we rather consider an extension of this current work.

Ii-D Reflection Spectroscopy System Model

THz reflection spectroscopy can measure the real and imaginary parts of both the refractive index and dielectric constant of sample material to obtain distinct absorption coefficients. The Kramers-Krönig transform technique can be applied to calculate the sample’s optical conductivity, and the reflection coefficient can be determined using Fresnel’s equation. When a THz beam propagating in a uniform medium (air) encounters a material with refractive index at incident angle , the ratio of the reflected beam to the incident beam yields the complex reflectivity of the s-polarized component (denoted ) and the p-polarized component (denoted ). We have


where and are the reflected and incident beams of the s-polarized component, and and are the reflected and incident beams of the p-polarized component, respectively. The refractive index of air is , and for both polarizations we have (the incident and reflected angles are related via Snell’s Law). Reflection at normal incidence facilitates deriving the reflection coefficient through the complex refractive index of the material , with being the extinction coefficient, and where we have


Therefore, the reflection coefficient is , where is the reflectance. Assuming both the phase shift and are known, the refractive index and the extinction coefficient are calculated as


The absorption coefficient can then be computed as


where is the frequency, and is the speed of light in vacuum.

Ii-E Transmission Spectroscopy System Model

Transmission THz-TDS probes collective vibrational modes in materials. In particular, the frequency-dependent complex refractive index can be measured entirely from the time-dependent electric field of the THz waveform without the need for Kramers-Krönig transform equations. Using the sample thickness and the complex refractive index , the amplitude of THz transmittance, denoted by , can be obtained as


where and are the incident and transmitted amplitudes, respectively. Consequently, we can obtain the extinction coefficient , and refractive index as


where is the phase difference between the reference and sample. The absorption coefficient of the sample material can then be calculated as


In gas spectroscopy, the absorption coefficient can be extracted from the path gain, which is expressed as


where is the distance between the transmitter and receiver. can be expressed as the sum of absorption contributions from isotopes () of gases () that the medium is composed of, and it is expressed as


Using radiative transfer theory, the molecular absorption coefficient for an individual isotope can be expressed as


where is the system pressure, is the reference pressure, is the temperature, is the temperature at standard pressure, is the gas constant, is the mixing ratio for the isotope of gas (), is the Avogadro constant, and is the absorption cross section that is given as


where is the Van Vleck-Weisskopf line shape, is the absorption strength, is the Planck constant, and is the Boltzamnn constant. is the resonant frequency of the isotope of gas , where is the zero-pressure of the resonant frequency and is the linear pressure shift. The Van Vleck-Weisskopf line can be further decomposed as a function of Lorentz half-width for gas as follows:


The Lorentz half-width () can be obtained as a function of the broadening coefficient of both air () and the isotope of gas as


where is the temperature broadening coefficient parameter. Note that all the aforementioned parameters can be extracted from the high-resolution transmission molecular absorption database (HITRAN) database [21]. Although the complexity of the radiative transfer theory approach is high, the existing alternative gas absorption models are not accurate enough for sensing purposes; they are rather approximations that can be used in THz communication scenarios [12].

In our system model, for material sensing, we assume the THz transmission spectra to be accumulated in a vector , where represents the input data. For the gas sensing scenario, the absorption coefficient spectra is accumulated in a vector , where is used as input data. In the remainder of this paper, we drop the superscripts and when the discussion equally applies for both absorption and transmission spectra, for simplicity. We next detail the signal processing and machine learning techniques that can be applied on to retrieve sensing information. We first start with overviewing the pre-processing techniques in Sec. III. Then, we present the feature extraction techniques, quantitative analysis of materials, and deep learning prospects in Sec. IV, Sec. V, and Sec. VI, respectively. Table I lists sample works from the literature that utilize a variety of signal processing techniques for different THz sensing applications.

Iii Signal Pre-Processing Techniques

Perfect recovery of the THz signal at the detector is vital in spectroscopic systems. However, several factors prevent such ideal conditions. For instance, atmospheric attenuation significantly influences the propagating THz radiation in the air. Consequently, loss of information, slope variations, baseline shifts, and redundancy in the acquired data are expected. The quality of classification depends not only on the effectiveness of the feature extraction techniques but also on the pre-processed data’s quality and quantity. Employing signal pre-processing techniques for spectroscopic THz systems can thus significantly improve the accuracy, noise robustness, and resolution of THz sensing and imaging systems.

We denote the spectral output after applying pre-processing techniques on an input by the vector . Out of a vast range of signal pre-processing techniques, a few methods have been more commonly applied to THz spectral data, namely, Savitzky-Golay (SG) filtering, de-trending (DT), first derivative (FD), standard normal variate (SNV), baseline correction (BC), wavelet transform, and min-max normalization. Other spectral pre-treatment techniques have also been studied to pre-process THz and near-infrared (NIR) spectra, such as the logarithmic function , mean centering (MC), and multiplicative scatter correction (MSC). In this work, we detail and implement SNV, min-max normalization, and SG filtering.

Iii-1 Standard normal variate

In material identification experiments, normalization is widely used to compensate for variations in the sample surface optical properties, i.e., density, scattering, and roughness/smoothness. SNV correction and normalization pre-treatment is a transformation that eliminates scatter and baseline drift effects from the spectral data by centering and scaling individual spectra. The average and standard deviation of all data points for the given spectrum are first calculated. Then, the SNV transformation centers and scales the spectrum by subtracting every data point from the mean and dividing it by its standard deviation (

). The SNV-processed spectral values, (), are computed as


Iii-2 Min-max normalization

Min-max is another pre-processing normalization scaling technique that is frequently used in THz and NIR spectroscopy for constraining the range of each input feature of the spectrum. The features or spectral data are most often rescaled to fit a target range of or ; all the data relationships are preserved, and no bias is introduced. Such normalization allows more flexibility in designing classifiers and determining which features are more prominent. The rescaling norm is given by


where and are the maximum and minimum values of the feature data, respectively.

Iii-3 Savitzky-Golay filter

SG filtering is a simplified least-squares-fit convolution low-pass filter, preliminarily used for noise reduction and smoothing. The SG filter improves the signal-to-noise ratio (SNR) without strong signal distortion and deforming. The commonly used SG algorithm consists of approximating the signal in a window (or frame) by a polynomial function of a certain degree, which is constructed by the least-squares method. The best fit polynomial order and frame size can be estimated using a trial and error method. However, the polynomial filtering coefficients can be calculated in advance since they are independent of the input data, ensuring high computational efficiency. The SG-smoothed data is computed as


where is the smoothing coefficient and is the number of data points in the smoothing frame; is the half-width of the smoothing frame.

Iv Feature Extraction

Inconclusive classification results can be obtained in THz spectroscopy due to unquantifiable scattering effects which form spurious structures. Towards mitigating such THz-TDS problems, following signal pre-processing, we target extracting critical features from the THz spectral data. In particular, we consider the following feature extraction techniques: Principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), non-negative matrix factorization (NMF). Following our system model, feature extraction techniques operate on an initial set of THz spectral data in a matrix , or a pre-processed matrix , where denotes the number of data samples (observations) and corresponds to a range of THz features (variables) that are subsequently used as inputs to the classifiers. The dataset matrix is expressed as


and the ouptut feature vector is expressed as , where .

Iv-a Principal Component Analysis

PCA is a dimensionality reduction technique that reduces a multi-dimensional dataset of many correlated variables into a smaller set with few comprehensive indicators. The new indicators are known as principal components (PCs). In THz material classification, PCA is abundantly used as an unsupervised, non-parametric method for extracting relevant data features and eliminating overlapping data from original THz spectral datasets. The matrix is converted into a vector of synthesis indicators (PCs),

, the entries of which are sorted in descending order of their respective variances. The PCs correspond to the eigenvectors of the

larger eigenvalues of the covariance matrix


Iv-B t-Distributed Stochastic Neighbor Embedding

t-SNE is a non-linear learning method that reflects low-dimensional data by optimally positioning data points in a projection map. The t-SNE algorithm utilizes the joint probability distribution between high-dimensional data points and their corresponding synthetic data points in a low-dimensional space, minimizing the Kullback-Leibler (KL) divergence to obtain optimal low-dimensional data. For the same input

-dimensional data , the similarity conditional probabilities are first computed as


where is the similarity of data point to data point in , and

is the variance of the Gaussian-distributed THz spectral value centered over

. Using a heavy-tailed Student t-Distribution with one degree of freedom in the low-dimensional space, the joint probability

between synthetic data points in is calculated as


Then, the divergence between the synthetic data points in low-dimensional space and their corresponding data points in high-dimensional space is minimized as


with . Finally, the KL divergence is optimized via the gradient descent method as:


The corresponding t-SNE output is denoted by .

Fig. 2: A map of candidate machine learning techniques for THz sensing.
Materials THz Range THz-TDS Setup Details Feature Extraction Classification Ref.
Aminoacids, saccharides, and inorganic substances 0.9-6 THz GnAs photoconductive antenna, femtosecond laser, transmission mode PCA Fuzzy Pattern [27]
Flavonols (myricetin, quercetin, and kaempferol) 0.6-2.7 THz Mode-locked Ti-sapphire laser, transmission mode PLS Random forest, LS-SVM [28]
Rice 0-6.4 THz TERA K15 All fiber-coupled spectrometer, femtosecond laser, transmission mode PCA PLS-DA, SVM, BPNN [29]
Protein (bovine serum albumin) 0.2-1.2 THz LT-GaAs photoconductive antenna, femtosecond laser, lock-in amplifier PCA, t-SNE

Random forest, NB, SVM, XGBoost

Benzoic acid 1.6-2.8 THz TAS7500SU system, transmission mode, ultra-short pulse fibre lasers PCA GRNN, BPNN [31]
Pure analytes (citric acid, fructose, and lactose) 0.05-3 THz TPS Spectra 3000, mode locked Ti Sapphire laser, transmission mode PCA PLSR, ANN [32]
Transgenic rice and Cry1Ab protein 0.1-2.6 THz Z-3 THz-TDS system, LT GaAs photoconductive antenna, ZnTe electro-optical crystal detector PCA PLSR, DA [33]
Rice and imidacloprid pesticide 0.3-1.7 THz mode-locked Ti-sapphire laser, femtosecond laser, ZnTe photoconductive antenna, transmission mode PLS SVR [34]
Aflatoxins B1 in acetonitrile solution 0.4-1.6 THz mode-locked Ti:sapphire laser, photoconductive switches, transmission mode PCA, PLS, PCR SVM [35]
Fuel oils (lubricant, gasoline, and diesel) 0.2-1.5 THz mode-locked femtosecond Ti-sapphire laser, lock-in amplifier, transmission mode PCA SVM, BPNN [36]
Extra-virgin olive oil (EVOO) 0.1-4 THz TAS7500TS HF THz-TDS system, femtosecond laser, transmission mode

PCA, Genetic algorithm

LS-SVM, BPNN, Random Forest [37]
Adulterated dairy products (skim, low fat milk) 0.1-1.5 THz PCA SVM-DA [38]
Oral lichen planus (OLP) 0.3-3.5 THz T-SPEC THz spectrometer, absorption mode PCA SVM [6]
TABLE I: THz-TDS Chemometrics

Iv-C Non-negative Matrix Factorization

NMF is another dimensionality reduction and feature extraction technique suitable for high-dimensional multivariate THz spectral data analyses. NMF approximates the data by iterative additive combinations of the basis vectors, making it a good candidate when other tools can not guarantee non-negativity in measurements that contradict physical realities. For a non-negative matrix and a positive integer factorization rank , we find two non-negative matrices, and , the product of which approximates via non-negative factorization:


The number of columns in is the latent feature representing the reduced feature space dimension.

V Classification

Following the discussion on feature extraction, we investigate complementing machine learning techniques that classify materials based on their THz spectral absorption and transmission coefficients. The candidate combinations of signal processing techniques are illustrated in Fig. 2

. We next detail the following candidate classification techniques: Naive Bayes (NB), support vector machine (SVM), K-nearest neighbor (KNN), and partial least squares-discriminant analysis (PLS-DA). Unless otherwise stated, for an input set

of features, the vector denotes the output data classes.

V-a Naive Bayes

The NB classifier is a supervised probabilistic machine learning classifier that applies the maximum a posteriori (MAP) decision rule for parameter estimation. NB assumes the presence and absence of each feature of a class independently. Each class has a probability

that is estimated from the training feature dataset. The class with the highest post-probability is the resulting target class. Using Bayes’ theorem, the NB classifier computes the conditional probability

for each of the possible classes as


Since for all classes is invariant, the naive classification rule can be further simplified as


For the final decision, the classifier model incorporates a MAP rule to predict the class with the largest posterior probability:


The NB classifier can be used alongside 2-D cross-correlation for classifying THz signals. The correlation between the background time-domain pulse and the sample ensures considerable noise suppression in a THz dataset. Consequently, any phase differences due to sample dispersion are preserved and discriminated between the two signals. The cross-correlation sequences associated with every class sample can be computed using the reference and sample signal. Then, statistical features can be extracted from each cross-correlation sequence and forwarded to the NB classifier.

V-B Support Vector Machine

SVM is a linear regression model that classifies data based on a set of support vectors, subsets of the training dataset, that construct a hyperplane in feature space. For both linear and nonlinear problems, SVM classifies data using a boundary hyperplane that separates data into different classes. Most material recognition problems consisting of multi-class pulsed signals (signals belonging to three or more classes) use SVM classifiers to analyze and discriminate data.

A set of learning data is used for building the SVM model, where and denotes the class label corresponding to each input feature vector . A total of classifiers are constructed, where denotes the class number of the input data. Each classifier is trained on input data from two classes. By training data from the and the classes, the classification problem can be expressed as


Subject to

where is the normal vector of the hyperplane, is a real-valued bias, is the slack variable, is the penalty parameter, is the index of the combined set of the and samples of the training data, and is the function that maps the training data (input space) to a higher dimensional space (feature space). Both and denote the optimization variables for the optimal hyperplane. Consequently, the voting strategy for the SVM classification function is defined as


where is the Signum function that extracts the sign that determines the class to assign to the point. The sign is positive if the point is correctly classified and negative otherwise.

V-C K-Nearest Neighbor

KNN is a distance-based learning algorithm that is favorable due to its simple mathematical formulation and relatively little training time. In KNN, data points close to each other are referred to as neighbors, and the desired class is constructed based on distances to the data points near known data. The algorithm acquires nearest neighbors by a majority vote decision, based on a specific distance metric (Euclidean, Mahalanobis, Chebychev, or correlation distance). The controlling variable

is chosen after preliminary validation or hyperparameter optimization, depending on the dataset requirements. The KNN algorithm selects the predicted class

for which the distance to the test data is minimized as


where is the distance between the training input and the test input , and is the total number of features.

V-D Partial Least Squares-Discriminant Analysis

PLS-DA is derived from the PLS regression (PLS-R) algorithm and combines the properties of PLS-R with the discrimination power of a classification technique. PLS-DA thus sharpens and maximizes the separation between groups or classes of observations. PLS is a commonly used supervised feature extraction technique for THz data. It reduces data in a low dimensional space via linear transformation. However, it forms a hybrid classification method when combined with DA, which can be used for predictive modeling (we thus discuss PLS-DA under classification techniques). Let the input matrix

of THz spectral feature data for several classes consist of predictor variables, the output measured parameter matrix consists of response variables. The fundamental PLS-DA paradigm is formulated as


where is the Y-score matrix, is the orthogonal Y-loading matrix,

is the Z-score matrix,

is the orthogonal Z-loading matrix, and and are error matrices. The linear regression model of and is expressed as


with being the regression coefficient matrix and is computed as


The prediction value, , of the tested sample is then computed as


If the prediction value of class membership is above zero, a corresponding sample is considered as a member of a class.

Vi Deep Learning

Given that THz spectral data sets can be extensive and complex, neural networks can be explored as robust classification tools to speed up the learning efficiency compared to conventional classification models discussed earlier in this paper. We distinguish various neural network architectures depending on the approach of network training and classifying. In particular, deep learning neural network techniques can be supervised, unsupervised, and reinforced. One of the main advantages of neural networks is that they can create new features by themselves, unlike traditional shallow learning techniques in which features need to be identified accurately by other techniques. Therefore, deep learning classifiers can operate directly on THz training data, enabling faster learning that is much needed in fast-changing THz conditions. We highlight two particular types of supervised neural networks because of their popular usage in existing THz material sensing literature: Generalized regression neural networks (GRNN)


and backpropagation neural networks (BPNN)


Vi-a Generalized Regression Neural Network

The most extensively used deep learning model in THz material classification studies is GRNN, both for theoretical and practical applications [39]

. GRNN is a typical feed-forward neural network that provides a powerful variation to the conventional radial bases function neural network. As a single-pass efficient learning technique, GRNN solves tedious efficiency and flexibility issues. Unlike BPNN, GRNN requires selecting only one training parameter to be learned (the smoothing factor or the width of radial basis functions, for example). The corresponding performance differs significantly with smoothing factor choice, where the estimated density takes a multivariate Gaussian form with larger smoothing factors. GRNN consists of four layers: Input, pattern, summation, and output


Vi-B Backpropagation Neural Network

BPNN is widely used to train multilayer feedforward neural networks, and it mainly consists of the input layer, one or more hidden layers, and the output layer. The BPNN principle adjusts the weight parameter controlling the degrees of connections between the neuron nodes of different layers to produce the desired output layer. Proper adjustment of weights allows minimizing the total network error. The number of input features determines the number of input neurons. Furthermore, the number of neurons in the output layer is related to the number of classes. However, the number of hidden intermediate layers between the input and output layers can be customized.

Fig. 3: Transmittance waveform of five materials.
Fig. 4: Absorption spectra of five gases.

Vii Performance and Complexity Tradeoffs

This section analyzes the performance and complexity trade-offs of the feature extraction and classification techniques presented in the previous sections. We consider publically available THz spectral data for different materials, solids (Fig. 3), and gases (Fig. 4), and we compare the techniques’ classifications success rates.

Vii-a Terahertz Spectroscopy Datasets

For material identification, we consider the sub-THz spectral data of 20 sample materials (such as alumina, aspirin, baking powder, baking soda, and chalk), as obtained from the THz database provided by the National Institute of Standards and Technology (NIST) [22]. The chemical materials are assumed to be grounded to a finer powder and pressed into solid pellets in polyethylene diluted pellets. By investigating the transmittance spectrum, it can be seen from Fig.3 that the THz transmittance amplitude and peak locations across the materials are different, which facilitates classification and identification. In the particular case of gas spectroscopy, the gas absorption spectra are modeled using radiative transfer theory. The molecular absorption for an individual isotope can be expressed as a function of system pressure and temperature. We extract all parameters from the HITRAN database [21].

Vii-B Performances of Feature Extraction Techniques

Since smaller datasets are easier and faster to analyze and visualize than rich datasets, dimension reduction can be an essential step before implementing machine learning algorithms. For sample materials, we use 1000 observations to establish the calibration model over 430 transmission spectral coefficients (variables). We use the Savitzky-Golay function to pre-process the data for smoothing (assuming noise-corrupted data). The qualitative chemometric analysis is first performed using four different feature extraction and dimension reduction techniques: PCA, PLS regression, NMF, and t-SNE.

For each SNR value, PCA is executed 10 times, where a small number of PCs (up to 10 PCs) is required to reach good classification performance. The THz spectral window under investigation is reduced while still retaining relevant data that captures most of the THz spectral fingerprint. At an SNR of 20 dB, the accumulated contribution rate of the first 10 principal components of PCA reaches , which results in good clustering performance. Eventually, PCA constructs PCs that convey the most variation in the available dataset. Similarly, the first 10 output components of each of the studied techniques are selected to train the classifiers. In t-SNE, we set the effective number of local neighbors of each point, known as perplexity, to 5. The best feature extraction model is the one that minimizes the computation complexity and dimensionality. We note that PCA has the best clustering efficacy (see Fig. (a)a and Fig. (a)a) for complex high-dimensional spectral datasets and that t-SNE has the highest computational and time complexities. At lower SNR, PCA outperforms NMF, PLS, and t-SNE in terms of stability, complexity, and interpretability of the spectral features. In this work, t-SNE was performed using a slightly less number of samples, as it requires considerable time to execute on the same sample size of spectral data than other feature extraction techniques.

(a) Relative classification performance with PCA
(b) Relative classification performance with PLS
(c) Relative classification performance with t-SNE
(d) Relative classification performance with NMF
Fig. 5: Performance of studied classifiers with respect to signal-to-noise ratio for solids.
(a) Relative classification performance with PCA
(b) Relative classification performance with PLS
(c) Relative classification performance with t-SNE
(d) Relative classification performance with NMF
Fig. 6: Performance of studied classifiers with respect to signal-to-noise ratio for gases.

Vii-C Performances of Classification Techniques

We next illustrate the classification rate performance of several materials. We utilize linear DA (LDA), linear SVM (LSVM), weighted KNN (WKNN), and Gaussian NB (GNB) to predict the materials following each feature extraction technique. The output feature data are divided into two sets, training and testing, both of which are composed of the sample material (or class) data and a sample label. We use a K-fold cross-validation scheme to evaluate the classification models’ performance. The input data of features is partitioned into 10 equal-size sub-samples (folds), each used as a testing set and as validation data. In the first iteration, the first fold of the sub-samples is used to test the classifier, while the remaining folds are used to train the classifier. In the second iteration, the second fold is used as the testing set, and the rest serve as training sets. This process is repeated until each of the 10 folds is used as a testing set.

We measure the root mean square error of calibration (RMSEC) for the testing set to capture the classification model’s performance. The results in Fig. 5 suggest that LDA can distinguish the solid materials accurately at low SNR values with low processing time. This observation is expected because LDA is suitable for problems that deal with linearly separable THz spectral data. Gaussian NB (GNB) gave similar good classification results, which is also expected because the THz material fingerprints have continuous features and approximate Gaussian distributions. However, the GNB classifier (Fig. 5d) did not exhibit a good representation of the NMF data, mainly due to inaccuracies in its simple hypothesis function. Furthermore, the accuracy of KNN is degraded due to noisy feature data and large sample sizes, resulting in an expensive cost to calculate the distance to the nearest neighbor. The relative degradation in the accuracy of the SVM classifier, further, might be caused by overfitting.

For BPNN, the best performance is achieved with 10 PCs, 10 hidden nodes, and a 0.01 learning rate. However, the best GRNN performance is achieved with 5 PCs and a spread value of 10 for the radial basis function, attaining smoother function approximation. Compared to the BPNN model, the GRNN model results suggest that GRNN is more favorable in predicting the THz spectral data, mainly because GRNN has low computational complexity, intermittent flow estimation, and high computing speed. Each GRNN trains fast in one-pass learning, whereas BPNN takes much more time on average over forward and backward passes. GRNN can further converge to the THz data’s underlying function with just a few training samples, unlike BPNN. Moreover, GRNN results in less classification error (better generalization ability) due to its ability to handle noise in the input data.

GRNN achieves a good balance between the high classification accuracy and speed for both solid and gaseous materials datasets (Figures 5 and 6, respectively). The corresponding computational efficiency is proven in [42] for all kinds of gas molecules supported by a spectral database. The results show that the multivariate discriminative model of LDA and GRNN, in association with THz spectroscopy, provides a cost-effective and low-time-consuming alternative to the commonly used models in the literature for material classification, suggesting a commercial and regulatory potential.

Viii Joint THz Communications and Sensing

The progress made on THz system design triggers a plethora of promising research directions, especially on the topic of convergence of communications and sensing. To this end, we next present some particular timely applications of joint communications and sensing at the THz band. The section also suggests some future research directions, which are expected to be at the forefront of THz sensing studies.

Viii-a Joint THz Communications and Sensing Applications

The prospective use cases of THz sensing are mainly in the context of joint communications and sensing [13, 14]. A unified THz system for communication and sensing can support various applications, from in-home digital health to building analytics (e.g., residential security). For instance, the THz band can establish reliable communication links for unmanned automotive vehicles (UAVs) that build on accurate localization and sensing capabilities. The THz band can also provide high-rate virtual reality services, enabling good visual perception. In particular, THz signals can support extended reality (XR) interfaces capable of interacting with sensory information in indoor environments without intervention. Sub-THz vehicular communications and sensing can further enable data exchange in vehicular-to-everything (V2X) communication systems. The deployment of such unified systems requires exceptionally high data rates, low latencies, and high reliability. Several contemporary approaches to joint vehicular communication and sensing are explored, such as time-domain duplex (TDD), telecom messages over radar transmissions (ToR), and radar sensing over telecom transmissions (RoT). Moreover, joint THz communications and sensing applications are particularly useful in body-centric systems where THz-TDS can be used to sense infections and mitigate virus outbreaks.

Intelligent THz systems should achieve high-rate communications and robust high-resolution sensing at the same time. Towards this end, efficient signal processing techniques are critical. Recent advances in machine learning and compressive sensing can significantly accelerate THz network applications by enhancing situational awareness and facilitating fast and low latency configurations. In this regard, network intelligentization is a new trend that aims at leveraging machine learning techniques to empower 6G communication systems with artificial intelligence (AI) algorithms. The use of AI in THz-based 6G networks is envisioned to pave the way towards enhanced localization, communication, and sensing. However, unleashing machine learning’s full potential requires addressing significant challenges, especially in waveform design. The differences in performance metrics between communication and sensing functionalities (achievable data rates, level of interference, sensing accuracy, and reliability) should also be taken into consideration. For instance, deep learning methods can be applied for target classification, waveform recognition, material sensing, and optimal selection of antennas and radio frequency chains in a joint THz communications and sensing setup.

Viii-B THz Frequency-Domain Spectroscopy

The THz band provides the possibility of designing both carrier-based and pulse-based setups that offer robust wireless sensing functionalities in communication system frameworks. However, transmitting short-time THz pulses in THz-TDS that cover large THz frequency bands is inefficient for communications. A carrier-based THz sensing setup can offer better flexibility in jointly meeting the continuously increasing bandwidth demands and sensing capabilities. We denote by the latter THz Frequency-Domain Spectroscopy (THz-FDS). THz-FDS can be conducted to recover the channel’s molecular, spatial, and temporal characteristics when probing the environment with a discrete set of carrier frequencies. THz-FDS wireless sensing comprises three steps: (1) Selectively probing few THz waves into a medium, (2) estimating the channel response to identify THz fingerprints, and (3) correlating the estimated response with a reference spectral database of the constituents of the medium under investigation.

However, the spectral lines that form fingerprints for testing occur at specific resonance frequencies that are typically avoided when allocating carriers for communication purposes. On the one hand, this observation complicates the joint communications and sensing problem because near-absorption-free communication spectra hold little sensing information. On the other hand, if sensing is not fully piggybacked over communication resources and time-sharing is enabled, the carriers can be directly tuned to the target resonant frequencies of specific gases/materials to achieve better accuracy.

We test THz-FDS in the context of environment electronic smelling (gas sensing). We assume several realistic mixtures of gases (N2, O2, H2O, CO2, and CH4), demonstrating different possible mediums with molecular concentrations representing dry, humid, and polluted air profiles. The corresponding results in Fig. 7

demonstrate that carrier-based sensing of gas mixtures is feasible. However, higher SNR values are required for convergence to a 100% classification success rate because estimating a mixture of gases is complicated. Not that we used 100 carriers, randomly distributed over a specific frequency range, which can be allocated in a single channel use with orthogonal frequency-division multiplexing (OFDM), for example, given the large THz bandwidths. We also test simple heuristic search algorithms to illustrate the importance of tuning carriers to resonant frequencies in joint communication and sensing setups, as illustrated in Fig. 


. Tuning 10 or 100 carriers to resonant frequencies of water vapor or oxygen introduces significant gains compared to uniformly distributing these carriers between

and . Note that the high SNR values are caused by the high propagation losses of THz channels (over a distance of ) that are accounted for in this simulation. Lower SNR ranges can be achieved by adding substantial antenna and beamforming gains.

Towards introducing beamforming gains, UM-MIMO systems can be deployed. High beamforming gains increase the received signal power and provide the required high-resolution spatial focusing at a specific distance (molecular absorption is also distance-dependent). Furthermore, UM-MIMO systems can realize multiple measurements in a single channel use. However, the high correlation between absorption spectra and the inherent high correlation of UM-MIMO channels results in low-rank measurement matrices. Spatial tuning of antenna separations can guarantee orthogonality of THz channels to achieve high multiplexing gains [43]. The level of accuracy also depends on the application requirements and the assumptions on how many gases and isotopes of gases can exist in a medium. Therefore, the problems under study can easily get prohibitively complex for simple signal processing techniques, hence the motivation for machine learning.

Instead of comparing the exact values of channel measurements, we can set thresholds to check the presence or absence of specific spikes and build decision trees for classification

[44]. Furthermore, when a small number of gases/materials is being tested, the corresponding sparsity can be leveraged in compressive sensing techniques. Note that the transmitted information-bearing symbols over the channel can be assumed to be random for sensing purposes. However, in cooperative sensing and communications setups, such symbols would belong to a specific constellation with a specific structure, a quadrature amplitude modulation, for example. The knowledge on modulation format can further be exploited to enhance the sensing performance. Note that in adaptive THz UM-MIMO systems, the set of active antennas, the carrier frequencies, and modulation modes can be tuned in real-time while being efficiently blindly estimated at the receiver side [45].

Fig. 7: Performance of studied classifiers with THz-FDS sensing of a mixture of gases.
Fig. 8: Performance of carrier-based joint communications and sensing of O2 (solid) and HO (dotted).

Viii-C Future Research Directions

The general need for automating analytical modeling, empowered by the developments in machine learning techniques, is trending nowadays, both from numerical and theoretical perspectives. Since at the core of the THz sensing problem lies a pattern recognition procedure and, by transitivity, a designated function approximator, as entailed earlier in this paper, we expect the dynamic machine learning developments to play a major role in future THz sensing research directions. For instance, in sensing applications characterized with long-term variable dependencies, one can investigate the merits of using convolutional neural networks


, recurrent neural networks

[47], or echo-state networks [48] as potential sensing solutions. In sensing problems where distributed, yet privacy-preserving, implementation is of practical importance, we can adopt federated learning-based sensing techniques [49]

. Lastly, in problems involving interactive learning frameworks, e.g., in sensing situations where one agent would be able to sense one state of the environment and consequently executes an action that impacts the next state for maximizing a certain score (reward), one can use reinforcement learning-based techniques

[50]. Such reinforcement learning paradigm would indeed be quite fitting for THz multi-purpose platforms, e.g., the THz band capabilities as a powerful enabler of joint communications, sensing, and localization, which promises to be an exciting area of future research in the field.

Ix Conclusions

In conclusion, non-destructive THz spectroscopic sensing with chemometrics is useful for material discrimination and has the potential to deal with real-life problems and applications. Employing machine learning methods can provide a powerful tool for both qualitative and quantitative analyses of THz spectral fingerprints. In this paper, we have successfully applied several relevant dimension reduction and classification techniques to identify solid and gaseous materials measured by transmission and absorption THz-TDS spectroscopy, respectively. We have demonstrated the feasibility of PCA, PLS, NMF, and t-SNE to reduce the high-dimensional THz data and extract its most prominent features. Furthermore, we employed SVM, LDA, KNN, and NB classifiers to determine the sample materials’ quantitative determination or prediction. Our results confirm that PCA-GRNN and PLS-GRNN have superior performances among other machine learning models in identifying solid materials in the THz range. However, for the special case of gaseous materials, NMF-GRNN, comparable to PCA-GRNN, provides the best classification results over lower SNR values. Up to the authors’ knowledge, this paper is the first article that presents a holistic overview of signal processing and machine learning techniques for efficient THz sensing and introduces a roadmap for several exciting future THz use cases.


  • [1] R. Bogue, “Sensing with terahertz radiation: A review of recent progress,” Sensor Review, 2018.
  • [2] P. Jepsen, D. Cooke, and M. Koch, “Terahertz spectroscopy and imaging – modern techniques and applications,” Laser & Photonics Reviews, vol. 5, pp. 124–166, Jan. 2011.
  • [3] K. Sengupta, T. Nagatsuma, and D. M. Mittleman, “Terahertz integrated electronic and hybrid electronic-photonic systems,” Nature Electronics, vol. 1, no. 12, p. 622, 2018.
  • [4] A. Ren, A. Zahid, D. Fan, X. Yang, M. A. Imran, A. Alomainy, and Q. H. Abbasi, “State-of-the-art in terahertz sensing for food and water security–a comprehensive review,” Trends in food science & technology, 2019.
  • [5] R. Li, C. Li, H. Li, S. Wu, and G. Fang, “Study of automatic detection of concealed targets in passive terahertz images for intelligent security screening,” IEEE Transactions on Terahertz Science and Technology, vol. 9, no. 2, pp. 165–176, Mar. 2019.
  • [6] Y. V. Kistenev, A. V. Borisov, M. A. Titarenko, O. D. Baydik, and A. V. Shapovalov, “Diagnosis of oral lichen planus from analysis of saliva samples using terahertz time-domain spectroscopy and chemometrics,” Journal of Biomedical Optics, vol. 23, no. 4, pp. 1 – 8, 2018.
  • [7] K. E. K. Coppin, J. E. Geach, I. Smail, L. Dunne, A. C. Edge, R. J. Ivison, S. Maddox, R. Auld, M. Baes, S. Buttiglione, A. Cava, D. L. Clements, A. Cooray, A. Dariush, G. De Zotti, S. Dye, S. Eales, J. Fritz, R. Hopwood, E. Ibar, M. Jarvis, M. J. Michałowski, D. N. A. Murphy, M. Negrello, E. Pascale, M. Pohlen, E. Rigby, G. Rodighiero, D. Scott, S. Serjeant, D. J. B. Smith, P. Temi, and P. van der Werf, “Herschel-Astrophysical Terahertz Large Area Survey: Detection of a far-infrared population around galaxy clusters,” Monthly Notices of the Royal Astronomical Society, vol. 416, no. 1, pp. 680–688, Aug. 2011.
  • [8] P. Strobbia, R. Odion, and T. Vo-Dinh, “Spectroscopic chemical sensing and imaging: From plants to animals and humans,” Chemosensors, vol. 6, p. 11, Feb. 2018.
  • [9] J. Xu, K. W. Plaxco, and S. J. Allen, “Probing the collective vibrational dynamics of a protein in liquid water by terahertz absorption spectroscopy,” Protein Science, vol. 15, no. 5, pp. 1175–1181, 2006.
  • [10] B. Breitenstein, M. Scheller, M. Shakfa, T. Kinder, T. Müller-Wirts, M. Koch, and D. Selmar, “Introducing terahertz technology into plant biology: A novel method to monitor changes in leaf water status,” Journal of Applied Botany and Food Quality, vol. 84, no. 2, pp. 158–161, Dec. 2011.
  • [11] I. F. Akyildiz, J. M. Jornet, and C. Han, “Terahertz band: Next frontier for wireless communications,” Physical Communication, vol. 12, pp. 16–32, 2014.
  • [12] H. Sarieddeen, M.-S. Alouini, and T. Y. Al-Naffouri, “An overview of signal processing techniques for terahertz communications,” arXiv preprint arXiv:2005.13176, 2020.
  • [13] H. Sarieddeen, N. Saeed, T. Y. Al-Naffouri, and M. Alouini, “Next generation terahertz communications: A rendezvous of sensing, imaging, and localization,” IEEE Communications Magazine, vol. 58, no. 5, pp. 69–75, 2020.
  • [14] C. Chaccour, M. N. Soorki, W. Saad, M. Bennis, P. Popovski, and M. Debbah, “Seven defining features of terahertz (THz) wireless systems: A fellowship of communication and sensing,” arXiv preprint arXiv:2102.07668, 2021.
  • [15] Y.-D. Hsieh, S. Nakamura, D. Ibrahim, T. Minamikawa, Y. Mizutani, H. Yamamoto, T. Iwata, F. Hindle, and T. Yasui, “Dynamic terahertz spectroscopy of gas molecules mixed with unwanted aerosol under atmospheric pressure using fibre-based asynchronous-optical-sampling terahertz time-domain spectroscopy,” Scientific Reports, vol. 6, p. 28114, 06 2016.
  • [16] A. Faisal, H. Sarieddeen, H. Dahrouj, T. Y. Al-Naffouri, and M. S. Alouini, “Ultramassive MIMO systems at terahertz bands: Prospects and challenges,” vol. 15, no. 4, pp. 33–42, 2020.
  • [17] M. T. Ruggiero, “Invited review: Modern methods for accurately simulating the terahertz spectra of solids,” Journal of Infrared, Millimeter, and Terahertz Waves, pp. 1–38, 2020.
  • [18] J. Qin, Y. Ying, and L. Xie, “The detection of agricultural products and food using terahertz spectroscopy: A review,” Applied Spectroscopy Reviews, vol. 48, no. 6, pp. 439–457, 2013.
  • [19] M. Yin, S. Tang, and M. Tong, “The application of terahertz spectroscopy to liquid petrochemicals detection: A review,” Applied Spectroscopy Reviews, vol. 51, no. 5, pp. 379–396, 2016.
  • [20] J. El Haddad, B. Bousquet, L. Canioni, and P. Mounaix, “Review in terahertz spectral analysis,” TrAC Trends in Analytical Chemistry, vol. 44, pp. 98–105, 2013.
  • [21] I. E. Gordon, L. S. Rothman, C. Hill, R. V. Kochanov, Y. Tan, P. F. Bernath, M. Birk, V. Boudon, A. Campargue, K. Chance et al., “The HITRAN2016 molecular spectroscopic database,” Journal of Quantitative Spectroscopy and Radiative Transfer, vol. 203, pp. 3–69, 2017.
  • [22] E. Heilweil and M. Campbell, “THz spectral database,” 2011.
  • [23] T. Wang, E. A. Romanova, N. Abdel-Moneim, D. Furniss, A. Loth, Z. Tang, A. Seddon, T. Benson, A. Lavrinenko, and P. U. Jepsen, “Time-resolved terahertz spectroscopy of charge carrier dynamics in the chalcogenide glass AsSeTe,” Photon. Res., vol. 4, no. 3, pp. A22–A28, Jun. 2016.
  • [24] D. K. George and A. G. Markelz, Terahertz Spectroscopy of Liquids and Biomolecules.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 229–250.
  • [25] M. Hangyo, M. Tani, and T. Nagashima, “Terahertz time-domain spectroscopy of solids: A review,” International journal of infrared and millimeter waves, vol. 26, no. 12, pp. 1661–1690, 2005.
  • [26] Y.-C. Shen and P. F. Taday, “Development and application of terahertz pulsed imaging for nondestructive inspection of pharmaceutical tablet,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 14, no. 2, pp. 407–415, 2008.
  • [27] B. S. Ferguson, H. Liu, S. Hay, D. Findlay, X.-C. Zhang, and D. Abbott, “In vitro osteosarcoma biosensing using THz time domain spectroscopy,” in BioMEMS and Nanotechnology, D. V. Nicolau, U. R. Muller, and J. M. Dell, Eds., vol. 5275, International Society for Optics and Photonics.   SPIE, 2004, pp. 304 – 316.
  • [28] L. Yan, C. Liu, H. Qu, W. Liu, Y. Zhang, J. Yang, and L. Zheng, “Discrimination and measurements of three flavonols with similar structure using terahertz spectroscopy and chemometrics,” Journal of Infrared, Millimeter, and Terahertz Waves, vol. 39, 03 2018.
  • [29] C. Li, B. Li, and D. Ye, “Analysis and identification of rice adulteration using terahertz spectroscopy and pattern recognition algorithms,” IEEE Access, vol. 8, pp. 26 839–26 850, 2020.
  • [30] C. Cao, Z. Zhang, X. Zhao, and T. Zhang, “Terahertz spectroscopy and machine learning algorithm for non-destructive evaluation of protein conformation,” Optical and Quantum Electronics, vol. 52, 04 2020.
  • [31] X. Sun, J. Liu, K. Zhu, J. Hu, X. Jiang, and Y. Liu, “Generalized regression neural network association with terahertz spectroscopy for quantitative analysis of benzoic acid additive in wheat flour,” Royal Society Open Science, vol. 6, no. 7, p. 190485, 2019.
  • [32] T. Bowman, T. Chavez, K. Khan, J. Wu, A. Chakraborty, N. Rajaram, K. Bailey, and M. O. El-Shenawee, “Pulsed terahertz imaging of breast cancer in freshly excised murine tumors,” Journal of Biomedical Optics, vol. 23, no. 2, pp. 1–13, 2018.
  • [33] W. Xu, L. Xie, Z. Ye, W. Gao, Y. Yao, M. Chen, J. Qin, and Y. Ying, “Discrimination of transgenic rice containing the cry1ab protein using terahertz spectroscopy and chemometrics,” Scientific reports, vol. 5, p. 11115, 07 2015.
  • [34] Z. Chen, Z. Zhang, R. Zhu, Y. Xiang, Y. Yang, and P. B. Harrington, “Application of terahertz time-domain spectroscopy combined with chemometrics to quantitative analysis of imidacloprid in rice samples,” Journal of Quantitative Spectroscopy and Radiative Transfer, vol. 167, pp. 1 – 9, 2015.
  • [35] H. Ge, Y. Jiang, F. Lian, Y. Zhang, and S. Xia, “Quantitative determination of aflatoxin B1 concentration in acetonitrile by chemometric methods using terahertz spectroscopy,” Food Chemistry, vol. 209, pp. 286–292, 2016.
  • [36] H. Zhan, K. Zhao, H. Zhao, Q. Li, S. Zhu, and L. Xiao, “The spectral analysis of fuel oils using terahertz radiation and chemometric methods,” Journal of Physics D: Applied Physics, vol. 49, no. 39, p. 395101, sep 2016.
  • [37] W. Liu, C. Liu, J. Yu, Y. Zhang, J. Li, Y. Chen, and L. Zheng, “Discrimination of geographical origin of extra virgin olive oils using terahertz spectroscopy combined with chemometrics,” Food Chemistry, vol. 251, pp. 86 – 92, 2018.
  • [38] J. Liu, “Terahertz spectroscopy and chemometric tools for rapid identification of adulterated dairy product,” Optical and Quantum Electronics, vol. 49, Jan. 2017.
  • [39] X. Sun, J. Liu, K. Zhu, J. Hu, X. Jiang, and Y. Liu, “Generalized regression neural network association with terahertz spectroscopy for quantitative analysis of benzoic acid additive in wheat flour,” Royal Society open science, vol. 6, no. 7, p. 190485, 2019.
  • [40] D. Ye, W. Wang, H. Zhou, H. Fang, J. Huang, Y. Li, H. Gong, and Z. Li, “Characterization of thermal barrier coatings microstructural features using terahertz spectroscopy,” Surface and Coatings Technology, vol. 394, p. 125836, 2020.
  • [41] X. Sun, J. Liu, K. Zhu, J. Hu, X. Jiang, and Y. Liu, “Generalized regression neural network association with terahertz spectroscopy for quantitative analysis of benzoic acid additive in wheat flour,” Royal Society Open Science, vol. 6, no. 7, p. 190485, 2019.
  • [42] J. Cui, J. Zhang, C. Dong, D. Liu, and X. Huang, “An ultrafast and high accuracy calculation method for gas radiation characteristics using artificial neural network,” Infrared Physics & Technology, vol. 108, p. 103347, 2020.
  • [43] H. Sarieddeen, A. Abdallah, M. M. Mansour, M.-S. Alouini, and T. Y. Al-Naffouri, “Terahertz-band MIMO-NOMA: Adaptive superposition coding and subspace detection,” arXiv preprint arXiv:2103.02348, 2021.
  • [44] R. Ryniec, P. Zagrajek, and N. Palka, “Terahertz frequency domain spectroscopy identification system based on decision trees,” Acta Physica Polonica-Series A General Physics, vol. 122, no. 5, p. 891, 2012.
  • [45] M. H. Loukil, H. Sarieddeen, M. S. Alouini, and T. Y. Al-Naffouri, “Terahertz-band MIMO systems: Adaptive transmission and blind parameter estimation,” vol. 25, no. 2, pp. 641–645, 2021.
  • [46] K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” arXiv preprint, Nov. 2015. [Online]. Available:
  • [47] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, Gradient flow in recurrent nets: The difficulty of learning long-term dependencies.   Wiley-IEEE Press, 2001, ch. 14, pp. 237–243.
  • [48] S. P. Chatzis and Y. Demiris, “Echo state gaussian process,” vol. 22, pp. 1435–1445, Sep. 2011.
  • [49] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Trans. Intell. Syst. Technol., vol. 10, no. 2, pp. 1–19, Feb. 2019.
  • [50] D. Silver et al., “A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play,” Sci. J., vol. 362, no. 6419, p. 1140–1144, Dec. 2018.


Sara Helal is a senior Electrical and Computer Engineering student at Effat University, Jeddah, Saudi Arabia. Her research interests are in the areas of machine learning, signal and image processing, and wireless communications.

Hadi Sarieddeen (S’13-M’18) received his B.E. degree in Computer and Communications Engineering from Notre Dame University-Louaize, Lebanon, in 2013, and his Ph.D. degree in Electrical and Computer Engineering from the American University of Beirut (AUB), Lebanon, in 2018. He is currently a postdoctoral fellow at King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia. His research interests are in the areas of wireless communications and signal processing for wireless communications.

Hayssam Dahrouj (S’02, M’11, SM’15) received his Computer and Communications engineering degree from AUB in 2005, and his Ph.D. degree in Electrical and Computer Engineering from the University of Toronto (UofT) in 2010. In July 2020, he joined the Center of Excellence for NEOM Research at KAUST as a senior research scientist. His main research interests include cloud radio access networks, cross-layer optimization, cooperative networks, convex optimization, distributed algorithms, machine learning, and optical communications networks.

Tareq Y. Al-Naffouri (M’10-SM’18) received his Ph.D. degree in Electrical Engineering from Stanford University in 2004. He is currently a Professor at the Electrical and Computer Engineering department at KAUST. His research interests lie in the areas of sparse, adaptive, and statistical signal processing, localization, machine learning, and their applications.

Mohamed-Slim Alouini (S’94-M’98-SM’03-F’09) was born in Tunis, Tunisia. He received his Ph.D. degree in Electrical Engineering from Caltech, Pasadena, CA, in 1998. He served as a faculty member at the University of Minnesota, Minneapolis, then at Texas A&M University at Qatar, Education City, Doha, Qatar before joining KAUST as a professor of Electrical Engineering in 2009. His current research interests include the modeling, design, and performance analysis of wireless communication systems.