I Introduction
Ia History and Motivation for THz Sensing
The introduction of Terahertz (THz) technology has boosted research on exploring the atomic behavior of materials, fostering diverse opportunities in wireless sensing [1] and imaging [2]. The THz band, which extends from 100 Gigahertz (GHz) to 10 THz (wavelength from 1 mm to 0.1 mm) is sandwiched between the wellstudied microwave and infrared bands of the electromagnetic spectrum. This farinfrared region is thus seen as a transition region where optics meets electronics. In the optical domain, THz radiation, also known as Tray, is treated as a beam of light that can be manipulated by mirrors and lenses, of which the intensity of light can be measured. On the other hand, in electronics, Tray is treated as an electrical wave, of which the phase of the electric field can be measured. Due to the lack of efficient and reliable sources, the THz band has been perceived as the THz gap. However, the development of highenergy strongfield THz generation and detection devices is bridging this gap. In particular, the THz generation and detection technologies [3]
are classified as optical, electronic, hybrid photonicelectronic, and plasmonic (graphenebased).
THz wireless sensing leverages the inherent spectral fingerprints of molecules at the THz band for various sensing and imaging applications, in the areas of quality control, food safety [4], security [5], health [6], astronomy [7], and environmental monitoring [8]. Owing to its low photon energy ( at ), Tray is nonionizing; therefore, it can analyze the content of biological bodies without posing significant harm. Many biological and chemical materials exhibit unique spectral fingerprint information in the THz range. For instance, THz vibrational modes enable the study of molecular structures and vibrational dynamics such as scissoring, wagging, and twisting that arise from both intermolecular and intramolecular interactions [9]. THz radiation also has special interactions and strong molecular coupling with hydrogenbonded networks, which maximizes the potential of observing water dynamics [10]. Furthermore, Trays can penetrate various nonconducting, amorphous, and dielectric materials and are highly reflected from metals, which gives Trays the capability to inspect metallic components and detect weapons in securityrelated applications [5]. Most recently, THz technology is celebrated as a key enabler for future wireless communications [11, 12], which introduces unique applications in the area of joint THz communications and sensing [13, 14]. Note that wireless sensing can also denote localization functionalities, such as in radar sensing. However, accurate THzband localization is a promising research topic that is outside the current paper scope.
IB Comparison with Infrared and Microwave Bands
The THzband’s sensing capabilities are superior to their counterparts in the neighboring microwave and infrared bands. In particular, THz spectroscopy allows faster analysis with higher precision than infrared spectroscopy, which is further degraded by aerosolinduced light scattering and weather conditions [15]. Many materials are transparent in the THz range. However, in the microwave region, most materials and living tissues are opaque, and the spectral data produces lowerdefinition images, despite being less sensitive to the weather conditions (due to larger wavelengths). In the THz region, spectral data can reconstruct higher resolution images at arbitrary directions due to antenna array beamforming capabilities and reduced scattering loss. However, THz sensing and imaging also have shortcomings. For instance, at room temperature, blackbody radiation is dominant at THz frequencies, complicating imagingbased object detection. THz signals are also severely absorbed by water vapor, making reliable material identification at long distances in high humidity conditions challenging. Moreover, the time response of a material in the reflected or absorbed THz radiation is calibrated by a sampling technique linked to the THz spectroscopy’s pulsed nature, which limits its resolution. For temporal waveform acquisition, the spectral resolution depends on the temporal window inverse; a tradeoff between spectral resolution and temporal measurement time is thus unavoidable. Such promising gains and shortcomings motivate research on signal processing techniques for THz sensing.
IC Advancements in THz Devices
A typical THz system comprises a source, a detector, and intermediate optical components, with some systems consisting of two or more of these elements. The source produces broadspectrum THz radiation. The components, i.e., lenses, mirrors, waveguides, and polarizers, manipulate the radiation. Then, a detector sensitive to THz rays measures the radiation reaching it as depicted in Fig. 1. THz systems are divided into two categories: Pulsedwave and continuouswave. Both systems are suitable for spectroscopic applications in imaging and sensing. Pulsed THz systems generate short broadband or nearsinglecycle THz signals, whereas continuouswave THz systems generate narrowlinewidth and frequencytunable THz waves. Continuouswave electronic sources include frequencymultiplied Gunn diode oscillators and backward wave oscillators (BWOs), whereas promising THz photonic sources include photomixing techniques and cryogenically cooled quantum cascade lasers (QCL).
THztime domain spectroscopy (THzTDS) is a powerful spectroscopic tool that has long been utilized for the characterization and identification of materials. Photoconductive generation and detection of short, broadband THz pulses is at the heart of THzTDS systems. An ultrafast laser emits femtosecond optical pulses, which are converted into picosecond THz pulses, and photoconductive antennas serve as the emitter and detector. The generation and detection techniques mainly depend on the application.
ID Significance of Signal Processing for THz Sensing
The literature lacks a holistic approach for THz signal processing for communications and sensing, despite the progress made on THz system design. It is still unclear whether we can mitigate quasioptical THz propagation to mimic favorable microwave propagation characteristics and enable seamless connectivity and sensing. Consequently, the questions of whether THz sensing can extend beyond applicationspecific settings and whether joint THz sensing and communications are of real practical value remain open at this early stage of research on the topic.
The progress on THz channel modeling paves the way for solid research in signal processing techniques for THz wireless sensing. Such research is particularly significant because several challenges have not yet been addressed to achieve THz technology’s full potential. For example, THz signals are susceptible to high propagation losses and molecular absorption losses, which severely limit the achievable communication distances. Therefore, THz communications should be complemented by infrastructurelevel and algorithmiclevel enablers. At the infrastructure level, emerging beyond5G technologies such as intelligent reflecting surfaces (IRSs) and ultramassive multipleinput multipleoutput (UMMIMO) configurations [16] are vital to expand the THz communication gains and overcome the distance problem. At the algorithmic level, novel signal processing techniques can optimize infrastructure utilization to get around THz quasioptical propagation limitations and realize seamless wireless sensing. Efficient signal processing techniques are crucial to mitigate the noise and hardware impairments that distort measurements and worsen the sensing performance. This paper, therefore, presents a holistic overview of signal processing and machine learning techniques for efficient THz sensing, with an emphasis on signal preprocessing, feature extraction, and classification techniques. The paper further highlights the role of deep learning techniques by exploring their promising sensing capabilities at the THz band.
IE Contributions of this Work
Several review papers on THz sensing exist in the literature, such as [17, 18, 19, 20]
. Up to the authors’ knowledge, however, this paper is the first article that presents a holistic overview of signal processing and machine learning techniques for efficient THz sensing. More specifically, the paper introduces a roadmap for future THz sensing use cases, with a special focus on the role of signal processing. We promote the importance of both THzTDS and frequencydomain spectroscopy in future reconfigurable THz systems. The frequencydomain spectroscopy approach, particularly, introduces significant flexibility in carriers’ choice, thus converging on sensing information with minimum required measurements. We address the effect of THz channels on sensing performance. We perform a numerical merit investigation in which simulations validate our analyses. Such scope is significantly different from the treatment of THz sensing in literature, which mainly assumes controlled laboratory scenarios or specific use cases. We note that, at this early stage of research on the topic, the paper generates comprehensive numerical simulations using realistic data from existing databases
[21, 22], which would pave the way towards physical experimentations in future studies. The paper’s main contributions can be summarized as follows:
We overview the studied signal processing and machine learning techniques in detail, assuming a unified THz system model, which facilitates the analyses of performance and complexity tradeoffs. The presented overview particularly covers preprocessing (standard normal variate normalization, minmax normalization, and SavitzkyGolay filtering), feature extraction (principal component analysis, partial least squares, tdistributed stochastic neighbor embedding, and nonnegative matrix factorization), and classification techniques (support vector machines, knearest neighbor, discriminant analysis, and naive Bayes).

We motivate the importance of machine learning solutions compared to conventional signal processing.

We motivate the importance of deep learning compared to conventional machine learning, especially in autonomous joint communications and sensing applications, where THz systems can generate a lot of data very fast.

We comment on the performance and complexity tradeoffs of different techniques to come up with clear recommendations.

We motivate and formulate joint THz communication and sensing system models.
IF Organization of The Paper
The remainder of this paper is organized as follows. THzTDS use cases and the corresponding spectroscopy system models are overviewed in Sec. II. Sec. III reviews the different methods for the preprocessing of THz spectral data. The qualitative (feature extraction) techniques and quantitative (classification) analysis of materials are then discussed in Sec. IV and Sec. V, respectively. We also investigate deep learning techniques that rely on raw recovered measurements in Sec. VI. Sec. VII highlights the relative performance and complexity tradeoffs of the illustrated techniques based on numerical simulations. We motivate prospect use cases and open research directions in Sec. VIII, detailing joint THz communication and sensing applications and highlighting the importance of carrierbased spectroscopy. We conclude the paper in Sec. IX.
Regarding notation, bold upper case, bold lower case, and lower case letters correspond to matrices, vectors, and scalars, respectively.
is the probability that event
occurs. and stand for the transpose and inverse, respectively. Scalar norms (or absolute values) are denoted by and Frobenius norms are denoted by . The notation is the imaginary number.Ii Terahertztime domain spectroscopy
Spectroscopy is a reliable technique for material and gas identification over a wide range of spectral data, allowing the extraction of material parameters without requiring multiple samples of different thicknesses. THzTDS [23, 2]
characterizes the refractive indices and absorption coefficients, which correspond to the static and transient optical properties of materials. In a conventional THzTDS measurement, THz pulses’ temporal profiles are recorded twice with and without the media under examination. These pulses are referred to as the incident and transmitted pulses, or reference and sample pulses, respectively. The recorded pulse traces correspond to the electric field’s temporal dependence associated with THz waves, the transient THz electric field at the detector. The two profiles are then Fouriertransformed to obtain their complexvalued spectral behavior, where the ratio of the two pulses’ electric field strengths in the frequency domain determines the optical properties of the sample material. The complex spectrum yields amplitude and phase information and allows direct calculation of the frequencydependent index of refraction, absorption coefficient, and sample thickness. Compared to conventional Fouriertransformbased infrared and Raman spectroscopy, THzTDS infers more useful information and gives direct access to the electrical field amplitude.
Iia Transmission and ReflectionBased THzTDS
The necessary spectroscopic measurements that provide the diagnostic tools for determining the optical constants of materials are transmissionbased and reflectionbased. Transmission spectroscopy is correlated with absorption spectroscopy, which is based on analyzing the amount of light absorbed by a sample material at a given wavelength. On the other hand, reflection spectroscopy studies the reflected or scattered light from a material as a function of wavelength. For both reflection and transmission measurements, the Fresnel equations describe the energy transfer at the interface between two media, usually air and another homogeneous material. Most THz sensing and imaging applications are conducted in a transmission geometry (absorbance THz spectroscopy), owing to its simple setup design and the high contrast of the transmitted THz signal. Several reasons, however, make THz reflectionbased TDS more appealing for material identification. Reflection geometry is capable of measuring the spectrum of thick or highly absorptive materials that are opaque in the THz band. In particular, reflection geometry makes detecting targets on nontransparent substrates possible, such as in the case of threat items concealed in opaque envelopes and paint on the bodies of vehicles. Furthermore, reflection spectroscopy is capable of reconstructing full threedimensional (3D) images of objects, where the short THz pulse duration provides precise means of calculating the distance to reflecting surfaces. Despite providing lowcontrast spectroscopic imaging, reflection geometry offers a unique solution to applications at long distances, such as standoff distance THz sensing and imaging in open field environments.
IiB Gas, Liquid, and Solid THzTDS
THzTDS supports inspection and identification of different states of matter through electromagnetic interactions with gas, liquid, and solid components:

Gas Spectroscopy [15]: Used to detect air pollution, flammable gases, atmospheric pressure, and humidity, or to monitor molecules blended in unfavorable aerosols like smoke, fog, haze, dust, fume, and others.

Liquid Spectroscopy [24]: Used to characterize liquid and aqueous solutions. The dielectric properties of liquids depend on interactions and dipole relaxations. Trays are sensitive to relaxational, oscillatory, and collective motions and can be used to study ionic liquid solutions. The high sensitivity to water and nondestructive penetration abilities of THz radiation are appealing for a plethora of liquid spectroscopy applications, such as the identification of water content, alcohol, liquid fuels, and petrochemicals. THzTDS can also investigate the color, carbonation, and flavor of commercial beverages.

Solid Spectroscopy [25]: Used to investigate a wide variety of spectral fingerprints of molecular solids such as explosives, dielectrics, polymers, ferroelectrics, semiconductors, and photonic crystals, for security screening, quality inspection, or pharmaceutical applications. Several THzTDS methods can quantify the concentration, index of refraction, absorption peak, dielectric constants, and polarizability of materials. There are also different kinds of lowenergy excitations in solid materials that THzTDS can successfully detect.
IiC THzTDS Imaging
In THz material sensing, the inspection is not consistently feasible if the samples under investigation have no apparent features such as absorption peaks. Nevertheless, the capability to form images and provide spectral information by exploiting other Tray unique aspects enables alternative THzbased material characterization techniques. In particular, there has been a resurged interest in THz timedomain spectroscopic imaging because of the low power and low energy interaction of THz waves with some materials [2]. The image data and spectral information can determine the type and the shape of the material. For instance, a noncontact method uses THz pulse imaging to quantify the coated pharmaceutical tablets’ thickness [26]. By setting a THzTDS system with an intermediate focus, one can measure the THz waveform of the pulse traversing an object placed at the focal plane. Because of low THz scattering, THzTDS imaging systems produce high contrast images that enable effective material analysis.
There are several THzTDS imaging methods, each of which can extract different relevant information from samples. Such methods are amplitudebased, phasebased, or a combination of the two. With amplitude imaging, the THz waveform’s magnitude is measured using fast numerical Fourier transform over a particular frequency band to extract the waveform’s peak signal at each pixel. To form an image pixel by pixel, the transmitted THz wave is measured for each position of the object under study. After THz signal acquisition, THz imaging requires the processing of the entire waveform. Each waveform holds a wealth of information (both in amplitude and phase) that can be extracted at each image pixel with integrated digital signal processors; detailed chemical or physical information can be obtained at each pixel. In our paper’s context, signal processing for THz sensing does not account for imaging techniques, which we rather consider an extension of this current work.
IiD Reflection Spectroscopy System Model
THz reflection spectroscopy can measure the real and imaginary parts of both the refractive index and dielectric constant of sample material to obtain distinct absorption coefficients. The KramersKrönig transform technique can be applied to calculate the sample’s optical conductivity, and the reflection coefficient can be determined using Fresnel’s equation. When a THz beam propagating in a uniform medium (air) encounters a material with refractive index at incident angle , the ratio of the reflected beam to the incident beam yields the complex reflectivity of the spolarized component (denoted ) and the ppolarized component (denoted ). We have
(1) 
(2) 
where and are the reflected and incident beams of the spolarized component, and and are the reflected and incident beams of the ppolarized component, respectively. The refractive index of air is , and for both polarizations we have (the incident and reflected angles are related via Snell’s Law). Reflection at normal incidence facilitates deriving the reflection coefficient through the complex refractive index of the material , with being the extinction coefficient, and where we have
(3) 
Therefore, the reflection coefficient is , where is the reflectance. Assuming both the phase shift and are known, the refractive index and the extinction coefficient are calculated as
(4) 
(5) 
The absorption coefficient can then be computed as
(6) 
where is the frequency, and is the speed of light in vacuum.
IiE Transmission Spectroscopy System Model
Transmission THzTDS probes collective vibrational modes in materials. In particular, the frequencydependent complex refractive index can be measured entirely from the timedependent electric field of the THz waveform without the need for KramersKrönig transform equations. Using the sample thickness and the complex refractive index , the amplitude of THz transmittance, denoted by , can be obtained as
(7) 
where and are the incident and transmitted amplitudes, respectively. Consequently, we can obtain the extinction coefficient , and refractive index as
(8) 
(9) 
where is the phase difference between the reference and sample. The absorption coefficient of the sample material can then be calculated as
(10) 
In gas spectroscopy, the absorption coefficient can be extracted from the path gain, which is expressed as
(11) 
where is the distance between the transmitter and receiver. can be expressed as the sum of absorption contributions from isotopes () of gases () that the medium is composed of, and it is expressed as
(12) 
Using radiative transfer theory, the molecular absorption coefficient for an individual isotope can be expressed as
(13) 
where is the system pressure, is the reference pressure, is the temperature, is the temperature at standard pressure, is the gas constant, is the mixing ratio for the isotope of gas (), is the Avogadro constant, and is the absorption cross section that is given as
(14) 
where is the Van VleckWeisskopf line shape, is the absorption strength, is the Planck constant, and is the Boltzamnn constant. is the resonant frequency of the isotope of gas , where is the zeropressure of the resonant frequency and is the linear pressure shift. The Van VleckWeisskopf line can be further decomposed as a function of Lorentz halfwidth for gas as follows:
(15) 
The Lorentz halfwidth () can be obtained as a function of the broadening coefficient of both air () and the isotope of gas as
(16) 
where is the temperature broadening coefficient parameter. Note that all the aforementioned parameters can be extracted from the highresolution transmission molecular absorption database (HITRAN) database [21]. Although the complexity of the radiative transfer theory approach is high, the existing alternative gas absorption models are not accurate enough for sensing purposes; they are rather approximations that can be used in THz communication scenarios [12].
In our system model, for material sensing, we assume the THz transmission spectra to be accumulated in a vector , where represents the input data. For the gas sensing scenario, the absorption coefficient spectra is accumulated in a vector , where is used as input data. In the remainder of this paper, we drop the superscripts and when the discussion equally applies for both absorption and transmission spectra, for simplicity. We next detail the signal processing and machine learning techniques that can be applied on to retrieve sensing information. We first start with overviewing the preprocessing techniques in Sec. III. Then, we present the feature extraction techniques, quantitative analysis of materials, and deep learning prospects in Sec. IV, Sec. V, and Sec. VI, respectively. Table I lists sample works from the literature that utilize a variety of signal processing techniques for different THz sensing applications.
Iii Signal PreProcessing Techniques
Perfect recovery of the THz signal at the detector is vital in spectroscopic systems. However, several factors prevent such ideal conditions. For instance, atmospheric attenuation significantly influences the propagating THz radiation in the air. Consequently, loss of information, slope variations, baseline shifts, and redundancy in the acquired data are expected. The quality of classification depends not only on the effectiveness of the feature extraction techniques but also on the preprocessed data’s quality and quantity. Employing signal preprocessing techniques for spectroscopic THz systems can thus significantly improve the accuracy, noise robustness, and resolution of THz sensing and imaging systems.
We denote the spectral output after applying preprocessing techniques on an input by the vector . Out of a vast range of signal preprocessing techniques, a few methods have been more commonly applied to THz spectral data, namely, SavitzkyGolay (SG) filtering, detrending (DT), first derivative (FD), standard normal variate (SNV), baseline correction (BC), wavelet transform, and minmax normalization. Other spectral pretreatment techniques have also been studied to preprocess THz and nearinfrared (NIR) spectra, such as the logarithmic function , mean centering (MC), and multiplicative scatter correction (MSC). In this work, we detail and implement SNV, minmax normalization, and SG filtering.
Iii1 Standard normal variate
In material identification experiments, normalization is widely used to compensate for variations in the sample surface optical properties, i.e., density, scattering, and roughness/smoothness. SNV correction and normalization pretreatment is a transformation that eliminates scatter and baseline drift effects from the spectral data by centering and scaling individual spectra. The average and standard deviation of all data points for the given spectrum are first calculated. Then, the SNV transformation centers and scales the spectrum by subtracting every data point from the mean and dividing it by its standard deviation (
). The SNVprocessed spectral values, (), are computed as(17) 
Iii2 Minmax normalization
Minmax is another preprocessing normalization scaling technique that is frequently used in THz and NIR spectroscopy for constraining the range of each input feature of the spectrum. The features or spectral data are most often rescaled to fit a target range of or ; all the data relationships are preserved, and no bias is introduced. Such normalization allows more flexibility in designing classifiers and determining which features are more prominent. The rescaling norm is given by
(18) 
where and are the maximum and minimum values of the feature data, respectively.
Iii3 SavitzkyGolay filter
SG filtering is a simplified leastsquaresfit convolution lowpass filter, preliminarily used for noise reduction and smoothing. The SG filter improves the signaltonoise ratio (SNR) without strong signal distortion and deforming. The commonly used SG algorithm consists of approximating the signal in a window (or frame) by a polynomial function of a certain degree, which is constructed by the leastsquares method. The best fit polynomial order and frame size can be estimated using a trial and error method. However, the polynomial filtering coefficients can be calculated in advance since they are independent of the input data, ensuring high computational efficiency. The SGsmoothed data is computed as
(19) 
where is the smoothing coefficient and is the number of data points in the smoothing frame; is the halfwidth of the smoothing frame.
Iv Feature Extraction
Inconclusive classification results can be obtained in THz spectroscopy due to unquantifiable scattering effects which form spurious structures. Towards mitigating such THzTDS problems, following signal preprocessing, we target extracting critical features from the THz spectral data. In particular, we consider the following feature extraction techniques: Principal component analysis (PCA), tdistributed stochastic neighbor embedding (tSNE), nonnegative matrix factorization (NMF). Following our system model, feature extraction techniques operate on an initial set of THz spectral data in a matrix , or a preprocessed matrix , where denotes the number of data samples (observations) and corresponds to a range of THz features (variables) that are subsequently used as inputs to the classifiers. The dataset matrix is expressed as
(20) 
and the ouptut feature vector is expressed as , where .
Iva Principal Component Analysis
PCA is a dimensionality reduction technique that reduces a multidimensional dataset of many correlated variables into a smaller set with few comprehensive indicators. The new indicators are known as principal components (PCs). In THz material classification, PCA is abundantly used as an unsupervised, nonparametric method for extracting relevant data features and eliminating overlapping data from original THz spectral datasets. The matrix is converted into a vector of synthesis indicators (PCs),
, the entries of which are sorted in descending order of their respective variances. The PCs correspond to the eigenvectors of the
larger eigenvalues of the covariance matrix
.IvB tDistributed Stochastic Neighbor Embedding
tSNE is a nonlinear learning method that reflects lowdimensional data by optimally positioning data points in a projection map. The tSNE algorithm utilizes the joint probability distribution between highdimensional data points and their corresponding synthetic data points in a lowdimensional space, minimizing the KullbackLeibler (KL) divergence to obtain optimal lowdimensional data. For the same input
dimensional data , the similarity conditional probabilities are first computed as(21) 
(22) 
where is the similarity of data point to data point in , and
is the variance of the Gaussiandistributed THz spectral value centered over
. Using a heavytailed Student tDistribution with one degree of freedom in the lowdimensional space, the joint probability
between synthetic data points in is calculated as(23) 
Then, the divergence between the synthetic data points in lowdimensional space and their corresponding data points in highdimensional space is minimized as
(24) 
with . Finally, the KL divergence is optimized via the gradient descent method as:
(25) 
The corresponding tSNE output is denoted by .
Materials  THz Range  THzTDS Setup Details  Feature Extraction  Classification  Ref. 

Aminoacids, saccharides, and inorganic substances  0.96 THz  GnAs photoconductive antenna, femtosecond laser, transmission mode  PCA  Fuzzy Pattern  [27] 
Flavonols (myricetin, quercetin, and kaempferol)  0.62.7 THz  Modelocked Tisapphire laser, transmission mode  PLS  Random forest, LSSVM  [28] 
Rice  06.4 THz  TERA K15 All fibercoupled spectrometer, femtosecond laser, transmission mode  PCA  PLSDA, SVM, BPNN  [29] 
Protein (bovine serum albumin)  0.21.2 THz  LTGaAs photoconductive antenna, femtosecond laser, lockin amplifier  PCA, tSNE  Random forest, NB, SVM, XGBoost 
[30] 
Benzoic acid  1.62.8 THz  TAS7500SU system, transmission mode, ultrashort pulse fibre lasers  PCA  GRNN, BPNN  [31] 
Pure analytes (citric acid, fructose, and lactose)  0.053 THz  TPS Spectra 3000, mode locked Ti Sapphire laser, transmission mode  PCA  PLSR, ANN  [32] 
Transgenic rice and Cry1Ab protein  0.12.6 THz  Z3 THzTDS system, LT GaAs photoconductive antenna, ZnTe electrooptical crystal detector  PCA  PLSR, DA  [33] 
Rice and imidacloprid pesticide  0.31.7 THz  modelocked Tisapphire laser, femtosecond laser, ZnTe photoconductive antenna, transmission mode  PLS  SVR  [34] 
Aflatoxins B1 in acetonitrile solution  0.41.6 THz  modelocked Ti:sapphire laser, photoconductive switches, transmission mode  PCA, PLS, PCR  SVM  [35] 
Fuel oils (lubricant, gasoline, and diesel)  0.21.5 THz  modelocked femtosecond Tisapphire laser, lockin amplifier, transmission mode  PCA  SVM, BPNN  [36] 
Extravirgin olive oil (EVOO)  0.14 THz  TAS7500TS HF THzTDS system, femtosecond laser, transmission mode  PCA, Genetic algorithm 
LSSVM, BPNN, Random Forest  [37] 
Adulterated dairy products (skim, low fat milk)  0.11.5 THz  —  PCA  SVMDA  [38] 
Oral lichen planus (OLP)  0.33.5 THz  TSPEC THz spectrometer, absorption mode  PCA  SVM  [6] 
IvC Nonnegative Matrix Factorization
NMF is another dimensionality reduction and feature extraction technique suitable for highdimensional multivariate THz spectral data analyses. NMF approximates the data by iterative additive combinations of the basis vectors, making it a good candidate when other tools can not guarantee nonnegativity in measurements that contradict physical realities. For a nonnegative matrix and a positive integer factorization rank , we find two nonnegative matrices, and , the product of which approximates via nonnegative factorization:
(26) 
The number of columns in is the latent feature representing the reduced feature space dimension.
V Classification
Following the discussion on feature extraction, we investigate complementing machine learning techniques that classify materials based on their THz spectral absorption and transmission coefficients. The candidate combinations of signal processing techniques are illustrated in Fig. 2
. We next detail the following candidate classification techniques: Naive Bayes (NB), support vector machine (SVM), Knearest neighbor (KNN), and partial least squaresdiscriminant analysis (PLSDA). Unless otherwise stated, for an input set
of features, the vector denotes the output data classes.Va Naive Bayes
The NB classifier is a supervised probabilistic machine learning classifier that applies the maximum a posteriori (MAP) decision rule for parameter estimation. NB assumes the presence and absence of each feature of a class independently. Each class has a probability
that is estimated from the training feature dataset. The class with the highest postprobability is the resulting target class. Using Bayes’ theorem, the NB classifier computes the conditional probability
for each of the possible classes as(27) 
Since for all classes is invariant, the naive classification rule can be further simplified as
(28) 
For the final decision, the classifier model incorporates a MAP rule to predict the class with the largest posterior probability:
(29) 
The NB classifier can be used alongside 2D crosscorrelation for classifying THz signals. The correlation between the background timedomain pulse and the sample ensures considerable noise suppression in a THz dataset. Consequently, any phase differences due to sample dispersion are preserved and discriminated between the two signals. The crosscorrelation sequences associated with every class sample can be computed using the reference and sample signal. Then, statistical features can be extracted from each crosscorrelation sequence and forwarded to the NB classifier.
VB Support Vector Machine
SVM is a linear regression model that classifies data based on a set of support vectors, subsets of the training dataset, that construct a hyperplane in feature space. For both linear and nonlinear problems, SVM classifies data using a boundary hyperplane that separates data into different classes. Most material recognition problems consisting of multiclass pulsed signals (signals belonging to three or more classes) use SVM classifiers to analyze and discriminate data.
A set of learning data is used for building the SVM model, where and denotes the class label corresponding to each input feature vector . A total of classifiers are constructed, where denotes the class number of the input data. Each classifier is trained on input data from two classes. By training data from the and the classes, the classification problem can be expressed as
(30) 
Subject to
where is the normal vector of the hyperplane, is a realvalued bias, is the slack variable, is the penalty parameter, is the index of the combined set of the and samples of the training data, and is the function that maps the training data (input space) to a higher dimensional space (feature space). Both and denote the optimization variables for the optimal hyperplane. Consequently, the voting strategy for the SVM classification function is defined as
(31) 
where is the Signum function that extracts the sign that determines the class to assign to the point. The sign is positive if the point is correctly classified and negative otherwise.
VC KNearest Neighbor
KNN is a distancebased learning algorithm that is favorable due to its simple mathematical formulation and relatively little training time. In KNN, data points close to each other are referred to as neighbors, and the desired class is constructed based on distances to the data points near known data. The algorithm acquires nearest neighbors by a majority vote decision, based on a specific distance metric (Euclidean, Mahalanobis, Chebychev, or correlation distance). The controlling variable
is chosen after preliminary validation or hyperparameter optimization, depending on the dataset requirements. The KNN algorithm selects the predicted class
for which the distance to the test data is minimized as(32) 
where is the distance between the training input and the test input , and is the total number of features.
VD Partial Least SquaresDiscriminant Analysis
PLSDA is derived from the PLS regression (PLSR) algorithm and combines the properties of PLSR with the discrimination power of a classification technique. PLSDA thus sharpens and maximizes the separation between groups or classes of observations. PLS is a commonly used supervised feature extraction technique for THz data. It reduces data in a low dimensional space via linear transformation. However, it forms a hybrid classification method when combined with DA, which can be used for predictive modeling (we thus discuss PLSDA under classification techniques). Let the input matrix
of THz spectral feature data for several classes consist of predictor variables, the output measured parameter matrix consists of response variables. The fundamental PLSDA paradigm is formulated as(33) 
(34) 
where is the Yscore matrix, is the orthogonal Yloading matrix,
is the Zscore matrix,
is the orthogonal Zloading matrix, and and are error matrices. The linear regression model of and is expressed as(35) 
with being the regression coefficient matrix and is computed as
(36) 
The prediction value, , of the tested sample is then computed as
(37) 
If the prediction value of class membership is above zero, a corresponding sample is considered as a member of a class.
Vi Deep Learning
Given that THz spectral data sets can be extensive and complex, neural networks can be explored as robust classification tools to speed up the learning efficiency compared to conventional classification models discussed earlier in this paper. We distinguish various neural network architectures depending on the approach of network training and classifying. In particular, deep learning neural network techniques can be supervised, unsupervised, and reinforced. One of the main advantages of neural networks is that they can create new features by themselves, unlike traditional shallow learning techniques in which features need to be identified accurately by other techniques. Therefore, deep learning classifiers can operate directly on THz training data, enabling faster learning that is much needed in fastchanging THz conditions. We highlight two particular types of supervised neural networks because of their popular usage in existing THz material sensing literature: Generalized regression neural networks (GRNN)
[39]and backpropagation neural networks (BPNN)
[40].Via Generalized Regression Neural Network
The most extensively used deep learning model in THz material classification studies is GRNN, both for theoretical and practical applications [39]
. GRNN is a typical feedforward neural network that provides a powerful variation to the conventional radial bases function neural network. As a singlepass efficient learning technique, GRNN solves tedious efficiency and flexibility issues. Unlike BPNN, GRNN requires selecting only one training parameter to be learned (the smoothing factor or the width of radial basis functions, for example). The corresponding performance differs significantly with smoothing factor choice, where the estimated density takes a multivariate Gaussian form with larger smoothing factors. GRNN consists of four layers: Input, pattern, summation, and output
[41].ViB Backpropagation Neural Network
BPNN is widely used to train multilayer feedforward neural networks, and it mainly consists of the input layer, one or more hidden layers, and the output layer. The BPNN principle adjusts the weight parameter controlling the degrees of connections between the neuron nodes of different layers to produce the desired output layer. Proper adjustment of weights allows minimizing the total network error. The number of input features determines the number of input neurons. Furthermore, the number of neurons in the output layer is related to the number of classes. However, the number of hidden intermediate layers between the input and output layers can be customized.
Vii Performance and Complexity Tradeoffs
This section analyzes the performance and complexity tradeoffs of the feature extraction and classification techniques presented in the previous sections. We consider publically available THz spectral data for different materials, solids (Fig. 3), and gases (Fig. 4), and we compare the techniques’ classifications success rates.
Viia Terahertz Spectroscopy Datasets
For material identification, we consider the subTHz spectral data of 20 sample materials (such as alumina, aspirin, baking powder, baking soda, and chalk), as obtained from the THz database provided by the National Institute of Standards and Technology (NIST) [22]. The chemical materials are assumed to be grounded to a finer powder and pressed into solid pellets in polyethylene diluted pellets. By investigating the transmittance spectrum, it can be seen from Fig.3 that the THz transmittance amplitude and peak locations across the materials are different, which facilitates classification and identification. In the particular case of gas spectroscopy, the gas absorption spectra are modeled using radiative transfer theory. The molecular absorption for an individual isotope can be expressed as a function of system pressure and temperature. We extract all parameters from the HITRAN database [21].
ViiB Performances of Feature Extraction Techniques
Since smaller datasets are easier and faster to analyze and visualize than rich datasets, dimension reduction can be an essential step before implementing machine learning algorithms. For sample materials, we use 1000 observations to establish the calibration model over 430 transmission spectral coefficients (variables). We use the SavitzkyGolay function to preprocess the data for smoothing (assuming noisecorrupted data). The qualitative chemometric analysis is first performed using four different feature extraction and dimension reduction techniques: PCA, PLS regression, NMF, and tSNE.
For each SNR value, PCA is executed 10 times, where a small number of PCs (up to 10 PCs) is required to reach good classification performance. The THz spectral window under investigation is reduced while still retaining relevant data that captures most of the THz spectral fingerprint. At an SNR of 20 dB, the accumulated contribution rate of the first 10 principal components of PCA reaches , which results in good clustering performance. Eventually, PCA constructs PCs that convey the most variation in the available dataset. Similarly, the first 10 output components of each of the studied techniques are selected to train the classifiers. In tSNE, we set the effective number of local neighbors of each point, known as perplexity, to 5. The best feature extraction model is the one that minimizes the computation complexity and dimensionality. We note that PCA has the best clustering efficacy (see Fig. (a)a and Fig. (a)a) for complex highdimensional spectral datasets and that tSNE has the highest computational and time complexities. At lower SNR, PCA outperforms NMF, PLS, and tSNE in terms of stability, complexity, and interpretability of the spectral features. In this work, tSNE was performed using a slightly less number of samples, as it requires considerable time to execute on the same sample size of spectral data than other feature extraction techniques.
ViiC Performances of Classification Techniques
We next illustrate the classification rate performance of several materials. We utilize linear DA (LDA), linear SVM (LSVM), weighted KNN (WKNN), and Gaussian NB (GNB) to predict the materials following each feature extraction technique. The output feature data are divided into two sets, training and testing, both of which are composed of the sample material (or class) data and a sample label. We use a Kfold crossvalidation scheme to evaluate the classification models’ performance. The input data of features is partitioned into 10 equalsize subsamples (folds), each used as a testing set and as validation data. In the first iteration, the first fold of the subsamples is used to test the classifier, while the remaining folds are used to train the classifier. In the second iteration, the second fold is used as the testing set, and the rest serve as training sets. This process is repeated until each of the 10 folds is used as a testing set.
We measure the root mean square error of calibration (RMSEC) for the testing set to capture the classification model’s performance. The results in Fig. 5 suggest that LDA can distinguish the solid materials accurately at low SNR values with low processing time. This observation is expected because LDA is suitable for problems that deal with linearly separable THz spectral data. Gaussian NB (GNB) gave similar good classification results, which is also expected because the THz material fingerprints have continuous features and approximate Gaussian distributions. However, the GNB classifier (Fig. 5d) did not exhibit a good representation of the NMF data, mainly due to inaccuracies in its simple hypothesis function. Furthermore, the accuracy of KNN is degraded due to noisy feature data and large sample sizes, resulting in an expensive cost to calculate the distance to the nearest neighbor. The relative degradation in the accuracy of the SVM classifier, further, might be caused by overfitting.
For BPNN, the best performance is achieved with 10 PCs, 10 hidden nodes, and a 0.01 learning rate. However, the best GRNN performance is achieved with 5 PCs and a spread value of 10 for the radial basis function, attaining smoother function approximation. Compared to the BPNN model, the GRNN model results suggest that GRNN is more favorable in predicting the THz spectral data, mainly because GRNN has low computational complexity, intermittent flow estimation, and high computing speed. Each GRNN trains fast in onepass learning, whereas BPNN takes much more time on average over forward and backward passes. GRNN can further converge to the THz data’s underlying function with just a few training samples, unlike BPNN. Moreover, GRNN results in less classification error (better generalization ability) due to its ability to handle noise in the input data.
GRNN achieves a good balance between the high classification accuracy and speed for both solid and gaseous materials datasets (Figures 5 and 6, respectively). The corresponding computational efficiency is proven in [42] for all kinds of gas molecules supported by a spectral database. The results show that the multivariate discriminative model of LDA and GRNN, in association with THz spectroscopy, provides a costeffective and lowtimeconsuming alternative to the commonly used models in the literature for material classification, suggesting a commercial and regulatory potential.
Viii Joint THz Communications and Sensing
The progress made on THz system design triggers a plethora of promising research directions, especially on the topic of convergence of communications and sensing. To this end, we next present some particular timely applications of joint communications and sensing at the THz band. The section also suggests some future research directions, which are expected to be at the forefront of THz sensing studies.
Viiia Joint THz Communications and Sensing Applications
The prospective use cases of THz sensing are mainly in the context of joint communications and sensing [13, 14]. A unified THz system for communication and sensing can support various applications, from inhome digital health to building analytics (e.g., residential security). For instance, the THz band can establish reliable communication links for unmanned automotive vehicles (UAVs) that build on accurate localization and sensing capabilities. The THz band can also provide highrate virtual reality services, enabling good visual perception. In particular, THz signals can support extended reality (XR) interfaces capable of interacting with sensory information in indoor environments without intervention. SubTHz vehicular communications and sensing can further enable data exchange in vehiculartoeverything (V2X) communication systems. The deployment of such unified systems requires exceptionally high data rates, low latencies, and high reliability. Several contemporary approaches to joint vehicular communication and sensing are explored, such as timedomain duplex (TDD), telecom messages over radar transmissions (ToR), and radar sensing over telecom transmissions (RoT). Moreover, joint THz communications and sensing applications are particularly useful in bodycentric systems where THzTDS can be used to sense infections and mitigate virus outbreaks.
Intelligent THz systems should achieve highrate communications and robust highresolution sensing at the same time. Towards this end, efficient signal processing techniques are critical. Recent advances in machine learning and compressive sensing can significantly accelerate THz network applications by enhancing situational awareness and facilitating fast and low latency configurations. In this regard, network intelligentization is a new trend that aims at leveraging machine learning techniques to empower 6G communication systems with artificial intelligence (AI) algorithms. The use of AI in THzbased 6G networks is envisioned to pave the way towards enhanced localization, communication, and sensing. However, unleashing machine learning’s full potential requires addressing significant challenges, especially in waveform design. The differences in performance metrics between communication and sensing functionalities (achievable data rates, level of interference, sensing accuracy, and reliability) should also be taken into consideration. For instance, deep learning methods can be applied for target classification, waveform recognition, material sensing, and optimal selection of antennas and radio frequency chains in a joint THz communications and sensing setup.
ViiiB THz FrequencyDomain Spectroscopy
The THz band provides the possibility of designing both carrierbased and pulsebased setups that offer robust wireless sensing functionalities in communication system frameworks. However, transmitting shorttime THz pulses in THzTDS that cover large THz frequency bands is inefficient for communications. A carrierbased THz sensing setup can offer better flexibility in jointly meeting the continuously increasing bandwidth demands and sensing capabilities. We denote by the latter THz FrequencyDomain Spectroscopy (THzFDS). THzFDS can be conducted to recover the channel’s molecular, spatial, and temporal characteristics when probing the environment with a discrete set of carrier frequencies. THzFDS wireless sensing comprises three steps: (1) Selectively probing few THz waves into a medium, (2) estimating the channel response to identify THz fingerprints, and (3) correlating the estimated response with a reference spectral database of the constituents of the medium under investigation.
However, the spectral lines that form fingerprints for testing occur at specific resonance frequencies that are typically avoided when allocating carriers for communication purposes. On the one hand, this observation complicates the joint communications and sensing problem because nearabsorptionfree communication spectra hold little sensing information. On the other hand, if sensing is not fully piggybacked over communication resources and timesharing is enabled, the carriers can be directly tuned to the target resonant frequencies of specific gases/materials to achieve better accuracy.
We test THzFDS in the context of environment electronic smelling (gas sensing). We assume several realistic mixtures of gases (N2, O2, H2O, CO2, and CH4), demonstrating different possible mediums with molecular concentrations representing dry, humid, and polluted air profiles. The corresponding results in Fig. 7
demonstrate that carrierbased sensing of gas mixtures is feasible. However, higher SNR values are required for convergence to a 100% classification success rate because estimating a mixture of gases is complicated. Not that we used 100 carriers, randomly distributed over a specific frequency range, which can be allocated in a single channel use with orthogonal frequencydivision multiplexing (OFDM), for example, given the large THz bandwidths. We also test simple heuristic search algorithms to illustrate the importance of tuning carriers to resonant frequencies in joint communication and sensing setups, as illustrated in Fig.
8. Tuning 10 or 100 carriers to resonant frequencies of water vapor or oxygen introduces significant gains compared to uniformly distributing these carriers between
and . Note that the high SNR values are caused by the high propagation losses of THz channels (over a distance of ) that are accounted for in this simulation. Lower SNR ranges can be achieved by adding substantial antenna and beamforming gains.Towards introducing beamforming gains, UMMIMO systems can be deployed. High beamforming gains increase the received signal power and provide the required highresolution spatial focusing at a specific distance (molecular absorption is also distancedependent). Furthermore, UMMIMO systems can realize multiple measurements in a single channel use. However, the high correlation between absorption spectra and the inherent high correlation of UMMIMO channels results in lowrank measurement matrices. Spatial tuning of antenna separations can guarantee orthogonality of THz channels to achieve high multiplexing gains [43]. The level of accuracy also depends on the application requirements and the assumptions on how many gases and isotopes of gases can exist in a medium. Therefore, the problems under study can easily get prohibitively complex for simple signal processing techniques, hence the motivation for machine learning.
Instead of comparing the exact values of channel measurements, we can set thresholds to check the presence or absence of specific spikes and build decision trees for classification
[44]. Furthermore, when a small number of gases/materials is being tested, the corresponding sparsity can be leveraged in compressive sensing techniques. Note that the transmitted informationbearing symbols over the channel can be assumed to be random for sensing purposes. However, in cooperative sensing and communications setups, such symbols would belong to a specific constellation with a specific structure, a quadrature amplitude modulation, for example. The knowledge on modulation format can further be exploited to enhance the sensing performance. Note that in adaptive THz UMMIMO systems, the set of active antennas, the carrier frequencies, and modulation modes can be tuned in realtime while being efficiently blindly estimated at the receiver side [45].ViiiC Future Research Directions
The general need for automating analytical modeling, empowered by the developments in machine learning techniques, is trending nowadays, both from numerical and theoretical perspectives. Since at the core of the THz sensing problem lies a pattern recognition procedure and, by transitivity, a designated function approximator, as entailed earlier in this paper, we expect the dynamic machine learning developments to play a major role in future THz sensing research directions. For instance, in sensing applications characterized with longterm variable dependencies, one can investigate the merits of using convolutional neural networks
[46][47], or echostate networks [48] as potential sensing solutions. In sensing problems where distributed, yet privacypreserving, implementation is of practical importance, we can adopt federated learningbased sensing techniques [49]. Lastly, in problems involving interactive learning frameworks, e.g., in sensing situations where one agent would be able to sense one state of the environment and consequently executes an action that impacts the next state for maximizing a certain score (reward), one can use reinforcement learningbased techniques
[50]. Such reinforcement learning paradigm would indeed be quite fitting for THz multipurpose platforms, e.g., the THz band capabilities as a powerful enabler of joint communications, sensing, and localization, which promises to be an exciting area of future research in the field.Ix Conclusions
In conclusion, nondestructive THz spectroscopic sensing with chemometrics is useful for material discrimination and has the potential to deal with reallife problems and applications. Employing machine learning methods can provide a powerful tool for both qualitative and quantitative analyses of THz spectral fingerprints. In this paper, we have successfully applied several relevant dimension reduction and classification techniques to identify solid and gaseous materials measured by transmission and absorption THzTDS spectroscopy, respectively. We have demonstrated the feasibility of PCA, PLS, NMF, and tSNE to reduce the highdimensional THz data and extract its most prominent features. Furthermore, we employed SVM, LDA, KNN, and NB classifiers to determine the sample materials’ quantitative determination or prediction. Our results confirm that PCAGRNN and PLSGRNN have superior performances among other machine learning models in identifying solid materials in the THz range. However, for the special case of gaseous materials, NMFGRNN, comparable to PCAGRNN, provides the best classification results over lower SNR values. Up to the authors’ knowledge, this paper is the first article that presents a holistic overview of signal processing and machine learning techniques for efficient THz sensing and introduces a roadmap for several exciting future THz use cases.
References
 [1] R. Bogue, “Sensing with terahertz radiation: A review of recent progress,” Sensor Review, 2018.
 [2] P. Jepsen, D. Cooke, and M. Koch, “Terahertz spectroscopy and imaging – modern techniques and applications,” Laser & Photonics Reviews, vol. 5, pp. 124–166, Jan. 2011.
 [3] K. Sengupta, T. Nagatsuma, and D. M. Mittleman, “Terahertz integrated electronic and hybrid electronicphotonic systems,” Nature Electronics, vol. 1, no. 12, p. 622, 2018.
 [4] A. Ren, A. Zahid, D. Fan, X. Yang, M. A. Imran, A. Alomainy, and Q. H. Abbasi, “Stateoftheart in terahertz sensing for food and water security–a comprehensive review,” Trends in food science & technology, 2019.
 [5] R. Li, C. Li, H. Li, S. Wu, and G. Fang, “Study of automatic detection of concealed targets in passive terahertz images for intelligent security screening,” IEEE Transactions on Terahertz Science and Technology, vol. 9, no. 2, pp. 165–176, Mar. 2019.
 [6] Y. V. Kistenev, A. V. Borisov, M. A. Titarenko, O. D. Baydik, and A. V. Shapovalov, “Diagnosis of oral lichen planus from analysis of saliva samples using terahertz timedomain spectroscopy and chemometrics,” Journal of Biomedical Optics, vol. 23, no. 4, pp. 1 – 8, 2018.
 [7] K. E. K. Coppin, J. E. Geach, I. Smail, L. Dunne, A. C. Edge, R. J. Ivison, S. Maddox, R. Auld, M. Baes, S. Buttiglione, A. Cava, D. L. Clements, A. Cooray, A. Dariush, G. De Zotti, S. Dye, S. Eales, J. Fritz, R. Hopwood, E. Ibar, M. Jarvis, M. J. Michałowski, D. N. A. Murphy, M. Negrello, E. Pascale, M. Pohlen, E. Rigby, G. Rodighiero, D. Scott, S. Serjeant, D. J. B. Smith, P. Temi, and P. van der Werf, “HerschelAstrophysical Terahertz Large Area Survey: Detection of a farinfrared population around galaxy clusters,” Monthly Notices of the Royal Astronomical Society, vol. 416, no. 1, pp. 680–688, Aug. 2011.
 [8] P. Strobbia, R. Odion, and T. VoDinh, “Spectroscopic chemical sensing and imaging: From plants to animals and humans,” Chemosensors, vol. 6, p. 11, Feb. 2018.
 [9] J. Xu, K. W. Plaxco, and S. J. Allen, “Probing the collective vibrational dynamics of a protein in liquid water by terahertz absorption spectroscopy,” Protein Science, vol. 15, no. 5, pp. 1175–1181, 2006.
 [10] B. Breitenstein, M. Scheller, M. Shakfa, T. Kinder, T. MüllerWirts, M. Koch, and D. Selmar, “Introducing terahertz technology into plant biology: A novel method to monitor changes in leaf water status,” Journal of Applied Botany and Food Quality, vol. 84, no. 2, pp. 158–161, Dec. 2011.
 [11] I. F. Akyildiz, J. M. Jornet, and C. Han, “Terahertz band: Next frontier for wireless communications,” Physical Communication, vol. 12, pp. 16–32, 2014.
 [12] H. Sarieddeen, M.S. Alouini, and T. Y. AlNaffouri, “An overview of signal processing techniques for terahertz communications,” arXiv preprint arXiv:2005.13176, 2020.
 [13] H. Sarieddeen, N. Saeed, T. Y. AlNaffouri, and M. Alouini, “Next generation terahertz communications: A rendezvous of sensing, imaging, and localization,” IEEE Communications Magazine, vol. 58, no. 5, pp. 69–75, 2020.
 [14] C. Chaccour, M. N. Soorki, W. Saad, M. Bennis, P. Popovski, and M. Debbah, “Seven defining features of terahertz (THz) wireless systems: A fellowship of communication and sensing,” arXiv preprint arXiv:2102.07668, 2021.
 [15] Y.D. Hsieh, S. Nakamura, D. Ibrahim, T. Minamikawa, Y. Mizutani, H. Yamamoto, T. Iwata, F. Hindle, and T. Yasui, “Dynamic terahertz spectroscopy of gas molecules mixed with unwanted aerosol under atmospheric pressure using fibrebased asynchronousopticalsampling terahertz timedomain spectroscopy,” Scientific Reports, vol. 6, p. 28114, 06 2016.
 [16] A. Faisal, H. Sarieddeen, H. Dahrouj, T. Y. AlNaffouri, and M. S. Alouini, “Ultramassive MIMO systems at terahertz bands: Prospects and challenges,” vol. 15, no. 4, pp. 33–42, 2020.
 [17] M. T. Ruggiero, “Invited review: Modern methods for accurately simulating the terahertz spectra of solids,” Journal of Infrared, Millimeter, and Terahertz Waves, pp. 1–38, 2020.
 [18] J. Qin, Y. Ying, and L. Xie, “The detection of agricultural products and food using terahertz spectroscopy: A review,” Applied Spectroscopy Reviews, vol. 48, no. 6, pp. 439–457, 2013.
 [19] M. Yin, S. Tang, and M. Tong, “The application of terahertz spectroscopy to liquid petrochemicals detection: A review,” Applied Spectroscopy Reviews, vol. 51, no. 5, pp. 379–396, 2016.
 [20] J. El Haddad, B. Bousquet, L. Canioni, and P. Mounaix, “Review in terahertz spectral analysis,” TrAC Trends in Analytical Chemistry, vol. 44, pp. 98–105, 2013.
 [21] I. E. Gordon, L. S. Rothman, C. Hill, R. V. Kochanov, Y. Tan, P. F. Bernath, M. Birk, V. Boudon, A. Campargue, K. Chance et al., “The HITRAN2016 molecular spectroscopic database,” Journal of Quantitative Spectroscopy and Radiative Transfer, vol. 203, pp. 3–69, 2017.
 [22] E. Heilweil and M. Campbell, “THz spectral database,” 2011.
 [23] T. Wang, E. A. Romanova, N. AbdelMoneim, D. Furniss, A. Loth, Z. Tang, A. Seddon, T. Benson, A. Lavrinenko, and P. U. Jepsen, “Timeresolved terahertz spectroscopy of charge carrier dynamics in the chalcogenide glass AsSeTe,” Photon. Res., vol. 4, no. 3, pp. A22–A28, Jun. 2016.
 [24] D. K. George and A. G. Markelz, Terahertz Spectroscopy of Liquids and Biomolecules. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 229–250.
 [25] M. Hangyo, M. Tani, and T. Nagashima, “Terahertz timedomain spectroscopy of solids: A review,” International journal of infrared and millimeter waves, vol. 26, no. 12, pp. 1661–1690, 2005.
 [26] Y.C. Shen and P. F. Taday, “Development and application of terahertz pulsed imaging for nondestructive inspection of pharmaceutical tablet,” IEEE Journal of Selected Topics in Quantum Electronics, vol. 14, no. 2, pp. 407–415, 2008.
 [27] B. S. Ferguson, H. Liu, S. Hay, D. Findlay, X.C. Zhang, and D. Abbott, “In vitro osteosarcoma biosensing using THz time domain spectroscopy,” in BioMEMS and Nanotechnology, D. V. Nicolau, U. R. Muller, and J. M. Dell, Eds., vol. 5275, International Society for Optics and Photonics. SPIE, 2004, pp. 304 – 316.
 [28] L. Yan, C. Liu, H. Qu, W. Liu, Y. Zhang, J. Yang, and L. Zheng, “Discrimination and measurements of three flavonols with similar structure using terahertz spectroscopy and chemometrics,” Journal of Infrared, Millimeter, and Terahertz Waves, vol. 39, 03 2018.
 [29] C. Li, B. Li, and D. Ye, “Analysis and identification of rice adulteration using terahertz spectroscopy and pattern recognition algorithms,” IEEE Access, vol. 8, pp. 26 839–26 850, 2020.
 [30] C. Cao, Z. Zhang, X. Zhao, and T. Zhang, “Terahertz spectroscopy and machine learning algorithm for nondestructive evaluation of protein conformation,” Optical and Quantum Electronics, vol. 52, 04 2020.
 [31] X. Sun, J. Liu, K. Zhu, J. Hu, X. Jiang, and Y. Liu, “Generalized regression neural network association with terahertz spectroscopy for quantitative analysis of benzoic acid additive in wheat flour,” Royal Society Open Science, vol. 6, no. 7, p. 190485, 2019.
 [32] T. Bowman, T. Chavez, K. Khan, J. Wu, A. Chakraborty, N. Rajaram, K. Bailey, and M. O. ElShenawee, “Pulsed terahertz imaging of breast cancer in freshly excised murine tumors,” Journal of Biomedical Optics, vol. 23, no. 2, pp. 1–13, 2018.
 [33] W. Xu, L. Xie, Z. Ye, W. Gao, Y. Yao, M. Chen, J. Qin, and Y. Ying, “Discrimination of transgenic rice containing the cry1ab protein using terahertz spectroscopy and chemometrics,” Scientific reports, vol. 5, p. 11115, 07 2015.
 [34] Z. Chen, Z. Zhang, R. Zhu, Y. Xiang, Y. Yang, and P. B. Harrington, “Application of terahertz timedomain spectroscopy combined with chemometrics to quantitative analysis of imidacloprid in rice samples,” Journal of Quantitative Spectroscopy and Radiative Transfer, vol. 167, pp. 1 – 9, 2015.
 [35] H. Ge, Y. Jiang, F. Lian, Y. Zhang, and S. Xia, “Quantitative determination of aflatoxin B1 concentration in acetonitrile by chemometric methods using terahertz spectroscopy,” Food Chemistry, vol. 209, pp. 286–292, 2016.
 [36] H. Zhan, K. Zhao, H. Zhao, Q. Li, S. Zhu, and L. Xiao, “The spectral analysis of fuel oils using terahertz radiation and chemometric methods,” Journal of Physics D: Applied Physics, vol. 49, no. 39, p. 395101, sep 2016.
 [37] W. Liu, C. Liu, J. Yu, Y. Zhang, J. Li, Y. Chen, and L. Zheng, “Discrimination of geographical origin of extra virgin olive oils using terahertz spectroscopy combined with chemometrics,” Food Chemistry, vol. 251, pp. 86 – 92, 2018.
 [38] J. Liu, “Terahertz spectroscopy and chemometric tools for rapid identification of adulterated dairy product,” Optical and Quantum Electronics, vol. 49, Jan. 2017.
 [39] X. Sun, J. Liu, K. Zhu, J. Hu, X. Jiang, and Y. Liu, “Generalized regression neural network association with terahertz spectroscopy for quantitative analysis of benzoic acid additive in wheat flour,” Royal Society open science, vol. 6, no. 7, p. 190485, 2019.
 [40] D. Ye, W. Wang, H. Zhou, H. Fang, J. Huang, Y. Li, H. Gong, and Z. Li, “Characterization of thermal barrier coatings microstructural features using terahertz spectroscopy,” Surface and Coatings Technology, vol. 394, p. 125836, 2020.
 [41] X. Sun, J. Liu, K. Zhu, J. Hu, X. Jiang, and Y. Liu, “Generalized regression neural network association with terahertz spectroscopy for quantitative analysis of benzoic acid additive in wheat flour,” Royal Society Open Science, vol. 6, no. 7, p. 190485, 2019.
 [42] J. Cui, J. Zhang, C. Dong, D. Liu, and X. Huang, “An ultrafast and high accuracy calculation method for gas radiation characteristics using artificial neural network,” Infrared Physics & Technology, vol. 108, p. 103347, 2020.
 [43] H. Sarieddeen, A. Abdallah, M. M. Mansour, M.S. Alouini, and T. Y. AlNaffouri, “Terahertzband MIMONOMA: Adaptive superposition coding and subspace detection,” arXiv preprint arXiv:2103.02348, 2021.
 [44] R. Ryniec, P. Zagrajek, and N. Palka, “Terahertz frequency domain spectroscopy identification system based on decision trees,” Acta Physica PolonicaSeries A General Physics, vol. 122, no. 5, p. 891, 2012.
 [45] M. H. Loukil, H. Sarieddeen, M. S. Alouini, and T. Y. AlNaffouri, “Terahertzband MIMO systems: Adaptive transmission and blind parameter estimation,” vol. 25, no. 2, pp. 641–645, 2021.
 [46] K. O’Shea and R. Nash, “An introduction to convolutional neural networks,” arXiv preprint, Nov. 2015. [Online]. Available: https://arxiv.org/abs/1511.08458
 [47] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, Gradient flow in recurrent nets: The difficulty of learning longterm dependencies. WileyIEEE Press, 2001, ch. 14, pp. 237–243.
 [48] S. P. Chatzis and Y. Demiris, “Echo state gaussian process,” vol. 22, pp. 1435–1445, Sep. 2011.
 [49] Q. Yang, Y. Liu, T. Chen, and Y. Tong, “Federated machine learning: Concept and applications,” ACM Trans. Intell. Syst. Technol., vol. 10, no. 2, pp. 1–19, Feb. 2019.
 [50] D. Silver et al., “A general reinforcement learning algorithm that masters chess, shogi, and Go through selfplay,” Sci. J., vol. 362, no. 6419, p. 1140–1144, Dec. 2018.
Biographies
Sara Helal is a senior Electrical and Computer Engineering student at Effat University, Jeddah, Saudi Arabia. Her research interests are in the areas of machine learning, signal and image processing, and wireless communications.
Hadi Sarieddeen (S’13M’18) received his B.E. degree in Computer and Communications Engineering from Notre Dame UniversityLouaize, Lebanon, in 2013, and his Ph.D. degree in Electrical and Computer Engineering from the American University of Beirut (AUB), Lebanon, in 2018. He is currently a postdoctoral fellow at King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia. His research interests are in the areas of wireless communications and signal processing for wireless communications.
Hayssam Dahrouj (S’02, M’11, SM’15) received his Computer and Communications engineering degree from AUB in 2005, and his Ph.D. degree in Electrical and Computer Engineering from the University of Toronto (UofT) in 2010. In July 2020, he joined the Center of Excellence for NEOM Research at KAUST as a senior research scientist. His main research interests include cloud radio access networks, crosslayer optimization, cooperative networks, convex optimization, distributed algorithms, machine learning, and optical communications networks.
Tareq Y. AlNaffouri (M’10SM’18) received his Ph.D. degree in Electrical Engineering from Stanford University in 2004. He is currently a Professor at the Electrical and Computer Engineering department at KAUST. His research interests lie in the areas of sparse, adaptive, and statistical signal processing, localization, machine learning, and their applications.
MohamedSlim Alouini (S’94M’98SM’03F’09) was born in Tunis, Tunisia. He received his Ph.D. degree in Electrical Engineering from Caltech, Pasadena, CA, in 1998. He served as a faculty member at the University of Minnesota, Minneapolis, then at Texas A&M University at Qatar, Education City, Doha, Qatar before joining KAUST as a professor of Electrical Engineering in 2009. His current research interests include the modeling, design, and performance analysis of wireless communication systems.
Comments
There are no comments yet.