The emerging industrial use-cases of sixth-generation (6G) and beyond wireless networks are envisaged to include industrial automation, autonomous vehicles, and smart infrastructure. These applications require significant improvements in data capacity, system latency, and quality-of-service reliability over the current 5G networks. In this context, reconfigurable intelligent surface (RIS) has been identified as a key enabling technology to program the smart radio environment (SRE), increase link quality, and reduce the hardware complexity [55, 26]. The RIS is made up of a metasurface (MTS) - a two-dimensional (2-D) reconfigurable electromagnetic (EM) layer composed of a large periodic array of subwavelength scattering elements (meta-atoms) with specially designed spatial features [28, 51]. Compared to electrically large arrays, the nearly passive meta-atoms offer lower cost and power consumption. The radio-frequency (RF) MTS performs customized transformations, such as beamforming, on a reflected incident wave through modified surface boundary conditions using Huygens’ principle. For example, the MTS shifts the reflected phase of incident signal by creating a field discontinuity at the boundary of the surface. The arrangement and subwavelength structure of each meta-atom and, in turn, the array of space- and time-varying meta-atoms determine MTS aperture field distribution and control the direction and strength of reflected signal .
In a conventional wireless communication systems, the network optimization has been limited to control at the transmitter and receiver. This paradigm assumes that the wireless fading channel is uncontrollable and is a significant factor limiting the performance because of random signal reflections, diffraction, and scattering in the wireless environment. The RIS overcomes many of the aforementioned fading channel limitations through the ability of MTS to manipulate waves, achieve arbitrary aperture beamforming, and perform real-time analog spatial signal processing. This has spawned novel MTS-based RF applications such as intelligent beamforming , anomalous refraction and reflection , frequency selective and high-impedance surfaces , scattering reduction , polarization conversion , leaky-wave antenna , surface wave control , beam focusing , transmit-array antennas , reflect-array antennas , and holographic imaging . Initial applications of RIS were limited to wireless communications for interference suppression , joint wireless information and power transmission , physical layer security , and multi-beam design . However, more recent works have introduced IRS to radar remote sensing [14, 66, 13] and joint radar-communications systems [67, 10].
In a wireless link, the RIS functions as either an electrically large antenna array at the endpoints or as an amplify-and-forward relay (Fig. 14.1
). By actively controlling and optimizing the amplitude/phase of each meta-atom across the aperture, the RIS maximizes the receive signal-to-noise ratio and provides adaptive beamforming to coherently focus the reflected signal on the receiver. Through joint optimization of the wireless channel and endpoints, RIS-assisted links are able to realize SRE. Each scattering element typically includes an active tuning element, such as a varactor or PIN diode, whose bias voltage is software-controlled to change the EM response of the surface. The bias voltage for each meta-atom is pre-computed and modulated by a digital control module employing a field programmable gate array (FPGA). Each meta-atom is controlled by tuning its EM properties (susceptibility or impedance) which affects the spectral response of the reflected signal. This aids in producing tailored radiation patterns for diverse functions, such as beam steering, anomalous reflection, focusing, beam splitting, absorption, and direct modulation of the reflected signal.
There are several challenges in the design, fabrication, deployment, and processing of RIS. In applications such as radar and communications that have precise radiation pattern constraints, the RIS design often involves optimization of several complicated and irregular geometry parameters to meet the required resonant frequency, gain, polarization, bandwidth, and size constraints. The conventional design process could be very tedious. Further, post-deployment, the processing of RIS signals and optimized beamforming is also challenging because of high-dimensional nature arising from the use of several antennas. In this context, machine learning (ML) techniques have recently shown unprecedented performance in problems where it is challenging to develop an accurate mathematical model for feature representation. These methods are now also transforming the above-mentioned tedious approaches to design RIS and process its signals. In particular, as a class of machine learning techniques, deep leaning (DL) methods have gained much interest recently for solving many challenging problems such as speech recognition, visual object recognition, rainfall estimation, and language processing [36, 1, 70]. These techniques offer advantages such as low computational complexity while solving optimization-based or combinatorial search problems as well as the ability to extrapolate new features from a limited set of features contained in a training set . Recently, DL for MTS inverse design, wherein a meta-atom design if synthesized from a specific response, has become very popular. This has been applied for semi-automated inverse design of metamaterials , MTS [72, 54], and nanophotonic structures . Note that the above ML/DL application to MTS/RIS design is different from using DL to perform signal processing functions in RIS-aided communications (see, e.g.,  for a survey). In the following, we describe these aspects in detail.
14.1.1 ML/DL for RIS Design
The design and optimization of RIS hardware at the physical layer remains a formidable challenge. To date, RIS/MTS implementations remain quite limited. To realize the promise of RIS-assisted networks and SRE, more robust and automated MTS design techniques are required. Without capable RIS hardware, the benefit of RIS-assisted networks will be significantly reduced due to EM limitations. In general, canonical structures such as v-antennas, loaded-dipoles, split-ring resonators, are used to fabricate RIS. However, meta-atoms based on these geometries usually fall short of desired performance, particularly when anisotropic, broadband, and/or wide-angle responses are required. As a result, traditional MTS design approaches exhibit performance limitations, especially given the complexity of MTS hardware requirements and increasing functionality required for wireless nodes in next generation networks.
Designing a user-defined, arbitrary wave-front RIS or metagrating [41, 32, 25] is a challenging, labor-intensive, and long process. In general, a new MTS design entails numerous rounds of manual tuning and full-wave simulations that iteratively solve Maxwell’s equations until a locally optimized design is achieved . Initial designs are typically based on physical instincts and intuitive arguments. However, the final geometric structure and material characteristics are attained through iterative analyses.
The ML/DL approaches expend computational time and resources upfront as a fixed-cost to generate training data sets of device geometries and their associated spectral responses but are useful during the predication stage 
. Deep neural networks are trained to map the nonlinear relationships between meta-atom geometry and spectral response. The power of deep neural networks comes from their multi-layered composition which allows them to learn the relationships between data with multiple levels of abstraction
. Once trained, a deep neural network efficiently produces the geometry of a meta-atom given a desired spectral response. The application of deep learning to the inverse design of MTS and nanophotonic structures is still in its early stages and much more work required to realize more generalized complex designs, reduce the amount of required training data, and result in increased efficacy. Nearly all of these works rely on supervised learning techniques for metamaterial performance predictions, which map known input-output pairs based on large training examples. In MTS design, applying such techniques does not result in new shapes different than the ones used in training. This severely limits the ability to generate customized MTS patterns. In[25, 22]
, we introduced the use of generative adversarial networks (GANs) to microwave MTS design that aids in discovering new shapes of meta-atoms.
14.1.2 ML/DL for RIS Applications
The next-generation millimeter wave (mm-Wave) massive multiple-input multiple-output (MIMO) systems require large antenna arrays with a dedicated radio-frequency (RF) chain for each antenna. This results in expensive and large system architectures which consume high power and processing resources. To reduce the number of RF chains while also maintaining sufficient beamforming gains, hybrid analog and digital beamforming architectures were introduced. However, the resulting cost and energy overheads using these systems remain a concern. Recently, RISs have emerged as a feasible solution  to implement low cost and light-weight alternative to large arrays complexity in both outdoor and indoor applications, usually with separate operating frequencies or spectral bands. (Fig. 14.2).
The RISs reflect the incoming signal by introducing a pre-determined phase shift. This phase shift is controlled via external signals by the base station (BS) through a backhaul control link. As a result, the incoming signal from the BS can be manipulated in real-time, thereby, reflecting the received signal toward the users. Hence, the usage of RIS enhances the signal energy received by distant users and expands the coverage of the BS. It is, therefore, required to jointly design the beamformer parameters both at the RIS and BS. This achieves desired channel conditions, wherein the BS conveys the information to multiple users through the RIS . Different from amplify-and-forward (AF) relay systems, an RIS can have both active and passive components, which can provide a flexible configuration, thus, it has less active transmit modules or totally reflects the received signal as a passive surface. Thus, the RIS is much more energy- and spectrum-efficient .
The accuracy of beamformer design strongly relies on the knowledge of the channel information. In fact, the RIS-assisted systems include multiple communications links, i.e., a direct channel from BS to users and a cascaded channel from BS to users through RIS. This makes the RIS scenario even more challenging than the conventional massive MIMO systems. Furthermore, the wireless channel is dynamic and uncertain because of changing RIS configurations. Consequently, there exists an inherit uncertainty stemming from the RIS configuration and the channel dynamics. These characteristics of RIS make the system design very challenging [38, 15].
To address the aforementioned uncertainties and non-linearities imposed by channel equalization, hardware impairments, and sub-optimality of high-dimensional problems, model-free techniques have become common in wireless communications . In this context, DL is particularly powerful in extracting the features from the raw data and providing a “meaning” to the input by constructing a model-free data mapping with huge number of learnable parameters. Furthermore, DL is helpful when modeling the channel characteristics thanks to its data-driven structure. A learning model constructs a non-linear mapping between the raw input data and the desired output to approximate a problem from a model-free perspective . Thus, its prediction performance is robust against the corruptions/imperfections in the wireless channel data. DL learns the feature patterns, which are easily updated for the new data and adapted to environmental changes. In the long run, this results in in lower computational complexity than a model-based optimization . DL-based solutions have significantly reduced run-times because of parallel processing capabilities. On the other hand, it is not straightforward to achieve parallel implementations of conventional optimization and signal processing algorithms . The aforementioned advantages have led to DL superseding the optimization-based techniques in the RIS system design for physical layer of the wireless communications .
This chapter provides an overview of recent developments in using ML/DL for designing, deploying and processing the physical layer of RIS. The rest of the chapter is organized as follows. In the next section, we discuss various ML techniques for inverse RIS design. Then, we introduce various techniques DL for RIS design in 14.3 and provide a few case studies in Section 14.4. Then, we focus on DL-aided RIS applications for wireless systems in Section 14.5. including signal detection and channel estimation. For a more widely used application of RIS beamforming, we discuss various DL frameworks in Section 14.6. We also discuss current challenges in using ML/DL for RIS systems and highlight related future research directions in Section 14.7. We conclude in Section 14.8.
14.2 Inverse RIS Design
Communications-based analysis of RIS without physics-based EM-compliant models is a major limitation of current research. Until recently, prior works did not consider such realistic RIS implementations. As the parameter spaces of meta-atom geometry and constituent materials has grown, the conventional approaches to achieve the targeted EM response have become more tedious. In this context, learning models have demonstrated the ability to implicitly learn Maxwell’s equations from training data within a constrained design space. The ML techniques have witnessed increased use in research to create surrogate models for MTS performance prediction, inverse design, and optimization. For an inverse MTS design problem, the input is an arbitrary design spectrum and the network finds or synthesizes a geometry to closely approximate the desired spectral response (Fig. 14.3).
Major benefits of DL-based RIS design for wireless communications include:
EM-based surrogate models: DL constructs a nonlinear mapping between the raw input data (meta-atom design) and the desired output to approximate the MTS response.
Inverse design: Deep generative models are utilized to learn geometric features from training data and generate new meta-atom designs to achieve the spectral response.
Diverse EM surface representations
: DL-based MTS design admits flexible design representation. The input could be either vectors of discrete parameters describing the geometry, material, frequency, and angular design parameters or pixelated images to represent the geometry or phases of the meta-atom design. Whereas a fully-connected neural network is well-suited to process the simple designs specified by the former representation, a convolutional networks handle images appropriately to yield more complex MTS geometries.
|Algorithm||Frequency||MTS layers||Data||Key features||Drawbacks|
|Evolutionary optimization techniques|
|GA ||- GHz||Parameter Vector||Pixelized meta-atoms with discrete input design space when a contiguous structure is not required||Optimization from scratch for each design; output structures may be too complex to fabricate|
|PSO ||- GHz||Binary Matrix (2-D)||Swarm-based GO technique for pixelized meta-atom design; outperforms GA for various EM designs||Optimization from scratch for each design with parameter tuning|
|ACO ||- GHz||Binary Matrix (3-D)||MTS, including 3-D structures and wire grid arrays, with discrete design space and a contiguous structure||Optimization from scratch for each design; output structures may be too complex to fabricate|
|ANN ||- THz||Parameter Vector||Performance prediction, inverse design, and optimization of nanophotonic particles||Limited design variables; applicable to only spherical dielectric nanoparticles|
|ANN ||THz||Parameter Vector||Performance prediction and inverse design of metagratings||Limited set of parametric inputs; significant training overload|
|DNN ||- THz||Parameter Vector||Inverse design of chiral and multi-layer MTS||Design-specific architecture; limited design space|
|CNN ||GHz||Binary Matrix (2-D)||Anisotropic digital coding MTS; PSO for beamforming||Significant training overload|
|CNN ||GHz||Binary Matrix (2-D)||Hybrid CNN-GA for space-time modulation of programmable MTS; multi-beam steering||Binary phase coding limits beamforming performance; limited tunability|
|cDC-GAN ||- THz||Image Matrix (2-D)||Generative inverse design of transmission MTS||Significant training overload; limited to single layer designs and passive structures|
|cDC-GAN ||- GHz||Image (2-D)||Reflective RF MTS; training set with published meta-atom structures to improve learning||Limited to single layer; post-processing required|
|cDC-GAN ||- GHz||Image (3-D)||Multi-layer MTS; RGB-style matrix to represent multiple layers||No active elements; additional validation required|
|cDC-GAN ||- GHz||Image (3-D)||Federated learning for multi-layer design||Significant training overload|
|cDC-VAE ||- THz||1||Image (2-D)||Anisotropic MTS; encodes input into low-dimensional latent space||Significant training overload; post-processing required|
|TO-GAN ||- THz||1||Image (2-D)||Free-form diffractive metagrating design for select wavelength-deflection angle pairs with topology refinement||Additional optimization required|
|GLOnet ||- THz||1||Image (2-D)||Dielectric MTS design without training sets||Limited to single objective optimization; requires solving Maxwell’s equations inside training loop|
Table 14.1 summarizes prior works on various techniques for RIS inverse design. The non-DL methods typically comprise of several evolutionary optimization algorithms as listed below. The drawback of traditional optimization techniques is that they start from scratch with each new design. This often requires hundreds of additional full-wave simulations per design.
184.108.40.206 Genetic algorithm (GA)
This is an iterative global optimization (GO) algorithm that has been used extensively in the design of pixelated coded MTS designs. GA is a nature-inspired algorithm that uses binary strings (chromosomes) to represent candidate designs . During the optimization, the GA selects the best subset of design candidates from the previous generation to serve as starting points for mutation and crossover in the next design iteration. Recent GA applications include coding MTS  which demonstrates channel response modification, efficient polarization conversion, and phase-graded beam steering.
220.127.116.11 Particle swarm optimization (PSO)
A popular stochastic evolutionary computation technique, PSO is inspired by the movement and intelligence of swarms. Recently, it has been employed for shaping EM waves using pixelized coded metasurfaces. The design procedure using PSO is tied to a full-wave EM solver and completely automatic. The software yields both microscopic meta-atom designs and the macroscopic aperture coding matrix. By changing the reflection phase difference between cells, this approach has produced designs of functional metasurfaces with circularly- and elliptically-shaped radiation beams and multi-beam patterns. This is useful for achieving customized radiation patterns to enhance link performance in the wireless communication channel. Similar efforts have used a simulated annealing algorithm for the design and optimization of a broadband diffusion MTS using anisotropic elements for scattering reduction. In , binary PSO (BPSO) was used to automate the macroscropic layout of both passive and active aperture to realize user-defined dual-beam scattering radiation patterns. For example, this study used BPSO to realize a reflecting MTS with a left-handed circular polarization (LHCP) beam and a right-handed circular polarization (RHCP) beam. Results of this study were experimentally verified. This digital coding approach has been applied to both passive and active R-MTS.
18.104.22.168 Ant colony optimization (ACO)
This is another swarm-based algorithm inspired by stigmergy in ant colonies in order to search for optimal solutions to graph-based problems . Here, a number of artificial ants build solutions to an optimization problem and exchange information on their quality using a cooperation scheme similar to that utilized by real ants. In , inverse MTS design is performed based on multi-objective lazy ACO (MOLACO) to synthesize 3-D nano-antenna geometries with low-loss transmission performance and broad phase tunability. The ACO is generally most useful for a discrete input design space and when a contiguous structure is required.
14.3 DL-Based Inverse Design and Optimization
The computational power and time required for evolutionary optimization algorithms grow exponentially with the number of design parameters. This is mitigated by DL-based inverse design for RIS. Prior works have employed a variety of network structures and algorithms based on the availability of data, RIS topology, and desired EM spectral response.
14.3.1 Artificial Neural Network (ANN)
The ANNs were first used to approximate light scattering by multi-layer nanoparticles (meta-atoms) . Similar to MTS, nanophotonic particles derive their frequency response from physical structure and the size constituent scatterers. Then, 
used a similar technique for metagratings. Typical inverse design problems require optimization in high-dimensional space, which involves lengthy calculations and are typically solved using genetic algorithm or adjoint methods. However, the computational power and time required for GA optimization grows exponentially with the number of design parameters.
The primary application of ANNs in MTS design is performance approximation. The feedforward ANN is trained to be a high-fidelity surrogate model for performance prediction. Using training data consisting of meta-atom physical design parameters as inputs and frequency response as labels, the ANN is trained to approximate a complex physics simulation (such as finite-element method (FEM), method of moment (MoM), or finite-difference time-domain (FDTD) simulation). Through the training data, the ANN learns to map the scattering function of the meta-atom into a continuous, higher-order space where the derivative is found analytically through propagation. In, a trained ANN simulated spectral responses orders of magnitude faster than conventional full-wave simulations. This study used a fully-connected ANN consisting of four layers with neurons per layer resulting in 239,500 parameters. The inputs were the thickness of each meta-atom layer (the materials were fixed) and the outputs were the spectrum sampled at points between and nm. The results suggest that the ANN was not simply fitting the data, but rather discovered the underlying structure of input-to-output mapping to generalize the physics of the systems with the training set and solve problems not yet encountered.
A significant drawback of this approach is that the inputs are limited to the thicknesses of the meta-atom layers with fixed materials. This results in a lack of generalizability for the ANN that vastly limits the possible meta-atom design structures. While fixing the input parameters reduces the complexity of the ANN architecture, it limits the design space and optimal designs. Another drawback of this approach is that  required examples using conventional simulation methods to generate training data. However, unlike evolutionary optimization methods such as GA or PSO, simulation of the training dataset is an upfront fixed cost because it only needs to be simulated once and is then leveraged for other designs. Additionally, the simulations for training data generation are highly parallelized unlike serial optimization techniques.
Once trained,  shows that the ANN solves inverse design problems more quickly than than its numerical counterparts because the gradient is found analytically, through back propagation, rather than numerically. Similar to inverse design, the ANN also optimizes for a desired property by altering the cost function used for the design without training the ANN. Their results that the ANN performs inverse design and optimization more accurately than traditional numerical nonlinear optimization techniques.
22.214.171.124 Deep Neural Networks (DNN)
To model more complex meta-atom structures and increase performance prediction accuracy, DL has been applied to the on-demand design of chiral (a form of anisotropy) MTS . Here, deep neural networks (DNN) - an ANN comprised of many hidden layers to significantly expand learning and generalization ability - was employed to automatically design and optimize 3-D chiral MTS with strong anisotropic spectra at predetermined wavelengths. The network comprised two bidirectional networks that were constructed using partial stacking technique. This study limited the input design space (and hence the structures obtained) and predicted the reflection spectral response at discrete frequency points for two orthogonal polarization and the cross-polarization coupling term resulting in a -by- spectral output vector. By fixing the inputs to be five specific design parameters, this DNN design approach is also limited in its generaliziblity to other physical structures in the design space. Full-wave simulation was used to generate the training data set for example meta-atoms. The DNN achieved high efficiency and high-accuracy for performance prediction and inverse design for anisotropic MTS, where the meta-atom design space is limited.
14.3.2 Convolutional Neural Networks (CNN)
To improve on the lack of generalization and increase performance prediction accuracy, convolutional neural networks (CNN) are used to design anisotropic digital coding metasurfaces. CNNs are a class of ANNs that use convolution functions to learn hierarchical patterns within data. These models learn generalized patterns across many spatial scales from their input data and are widely used on image data. In, a CNN predicted the reflection phase response of binary coded meta-atoms where each meta-atom contains -by- square sub-pixels and is mirrored with two-fold symmetry. The CNN used in this study is a -layer deep residual network, known as Resnet-101. The authors found that other networks with fewer layers resulted in less precise and robust performance predictions.
The results show an accuracy of of phase responses with error in the phase. A drawback of this binary coding approach is that a -by- pixel meta-atom has potential design combinations. This study generated training data by simulating randomized pixel matrices. However, it was fundamentally inefficient in an analogous manner to GA because the training data is essentially random and does not contain the knowledge of canonical structures in the training data set. This likely results in significantly more required training data and greater network complexity. Another drawback of this study is that it required full-wave simulation of 70,000 training examples 10,000 test examples to generate the training dataset.
A significant CNN advantage is that the meta-atom shape is directly input into the network rather than shape-specific design parameters. The convolutional filters allow the CNN to learn the physical structure that leads to given EM response, leading to a broader applicability of the model.
In , the element phases of a reconfigurable MTS were computed by a 11-layer CNN for multiple beam steering applications. The input was the parameter vector representing the target beam pattern and the output was a matrix that carried the 1-bit codes for a programmable -element MTS. This technique to obtain the phase matrices reduced the time for producing almost similar beam patterns using conventional methods to a few milliseconds.
14.3.3 Deep Generative Models (DGMs)
Generative models are unsupervised or semi-supervised learning models that infer a function to describe hidden structure from unlabeled data. Their functions include clustering, density estimation, feature learning, and dimension reduction. Whereas discriminative networks capture the relationship between meta-atom geometry and spectral response from a training set, DGMs focus on learning the properties of meta-atom geometry distributions[41, 25, 33, 31]. Major classes of DGMs (Fig. 14.4) applied to MTS inverse design are as follows.
126.96.36.199 Generative adversarial networks (GANs)
In a GAN system, two ANNs compete to improve each of their models: the generative network learns to create inputs indistinguishable from the training data while the discriminative network learns to identify true data from the output of the generative network. Training GANs involves jointly training a generator network and a discriminator network in a game theoretic approach to find a local Nash equilibrium. The goal of a generative model is to observe a collection of training examples and learn the underlying probability distribution that generates them. GANs are able to generate new samples from the estimated probability distribution. GANs were initially applied to generate photos, however, have been applied to many domains including speech and video generation. Very recently, GANs have been applied to generate new MTS hardware design including those not explicitly seen in the training dataset or current literature. At the end of a successful training process, GANs are able to produce realistic meta-atom designs, even for very complicated datasets and spectral responses.
In, , we introduce GANs to microwave MTS design. GANs are promising for low-cost MTS design with complex frequency and polarization dependent scattering responses. In , an input set of user-defined EM spectra is fed to GAN that generates candidate patterns to match the on-demand spectra with high fidelity. Here, DNNs are employed to approximate the spectra of the MTS and perform inverse design by generating meta-atom structures that yield user-defined input spectra. Once the model is trained, extensive parameter scans and trial-and-error procedures are bypassed. This conditional deep convolutional GAN (cDC-GAN) architecture uses three interconnected CNNs: generator, discriminator, and simulator. The simulator is a pretrained network that serves as a surrogate model for fast spectral performance prediction. In this study, is a five-layer CNN with three-fully connected layers at the output. The conditional generator networks accepts the desired spectral response and a latent noise vector to output potential meta-atom designs. The discriminator serves to train the generator by evaluating the distance of the distributions between the geometric patterns from training data and generator. At the end of successful training, discriminator is unable to distinguish batches from generator and training set. This approach is shown to exhibit high accuracy in inverse design of meta-atoms.
In , a deep convolutional GAN (DC-GAN) is employed to generate anisotropic RF meta-atom designs. Using a small set of simulated spectra, the network learned the relationship between the physical structure of meta-atoms and their reflection spectra for vertical and horizontal polarizations. The DC-GANs generated meta-atom structures that resembled design features in the training data. To speed up training, the network was fed with parametric variations of twelve published meta-atom designs to a full-wave EM simulator. Starting out with parametric variations of canonical meta-atoms scatterers, the network picked up more efficiently than it would have from training with responses of randomized pixel data.
188.8.131.52 Conditional variational autoencoder (cVAE)
As an alternative to GAN approaches,  presents a probabilistic DGM that solves both forward and inverse problems at the same time. It is trained in an end‐to‐end manner and uses a deep convolution cVAE (cDC-VAE) architecture (Fig. 14.4
) comprising an encoder-decoder network structure. The encoder maps the meta-atom structure to a multivariate Gaussian distribution in the latent space and the conditional decoder network inputs the reflection spectra and latent variable to generate meta-atom designs (Fig.14.4).
In , the RIS inverse design is modeled in a probabilistic generative manner to investigate the complex structure–performance relationship in an interpretable way and solve the one-to-many mapping issue that is intractable in deterministic models. It developed a semi-supervised learning strategy that allows the model to utilize unlabeled data in addition to labeled data in an end-to-end training. The RIS design and spectral response are encoded into a low-dimensional latent space with a predefined prior distribution, from which the latent variables are sampled. The DGM, comprising prediction, recognition, and generation models, serves as a tool to accelerate the design, characterization, and even new discovery of MTS.
184.108.40.206 Global topology optimization networks (GLOnets)
Recently, GANs utilized to learn structural features of topology-optimized (TO) metagratings for inverse design [33, 31]. TO is a method of optimizing a material layout or an array of pixels to maximize system performance given a set of constraints and boundary conditions. Unlike other approaches, simulation overload for TO does not increase with the number of RIS units. In , free-form diffractive metagratings were designed using TO-GAN. Here, DGMs were trained from images of periodic, TO metagratings to produce efficient scattering structures with the desired performance over a broad range of frequencies and angles. The network employed training examples for each angle. However, the performance of the best structures was not robust and additional refinement was needed to meet the desired performance. In 
, dielectric metasurfaces optimization was performed using a physics-informed cGAN. Global optimization-based generative networks (GLOnets) are able to search the design space for the global optimum design. Unlike other GAN approaches, GLOnets seek to fit a narrow-peaked function centered on the optimal solution without a training set. The GLOnet generates a distribution of meta-atoms to samples the global design space and then shifts the distribution toward a more optimal design. Training requires computing forward and adjoint EM simulations of output structures using backpropagation. In this work, GLOnets are shown to be successful and computationally efficient global TO for MTS and metagratings.
14.4 Case Studies
We perform two case studies for the design of single- and multi-layer RIS based on  and , respectively. The design approach in  introduced the cDC-GAN-based for jointly designing several layers of tensorial RIS. It represented three RIS layers with a
red-green-blue (RGB) image matrix. The advantages of the cDC-GAN are that it trains classifiers in a semi-supervised manner and generates new free-form shapes not previously shown in the literature. However, GANs can be unstable and challenging to train. We validated the designs by simulating their spectrum using a full-wave EM solver and comparing the results to the desired spectrum. In this data representation, the top layer meta-atom design is represented as the first channel, the second layer meta-atom design is represented by the second channel, and a third layer is represented by the third channel using the conventional RGB image format.
14.4.1 MTS Characterization Model
Consider a two-dimensional (2-D) MTS lying in the x-y plane with z-axis being the direction of propagation. According to Huygens’ principle, the EM fields created by arbitrary sources in an arbitrary volume are found as the fields created by equivalent surface currents on the volume surface . Therefore, a known incident EM source, such as a plane wave, can be transformed into a desired transmitted or reflected wave using an MTS. The MTS creates the desired aperture field distribution or phase shift by modifying the effective boundary conditions of the EM surface.
The amplitude and phases of transmitted and reflected waves from MTS are functions of surface-averaged induced electric and magnetic current densities, and , respectively. These effective surface current densities induced on MTS are described by average tangential electric () and magnetic () fields on each side of MTS as
where is the unit vector normal to the MTS. Examples of passive implementations of Huygens’ MTS (HMS) include reflectionless refraction, perfect anomalous reflection, and arbitrary antenna beamforming .
The induced surface currents and are related to their respective average tangential fields (applied on a thin slab of polarizable particles) by spatially-varying electric surface impedance and magnetic surface admittance ,
In case of single polarization, the tensor quantities above reduce to scalars. Specifying the desired incident fields and and desired output fields and leads to computation of the required electric impedance and magnetic admittance at each (spatial) location of the MTS. The bianisotropy is included in these boundary conditions by introducing the tensor magneto-electric coupling coefficient as 
The transmission and reflection spectral responses of MTS, described by vectors and , respectively, are functions of the surface impedances at a particular incidence angle and frequency (GHz). For instance, [5, 11] where is a non-linear function. In this chapter, we fix (broadside incidence) so that , where we have omitted the arguments for simplicity. From here on, we focus on only because the design procedure using is identical.
The EM wave is also characterized by its polarization. We consider two polarizations - ‘’ and ‘’ - wherein the electric field is parallel to the x- and y-directions, respectively. For an incident wave with a particular polarization, the MTS produces responses in both polarizations. For example, the response in () polarization when the incident wave is also -polarized (-polarized) is the co-polar response (). Similarly, cross-polar responses and are defined. A multi-layer meta-atom consists of multiple layers of different shapes separated by dielectric spacers for structural support. Consider a 3-layer MTS (Fig. 14.5) whose composite response is the superposition of the responses of individual layers. Our goal is to train the MD-GAN to implicitly learn physical quantities , , and by mapping various design geometries to transmission spectra and produce new meta-atom designs for each layer to realize composite responses , , , and .
14.4.2 Training and Design
We evaluated proposed inverse design approach by implementing our distributed cMD-GAN architecture using PyTorch and performing simulations on an NVIDIA Tesla T4 GPU. During training, we included parametric variations of only those meta-atom shapes that have been extensively studied in the literature. Fig.14.6 lists these shapes and enumerates variations in the physical parameters to generate training data. The CNNs process matricized data e.g. a color image composed of three matrices, each of which contains pixel intensities in red, green, and blue (RGB) color channels. Rather than feeding a three-channel image matrix representing physical RGB colors as is conventionally done in image recognition, we exploit the three channel matrix input into the CNN to represent spatial design of meta-atom scatterers in different layers of a multi-layer MTS. Prior works do not employ this innovative technique of representing multiple MTS layers as channels of an image matrix.
In the first case study, we generated single-layer meta-atom designs using cDC-GAN. The co-polarization and cross-polarization transmission responses of the resulting meta-atom designs (Fig. 14.7
) differed from EM simulators by less than a dB. One of the most exciting features of cDC-GAN is its ability to discover new geometries not previously found in the literature. This suggests that the model implicitly learned the physical relationships of Maxwell’s equations rather than simply interpolating from past designs. We perform a second case study for multi-layer meta-atom design. Building on this techniques, the federated learning approach in employed a conditional multi-discriminator distributed GAN (cMD-GAN) (see Fig. 14.4) for multi-layer RF MTS discovery (Fig. 14.8). The results show the feasibility of GAN-based approaches for meta-atom discovery.
Lately, the RIS-aided wireless systems have exploited DL to handle very challenging problems. For instance, signal detection in RIS requires development of end-to-end learning systems under the effect of channel and beamformers . The channel needs to be estimated for multiple communication links, i.e., BS-user and BS-RIS-user . Finally, beamformers are designed (by solving complex optimization problems) for phase shifters at both BS and passive elements of the RIS . The DL-based techniques are able to handle the multidimensional, huge datasets in all these problems and may also be employed for channel modeling , where the conventional model-based approaches are not very useful. There have been recent surveys on applying DL  and RIS  individually to wireless communications. Here, we provide an overview of systems which jointly employ both approaches. In particular, we describe DL techniques (Table 14.2
) for three important RIS problems: signal detection, channel estimation, and beamforming. Each of these requires different DL architectures, which have so far included supervised learning (SL), unsupervised learning (UL), reinforcement learning (RL) and federated learning (FL). The UL and RL do not require labeling; SL needs labeled dataset; and FL has distributed structure for model training. We provide a detailed synopsis of the advantages and shortcomings of each algorithm for these three applications in the subsequent sections.
In RIS-assisted scenario, wherein the BS with antennas transmits data symbols by using a baseband precoder . Hence, the downlink transmitted signal becomes . The transmitted signal is received from the -user with two components, one of which is through the direct path from the BS and the another one is through the RIS. The received signal from the -th user can be given by
where and denotes the direct channel between the BS and the -th user. The vector expresses the RIS-assisted channel between the RIS and the -th user. is a diagonal matrix, i.e., . Here, represents the on/off state of the RIS elements. In practice, the RIS elements cannot be perfectly turned on/off, Hence, can be modeled as for . is the phase shift of the reflective elements. Finally, the channel between the RIS and the BS is represented by .
In mm-Wave transmission, the channel can be represented by the Saleh-Valenzuela (SV) model where a geometric channel model is adopted with limited scattering. Hence, we assume that the mm-Wave channels, i.e., and , include the contributions of , and paths, respectively. Thus, we can represent the channels and as and where and are the complex channel gains and received path angles for the corresponding channels, respectively. and are and steering vectors of the path angles as , where and is the array spacing for the wavelength . Further, the mm-Wave channel between the BS and the RIS is given by
where denotes the complex gain and are the angle-of-departure (AOD) and angle-of-arrival (AOA) angles of the paths, respectively. and are the steering vectors. Let be the cascaded channel matrix between the BS and the -th user as where . Then, we can write , for which we have .
14.5.1 DL-Based Signal Detection in RIS
The signal detection comprises mapping the received symbols under the effect of channel and beamformers to transmit symbols (Fig. 14.9). The signal detection problem can be formulated as
which requires the knowledge of the channel, i.e, and . Instead, DL-based model accepts the input data , where is the number of collected observations. Then, the DL model is trained to construct a non-linear mapping between the corrupted data and the clean symbols .
To leverage DL for signal detection, 
devised a multi-layer perceptron (MLP) for mapping the channel and reflecting beamformer effected data symbols to the transmit symbols. The MLP is a feedforward neural network (NN) composed of multiple hidden layers. The framework in uses three fully connected layers. Once the MLP is trained on a dataset composed of received-transmitted data symbols, each user feeds the learning model with the block of received symbols. These blocks account for the effect of channel and beamformers. Then, MLP yields the estimated transmit symbols.
A major advantage of this approach is its simplicity that the learning model estimates the data symbols directly, without a prior stage for channel estimation. Thus, this method is helpful reducing the cost of channel acquisition. In , a bit-error-rate (BER) analysis has shown that the DL-based RIS signal detection (DeepRIS) provides better BER than the minimum mean-squared-error (MMSE) and close performance to the maximum likelihood estimator.
However, a few challenges remain to achieve a reliable performance. The training data should be collected under several channel conditions and different beamformer configurations so that the trained model learns the environment well and reflects the accurate performance in different scenarios. This is a particularly challenging task because it requires collection of the training data for different user locations. As a result, DL-based signal detection demands huge training dataset collected at different channel conditions.
14.5.2 DL-Based RIS Channel Estimation
The RIS is composed of a huge number of reflecting elements and, therefore, channel state acquisition is a major task in RIS-assisted wireless systems. A common approach is to turn on and off each individual RIS element one-by-one while also using orthogonal pilot signals to estimate the channel between the BS and the users through RIS. In particular, RIS channel estimation via DL involves constructing a mapping between the received input signals at the user and the channel information of direct and cascaded links (Fig. 14.9). In this way, DL-based techniques reduce the pilot percentage and complexity in channel estimation stage .
The SL approach proposed in  estimates both direct and cascaded channels via twin convolutional neural networks (CNNs). First, the received pilot signals at the user are collected by sequentially turning on the individual RIS elements. Then, the collected data are used to find the least squares estimate of the cascaded and the direct channels. Both CNNs are trained to map the least squares (LS) channel estimates to the true channel data. The upshot is that each user estimates its own channels only once and feeds the received pilot data (LS estimate) to the trained CNN models. The CNNs have higher tolerance than MLP against the channel data uncertainties, imperfections (such as switching mismatch) of RIS elements.
When the model training is conducted at the user with huge datasets as in , the system may lack sufficient computational capability. This is overcome by FL-based training , where the learning model updates are computed at the devices (nodes) and aggregated at the BS (central server) (Fig. 3), thereby eliminating the transmission of raw data. FL significantly reduces the transmission overhead since the size of the datasets is usually larger than the size of the learning model, and its performance improves as the number of users increases [7, 42]. Furthermore, instead of using two CNNs as in , a single CNN in  jointly estimates both cascaded and direct channels.
Although FL reduces the transmission overhead during model training, its training performance is upper bounded by the centralized model training, i.e., training the model with the whole dataset at once. Therefore, the prediction performance of FL is usually poorer than the centralized learning (CL). As shown in Fig. 14.10), CL and FL frameworks are compared with the MMSE and the LS estimation. We note that FL performs slightly poorer than CL in high SNR regimes. Despite this, FL significantly reduces the transmission overhead, e.g., approximately ten-fold reduction in the number transmitted symbols 
. The performance of FL improves with the increase in the number of users or edge devices because this reduces the variance of the model updates aggregated at the BS. The diversity of the local dataset of the users also affects the training/prediction performance and better performance is obtained if the local datasets are close to uniformity.
Both SL- and FL-based channel estimation techniques suffer from high channel training overhead. In this context, compressive channel estimation with deep denoising neural networks (DDNNs) is very effective . It employs a hybrid passive/active RIS architecture, where the active RIS elements are used for uplink pilot training and passive ones for reflecting the signal from the BS to the users. Once the BS collects the compressed received pilot measurements, complete channel matrix is recovered through sparse reconstruction algorithms such as orthogonal matching pursuit (OMP). Then, DDNN is used to improve the channel estimation accuracy by exploiting the correlation between the real and imaginary parts of the mm-Wave channel in angular-delay domain. During training, the input is the OMP-reconstructed channel matrix and the output is the noise, i.e., the difference between the OMP estimate and the ground truth channel data. This method leverages both CS and DL yielding a performance better than using these techniques individually. The major drawback is the additional hardware complexity introduced by the active RIS elements. Furthermore, OMP algorithm is used in place of the raw received pilot measurements for constructing the input. This requires repeated execution of the OMP algorithm thereby increasing the prediction complexity over the DL methods in  and .
Consider the downlink scenario where the BS transmits the orthogonal pilot signals , one at a single coherence time , with and . Hence, the total number of channel uses to estimate the direct channel is . The received signal at the -th user can be given by
where is the pilot signal matrix while and are row vectors and . We assume that the pilot training has two phases: direct channel estimation (i.e., ) and the cascaded channel estimation (i.e., ). In phase I, we assume that all of the RIS elements are turned off, i.e., , by using the BS backhaul link. We note here that by setting as does not affect the the direct and cascaded channels since they do not depend on the reflect beamformer as seen in (14.12). Then, the received baseband signal at the -th user becomes
Here, the direct channel is selected as the label of the deep network with the corresponding input data of .
Once , being the estimated channel, is obtained, in the second phase of the training stage, the cascaded channel can be estimated. This can be achieved via two approaches. In the first approach, pilot signals are transmitted when each of the RIS elements is turned on one by one. In this case, the BS sends a request to RIS via the micro-controller device in the backhaul link to turn on a single RIS element at a time. For the -th frame, the reflect beamforming vector becomes where and the received signal from the cascaded channel at the -th user becomes
where and are row vectors. In (14.14), represents the -th column of as . Then the least-squares (LS) estimate of becomes
By using , (14.15) can be solved for . Then, we can construct the estimated cascaded matrix as .
Then, the deep network accepts the received signals as input at the preamble stage. As a result, the input-output pairs become and for direct and cascaded channel estimation, respectively.
Now, let us consider model training via CL for channel estimation, wherein the training is performed by collecting the local datasets from the users. Once the BS has collected the whole dataset , the training is performed by solving the following problem
where is the number of training samples and denotes the loss function defined as
denotes the loss function defined as
which is the MSE between the label data and the prediction of the CNN, .
On the other hand, in FL, the local datasets are preserved at the users and not transmitted to the BS. Hence, FL-based model training is performed at the user side as
where . Notice that the FL-based model training in (14.5.2) is solved at the user while the CL problem in (14.5.2) is handled at the BS. To efficiently solve (14.5.2) and (14.5.2), gradient descent (GD) is employed and the problems are solved iteratively. In CL, the gradient is computed over the whole dataset as and the parameter update is performed as
where is the learning rate.
In FL, each user computes the gradients individually as to solve (14.5.2), then sends them to the BS, where the model parameters are updated as
Once the model is trained, each user can feed its received pilots signals to the CNN to predict its channel data.
|Learning Scheme||NN Architecture||Benefits||Drawbacks|
|SL ||MLP with layers||No need for channel estimation algorithm||Still needs to design beamformers and requires huge datasets and deeper NN architectures|
|SL ||Twin CNNs with convolutional, fully connected layers||Each user estimates its own channel with the trained model||Data collection requires channel training by turning on/off each RIS elements|
|FL ||A single CNN with convolutional, fully connected layers||Less transmission overhead for training, A single CNN estimates both cascaded and direct channels||Performance depends on the number of users and the diversity of the local datasets|
|SL||DDNN with convolutional layers||Leverages both compressed sensing (CS) and DL methods||Requires active RIS elements. High prediction complexity arising from CS algorithms|
|SL ||MLP with layers||Reduced pilot training overhead||Requires active RIS elements for channel training|
|UL ||MLP with layers||Reduced complexity at the model training stage||Implicitly needs the reflect beamformers as labels|
|RL ||DQN with layers||Provides standalone operation since RL does not require labels like SL||Longer training. Active RIS elements needed for channel acquisition|
|RL ||DDPG with -layered actor and critic networks||Better performance than DQN||Large number of NN parameters are involved|
|RL ||DDPG with actor and critic networks||Accelerated learning performance with the aid of optimization, shrinking the search space||Additional optimization tools needed|
|FL ||MLP with layers||Less transmission overhead involved during model training||RIS must be connected to the PS|
|RL ||DQN with layers||Robust against eavesdropping||High model training complexity|
|RL ||DQN||Energy-efficient and robust against channel uncertainty||RIS beamforming only|
|SL ||MLP with layers||Reduces hardware complexity of multiple BSs and improves RSS for indoor environments||Learning model performance relies on room conditions|
14.6 DL-Aided Beamforming for RIS Applications
Beamforming in RIS-based communications has diverse applications such as RIS-only beamforming (passive), BS-RIS beamforming (active/passive), secure beamforming (eavesdroppers included), energy-efficient beamforming, and indoor RIS beamforming. There are specific DL challenges and solutions to each one of these problems.
In general, the beamforming design problem in RIS-assisted scenario maximizes the spectral efficiency of the system as
where is the total transmit power and denotes the set of discrete phase-shifts. Also, we define the signal-to-interference-plus-noise ratio (SINR) as , wherein only RIS-reflected channel is assumed.
14.6.1 Beamforming at the RIS
The RIS beamforming requires passive elements continuously to reliably reflect the BS signal to the users. Here, the MLP architecture  is helpful in designing the reflect beamforming weights using active RIS elements . These elements are randomly distributed through the RIS. They are used for pilot training, after which compressed channel estimation is carried out using OMP. During data collection, the reflect beamforming weights are optimized by using the estimated channel data. Finally, a training dataset is constructed with channel data and reflect beamformers as the input-output pairs for an SL framework. Note that the active RIS elements present similar shortcomings as in . However, the method in  excels by leveraging DL for designing beamformers.
The labeling process in  demands solving an optimization problem for each channel instance in training data generation stage. One possible way to mitigate this is to use label-free techniques, such as UL. The UL approach in  for reflect beamforming design employs MLP with five fully connected layers. The network maps the vectorized cascaded and direct channel data input to the output comprising the phase values of the reflect beamformers. The loss function is selected as the negative of the norm of the channel vector, which may seem like an unsupervised approach because it does not minimize the error between the label and learning model prediction. However, this technique yields the phase information at the output uniquely for each training samples. Consequently, the beamformers implicitly behave like a label in the training process. In UL, the training data is clustered into smaller sets without a prior knowledge about the “meaning” of each clustered sets. However, in , the output of the NN is a design parameter, i.e. reflect beamformer phases, which have the complexity of beamformer optimization for each input.
In order to eliminate the expensive labeling process of the SL-based techniques,  employed RL to design the reflect beamformers for single-antenna users and BS. The RL is a promising approach which directly yields the output by optimizing the objective function of the learning model. First, the channel state is estimated by using two orthogonal pilot signals. An action vector is selected either by exploitation (using prior experience of the learning model) or exploration (using a predefined codebook). After computing the achievable rate based on the selected action vector from the environment, a reward or penalty is imposed by comparing with the achievable rate with a threshold. Upon reward calculation, a Deep Quality Network (DQN) (Fig. 14.12) updates the map from the input state (channel data) to the output action (action vector composed of reflect beamformer weights). This process is repeated for several input states until the learning model converges. While RL is not an RIS-specific technique, it is particularly useful in lowering the overhead of labeling process as compared to SL architectures deployed by RNN or CNN models, which require labeled datasets. The RL algorithm learns reflect beamformer weights based on the optimization of the achievable rate. Thus, RL presents a solution for online learning schemes, where the model effectively adapts to the changes in the propagation environment. However, RL techniques have longer training times than the SL approaches because reward mechanism and discrete action spaces make it difficult to reach the global optimum. The label-free process implies that the RL usually has slightly poorer performance than the SL.
To accelerate the training stage by the use of continuous action spaces, a deep deterministic policy gradient (DDPG) (Fig. 14.12) was introduced in . Here, actor-critic network architectures are used to compute actions and target values, respectively. First, the learning stage is initialized by the use of input state excited by cascaded and direct channels. Given the state information, a deep policy network (DPN) (actor) constructs the actions (reflection beamformer phases). Here, the DPN provides a continuous action space that converges faster than the DQN architecture in . The action vector is used by the critic network architecture to estimate the received signal-to-noise-ratio (SNR) as objective. This SNR then yields the target beamformer vector under the learning policy. Using the gradient of DPN, the network parameters are updated and the next state is constructed as the combination of the received SNR and the reflecting beamformers. This process is repeated until it converges. An additional benefit of this approach is that it outperforms fixed-point iteration (FPI) algorithms used to solve reflect beamforming optimization. Moreover, the continuous action space representation with DPN in DDPG provides robustness of the learning model against changes in channel data. However, multiple NN architectures (actor and critic networks) increase the number of learning parameters and aggravate model update requirements for each architecture.
The model initialization in both DQN and DDPG may force the learning models to start far from the optimum point during the early stages of learning. This leads to a slow convergence and poor reward performance. In order to accelerate the learning process, 
devised a joint learning and optimization technique. The key idea is to use DDPG to search for optimal action for each decision epoch during training. Then, a feasible beamformer vector is found via optimization in a convex-approximation setting. This reduces the search space of the DDPG algorithm and shortens training times.
Even if RL is a label-free approach that reduces the overhead during training data generation, training approaches in [15, 62, 20] demand expensive transmission overhead to be trained on huge datasets. This is mitigated in FL techniques. The FL approach in  learns the RIS reflect beamformers by training an MLP by computing the model updates at each user with the local dataset. The model updates are aggregated in a parameter server (PS), which is connected to the RIS. The MLP input is the cascaded channel information and the output labels are RIS beamformer weights. The federated architecture lowers the transmission overhead during training. However, it is assumed that the PS is connected to the RIS. The simple architecture of the RIS could make this infeasible. It is more practical to access the PS via BS for model training.
Physical layer security in wireless systems is largely achieved through signal processing techniques, such as cooperative relaying and cooperative jamming. The hardware complexity is a major issue in these methods. The low-cost, less complex RIS-based systems have the potential to mitigate these problems. The RL-based secure beamforming  minimizes the secrecy rate by jointly designing the beamformers at the RIS and BS to serve multiple legitimate users in the presence of eavesdroppers. The RL algorithm accepts the states as the channel information of all users, secrecy rate and transmission rate. Similar to , the action vector are beamformers at the BS and RIS. The reward function is designed based on the secrecy rate of users. A DQN is trained to learn the beamformers by minimizing the secrecy rate while guaranteeing the quality-of-service requirements. The model training takes place at the BS, which is responsible for collecting the environment information (channel data) and making decisions for secure beamforming. This scheme is more realistic and reliable than that of [62, 15], which ignore the effect of eavesdroppers. The learning model includes high-dimensional state and action information, such as the channels of all users and beamformers of BS and RIS. This may necessitate more computing resources for training than non-secure RIS [62, 15] and conventional SL techniques [9, 63].
14.6.3 Energy-Efficient Beamforming
The RIS configuration dynamically changes depending on the network status. It is very demanding for the BS to optimize the transmit power every time when the on/off status of RIS elements is updated. This could be addressed by accounting energy-efficiency in the beamformer design problem. In , a self-powered RIS scenario maximizes the energy-efficiency by optimizing the transmit power and the RIS beamformer phases. In this DQN-based RL approach, the BS learns the outcome of the system performance while updating the model parameters. Thus, the BS makes decisions to allocate the radio resources by relying on only the estimated channel information. The RL framework has states selected as the estimated channels from users and the energy level of the RIS. Meanwhile, the action vector includes the transmit power, the RIS beamformer phases and on/off status of the RIS elements. The learning policy is based on the reward which is selected as the energy-efficiency of the overall system. However, this work considers only RIS beamforming and ignores the same at the BS.
14.6.4 Beamforming for Indoor RIS
Different from the above scenarios,  addresses the RIS beamformer design problem in an indoor communications scenario to increase the received signal strength (RSS) (see Fig. 14.2). This is particularly useful from the perspective of low hardware complexity because it eliminates deployment of multiple BSs to improve RSS. The MLP architecture in  accepts two-dimensional user position vector and yields the RIS beamformer phases at the output. Since the channel data is not employed as input, the network does not have to deal with severe environmental fluctuations. However, the learning model trains on specific room environments and may perform poorly for different room conditions or different obstacle distribution in the same room. This is mitigated in RL-based solutions which are highly adaptive to different environments [62, 15].
14.7 Challenges and Future Outlook
The techniques for RIS inverse design and processing are constantly evolving. Major challenges include reduction of training cost, gathering of labeled data, effective handling of system imperfections, and better data representations.
New approaches are needed to increase the computational efficiency and reduce the amount of training required for DL-based RIS design. As mentioned below, reduction of design time and achieving full EM-compliance remains a major challenge.
220.127.116.11 Hybrid physics-based models
Hybrid models, where training set is supplemented by physics-based analytical models, reduce the amount of required training data and increase learning efficiency. Analytical RF circuit-based models are available to predict the performance of several canonical meta-atom designs. To speed up the training data generation, these analytical circuit-based models could be used to supplement the training data set and reduce iterations of time-consuming full-wave EM simulations. It may also be feasible to create innovative DL design and optimization architectures that utilize physics-based analytical models within the ANN architecture. Another method to reduce the amount of required training data for multi-layer MTS designs is to use T-matrix data to analytically cascaded MTS designs from single-layer training data.
18.104.22.168 Other learning techniques
TL may also be used for expediting and improving the learning of a new task by using a previously trained neural network weights and bias as the initialization for the new ANN. Since all ANNs for meta-atom performance prediction and inverse design are implicitly learning Maxwell’s equations, it is sensible that a network trained for one meta-atom design or frequency band is scaled and transferred to a related design. DQNs have also been studied to increase the efficiency of MTS holograms and automated multi-layer RIS design.
22.214.171.124 Improved data representation
More complex input data structures and representations are increasingly studied for DL-based RIS. While this article focused on discrete input parameters and image data structures are RIS design representations, graphical and sequential data structures have recently been proposed as alternatives. The graphical model has been used to represent EM systems with near-field coupling (as in coupled resonators). In this arrangement, graph nodes contain resonator attributes, such as material, geometry, and location, and graph nodes represent the near-field coupling factors. These graphical data structures are processed using graphical neural networks (GNNs). While yet to be extensively explored in the domain of MTS, GNNs have been applied to model a broad range of physical systems. GNNs have the potential to handle additional complexities to jointly optimize RIS design and operation in wireless communication networks. Additionally, sequential data structures are another data representation that is yet to be extensively explored in the context of MTS. Similarly, sequential data structures are useful for representing time-sequence data in dynamic EM systems (as in RIS filters) and are learned using recurrent neural networks (RNNs). In other domains, such as natural language processing (NLP), sequential data is often learned using RNNs, which are ANNs that use forward or backward connections to enable a memory of internal states between successive passes to the network. As dynamic operation of RIS becomes increasingly important in the development of wireless networks and SRE, it is likely that RNNs will become increasingly useful to model dynamic RIS.
subsectionDeep Reinforcement Learning (DRL) Similar to evolutionary optimization techniques, RL is an area of ML concerned with how software agents ought to take actions in an environment in order to maximize the notion of cumulative reward through trial and error. Without the use of labeled training data, RL algorithms learn system dynamics through exploration to maximize a reward function. Here, DRL algorithms, such as DQNs, have produced ML advances in a broad range of applications including robotics, strategy games, NLP, and computer vision. To date, DQNs have been studied to increase the efficiency of MTS holograms and automated multi-layer MTS design. However, research using D-RLs for MTS design and optimization is very limited and further research is needed to develop these techniques for MTS applications. D-RL-based networks hold the promise for automated self-learning of RIS in SRE that are able to adapt and optimize themselves for dynamic RF environments and modulations.
Several challenges remain for DL architectures to reach their full potential in realizing significant performance gains and efficiency for RIS-assisted wireless systems. Given that it is an emerging technology, larger sets of real data are not yet available. Then, model training consumes much time and resources, including parallel processing and storage. Further, to achieve commercial viability of DL-based RIS-aided communications, dynamically adapting to changes in the environment is crucial. Finally, new RIS-specific implementation challenges have also been identified within emerging technologies such as terahertz communications, cell-free massive MIMO, drone operations, and open radio access network (RAN).
14.7.3 Channel Modeling
Channel Modeling is a challenging task in communication systems, especially with the large number of antennas due to the complexity of system architecture. In order to provide a reliable channel modeling performance, DL can be of help to construct a data-driven model based on the field measurements. In this case, SL schemes can be used to construct the channel model as a relationship between the input and output of a learning model . Thus, DL-based methods are expected to become more frequently used in RIS-assisted wireless networks for channel modeling.
126.96.36.199 Data Collection
Massive data collection hampers successful performance of DL-based techniques for all wireless communications tasks: signal detection, channel estimation, and beamformer design. The signal detection requires collection and storage of transmit and receive data symbols for different channel conditions. The prerequisites for channel estimation and beamforming are even more tedious because of additional labeling process. This is difficult to overcome in, especially online scenarios. Apart from SL, the label-free structure in RL is particularly helpful but at the cost of training times. It is possible to relax the data collection requirements by realizing the propagation environment in a numerical electromagnetic (EM) simulation tool  and then using a more realistically simulated data. This is helpful in constructing the training dataset offline but chances of failure remain in a real world scenario. Very recently, public datasets for channel estimation problem in RIS-aided communications were made available in the 2021 IEEE SIgnal Processing Cup competition.
188.8.131.52 Model Training
The models are usually trained offline prior to their online deployment at a PS connected to the BS. In addition, the model training complexity increases with the number of RIS elements and number of RISs deployed between the users and the BS. This introduces huge transmission overhead for model training. The FL has potential to reduce this cost and enable a communication-efficient model training (see, e.g., Fig. 14.10). Here, combining the label-free structure of RL and the communications efficiency of FL, i.e. federated reinforcement learning, could be the next step.
184.108.40.206 Environment Adaptation and Robustness
The behavior of the channel affects all DL-based tasks including channel estimation, beamforming, user scheduling, power allocation, and antenna selection/switching. Addressing the trade-off between the bias and the variance of the model output is essential for robust performance. This is usually achieved using a validation data so that the learning model does not either over-fit or under-fit the training data. Nonetheless, this does not generalize the learning performance to different environments. Moreover, the current DL architectures for wireless systems remain environment-specific because the input data space of their learning model is limited. As a result, the performance degrades significantly when the learning model is fed with the input from unlearned/uncovered data space. In order to cover larger data spaces and provide a robust performance against the changes in the environment, wider and deeper learning models are required. But the current DL architectures for wireless communications comprise less than a million neurons and and are composed of only a few layers (Table 1) 
. The giant learning models for image recognition or natural language processing consists of millions and billions of neurons, e.g., VGG (138 million), AlexNet (60 million), and GPT-3 (170 billion). Clearly, going wider and deeper in designing the learning models is of great interest for future DL-based RIS-aided systems.
We surveyed DL-based techniques for designing RIS hardware to be deployed for future wireless communications. When the design space and scale of the RIS arrays increases, learning-based architectures outperform evolutionary optimization techniques for both surrogate performance modeling and inverse design. The DL inverse design is flexible in admitting a variety of RIS unit structures. The DGMs are the most useful because of their ability to generate new designs not previously seen in the published literature. While active research and techniques in this area are still evolving, DL is a promising solution for the inverse design of RIS.
We also investigated DL architectures for RIS-assisted wireless systems for key applications of signal detection, channel estimation, and beamforming. We extensively discussed various learning schemes and model architectures, such as SL, UL, FL and RL for RIS applications. The SL exhibits better performance than UL and RL because of label usage. The UL and RL are label-free schemes that provide less complexity during training data generation. However, UL still involves an optimization stage for each data instance. Among all, the RL is the most promising technique because of its standalone operation and the consequent ability to adapt to environmental changes at the cost of longer training times.
The FL reduces the transmission overhead significantly and can be integrated with the other learning methods. The combination of FL- and RL-based learning policies not only exhibits a communication-efficient model training but also provides environmental adaptation. Major research challenges include data collection, model training, and environment adaptation. These should be addressed simultaneously to provide a reliable DL architecture for the next-generation RIS-assisted wireless systems. Specifically, the combination of FL and RL should be fed with the collection of huge datasets and massive neural networks so that a robust DL architecture is achieved.
The authors warmly acknowledge valuable contributions of Dr. John A. Hodge (Amazon) for the inverse design portion of this chapter, when he was a graduate student at Virginia Tech.
- Bengio et al.  Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, 2013.
- Campbell et al.  Sawyer D Campbell, David Sell, Ronald P Jenkins, Eric B Whiting, Jonathan A Fan, and Douglas H Werner. Review of numerical optimization techniques for meta-device design. Optical Materials Express, 9(4):1842–1863, 2019.
- Chen  Hou-Tong Chen. Interference theory of metamaterial perfect absorbers. Optics Express, 20(7):7165–7172, 2012.
- Chen et al.  Hou-Tong Chen, Antoinette J Taylor, and Nanfang Yu. A review of metasurfaces: Physics and applications. Reports on Progress in Physics, 79(7):076401, 2016.
- Chen et al.  Michael Chen, Minseok Kim, Alex MH Wong, and George V Eleftheriades. Huygens’ metasurfaces from microwaves to optics: A review. Nanophotonics, 7(6):1207–1231, 2018.
- Dai et al.  Linglong Dai, Ruicheng Jiao, Fumiyuki Adachi, H Vincent Poor, and Lajos Hanzo. Deep learning for wireless communications: An emerging interdisciplinary paradigm. IEEE Wireless Communications, 27(4):133–139, 2020.
- Elbir and Coleri  Ahmet M. Elbir and Sinem Coleri. Federated learning for channel estimation in conventional and RIS-assisted massive MIMO. IEEE Transactions on Wireless Communications, 21(6):4255–4268, 2021.
- Elbir and Mishra  Ahmet M Elbir and Kumar Vijay Mishra. A survey of deep learning architectures for intelligent reflecting surfaces. arXiv preprint arXiv:2009.02540, 2020.
- Elbir et al.  Ahmet M Elbir, Anastasios Papazafeiropoulos, Pandelis Kourtessis, and Symeon Chatzinotas. Deep channel learning for large intelligent surfaces aided mm-Wave massive MIMO systems. IEEE Wireless Communications Letters, 9(9):1447–1451, 2020.
- Elbir et al.  Ahmet M. Elbir, Kumar Vijay Mishra, M. R. Bhavani Shankar, and Symeon Chatzinotas. The rise of intelligent reflecting surfaces in integrated sensing and communications paradigms. arXiv preprint arXiv:2204.07265, 2022.
- Epstein and Eleftheriades [2016a] Ariel Epstein and George V Eleftheriades. Arbitrary power-conserving field transformations with passive lossless omega-type bianisotropic metasurfaces. IEEE Transactions on Antennas and Propagation, 64(9):3880–3895, 2016a.
- Epstein and Eleftheriades [2016b] Ariel Epstein and George V Eleftheriades. Huygens' metasurfaces via the equivalence principle: Design and applications. Journal of the Optical Society of America B, 33(2):A31–A50, 2016b.
- Esmaeilbeig et al. [2022a] Zahra Esmaeilbeig, Kumar Vijay Mishra, Arian Eamaz, and Mojtaba Soltanalian. Cramér–rao lower bound optimization for hidden moving target sensing via multi-IRS-aided radar. arXiv preprint arXiv:2210.05812, 2022a.
- Esmaeilbeig et al. [2022b] Zahra Esmaeilbeig, Kumar Vijay Mishra, and Mojtaba Soltanalian. IRS-aided radar: Enhanced target parameter estimation via intelligent reflecting surfaces. In IEEE Sensor Array and Multichannel Signal Processing Workshop, pages 286–290, 2022b.
- Feng et al.  Keming Feng, Qisheng Wang, Xiao Li, and Chao-Kai Wen. Deep reinforcement learning based intelligent reflecting surface optimization for MISO communication systems. IEEE Wireless Communications Letters, 9(5):745–749, 2020.
- Fong et al.  Bryan H Fong, Joseph S Colburn, John J Ottusch, John L Visher, and Daniel F Sievenpiper. Scalar and tensor holographic artificial impedance surfaces. IEEE Transactions on Antennas and Propagation, 58(10):3212–3221, 2010.
- Gao et al.  Jiabao Gao, Caijun Zhong, Xiaoming Chen, Hai Lin, and Zhaoyang Zhang. Unsupervised learning for passive beamforming. IEEE Communications Letters, 24(5):1052–1056, 2020.
- Glybovski et al.  Stanislav B Glybovski, Sergei A Tretyakov, Pavel A Belov, Yuri S Kivshar, and Constantin R Simovski. Metasurfaces: From microwaves to visible. Physics Reports, 634:1–72, 2016.
- Gong et al.  Shimin Gong, Xiao Lu, Dinh Thai Hoang, Dusit Niyato, Lei Shu, Dong In Kim, and Ying-Chang Liang. Towards smart wireless communications via intelligent reflecting surfaces: A contemporary survey. IEEE Communications Surveys & Tutorials, 22(4):2283–2314, 2020.
- Gong et al.  Shimin Gong, Jiaye Lin, Beichen Ding, Dusit Niyato, Dong In Kim, and Mohsen Guizani. When optimization meets machine learning: The case of IRS-assisted wireless networks. IEEE Network, 36(2):190–198, 2022.
- Hodge et al.  John A Hodge, Theodore Anthony, and Amir I Zaghloul. Enhancement of the dipole antenna using a capacitively loaded loop (CLL) structure. In IEEE International Symposium on Antennas and Propagation and USNC-URSI Radio Science Meeting, pages 1544–1545, 2014.
- Hodge et al. [2019a] John A Hodge, Kumar Vijay Mishra, and Amir I Zaghloul. Joint multi-layer GAN-based design of tensorial RF metasurfaces. In IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6, 2019a.
- Hodge et al. [2019b] John A Hodge, Kumar Vijay Mishra, and Amir I Zaghloul. Multi-discriminator distributed generative model for multi-layer RF metasurface discovery. In IEEE Global Conference on Signal and Information Processing, pages 1–5, 2019b.
- Hodge et al. [2019c] John A Hodge, Kumar Vijay Mishra, and Amir I Zaghloul. Reconfigurable metasurfaces for index modulation in 5G wireless communications. In IEEE International Applied Computational Electromagnetics Society Symposium, pages 1–2, 2019c.
- Hodge et al. [2019d] John A Hodge, Kumar Vijay Mishra, and Amir I Zaghloul. RF metasurface array design using deep convolutional generative adversarial networks. In IEEE International Symposium on Phased Array Systems and Technology, pages 1–6, 2019d.
- Hodge et al.  John A Hodge, Kumar Vijay Mishra, and Amir I Zaghloul. Intelligent time-varying metasurface transceiver for index modulation in 6G wireless networks. IEEE Antennas and Wireless Propagation Letters, 19(11):1891–1895, 2020.
- Hodge et al.  John A Hodge, Kumar Vijay Mishra, and Amir I Zaghloul. Deep inverse design of reconfigurable metasurfaces for future communications. arXiv preprint arXiv:2101.09131, 2021.
- Holloway et al.  Christopher L Holloway, Edward F Kuester, Joshua A Gordon, John O’Hara, Jim Booth, and David R Smith. An overview of the theory and applications of metasurfaces: The two-dimensional equivalents of metamaterials. IEEE Antennas and Propagation Magazine, 54(2):10–35, 2012.
- Huang et al.  Chongwen Huang, George C Alexandropoulos, Chau Yuen, and Mérouane Debbah. Indoor signal focusing with deep learning designed reconfigurable intelligent surfaces. In IEEE International Workshop on Signal Processing Advances in Wireless Communications, pages 1–5, 2019.
- Inampudi and Mosallaei  Sandeep Inampudi and Hossein Mosallaei. Neural network based design of metagratings. Applied Physics Letters, 112(24):241102, 2018.
- Jiang and Fan  Jiaqi Jiang and Jonathan A Fan. Global optimization of dielectric metasurfaces using a physics-driven neural network. Nano Letters, 19(8):5366–5372, 2019.
- Jiang et al.  Jiaqi Jiang, David Sell, Stephan Hoyer, Jason Hickey, Jianji Yang, and Jonathan A Fan. Data-driven metasurface discovery. arXiv preprint arXiv:1811.12436, 2018.
- Jiang et al.  Jiaqi Jiang, David Sell, Stephan Hoyer, Jason Hickey, Jianji Yang, and Jonathan A Fan. Free-form diffractive metagrating design based on generative adversarial networks. ACS Nano, 13(8):8872–8878, 2019.
- Khan et al.  Saud Khan, Salman Durrani, and Xiangyun Zhou. Transfer learning based detection for intelligent reflecting surface aided communications. In IEEE Annual International Symposium on Personal, Indoor and Mobile Radio Communications, pages 13–16, 2021.
- Kumar et al.  Chandan Kumar, Salil Kashyap, Rimalapudi Sarvendranath, and Supreet Kumar Sharma. On the feasibility of wireless energy transfer based on low complexity antenna selection and passive IRS beamforming. IEEE Transactions on Communications, 70(8):5663–5678, 2022.
- Lecun et al.  Yann Lecun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436–444, 2015.
- LeCun et al.  Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.
- Lee et al.  Gilsoo Lee, Minchae Jung, Ali Taleb Zadeh Kasgari, Walid Saad, and Mehdi Bennis. Deep reinforcement learning for energy-efficient networking with reconfigurable intelligent surfaces. In IEEE International Conference on Communications, pages 1–6, 2020.
- Li et al.  Haipeng Li, Guangming Wang, He-Xiu Xu, Tong Cai, and Jiangang Liang. X-band phase-gradient metasurface for high-gain lens antenna application. IEEE Transactions on Antennas and Propagation, 63(11):5144–5149, 2015.
- Liu et al.  Shicong Liu, Zhen Gao, Jun Zhang, Marco Di Renzo, and Mohamed-Slim Alouini. Deep denoising neural network assisted compressive channel estimation for mmWave intelligent reflecting surfaces. IEEE Transactions on Vehicular Technology, 69(8):9223–9228, 2020.
- Liu et al.  Zhaocheng Liu, Dayu Zhu, Sean P Rodrigues, Kyu-Tae Lee, and Wenshan Cai. Generative model for the inverse design of metasurfaces. Nano Letters, 18(10):6570–6576, 2018.
- Ma et al.  Donghui Ma, Lixin Li, Huan Ren, Dawei Wang, Xu Li, and Zhu Han. Distributed rate optimization for intelligent reflecting surface with federated learning. In IEEE International Conference on Communications Workshops, pages 1–6, 2020.
- Ma et al.  Wei Ma, Feng Cheng, and Yongmin Liu. Deep-learning-enabled on-demand design of chiral metamaterials. ACS Nano, 12(6):6326–6334, 2018.
- Ma et al.  Wei Ma, Feng Cheng, Yihao Xu, Qinlong Wen, and Yongmin Liu. Probabilistic representation and inverse design of metamaterials based on a deep generative model with semi-supervised learning strategy. Advanced Materials, 31(35):1901111, 2019.
- Maci et al.  Stefano Maci, Gabriele Minatti, Massimiliano Casaletti, and Marko Bosiljevac. Metasurfing: Addressing waves on impenetrable metasurfaces. IEEE Antennas and Wireless Propagation Letters, 10:1499–1502, 2011.
- Mencagli et al.  Mario Mencagli, Enrica Martini, and Stefano Maci. Surface wave dispersion for anisotropic metasurfaces constituted by elliptical patches. IEEE Transactions on Antennas and Propagation, 63(7):2992–3003, 2015.
- Minatti et al.  Gabriele Minatti, Marco Faenzi, Enrica Martini, Francesco Caminita, Paolo De Vita, David González-Ovejero, Marco Sabbadini, and Stefano Maci. Modulated metasurface antennas for space: Synthesis analysis and realizations. IEEE Transactions on Antennas and Propagation, 63(4):1288–1300, 2015.
- Mishra et al.  Kumar Vijay Mishra, John A Hodge, and Amir I Zaghloul. Reconfigurable metasurfaces for radar and communications systems. In URSI Asia-Pacific Radio Science Conference, pages 1–4, 2019.
- Mishra et al.  Kumar Vijay Mishra, Arpan Chattopadhyay, Siddharth Sankar Acharjee, and Athina P Petropulu. OptM3Sec: Optimizing multicast IRS-aided multiantenna DFRC secrecy channel with multiple eavesdroppers. In IEEE International Conference on Acoustics, Speech and Signal Processing, pages 9037–9041, 2022.
- Nguyen and Zaghloul  Quang Nguyen and Amir I Zaghloul. Impedance matching metamaterials composed of ELC and NB-SRR. In IEEE Antennas and Propagation Society International Symposium, pages 1–2, 2018.
- Nguyen et al.  Quang Nguyen, K V Mishra, and Amir I Zaghloul. Retrieval of polarizability matrix for metamaterials. In IEEE International Conference on Microwaves, Communications, Antennas and Electronic Systems, pages 1–5, 2019.
- Pereda et al.  Amagoia Tellechea Pereda, Francesco Caminita, Enrica Martini, Iñigo Ederra, Juan Carlos Iriarte, Ramón Gonzalo, and Stefano Maci. Dual circularly polarized broadside beam metasurface antenna. IEEE Transactions on Antennas and Propagation, 64(7):2944–2953, 2016.
- Peurifoy et al.  John Peurifoy, Yichen Shen, Li Jing, Yi Yang, Fidel Cano-Renteria, Brendan G DeLacy, John D Joannopoulos, Max Tegmark, and Marin Soljačić. Nanophotonic particle simulation and inverse design using artificial neural networks. Science Advances, 4(6):eaar4206, 2018.
- Qiu et al.  Tianshuo Qiu, Xin Shi, Jiafu Wang, Yongfeng Li, Shaobo Qu, Qiang Cheng, Tiejun Cui, and Sai Sui. Deep learning: A rapid and efficient route to automatic metasurface design. Advanced Science, 2019.
- Renzo et al.  Marco Di Renzo, Merouane Debbah, Dinh-Thuy Phan-Huy, Alessio Zappone, Mohamed-Slim Alouini, Chau Yuen, Vincenzo Sciancalepore, George C Alexandropoulos, Jakob Hoydis, Haris Gacanin, et al. Smart radio environments empowered by reconfigurable AI meta-surfaces: An idea whose time has come. EURASIP Journal on Wireless Communications and Networking, 2019(1):1–20, 2019.
- Schurig et al.  D Schurig, J J Mock, and D R Smith. Electric-field-coupled resonators for negative permittivity metamaterials. Applied Physics Letters, 88(4):041109, 2006.
- Shan et al.  Tao Shan, Xiaotian Pan, Maokun Li, Shenheng Xu, and Fan Yang. Coding programmable metasurfaces based on deep learning techniques. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 10(1):114–125, 2020.
- Sievenpiper et al.  Dan Sievenpiper, Lijun Zhang, Romulo FJ Broas, Nicholas G Alexopolous, Eli Yablonovitch, et al. High-impedance electromagnetic surfaces with a forbidden frequency band. IEEE Transactions on Microwave Theory and techniques, 47(11):2059–2074, 1999.
- Su et al.  Jianxun Su, Yao Lu, Hui Zhang, Zengrui Li, Yaoqing Lamar Yang, Yongxing Che, and Kainan Qi. Ultra-wideband, wide angle and polarization-insensitive specular reflection reduction by metasurface based on parameter-adjustable meta-atoms. Scientific Reports, 7:42283, 2017.
- Su et al.  Pei Su, Yongjiu Zhao, Shengli Jia, Wenwen Shi, and Hongli Wang. An ultra-wideband and polarization-independent metasurface for RCS reduction. Scientific Reports, 6:20387, 2016.
- Sun et al.  Shulin Sun, Qiong He, Shiyi Xiao, Qin Xu, Xin Li, and Lei Zhou. Gradient-index meta-surfaces as a bridge linking propagating waves and surface waves. Nature Materials, 11(5):426–431, 2012.
- Taha et al.  Abdelrahman Taha, Yu Zhang, Faris B Mismar, and Ahmed Alkhateeb. Deep reinforcement learning for intelligent reflecting surfaces: Towards standalone operation. In IEEE International Workshop on Signal Processing Advances in Wireless Communications, pages 1–5, 2020.
- Taha et al.  Abdelrahman Taha, Muhammad Alrabeiah, and Ahmed Alkhateeb. Enabling large intelligent surfaces with compressive sensing and deep learning. IEEE Access, 9:44304–44321, 2021.
- Torkzaban and Khojastepour  Nariman Torkzaban and Mohammad A Amir Khojastepour. Shaping mmwave wireless channel via multi-beam design using reconfigurable intelligent surfaces. In IEEE Global Communications Conference Workshops, pages 1–6, 2021.
- Tretyakov  SA Tretyakov. Metasurfaces for general transformations of electromagnetic fields. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 373(2049):20140362, 2015.
- Wang et al.  Zhaolin Wang, Xidong Mu, and Yuanwei Liu. STARS enabled integrated sensing and communications. arXiv preprint arXiv:2207.10748, 2022.
- Wei et al.  Tong Wei, Linlong Wu, Kumar Vijay Mishra, and MR Bhavani Shankar. IRS-aided wideband dual-function radar-communications with quantized phase-shifts. In IEEE Sensor Array and Multichannel Signal Processing Workshop, pages 465–469, 2022.
- Wu and Zhang  Qingqing Wu and Rui Zhang. Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network. IEEE Communications Magazine, 58(1):106–112, 2019.
- Yang et al.  Helin Yang, Zehui Xiong, Jun Zhao, Dusit Niyato, Liang Xiao, and Qingqing Wu. Deep reinforcement learning based intelligent reflecting surface for secure wireless communications. IEEE Transactions on Wireless Communications, 20(1):375–388, 2021.
- Yu and Deng  D. Yu and L. Deng. Deep learning and its applications to signal and information processing [exploratory dsp]. IEEE Signal Processing Magazine, 28(1):145–154, Jan 2011. ISSN 1053-5888. doi: 10.1109/MSP.2010.939038.
- Yu et al.  Nanfang Yu, Patrice Genevet, Mikhail A Kats, Francesco Aieta, Jean-Philippe Tetienne, Federico Capasso, and Zeno Gaburro. Light propagation with phase discontinuities: Generalized laws of reflection and refraction. Science, page 1210713, 2011.
- Zhang et al.  Qian Zhang, Che Liu, Xiang Wan, Lei Zhang, Shuo Liu, Yan Yang, and Tie Jun Cui. Machine-learning designs of anisotropic digital coding metasurfaces. Advanced Theory and Simulations, page 1800132, 2018.
- Zhang et al.  Qian Zhang, Che Liu, Xiang Wan, Lei Zhang, Shuo Liu, Yan Yang, and Tie Jun Cui. Machine-learning designs of anisotropic digital coding metasurfaces. Advanced Theory and Simulations, 2(2):1800132, 2019.
- Zhu et al.  HL Zhu, SW Cheung, Kwok Lun Chung, and Tong I Yuk. Linear-to-circular polarization conversion using metasurface. IEEE Transactions on Antennas and Propagation, 61(9):4615–4623, 2013.