I Introduction
Ia Motivation
With the proliferation of wireless devices and services, wireless communication networks are becoming increasingly complex. Beyond 5G (B5G) networks are expected to provide uninterrupted connectivity to devices ranging from sensors and cell phones to vehicles and robots, calling for the development of novel interference management strategies via radio resource management (RRM). However, solving most RRM problems is NPhard, making it challenging to derive an optimal solution in all but the simplest scenarios [mollanoori2013uplink].
Solutions to this problem run the gamut from classical optimization techniques [lei2015joint]
to information and game theory
[yang2017mean, riaz2018power]. As emerging applications demand growth in scale and complexity, modern machine learning techniques have also been explored as alternatives to solve RRM problems in the presence of model and/or algorithmic deficits
[simeone2018very]. The performance of trained models generally depend on how representative the training data are for the channel conditions encountered at deployment time. As a result, when conditions in the network change, these rigid models are often no longer useful [nair2019covariate], [quinonero2009dataset].A fundamental RRM problem is the optimization of transmission power levels at distributed links that share the same spectral resources in the presence of timevarying channel conditions [chiang2008power]. This problem was addressed by the datadriven methodology introduced in [eisen2020optimal], and later studied in [eisen2020transferable, naderializadeh2020wireless, chowdhury2020unfolding]
. In it, the power control policy mapping channel state information (CSI) and power vector is parametrized by a graph neural network (GNN). The GNN encodes information about the network topology through its underlying graph whose edge weights are tied to the channel realizations. The design problem consists of training the weights of the graph filters, while tying the spatial weights applied by the GNN to the CSI. As a result, the solution – which is referred to as random edge GNN (REGNN) – automatically adapts to timevarying CSI conditions.
In this paper, we focus on the higherlevel problem of facilitating adaptation to timevarying topologies. To this end, as illustrated in Fig. 1, we assume that the topology of the network varies across periods of operation of the system, with each period being characterized by timevarying channel conditions as in [eisen2020optimal]. As such, the operation within each channel period is well reflected by the model studied in [eisen2020optimal, naderializadeh2020wireless], and we adopt an REGNN architecture for withinperiod adaptation. At the beginning of each period, the network designer is given limited CSI data that can be used to adapt the REGNNbased power control policy to the changed topology. In order to facilitate fast adaptation – in terms of data and iteration requirements – we integrate metalearning with REGNN training.
IB Metalearning
The goal of metalearning is to extract shared knowledge, in the form of an inductive bias, from data sets corresponding to distinct learning tasks in order to solve heldout tasks more efficiently [schmidhuber1987evolutionary, thrun1998lifelong]. The inductive bias may refer to parameters of a generalpurpose learning procedure, such as the learning rate [maclaurin2015gradient], or initialization [finn2017model], [nichol2018firstorder], [grant2018recasting] of (stochastic) gradient descent (S)GD. These schemes can be credited for much of the reinvigorated interest in metalearning in the previous decade. We will refer to them as blackbox metalearning methods, given their modelagnostic applicability via fast parametric generalization.
In contrast, modular metalearning aims at fast combinatorial generalization [chomsky2014aspects], making, in a sense, "infinite use of finite means" [von1999humboldt]. Modular metalearning generalizes to new tasks by optimizing a set of neural network modules that can be composed in different ways to solve a new task, without changing their internal parameters [alet2018modular], [alet2019neural]. Modularity is a key property of engineered systems, due to its fault tolerance, interpretability, and flexibility [baldwin2006modularity], but is generally lacking in datadriven solutions, which often amount to large blackbox inputoutput mappings. The few existing modular metalearning approaches rely on simulated annealing to find a suitable module composition for each task given the current neural network modules [alet2018modular]. These, however, are notoriously inefficient optimization methods (in terms of computation time), and more recent techniques integrate learnt proposal functions in order to speed up training [alet2019neural].
IC Contributions
As illustrated in Fig. 1, the main goal of this paper is to optimize fast adaptation procedures for the power control policy to timevarying network configurations. To do so, we consider both blackbox and modular metalearning methods. In particular, the contributions of this paper can be summarized as follows:

We integrate firstorder model agnostic metalearning (FOMAML) [finn2017model], as a stateoftheart representative of blackbox metalearning methods, with REGNN training;

We introduce a novel modular metalearning method that constructs a repository of fixed graph filters that can be combined to define REGNNbased power control models for new network configurations. In contrast to existing modular metalearning schemes that rely on variants of simulated annealing, the proposed method adopts a stochastic module assignment based on the Gumbelsoftmax reparametrization trick [maddison2016concrete], which enables optimization via standard SGD;

We validate the performance of all metalearning methods with extensive experiments that provide comparisons with joint training schemes [eisen2020optimal]. The use of metalearning for power control problems in wireless networks is validated, and a comparative study of the performance of the considered metalearning solutions is presented.
ID Prior Work
GNNs are enjoying an increasing popularity in the wireless communication community. In addition to power allocation [eisen2020optimal, eisen2020transferable, naderializadeh2020wireless, chowdhury2020unfolding], GNNs have been used to address cellular [zhao2020cellular] and satellite [yang2020noval] traffic prediction, link scheduling [lee2020graph], channel control [tekbiyik2020channel], and localization [yan2021graph]. Due to their localized nature, GNNs have also been applied to cooperative [dong2020drl] and decentralized [lee2021decentralized] control problems in networked systems. A review of the use of GNNs in wireless communication can be found in [he2021overview].
Metalearning has been shown to improve the training and adaptation efficiency in various problems in wireless communications, ranging from demodulation [park2020learning] and decoding [jiang2019mind]
, to channel estimation
[mao2019roemnet] and beamforming [yuan2020transfer]. In particular, in [park2020learning] the authors use pilots from previous transmissions of Internet of Things (IoT) devices in order to adapt a demodulator to new channel conditions using few pilot symbols. The authors of [mao2019roemnet] train a neural networkbased channel estimator for orthogonal frequencydivision multiplexing (OFDM) system with FOMAML in order to obtain an effective solution given a small number of samples. Reference [yuan2020transfer] studies fast beamforming in multiuser multipleinput singleoutput (MISO) downlink systems. An overview of metalearning methods, with applications to wireless communication networks is available in [simeone2020learning].The application of metalearning to GNNbased power control was presented in the conference version of this paper for the first time [nikoloska2021fast]. In particular, [nikoloska2021fast] considers blackbox methods and offers preliminary experimental results. In contrast to the preliminary conference version [nikoloska2021fast], in this paper, we consider both blackbox and modular metalearning solutions, and we provide a more comprehensive numerical evaluation of all considered metalearning schemes. To the best of the authors’ knowledge, this is the first work investigating the use of modular metalearning in communication engineering problems.
The rest of the paper is organized as follows. The considered model and problem are presented in Section II, and REGNNs are reviewed in Section III. Metalearning is introduced in Section IV, and blackbox methods and the proposed modular solution are given in Section V and Section VI, respectively. All metalearning schemes are evaluated in Section VII. Section VIII concludes the paper.
Ii Model and Problem
As illustrated in Fig. 1, we consider a wireless network running over periods , with topology possibly changing at each period . During period , the network is comprised of communication links. Transmissions on the links are assumed to occur at the same time using the same frequency band. The resulting interference graph includes an edge for any pair of links with whose transmissions interfere with one another. We denote by the subset of links that interfere with link at period . Both the number of links and the topology defined by the edge set generally vary across periods .
Each period contains time slots, indexed by . In time slot of period , the channel between the transmitter of link and its intended receiver is denoted by , while denotes the channel between transmitter of link and receiver of link with . Channels account for both slow and fast fading effects, and, by definition of the interference graph , we have for . The channels for slot in period are arranged in the channel matrix , with the entry given by . Channel states vary across time slots, and the marginal distribution of matrix for all is constant and denoted by . The distribution generally changes across periods , and it is a priori unknown to the network.
To manage interlink interference, it is useful to adjust the transmit powers such that a global networkwide objective function is optimized (see, e.g., [douros2011review]). For each channel realization , we denote the vector of power allocation variables as , whose th component, , represents the persymbol transmit power of transmitter at time slot of period . The resulting achievable rate in bits per channel use for link is given by
(1) 
where denotes the persymbol noise power. By (1), interference is treated as worstcase additive Gaussian noise.
The goal of the system is to determine a power allocation policy in each period that maps the channel matrix to a power allocation vector as
(2) 
by maximizing the average achievable sumrate. This yields the stochastic optimization problem
(3) 
where denotes the power constraint of link . Note that problem (II) is defined separately for each period . Since the distribution is unknown, problem (II) can not be addressed directly.
Iii Power Allocation by Training REGNN
In this section, we review the solution proposed in [eisen2020optimal], which tackles problem (II) separately for each period . The approach in [eisen2020optimal] parametrizes the power allocation function in (2) by a REGNN as
(5) 
where is a vector of trainable parameters. In the rest of this section, we first describe the mapping implemented by a REGNN, and then we review the problem of optimizing the parameter vector . Unless stated otherwise, in this section we drop the index , which is fixed in order to simplify notation.
Iiia REGNN Model
To introduce the REGNN model, let us first describe the key operation of graph filtering. Consider a graph , with nodes in set and edge set . We associate to graph a matrix , known as the graph shift operator (GSO), with the property that we have for . Note that the channel matrix satisfies this condition for the interference graph. A graph signal is a vector , with each entry being assigned to one of the nodes in the graph. Given a vector of filter taps with , a graph filter applies the graph convolution [sandryhaila2013discrete]
(6) 
to a input graph signal . The filter is a polynomial of the matrix .
As illustrated in Fig. 2, each th power of of the GSO (6) performs an hop shift of the elements in vector on the graph. Specifically, the term is a vector whose th entry aggregates the entries in vector corresponding to singlehop neighbouring nodes of node , each weighted by the corresponding channel element of the GSO; the term aggregates for each node the contributions in vector associated to twohop neighbouring nodes; and so on. As illustrated in Fig. 2, as the order increases, node inputs from larger neighborhoods are incorporated. Thus, the graph convolution implements a local messagepassing procedure, with information from larger neighbourhoods being aggregated as the filter size in (6) increases.
An REGNN consists of a layered architecture in which each layer is a composition of a graph convolution and a pernode nonlinearity. The graph convolution in each layer uses the current channel matrix as the GSO in (6). Due to its dependence on the random fading channels, the graph convolution is characterized by "random edges" according to the terminology used in [eisen2020optimal]. Given the current channel matrix , the output of each th intermediate layer is given as
(7) 
where
denotes a nonlinear function, such as a rectified linear unit (ReLU) or a sigmoid, that is applied separately to each of the
entries in the input. The REGNN is defined by the recursive application of (7) for layers, with input to the first layer given by the input graph signal . In this paper, the input signal is set to an allone vector [naderializadeh2020wireless], but it may more generally include a variable describing the state of each link [eisen2020optimal].The transmit power in (5) is found as the output of the final, th layer of the REGNN as
(8) 
with being a diagonal matrix with its th element on the main diagonal being given by , and , denoting the model parameters (convolution taps) for layer . By (8), specifying the REGNN architecture requires defining the number of layers and the number of filter taps per layer. Assuming all layers have an equal number of taps, the total number of trainable parameters is thus , a number considerably smaller than what would be required to train a fullyconnected neural network.
IiiB Training a REGNN
Given a set of channel realizations for a given period , training of the REGNN parameters
is done by tackling the unsupervised learning problem
[eisen2020optimal](9) 
via (S)GD. Note that problem (9) restricts the optimization in (II) to the class of REGNNs in (8). By incorporating the channel matrices in the structure of the REGNNbased power control policy , the method proposed in [eisen2020optimal] automatically adapts to the different perslot channel realizations.
Iv Metalearning Power Control
Our main goal in this paper is to improve the data efficiency of the REGNN solution reviewed in the previous section by enabling the explicit adaptation of the power control policy to the interference graph of each period , and hence across the changing topologies (see Fig. 1). To this end, we propose to transfer knowledge across a number of previously observed topologies in the form of an adaptation procedure for the power control policy. This is done by metalearning.
In order to enable metalearning, we assume the availability of channel information from previous periods. We denote the metatraining data set as , with being the channel matrices available for each period . Following standard practice in metalearning, each metatraining data set is split into training data and testing data [finn2017model], [simeone2020learning], and we write and to denote the indices of the slots assigned to each set. At test time, during deployment, the network observes a new topology for which it has access to a data set , which is generally small, to optimize the power allocation strategy.
The idea underlining metalearning is to leverage the historical data in order to optimize a learning algorithm that uses training data to obtain a well performing REGNN parameter vector for any new period , even when the training data set is of limited size. In practice, the training algorithm is either explicitly or implicitly defined by the solution of the learning problem (9) using the training data . The metatraining objective is represented as the optimization problem
(10) 
where the testing part of the perperiod data set
is used to obtain an unbiased estimate of the sumrate in (
II).In the next two sections, we propose two approaches to formulate and solve the metalearning problem (10). First, we adopt blackbox metalearning strategies that are based on a modelagnostic optimization approach [finn2017model],[nichol2018firstorder]. Then, we introduce a novel modular metalearning method, which aims at discovering common structural elements for the power allocation strategies across different interference graphs.
V Blackbox Metalearning
Blackbox metalearning addresses the metalearning problem (10) by adopting a generalpurpose optimizer for the perperiod learning problem (9) as the adaptation procedure . Specifically, we adopt model agnostic metalearning (MAML), a stateoftheart metalearning technique whose key idea is parametrizing the algorithm with an initialization vector used to tackle the inner problem (9) via SGD. In this section, we first develop MAML, as well as its simplified version, FOMAML, for power allocation via REGNNs. Then, we observe that blackbox metalearning does not affect the permutation equivariance of REGNNs highlighted in [eisen2020optimal].
Va MAML and FOMAML
MAML and FOMAML parametrize the adaptation algorithm with the initialization vector . Accordingly, assuming for simplicity a single step of gradient descent for problem (9), we have the training algorithm
(11) 
where denotes the learning rate and we have made explicit the dependence on the initialization in the notation . The update (11) can be directly generalized to include multiple GD steps, as well as a reduced size of the minibatch to implement SGD. Furthermore, the same update, and generalization thereof, apply also to the metatest period , yielding the model parameters .
With definition (11) of the training algorithm, MAML addresses the optimization problem (10), which is restated as the maximization
(12) 
over the initialization .
For the single GD update in (11), the metatraining problem in (12) is addressed by MAML using GD, which updates the initialization in the outer loop as
(13) 
where
denotes the identity matrix and
denotes the learning rate. Extensions to SGD are straightforward.The MAML update in (VA), requires computation of the Hessian of the REGNN mapping (8) with respect of the model parameters, which can be expensive. Firstorder methods, such as FOMAML [finn2017model], aim at circumventing the need for computation of higherorder derivatives. In particular, FOMAML ignores the Hessian terms in the updates of the shared parameters in (VA), obtaining the update
(14) 
Algorithm 1 provides a summary of FOMAML for power allocation. The algorithm has a nested loop structure, with the outer loop updating the shared initialization parameters and the inner loop carrying out the local model updates in (11).
VB Permutation Equivariance and Invariance
An important property of REGNNs is their equivariance to permutations [sandryhaila2013discrete]. In the context of wireless networks, this implies that a relabelling or reordering of the transmitters in the network produces the same permutation of the power allocation obtained via the REGNN (8). In this subsection, we briefly review this important property, and observe that the solution provided by blackbox metalearning is also permutation invariant.
Formally, let denote a permutation matrix such that the product reorders the entries of any given vector , and the product reorders the rows and columns of any given matrix . The output of the REGNN is permutation equivariant in the sense that, for a permutation matrix and channel matrix , we have
(15) 
This structural property is not satisfied by general fullyconnected models, in which a restructuring of the network would require an equivalent permutation of the weights. The equivariance of the optimal power control policy encodes the structure of the problem, as the labeling of the nodes is generally arbitrary.
By (15), the metalearning objective in (12) is permutation invariant in the sense that, for a permutation matrix and any realizations of the channel matrices , we have
(16) 
where . As a consequence of the invariance of the objective in (16), the initialization produced by MAML in (VA) is also invariant to permutations.
Vi Modular MetaLearning
The blackbox metalearning method described in the previous section aims at fast parametric generalization, sharing an initialization of the model parameters across periods. In this section, we propose a modular approach that aims at combinatorial generalization, finding a set of reusable modules that can form components of a solution for a new period. The distinction between the two approaches is illustrated in Fig 3. As seen in the figure, in modular metalearning, the adaptation algorithm selects the filters to be applied at each layer of the REGNN (8) from a shared module set , representing a repository of filter taps. The key idea is that the module set is optimized during metatraining, while it is fixed at runtime, enabling an efficient adaptation based on limited data via the selection of modules from .
Via Modular Metalearning
A module assignment is a mapping between the layers of the REGNN and the modules from the module set . Mathematically, the assignment is an dimensional vector, with the th element indicating the module assigned to layer at period . Thereby, the assignment vector can take
possible values. Let us represent the categorical variable
using a onehot representation , in which if , and otherwise. With this definition, we can write the output (7) of layer of the modular REGNN as(17) 
Using a recursive application of (17), for a given module set and module assignment vector , the transmit power can be found as the output of the modular REGNN as
(18) 
The objective during metatraining is to optimize a module set that allows the system to find a combination of effective modules for any new topology during deployment. This is done by formulating problem (10) as the maximization
(19) 
over the module set , where the learning algorithm selects the best possible assignment from set given CSI data . Accordingly, the training algorithm is given as a function of the module set as
(20) 
where the optimized assignment vector is
(21) 
ViB Determining the Module Assignment
The optimization (19) is a mixed continuousdiscrete problem over the module set and the assignment variables . To address this challenging problem, we define a stochastic module assignment function given by the conditional distribution
. This distribution assigns probabilities to each one of the
possible assignment vectors , given the module set and training data for the current period . We can now redefine the bilevel optimization problem in (19) as(22) 
where the inner optimization is over the distributions . Problems (22) and (19) are equivalent in the sense that they have the same solution. This is because the optimal distributions concentrate at the optimal module assignment vector (21). As detailed next, we propose to leverage the reparametrization trick to tackle the stochastic optimization in (22) via SGD.
To start, we model the module assignment distribution by using a meanfield factorization across the layers of the REGNN, i.e.,
(23) 
where is the th entry of the vector . This does not affect the equivalence of problems (19) and (22) since the deterministic solution given by (21) can be realized by (23). Then, we let
, be the vector of logits that parametrize the assignment probabilities through the softmax function as
(24) 
The GumbelMax trick [gumbel1954statistical], [hazan2012partition], [maddison2014sampling] provides a simple and efficient way to draw a sample from a categorical distribution with logits as
(25) 
where denotes the indicator function which equals one if the assignment is true, and zero otherwise; and represent independent Gumbel variables obtained as
(26) 
with
being independent uniform random variables, i.e.,
. Thereby, using the GumbelMax trick (25), the sampling of a discrete random variable is reduced to applying a deterministic function of the parameters
to noise variables drawn from a fixed distribution.The argmax operation in (25) is not differentiable, making the optimization of the parameter vectors via SGD infeasible. To address this issue, references [maddison2016concrete], [jang2016categorical] adopt the softmax function as a continuous, differentiable approximation. Samples from the resulting concrete distribution can be drawn according to
(27) 
where the variables are drawn according to (26). The temperature parameter controls the extent to which random variable resembles the onehot representation (25): As the temperature tends to zero, the sample becomes identical to .
Regardless of the value of the temperature, substituting the distribution with the distribution in (22) allows us to address the inner optimization problems in (22) over the assignment probabilities. To this end, the objective in (22) is estimated by drawing samples from (26) and plugging (27) into the objective function in (22). As a result, we obtain a differentiable function with respect to the parameters , which can now be optimized via SGD.
To elaborate, consider for simplicity a single sample of the Gumbel random variables in (26). For a fixed set , the inner optimization problem in (22) can be written as
(28) 
where we have defined
(29) 
The gradient of (VIB) with respect to can be easily calculated to carry out the updates of the inner problem in (22). For later reference, a single step of gradient descent, given the current module set yields the update
(30) 
where denotes the learning rate.
Tackling the outer optimization problem in (22) is more challenging. Specifically, the optimal parameters of the assignment distribution, e.g. (30), are a function of the module set, and hence updating set also requires the partial derivative with respect to the module parameters of the optimized for the inner maximization in (22). However, in a manner similar to FOMAML (and other firstorder blackbox methods such as [nichol2018firstorder]), we ignore the higherorder derivatives and update the parameters in the module set as
(31) 
where denotes the learning rate and the gradient with respect to the module parameters is computed at the previous iterate . Using (30) and (31), we can address (22) by iterating over optimizing the assignment probability given the current module set, and optimizing the module parameters given the optimized assignment probability.
ViC Optimization During Runtime
During metatesting, we consider the obtained module set as fixed. Using the training portion of the metatest data set , we only optimize the parameters of the distribution using (30), or, more practically, multiple gradient descent steps. The final REGNN is constructed by using the mode of the assignment distribution as
(32) 
yielding the REGNN
(33) 
Modular metalearning is summarized in Algorithm 2.
ViD Permutation Equivariance and Invariance
The modular nature of the REGNN in (VIA) does not violate the invariance properties of the individual filters, and of the module set by extension. To elaborate, observe that a single element in the assignment is nonzero, and, as a result, the output of the individual layers (17) is equivalent to (7), whose equivariance properties have been established in [eisen2020optimal]. Therefore, the composition in (VIA) is also equivariant, as in (15), and the objective in (19) is invariant to permutation for any realization of the channel matrix as in (16). We conclude that the optimal module set is invariant to permutations. In other words, any relabelling of the transmitters in the network will produce the same permutation of the power allocation without any modification of the taps in the module set.
Vii Experiments
In this section, we provide numerical results to elaborate on the advantages of blackbox and modular metalearning for power control in distributed wireless networks.
Viia Network and Channel Model
As in [eisen2020optimal], a random geometric graph in two dimensions comprised of nodes is drawn in each period by dropping each transmitter uniformly at random at location , with its paired receiver at location . Given the geometric placement, the fading channel state between transmitter and receiver is given by
(34) 
where the subscript p denotes the pathloss gain which is invariant during a period , and the subscript f denotes the fastfading component, which depend on the time slot . The constant pathloss gain is given as , where the pathloss exponent is set to . The fast fading component is random, and is drawn i.i.d. over indices and according to . Thereby, at each time slot , fading conditions change, and the instantaneous channel information is used by the model to generate the optimal power allocation. The noise power is set to dBm, and the maximum transmit power is set to dBm for all devices. The corresponding maximum average SINR over the topology generation is
(35) 
where in is a uniform random variable, i.e., , and follows from applying the Cavalieri’s quadrature formula. The large SNR implies that the system operates in the interferencelimited regime, justifying the need for optimized power control policies. All details for the network and channel model are summarized in Table I in Appendix B.
ViiB Model Architecture and Hyperparameters
As in [naderializadeh2020wireless], we consider a REGNN comprised of hidden layers, each containing a filter of size . The nonlinearity in (7) and (8) is a ReLU, given by , except for the output layer where we use a sigmoid. Unless stated otherwise, the number of modules is set to . In all experiments we set the input signal to an allone vector. We define an annealing schedule for the temperature in (27
) over epochs, whereby the temperature is decreased in every epoch by
, until it reaches a predetermined minimal value, set to [jang2016categorical]. All model hyperparameters are summarized in Table II in Appendix B.ViiC Data sets
We study the case in which where the number of nodes in the network, , is fixed, but the topology changes across periods; as well as the case in which the number of nodes in the network is also timevarying.
ViiC1 Fixed network size
In the first scenario, for a fixed number of links , each metatraining data set corresponds to a realization of the random drop of the transmitterreceiver pairs at period . Each drop is then run for slots, whereby the fading coefficients are sampled i.i.d. at each slot.
ViiC2 Dynamic network size
In the second scenario, the size of the network is chosen uniformly at random as . Each metatraining data set corresponds to a realization of the network size and to a random drop of the transmitterreceiver pairs as discussed above.
In both scenarios, unless stated otherwise, we set the number of metatraining periods to , and the training and the testing portions of the data set contain slots each. The metalearning hyperparameters are summarized in Table III in Appendix B.
ViiD Schemes and Benchmarks
We compare the performance of the following schemes:
ViiD1 Joint learning [eisen2020optimal]
Adopted in [eisen2020optimal], joint learning pools together all tasks in the data set , for all periods in order to address problem (9) with an additional outer sum over periods . The model parameters are then finetuned at runtime using the samples in the data set .
ViiD2 Blackbox metalearning (Blackbox ML)
As a representative blackbox metalearning method, we investigate the performance of FOMAML, as detailed in Algorithm 1. The number of gradient descent updates for both the taskspecific and the shared parameters is set to .
ViiD3 Modular metalearning (Modular ML)
We consider the proposed modular metalearning method, as detailed in Algorithm 2. The number of gradient descent updates for the assignment and the module parameters are set to and , respectively.
ViiE Results
ViiE1 Runtime adaptation speed
To start, we evaluate the requirements in terms of the number of samples in the data set for the new, metatest topology at runtime by plotting the sumrate as a function of the size of the data set in Fig. 4. We consider the more challenging case of networks with dynamic size. Fig. 4 confirms that metalearning can adapt quickly to a new topology, using a much reduced number of samples, as compared to joint learning [eisen2020optimal]. This validates the application of metalearning to challenging communication problems like power control. Furthermore, modular metalearning with both and modules is observed to outperform blackbox methods when few adaptation samples are available, with the caviat that, a single adaptation sample is insufficient to determine a suitable module assignment when the number of modules is sufficiently large (here ). This points to the benefits of a stronger (meta)inductive bias in the regime where data availability is very limited. In particular, in modular ML, the adaptation samples are only used determine the module assignment at runtime and not to optimize the module parameters. As the number of samples for adaptation increases, the number of required modules grows and eventually blackbox ML becomes advantageous. Overall, the results in Fig. 4 reveal a tension between the sample efficiency of modular ML and the flexibility of blackbox methods.
Comments
There are no comments yet.