A Generalization Method of Partitioned Activation Function for Complex Number

02/08/2018 ∙ by HyeonSeok Lee, et al. ∙ Yonsei University 0

A method to convert real number partitioned activation function into complex number one is provided. The method has 4em variations; 1 has potential to get holomorphic activation, 2 has potential to conserve complex angle, and the last 1 guarantees interaction between real and imaginary parts. The method has been applied to LReLU and SELU as examples. The complex number activation function is an building block of complex number ANN, which has potential to properly deal with complex number problems. But the complex activation is not well established yet. Therefore, we propose a way to extend the partitioned real activation to complex number.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Motivation

Complex Number Artificial Neural Net (CANN) is a natural consequence of using ANN for complex number, while complex number appear in many problems. The complex number problems include (Fourier, Laplace, Z, etc) transform based methods (Haberman, 2013) , steady-state problems (Banerjee, 1994) , mapping 2D problem to complex plane (Greenberg, 1998) , and electromagnetic signals such as MRI image (Virtue et al., 2017)

. Due to such necessity, CANN appears from 1990’s even long before the days of deep learning, as discussed in

(Reichert and Serre, 2013)

. As the result of various efforts, building blocks of CANN has been developed; Convolution, Fully-connection, Back-propagation, Batch-normalization, Initialization, Pooling, and Activation. Among them, Complex Convolution and Fully-connection are essentially multiplication and addition, which are clearly established for complex number in mathematics

(Trabelsi et al., 2017) . Back-propagation for complex number is developed for general complex number function (Georgiou and Koutsougeras, 1992; Nitta, 1997), and for holomorphic function (La Corte and Zou, 2014) . Complex Batch-normalization is also developed following the idea of real Batch-normalization (Trabelsi et al., 2017) . The complex number Initialization is suggested to be done in polar-form (Trabelsi et al., 2017)

, not Cartesian. Max Pooling for complex number is another necessity, but the authors cannot see any research. Complex Activation is in active development

(Arjovsky et al., 2015; Guberman, 2016; Virtue et al., 2017; Georgiou and Koutsougeras, 1992); see (La Corte and Zou, 2014) for brief review about various types of complex activation functions.

For complex activation, holomorphic (a.k.a. analytic) function has been an important topic (Hitzer, 2013; Jalab and Ibrahim, 2011; Vitagliano et al., 2003; Kim and Adali, 2001; Mandic and Goh, 2009; Tripathi and Kalra, 2010) , since it makes the back-propagation simpler thus training becomes faster (Amin et al., 2011; La Corte, 2014)

. The importance of Complex Phase Angle of CANN has been suggested, including similarity to biological neuron

(Reichert and Serre, 2013). The importance leads to the idea of phase-preserving complex activation function (Georgiou and Koutsougeras, 1992; Virtue et al., 2017). In contrast, (Hirose, 2013; Kim and Adalı, 2003) suggested that phase-preservation makes training more difficult.

We focus on the complex activation, and propose a methods to generalize partitioned activation such as LReLU and SELU for complex number. There are 4 variations in the method to accommodate various cases; 1 of them is potentially holomorphic and alter phase, 1 is not holomorphic and alter phase while guarantees interaction between real and imaginary parts, and 2 are not holomorphic and potentially keep complex phase angle.

This article is structured as follows. Sec. 2 is a brief review about existing Complex Activation functions. Then Sec. 3 presents our generalization method along with examples of LReLU (Maas et al., 2013) and SELU (Klambauer et al., 2017) . Finally, Sec. 4 summarize the article.

2 Review of Current Complex Activation Functions

Certain complex-argument activation functions are derived from real-argument counterparts, using 3 ways depending on the characteristic of each function; No change, Modification, and generalization. We will briefly discuss them with particular weight on the generalization, since our approach is to generalize.

2.1 No change

Certain no-partitioned activation functions can be used for complex-argument without any change. Such functions include logistic, , , etc. These are already defined for complex-argument, and we can just use them. But they suffer from poles near the origin (La Corte and Zou, 2014) .

2.2 Modification

Partitioned activation functions are not straightforward for complex-argument, due to the partition points on the real-axis. So approach of “just take the idea of real activation, and create a corresponding complex activation” is tried. We call it modification. One example is modReLU (Arjovsky et al., 2015) as

(1)

where , is the phase angle of , and

is a trainable parameter. The modReLU takes the idea of inactive region from ReLU, and forms region around origin. Unfortunately, the modReLU is not holomorphic, while keeps the complex phase.

2.3 Generalization

We call a complex activation Generalized , if the values on real-axis coincide with the real-argument counter-part. The easiest generalization of partitioned activation function would be to separately apply activation function to real and imaginary parts (Nitta, 1997; Faijul Amin and Murase, 2009) . One example is Separate Complex ReLU (SCReLU) as

(2)

where . Another simple generalization of ReLU is to activate only when both real and imaginary parts are positive, which is called zReLU (Guberman, 2016)

(3)

Both SCReLU and ReLU are holomorphic, and are essentially not complex but 2 separate real activations (La Corte and Zou, 2014) . Another generalization of ReLU is Complex Cardioid (CC) (Virtue et al., 2017)

(4)

which keeps the complex angle, but is not holomorphic. Our approach is inspired by the CC.

3 New Generalization Method

Consider a partitioned activation function for real number, which has the typical form

(5)

where and are local functions for positive and negative regions. Activation functions like LReLU and SELU are examples of the above case. Now, replace partitions with Heaviside unit-step function , and we can rewrite as

(6)

Next, select a complex-argument function, whose real axis values coincide with the and . We pick

(7a)
(7b)

where is the phase angle of complex number , and (is an integer) is a parameter. Then, we can get a generalized complex-argument function

(8)

This approach can be easily extended to more complicated cases, like S-shaped ReLU (SReLU) which has 3 partitions. The typical form

(9)

is rewritten as

(10)

, then generalized to

(11)

where denotes the phase angle of complex number , and is the location of partition boundary p.

In case real valued scale is preferred (eg: to keep the complex angle), we made simple modification for making replacement of real valued. The modifications are

(12a)

and

(12b)

and generalization using them are

(13a)

and

(13b)

respectively.

Another approach is to approximate on real axis with a complex-argument function. A few well known such functions are based on (hyperbolic tangent) , (arc tangent), (sine integral), and (error function) functions. Among them, we pick

based Sigmoid function as below, since

and has poles on imaginary axis, and is oscillatory on real axis.

(14)

where is a just a parameter such that smaller more closely approximates on the real axis. Then approximately generalized functions are

(15)
(16)

respectively for (15) and (16) cases. An important advantage of this approximate generalization is that the is holomorphic , if all the are holomorphic. The reason is simple; a) logistic function is holomorphic, b) product and sum of holomorphic functions are also holomorphic. Then, each terms in (15) and (16) are holomorphic which are products of holomorphic, and is holomorphic which is sum of holomorphic. With using normalization (eg: Batch-Renormalization (Ioffe, 2017)) is suggested to prevent exploding feature values toward .

3.1 Example 1: Leaky Rectified Linear Unit (LReLU)

The LReLU (Maas et al., 2013) is the one of the most popular activation function (The ReLU is just a LReLU with ). The real-argument LReLU is

(17)

and is generalized using (8) with n=0 to “Complex LReLU” (CLReLU) as

(18)

which alters both phase and magnitude of input and has cross-influence between real and imaginary components. To keep the complex angle of argument, we use (13a) and (13b) with n=0 to get

(19a)

and

(19b)

which we call “cos LReLU” (cLReLU) and “abs LReLU” (aLReLU), respectively. Meanwhile, a “Holomorphic LReLU” (HLReLU) can be derived using (15) as

(20)

which is holomorphic, since both the and are polynomials which are all holomorphic. Please note that the HLReLU alters phase angle of argument, since the argument z is multiplied with a complex number.

3.2 Example 2: Scaled Exponential Linear Unit (SELU)

The SELU (Klambauer et al., 2017) was recently developed, and rapidly becomes popular due to its self-normalizing feature (The ELU is just a SELU with ). The real-argument SELU is

(21)

where and are parameters with optimum values and . The SELU is generalized using (8), (13a), and (13b) with n=0 to “Complex SELU” (CSELU), “cos SELU” (cSELU) and, “abs SELU” (aSELU), respectively as

(22a)
(22b)
(22c)

Complex phase is altered with all 3 CSELU, cSELU, and aSELU, unlike cLReLU (19a) and aLReLU (19b). Instead, all has interaction between real and imaginary components of argument.

Meanwhile, a Holomorphic “SELU” (HSELU) can be derived using (15) as

(23)

which is also holomorphic, since both the scaled shifted exponential and polynomial are holomorphic.

4 Concluding Remark

A generalization of partitioned real number activation function to complex number has been proposed. The generalization process has 2 steps; a) Replace partition on activation with , b) Generalize with a complex number function. The generalization has 4 variations, and 1 of them is potentially holomorphic. The generalization scheme has been demonstrated using 2 popular partitioned activations; LReLU and SELU. The properties of generalized complex activations are summarized on table (1) .

Original Real Activation Generalized Complex Activation Holomorphic Real-Complex Interaction Phase Preverving
LReLU (17) CLReLU (18) X O X
cLReLU (19a) X X O
aLReLU (19b) X X O
HLReLU (20) O O X
SELU (21) CSELU (22a) X O X
cSELU (22b) X O X
aSELU (22c) X O X
HSELU (23) O O X
Table 1: Properties of the Generalized Complex Activation

Furthermore, the method can be used for various partitioned real activation to make them into complex activation. He hope that this humble research adds another building block for complex ANN, which is important for complex number problems.

References