1 Motivation
Complex Number Artificial Neural Net (CANN) is a natural consequence of using ANN for complex number, while complex number appear in many problems. The complex number problems include (Fourier, Laplace, Z, etc) transform based methods (Haberman, 2013) , steadystate problems (Banerjee, 1994) , mapping 2D problem to complex plane (Greenberg, 1998) , and electromagnetic signals such as MRI image (Virtue et al., 2017)
. Due to such necessity, CANN appears from 1990’s even long before the days of deep learning, as discussed in
(Reichert and Serre, 2013). As the result of various efforts, building blocks of CANN has been developed; Convolution, Fullyconnection, Backpropagation, Batchnormalization, Initialization, Pooling, and Activation. Among them, Complex Convolution and Fullyconnection are essentially multiplication and addition, which are clearly established for complex number in mathematics
(Trabelsi et al., 2017) . Backpropagation for complex number is developed for general complex number function (Georgiou and Koutsougeras, 1992; Nitta, 1997), and for holomorphic function (La Corte and Zou, 2014) . Complex Batchnormalization is also developed following the idea of real Batchnormalization (Trabelsi et al., 2017) . The complex number Initialization is suggested to be done in polarform (Trabelsi et al., 2017), not Cartesian. Max Pooling for complex number is another necessity, but the authors cannot see any research. Complex Activation is in active development
(Arjovsky et al., 2015; Guberman, 2016; Virtue et al., 2017; Georgiou and Koutsougeras, 1992); see (La Corte and Zou, 2014) for brief review about various types of complex activation functions.For complex activation, holomorphic (a.k.a. analytic) function has been an important topic (Hitzer, 2013; Jalab and Ibrahim, 2011; Vitagliano et al., 2003; Kim and Adali, 2001; Mandic and Goh, 2009; Tripathi and Kalra, 2010) , since it makes the backpropagation simpler thus training becomes faster (Amin et al., 2011; La Corte, 2014)
. The importance of Complex Phase Angle of CANN has been suggested, including similarity to biological neuron
(Reichert and Serre, 2013). The importance leads to the idea of phasepreserving complex activation function (Georgiou and Koutsougeras, 1992; Virtue et al., 2017). In contrast, (Hirose, 2013; Kim and Adalı, 2003) suggested that phasepreservation makes training more difficult.We focus on the complex activation, and propose a methods to generalize partitioned activation such as LReLU and SELU for complex number. There are 4 variations in the method to accommodate various cases; 1 of them is potentially holomorphic and alter phase, 1 is not holomorphic and alter phase while guarantees interaction between real and imaginary parts, and 2 are not holomorphic and potentially keep complex phase angle.
2 Review of Current Complex Activation Functions
Certain complexargument activation functions are derived from realargument counterparts, using 3 ways depending on the characteristic of each function; No change, Modification, and generalization. We will briefly discuss them with particular weight on the generalization, since our approach is to generalize.
2.1 No change
Certain nopartitioned activation functions can be used for complexargument without any change. Such functions include logistic, , , etc. These are already defined for complexargument, and we can just use them. But they suffer from poles near the origin (La Corte and Zou, 2014) .
2.2 Modification
Partitioned activation functions are not straightforward for complexargument, due to the partition points on the realaxis. So approach of “just take the idea of real activation, and create a corresponding complex activation” is tried. We call it modification. One example is modReLU (Arjovsky et al., 2015) as
(1) 
where , is the phase angle of , and
is a trainable parameter. The modReLU takes the idea of inactive region from ReLU, and forms region around origin. Unfortunately, the modReLU is not holomorphic, while keeps the complex phase.
2.3 Generalization
We call a complex activation Generalized , if the values on realaxis coincide with the realargument counterpart. The easiest generalization of partitioned activation function would be to separately apply activation function to real and imaginary parts (Nitta, 1997; Faijul Amin and Murase, 2009) . One example is Separate Complex ReLU (SCReLU) as
(2) 
where . Another simple generalization of ReLU is to activate only when both real and imaginary parts are positive, which is called zReLU (Guberman, 2016)
(3) 
Both SCReLU and ReLU are holomorphic, and are essentially not complex but 2 separate real activations (La Corte and Zou, 2014) . Another generalization of ReLU is Complex Cardioid (CC) (Virtue et al., 2017)
(4) 
which keeps the complex angle, but is not holomorphic. Our approach is inspired by the CC.
3 New Generalization Method
Consider a partitioned activation function for real number, which has the typical form
(5) 
where and are local functions for positive and negative regions. Activation functions like LReLU and SELU are examples of the above case. Now, replace partitions with Heaviside unitstep function , and we can rewrite as
(6) 
Next, select a complexargument function, whose real axis values coincide with the and . We pick
(7a) 
(7b) 
where is the phase angle of complex number , and (is an integer) is a parameter. Then, we can get a generalized complexargument function
(8) 
This approach can be easily extended to more complicated cases, like Sshaped ReLU (SReLU) which has 3 partitions. The typical form
(9) 
is rewritten as
(10) 
, then generalized to
(11) 
where denotes the phase angle of complex number , and is the location of partition boundary p.
In case real valued scale is preferred (eg: to keep the complex angle), we made simple modification for making replacement of real valued. The modifications are
(12a) 
and
(12b) 
and generalization using them are
(13a) 
and
(13b) 
respectively.
Another approach is to approximate on real axis with a complexargument function. A few well known such functions are based on (hyperbolic tangent) , (arc tangent), (sine integral), and (error function) functions. Among them, we pick
based Sigmoid function as below, since
and has poles on imaginary axis, and is oscillatory on real axis.(14) 
where is a just a parameter such that smaller more closely approximates on the real axis. Then approximately generalized functions are
(15) 
(16) 
respectively for (15) and (16) cases. An important advantage of this approximate generalization is that the is holomorphic , if all the are holomorphic. The reason is simple; a) logistic function is holomorphic, b) product and sum of holomorphic functions are also holomorphic. Then, each terms in (15) and (16) are holomorphic which are products of holomorphic, and is holomorphic which is sum of holomorphic. With using normalization (eg: BatchRenormalization (Ioffe, 2017)) is suggested to prevent exploding feature values toward .
3.1 Example 1: Leaky Rectified Linear Unit (LReLU)
The LReLU (Maas et al., 2013) is the one of the most popular activation function (The ReLU is just a LReLU with ). The realargument LReLU is
(17) 
and is generalized using (8) with n=0 to “Complex LReLU” (CLReLU) as
(18) 
which alters both phase and magnitude of input and has crossinfluence between real and imaginary components. To keep the complex angle of argument, we use (13a) and (13b) with n=0 to get
(19a) 
and
(19b) 
which we call “cos LReLU” (cLReLU) and “abs LReLU” (aLReLU), respectively. Meanwhile, a “Holomorphic LReLU” (HLReLU) can be derived using (15) as
(20) 
which is holomorphic, since both the and are polynomials which are all holomorphic. Please note that the HLReLU alters phase angle of argument, since the argument z is multiplied with a complex number.
3.2 Example 2: Scaled Exponential Linear Unit (SELU)
The SELU (Klambauer et al., 2017) was recently developed, and rapidly becomes popular due to its selfnormalizing feature (The ELU is just a SELU with ). The realargument SELU is
(21) 
where and are parameters with optimum values and . The SELU is generalized using (8), (13a), and (13b) with n=0 to “Complex SELU” (CSELU), “cos SELU” (cSELU) and, “abs SELU” (aSELU), respectively as
(22a) 
(22b) 
(22c) 
Complex phase is altered with all 3 CSELU, cSELU, and aSELU, unlike cLReLU (19a) and aLReLU (19b). Instead, all has interaction between real and imaginary components of argument.
Meanwhile, a Holomorphic “SELU” (HSELU) can be derived using (15) as
(23) 
which is also holomorphic, since both the scaled shifted exponential and polynomial are holomorphic.
4 Concluding Remark
A generalization of partitioned real number activation function to complex number has been proposed. The generalization process has 2 steps; a) Replace partition on activation with , b) Generalize with a complex number function. The generalization has 4 variations, and 1 of them is potentially holomorphic. The generalization scheme has been demonstrated using 2 popular partitioned activations; LReLU and SELU. The properties of generalized complex activations are summarized on table (1) .
Original Real Activation  Generalized Complex Activation  Holomorphic  RealComplex Interaction  Phase Preverving 
LReLU (17)  CLReLU (18)  X  O  X 
cLReLU (19a)  X  X  O  
aLReLU (19b)  X  X  O  
HLReLU (20)  O  O  X  
SELU (21)  CSELU (22a)  X  O  X 
cSELU (22b)  X  O  X  
aSELU (22c)  X  O  X  
HSELU (23)  O  O  X 
Furthermore, the method can be used for various partitioned real activation to make them into complex activation. He hope that this humble research adds another building block for complex ANN, which is important for complex number problems.
References

Amin et al. [2011]
Md Faijul Amin, Muhammad Ilias Amin, A. Y. H. AlNuaimi, and Kazuyuki Murase.
Wirtinger Calculus Based Gradient Descent and LevenbergMarquardt Learning Algorithms in ComplexValued Neural Networks.
In Neural Information Processing, Lecture Notes in Computer Science, pages 550–559. Springer, Berlin, Heidelberg, November 2011. ISBN 9783642249549 9783642249556. doi: 10.1007/9783642249556_66. URL https://link.springer.com/chapter/10.1007/9783642249556_66.  Arjovsky et al. [2015] Martin Arjovsky, Amar Shah, and Yoshua Bengio. Unitary Evolution Recurrent Neural Networks. arXiv:1511.06464 [cs, stat], November 2015. URL http://arxiv.org/abs/1511.06464. arXiv: 1511.06464.
 Banerjee [1994] Prasanta Kumar Banerjee. The Boundary Element Methods in Engineering. McgrawHill College, London ; New York, rev sub edition edition, January 1994. ISBN 9780077077693.
 Faijul Amin and Murase [2009] Md. Faijul Amin and Kazuyuki Murase. Singlelayered complexvalued neural network for realvalued classification problems. Neurocomputing, 72(4):945–955, January 2009. ISSN 09252312. doi: 10.1016/j.neucom.2008.04.006. URL http://www.sciencedirect.com/science/article/pii/S0925231208002439.

Georgiou and Koutsougeras [1992]
G. M. Georgiou and C. Koutsougeras.
Complex domain backpropagation.
IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 39(5):330–334, May 1992. ISSN 10577130. doi: 10.1109/82.142037. URL http://ieeexplore.ieee.org/document/142037/.  Greenberg [1998] Michael Greenberg. Advanced Engineering Mathematics. Prentice Hall, Upper Saddle River, N.J, 2 edition edition, January 1998. ISBN 9780133214314. GoogleBooksID: CpoUngEACAAJ.
 Guberman [2016] Nitzan Guberman. On Complex Valued Convolutional Neural Networks. arXiv:1602.09046 [cs], February 2016. URL http://arxiv.org/abs/1602.09046. arXiv: 1602.09046.

Haberman [2013]
Richard Haberman.
Applied Partial Differential Equations: With Fourier Series and Boundary Value Problems
. PEARSON, 2013. ISBN 9780321797056. GoogleBooksID: hGNwLgEACAAJ.  Hirose [2013] Akira Hirose, editor. Complexvalued neural networks: advances and applications. IEEE Press series on computational intelligence. John Wiley & Sons Inc, Hoboken, N.J, 2013. ISBN 9781118344606. OCLC: ocn812254892.
 Hitzer [2013] Eckhard Hitzer. Nonconstant bounded holomorphic functions of hyperbolic numbers  Candidates for hyperbolic activation functions. arXiv:1306.1653 [cs, math], June 2013. URL http://arxiv.org/abs/1306.1653. arXiv: 1306.1653.
 Ioffe [2017] Sergey Ioffe. Batch Renormalization: Towards Reducing Minibatch Dependence in BatchNormalized Models. arXiv:1702.03275 [cs], February 2017. URL http://arxiv.org/abs/1702.03275. arXiv: 1702.03275.
 Jalab and Ibrahim [2011] Hamid A. Jalab and Rabha W. Ibrahim. New activation functions for complexvalued neural network. International Journal of Physical Sciences, 6(7):1766–1772, 2011.
 Kim and Adali [2001] Taehwan Kim and Tülay Adali. Complex backpropagation neural network using elementary transcendental activation functions. In Acoustics, Speech, and Signal Processing, 2001. Proceedings.(ICASSP’01). 2001 IEEE International Conference on, volume 2, pages 1281–1284. IEEE, 2001.

Kim and Adalı [2003]
Taehwan Kim and Tülay Adalı.
Approximation by fully complex multilayer perceptrons.
Neural computation, 15(7):1641–1666, 2003. URL https://www.mitpressjournals.org/doi/abs/10.1162/089976603321891846.  Klambauer et al. [2017] Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. SelfNormalizing Neural Networks. arXiv:1706.02515 [cs, stat], June 2017. URL http://arxiv.org/abs/1706.02515. arXiv: 1706.02515.
 La Corte [2014] Diana Thomson La Corte. Newton’s Method Backpropagation for ComplexValued Holomorphic Neural Networks: Algebraic and Analytic Properties. Theses and Dissertations, August 2014. URL https://dc.uwm.edu/etd/565.
 La Corte and Zou [2014] Diana Thomson La Corte and Yi Ming Zou. Newton’s Method Backpropagation for ComplexValued Holomorphic Multilayer Perceptrons. 2014 International Joint Conference on Neural Networks (IJCNN), pages 2854–2861, June 2014. doi: 10.1109/IJCNN.2014.6889384. URL http://arxiv.org/abs/1406.5254. arXiv: 1406.5254.
 Maas et al. [2013] Andrew L. Maas, Awni Y. Hannun, and Andrew Y. Ng. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3, 2013.
 Mandic and Goh [2009] Danilo P. Mandic and Vanessa Su Lee Goh. Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models. John Wiley & Sons, April 2009. ISBN 9780470742631. GoogleBooksID: MaW8MaIkztUC.
 Nitta [1997] Tohru Nitta. An Extension of the BackPropagation Algorithm to Complex Numbers. Neural Networks, 10(8):1391–1415, November 1997. ISSN 08936080. doi: 10.1016/S08936080(97)000361. URL http://www.sciencedirect.com/science/article/pii/S0893608097000361.
 Reichert and Serre [2013] David P. Reichert and Thomas Serre. Neuronal Synchrony in ComplexValued Deep Networks. arXiv:1312.6115 [cs, qbio, stat], December 2013. URL http://arxiv.org/abs/1312.6115. arXiv: 1312.6115.
 Trabelsi et al. [2017] Chiheb Trabelsi, Olexa Bilaniuk, Ying Zhang, Dmitriy Serdyuk, Sandeep Subramanian, João Felipe Santos, Soroush Mehri, Negar Rostamzadeh, Yoshua Bengio, and Christopher J. Pal. Deep Complex Networks. arXiv:1705.09792 [cs], May 2017. URL http://arxiv.org/abs/1705.09792. arXiv: 1705.09792.
 Tripathi and Kalra [2010] B. K. Tripathi and P. K. Kalra. High Dimensional Neural Networks and Applications. In Intelligent Autonomous Systems, Studies in Computational Intelligence, pages 215–233. Springer, Berlin, Heidelberg, 2010. ISBN 9783642116759 9783642116766. URL https://link.springer.com/chapter/10.1007/9783642116766_10. DOI: 10.1007/9783642116766_10.
 Virtue et al. [2017] Patrick Virtue, Stella X. Yu, and Michael Lustig. Better than Real: Complexvalued Neural Nets for MRI Fingerprinting. arXiv:1707.00070 [cs], June 2017. URL http://arxiv.org/abs/1707.00070. arXiv: 1707.00070.
 Vitagliano et al. [2003] Francesca Vitagliano, Raffaele Parisi, and Aurelio Uncini. Generalized splitting 2d flexible activation function. In Italian Workshop on Neural Nets, pages 85–95. Springer, 2003.
Comments
There are no comments yet.