1. Introduction
The deep neural network (DNN) is a popular learning paradigm that can generalize to tasks from disparate domains while achieving stateoftheart performance. However, these networks are computationally heavyweight with regard to both compute and memory resources. For example, an outrageously large neural network with 32bit floating point, such as an LSTM with a mixture of experts (Shazeer et al., 2017)
, approximately requires 137 billion parameters. To manage the training and batch inference of these networks, hardware accelerators are employed, such as Google’s Tensor, Processing Unit to decrease latency and increase throughput, embedded and/or reconfigurable devices to mitigate power bottlenecks, or targeted ASICs to optimize the overall performance. A predominant factor contributing to the computational cost is the large footprint of primitives, known as multiplyandaccumulate (MAC) operations, which perform weighted summations of the neuronal inputs. Techniques such as sparsity and lowprecision representation
(Han et al., 2016; Wu et al., 2018; Chung et al., 2018; Colangelo et al., 2018) have been extensively studied to reduce the cost associated with MACs. For example, substituting 8bit fixedpoint for 32bit fixedpoint when performing inference on CIFAR10 with AlexNet reduces the energy consumption 6 (Hashemi et al., 2017). These techniques become a necessity when deploying DNNs on enddevices, such as AI on the edge or IoT devices.Of the methods used to mitigate these constraints, lowprecision techniques have shown the most promise. For example, linear and nonlinear quantization have been able to match 32bit floating point performance with 8bit fixedpoint and 8bit floating point accelerators (Jouppi et al., 2017; Reagen et al., 2016; Chung et al., 2018). However, quantizing to an ultralow bit precision, i.e.
8bits, can necessitate an increase in computational complexity. For example, a DNN has to be retrained or the number of hyperparameters significantly increased
(Mishra and Marr, 2018) to maintain performance. A more lightweight solution is to perform DNN training and inference at a lowprecision numerical format (fixedpoint, floating point, or posit (Gysel, 2016)) instead of quantizing a trained network (e.g. with 32bit floating point). Previous studies have compared DNN inference with lowprecision (e.g. 8bit) to highprecision floating point (e.g. 32bit) (Hashemi et al., 2017). However, these works compare numerical formats with disparate bitwidths and thereby do not fairly provide a comprehensive, holistic study of the network efficiency.The recently proposed posit numerical format offers wider dynamic range, better accuracy, and improved closure over IEEE754 floating point (Gustafson and Yonemoto, 2017). Fig. 1 shows intuitively that a natural posit distribution (e.g. 8bit posit, ) may be an optimal fit for representing DNN parameters (e.g. of ConvNet). In this work, we investigate the effectiveness of ultralow precision posits for DNN inference. The designs of several multiplyandaccumulate units for the posit, fixedpoint, and floating point formats at lowprecision are analyzed for resource utilization, latency, power consumption, and energydelayproduct. We carry out various classification tasks and compare the tradeoffs between accuracy degradation and hardware efficacy. Our results indicate that posits outperform at ultralow precision and can be realized at a similar cost to floating point in DNN accelerators.
2. Related Work
Since the late 1980s, lowprecision fixedpoint and floating point computation have been studied (Iwata et al., 1989; Hammerstrom, 1990)
. In recent years, research attention has increased towards deep learning applications. Multiple groups have demonstrated that 16bit fixedpoint DNNs can perform inference with trivial degradation in performance
(Courbariaux et al., 2014; Bengio, 2013). However, most of these works study DNN inference at varying bitprecision. There is a need for a more fair comparison between different number formats of corresponding bitwidth paired with FPGA soft cores. For instance, Hashemi et al. analyze 32bit fixedpoint and 32bit floating point DNN inference on three DNN architectures (LeNet, ConvNet, and AlexNet) and show that fixedpoint reduces the energy consumption by 12% while suffering a mere 0–1% accuracy drop (Hashemi et al., 2017). Recently, Chung et al. proposed a DNN accelerator (Brainwave) that increases inference throughput within a Stratix10 FPGA by 3 by substituting 8bit msfp8, a novel spatial floating point format, in place of 8bit fixedpoint (Chung et al., 2018).Several groups have previously studied the usage of the posit format in DNNs. Langroudi et al. study the efficacy of posit representations of DNN parameters and activations (Langroudi et al., 2018). The work demonstrates that DNN inference using 7bit posits endures
1% accuracy degradation on ImageNet classification using AlexNet and that posits have a 30% less ravenous memory footprint than fixedpoint for multiple DNNs while maintaining a
1% drop in accuracy. Cococcioni et al. review the effectiveness of posits for autonomous driving functions (Cococcioni et al., 2018). A discussion of a posit processing unit as an alternative to a floating point processing unit develops into an argument for posits as they exhibit a better tradeoff between accuracy and implementation complexity. Most recently, J. Johnson proposed a log float format which couples posits with a logarithmic EMAC operation referred to as exact loglinear multiplyadd (ELMA) (Johnson, 2018). Use of the novel format within ResNet50 achieves 1% accuracy deterioration for ImageNet classification, and the ELMA shows much lower power consumption than the IEEE754 floating point.In this work, we demonstrate that posit arithmetic at ultralow bitwidth is an innate fit for DNN inference. The EMACequipped, parameterized Deep Positron architecture is mounted on an FPGA soft processor and compares assiduously the fixedpoint, floating point, and posit formats at same bitwidth.
3. Background
3.1. Deep Neural Networks
The DNN is a connectionist, predictive model used commonly for classification and regression. These networks learn a nonlinear inputtooutput mapping in either a supervised, unsupervised, or semisupervised manner. Before being able to perform inference, a DNN is trained to minimize a cost function and update parameters, called weights and biases, using backpropagation. Customarily, either 16bit or 32bit floating point arithmetic is used for DNN inference. However, 32bit IEEE754 floating point representation maintains a massive dynamic range of over 80 decades, which is beyond the range required for DNNs. Thus, this design of numerical distribution yields low informationperbit based on Shannon maximum entropy
(Shannon, 1948). 16bit floating point, often present in NVIDIA accelerators, unveils the format’s limitations: nontrivial exception cases, underflow and overflow to infinity or zero, and redundant NaN and zero representations. Posit arithmetic offers an elegant solution to these limitations at generic bitwidth.3.2. Posit Numerical Format
The posit numerical format, a Type III unum, was proposed to improve upon the deficiencies of the IEEE754 floating point format and to address complaints about Type I and II unums (Gustafson and Yonemoto, 2017; Tichy, 2016). The posit format offers better dynamic range, accuracy, and program reproducibility than IEEE floating point. A posit number comprises bits and exponent bits, which controls the dynamic range. The primary divergence posit takes from floating point is the introduction of a signed, runlength encoded regime bitfield. The longer this field is, a posit number has lower precision but larger magnitude, and vice versa for shorter runlengths. Two posit bitstrings are reserved: for zero and for “Not a Real,” which can denote infinity, division by zero, etc.. The following shows the interpretation of a binary posit bitstring.
The numerical value a posit represents is then given by (1)
(1) 
where is the regime, is the unsigned exponent (), and is the value of the fraction bits. If a posit number is negative, the 2’s complement is taken before decoding. We recommend reviewing (Gustafson and Yonemoto, 2017) for a more thorough introduction and intuition to the posit format.
4. Methodology
We build off of (Carmichael et al., 2019), using the proposed Deep Positron architecture. The framework is parameterized by bitwidth, numerical type, and DNN hyperparameters, so networks of arbitrary width and depth can be constructed for the fixedpoint, floating point, and posit formats. The following sections further describe the EMAC operation and detail the EMAC algorithms for each numerical format.
4.1. Exact MultiplyandAccumulate (EMAC)
The multiplyandaccumulate (MAC) operation is ubiquitous within DNNs – each neuron computes a weighted sum of its inputs. In most implementations, this operation is usually inexact, meaning rounding or truncation results in accumulation of error. The EMAC mitigates this issue by implementing a variant of the Kulisch accumulator (Kulisch, 2013) and delaying error until every product of each layer has been accumulated. This minimization of local error becomes substantial at lowprecision. In each EMAC module, a wide register accumulates fixedpoint values and rounds in a deferred stage. For multiplications, the accumulator width is computed using (2)
(2) 
where and
are the maximum and minimum value magnitudes for a given numerical system, respectively. Each EMAC is pipelined into three stages: multiplication, accumulation, and rounding. A fourth stage, implementing the trivial activation function,
, is present for hidden layer neurons. For further introduction to EMACs and the exact dot product, we recommend reviewing (Kulisch, 2013; Carmichael et al., 2019).4.2. FixedPoint EMAC
We parameterize the fixedpoint EMAC as , the bitwidth, and , the number of fractional bits, where . Fig. 2 shows the block diagram design of the EMAC with signal bitwidths indicated. The functionality of the unit is described by Algorithm 1. The general characteristics of a fixedpoint number are given by the following.
4.3. Floating Point EMAC
The floating point EMAC is parameterized by , the number of exponent bits, and , the number of fractional bits. As all inputs and intermediate values in Deep Positron are realvalued, we do not consider “Not a Number” (NaN) or “ Infinity” in this implementation. Fig. 3 shows the floating point EMAC block diagram with labeled bitwidths of signals. A leadingzerosdetector (LZD) is used in converting from fixedpoint back to floating point. The EMAC functionality is expressed in Algorithm 3, and the relevant characteristics of the floating point format are computed as follows.
Algorithm 2 Floating point EMAC operation.
Subnormal Detection
Multiplication
Conversion to FixedPoint
Accumulate
Convert Back to Floating Point^{1}^{1}1Note that during the conversion back to floating point overflow handling is omitted for simplicity.
4.4. Posit EMAC
The posit EMAC, shown in Fig. 4, is parameterized by , the bitwidth, and , the number of exponential bits. In this implementation, we do not consider “Not a Real” as all DNN parameters and data are realvalued and posits do not overflow to infinity. Algorithm 3 describes the data extraction process for each EMAC input, which is more involved per the dynamic length regime. The EMAC employs this process as outlined by Algorithm 4.4. The relevant attributes of a given posit format are calculated using the following,
where can be thought of as the scale factor base, as shown in (1).
Algorithm 4 Posit EMAC operation for bit inputs each with exponent bits
Multiplication
Accumulation
Fraction & SF Extraction
Convergent Rounding & Encoding
5. Experimental Results
In all experiments, we synthesize the EMACs onto a Virtex7 FPGA (xc7vx485t2ffg1761c) using Vivado 2017.2. and expand upon the results from (Carmichael et al., 2019). With regard to energy and latency, the posit EMAC is competitive with the floating point EMAC. While using more resources for the same bitprecision, posits offer a wider dynamic range at fewer bits while maintaining a faster maximum operational frequency. Moreover, the energydelayproduct of the floating point and posit EMACs are comparable. The fixedpoint EMAC, obviously, is uncontested with its resource utilization and latency; its lack of an exponential parameter results in a far more slender accumulation register. However, fixedpoint offers poor dynamic range compared with the other formats at the same bitprecision.
Floating  Fixed  32bit  

Posit  Point  Point  Float  
Dataset  Inference Size  Acc. ()  Acc. ()  Acc. ()  Acc. 
WI Breast Cancer (Street et al., 1993)  190  85.9% (2)  77.4% (4)  57.8% (5)  90.1% 
Iris (Fisher, 1936)  50  98.0% (1)  96.0% (3)  92.0% (4)  98.0% 
Mushroom (Schlimmer, 1987)  2,708  96.4% (1)  96.4% (4)  95.9% (5)  96.8% 
MNIST (LeCun, 1998)  10,000  98.5% (1)  98.4% (4)  98.3% (5)  98.5% 
Fashion MNIST (Xiao et al., 2017)  10,000  89.6% (1)  89.6% (4)  89.2% (4)  89.5% 

The term “dense” is synonymous with a fullyconnected feedforward layer in a DNN.
The quantization error of a tensor is computed as the meansquarederror as shown in (3).
(3) 
Fig. 3 shows a layerwise heatmap of quantization error between formats for the MNIST and Fashion MNIST classification tasks. It is clear that posits suffer the least consequences from quantization, which is especially noticeable at 5bit precision.
We evaluate the inference accuracy of several feedforward three or fourlayer neural networks, instantiated on the Deep Positron accelerator, on five datasets. The baseline results are taken from networks trained and evaluated using standard IEEE754 floating point at 32bit precision. The inputs and weights of the trained networks are quantized from the 32bit floating point format to the desired numerical format (either bit posit, bit floating point, or bit fixedpoint) via roundtonearest with ties to even. The best performance is selected among bit formats with a sweep of the , , and parameters for the posit, floating point, and fixedpoint formats, respectively. Across all tasks, posit either outperforms or matches the performance of fixedpoint and floating point, as shown in Table 1. In some cases, an 8bit posit matches the performance of the 32bit floating point baseline. An interesting result is that both posit and floating point at 8bit precision improve upon the baseline performance for the Fashion MNIST task.
We compare energy, delay, and the energydelayproduct against the average Deep Positron performance across all formats with bitprecision. Figs. 6 and 7 depict the average accuracy degradation across the five classification tasks against these metrics for each bitwidth. Posit consistently outperforms at a slight cost in power. Fixedpoint maintains the lowest delay across all bitwidths, as expected, but offers the worst performance. While the floating point EMAC generally uses less power than the posit EMAC, the posit EMAC enjoys lower latencies across all bitwidths whilst maintaining lower accuracy degradation.
Design  (Jaiswal and So, 2018b)  (Chaurasiya et al., 2018)  (Podobas and Matsuoka, 2018)  (Chen et al., 2018)  (Lehóczky et al., 2018)  (Johnson, 2018)  This Work 

Device  Virtex6 FPGA/ASIC  Zynq7000 SoC/ASIC  Stratix V GX  Virtex7 VX690 & Ultrascale  Artix7 FPGA  ASIC  Virtex7 (xc7vx485t2ffg1761c) 
5SGXA7 FPGA  Plus VU3P FPGAs  FPGA  
Task    FIR Filter        Image Classification  Image Classification 
Dataset            ImageNet  WI Breast Cancer, Iris, Mush 
room, MNIST, Fashion MNIST  
Bitprecision  All  All  All  32  All  All, emphasized on 8  All, emphasized on 
Operations  Mul,Add/Sub  Mul,Add/Sub  Mul,Add/Sub  Quire  Quire  Quire  Quire 
Programming Language  Verilog  Verilog  C++ /OpenCL  Verilog  C#  OpenCl  VHDL 
Technology Node  40 nm / 90 nm  28 nm / 90 nm  28 nm  28 nm / 20 nm  28 nm  28 nm  28 nm 
5.1. Exploiting the Posit Parameter
Experimental results in this paper are evaluated by exploiting the performance of posit numerical formats with across five data sets. As is shown in Fig. 6, the energydelayproduct of the posit EMAC is dependent upon the parameter. For instance, the energydelayproduct of the posit EMAC with , on average, is 3 and 1.4 less than the energydelayproduct of the posit EMAC with and , respectively. On the other hand, the average performance of DNN inference with for the posit EMAC among the five datasets and bitprecision is 2% and 4% percent better than with and , respectively. Thus, Deep Positron equipped with the posit () EMAC has a better tradeoff between energydelayproduct and accuracy for bits. For 8bit, the results suggest that is a better fit for energyefficient applications and for accuracydependent applications.
5.2. Comparison with Other Posit Hardware Implementations
A summary of previous studies which design posit arithmetic hardware is shown in Table 2. Several groups implement posit basic arithmetic algorithms, such as addition, subtraction, multiplication, and exactdotproduct (Quire) on FPGA for various applications (Jaiswal and So, 2018b, a; Chaurasiya et al., 2018; Chen et al., 2018; Podobas and Matsuoka, 2018; Lehóczky et al., 2018; Johnson, 2018). Kumar et al. provided a hardware generator for posit addition, subtraction, and multiplication and showed reduced latency and area consumption of 32bit posit addition with over IEEE754 floating point addition (Jaiswal and So, 2018b, a). However, the comparison is between two different FPGA platforms which diminishes the merit of this comparison. They also ignore several characteristic demands for posit arithmetic, such as roundtonearest with ties to even or unbiased rounding. To better realize the advantages of posit arithmetic over IEEE754 floating point with complete posit arithmetic features, Chaurasiya et al. proposed a parameterized posit arithmetic hardware generator (Chaurasiya et al., 2018). They emphasized that resource utilization and energy of the posit arithmetic unit is comparable with IEEE754 float when the same number of bits are considered for both formats. However, the area consumption of the posit hardware is less than IEEE745 float at similar precision and dynamic range. To simplify and expedite hardware design, as well as improve the usability of posits on heterogeneous platforms, researchers in (Lehóczky et al., 2018) and (Podobas and Matsuoka, 2018) use highlevel languages, such as C# and OpenCL, to generate posit arithmetic hardware for FPGAs.
Most of the previous works do not support the exactdotproduct operation and do not design specialized posit arithmetic for deep learning applications as we presented in this paper. In (Carmichael et al., 2019), a parameterized FPGAmounted DNN accelerator is constructed which employs exactdotproduct algorithms for the posit, fixedpoint, and floating point formats. The paper shows strong preliminary results that posits are a natural fit for lowprecision inference. Proceeding this work, J. Johnson proposed an exact loglinear multiplyadd arithmetic algorithm for deep learning applications using a posit multiplier in the log domain and a Kulisch adder (Johnson, 2018)
. The results indicate better performance of 8bit posit multiplyadd over 8bit fixedpoint multiplyadd with similar accuracy for the ResNet50 neural network and ImageNet dataset. However, the paper targets an ASIC platform and convolutional neural network at 8bit precision whereas we study an FPGA implementation and fullyconnected neural network at
bitprecision.6. Conclusions
We demonstrate that the recent posit numerical system has a high affinity for deep neural network inference at 8bit precision. The proposed posit hardware is shown to be competitive with the floating point counterpart in terms of resource utilization and energydelayproduct. Moreover, the posit EMAC offers a superior maximum operating frequency over that of floating point. With regard to performance degradation, direct quantization to ultralow precision favors posits heavily, surpassing fixedpoint vastly. Moreover, the performance of floating point is either matched or surpassed consistently by posits across multiple datasets. The success of prospective new classes of learning algorithms will be coordinately contingent on the underlying hardware.
References
 (1)
 Bengio (2013) Bengio, Y. 2013. Deep Learning of Representations: Looking Forward. In Statistical Language and Speech Processing, AdrianHoria Dediu, Carlos MartínVide, Ruslan Mitkov, and Bianca Truthe (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 1–37.
 Carmichael et al. (2019) Carmichael, Z. et al. 2019. Deep Positron: A Deep Neural Network Using the Posit Number System. In Design, Automation & Test in Europe (DATE) Conference & Exhibition. IEEE.
 Chaurasiya et al. (2018) Chaurasiya, R. et al. 2018. Parameterized Posit Arithmetic Hardware Generator. In 2018 IEEE 36th International Conference on Computer Design (ICCD). IEEE, 334–341.
 Chen et al. (2018) Chen, J. et al. 2018. A matrixmultiply unit for posits in reconfigurable logic leveraging (open) CAPI. In Proceedings of the Conference for Next Generation Arithmetic. ACM, 1.
 Chung et al. (2018) Chung, E. et al. 2018. Serving DNNs in Real Time at Datacenter Scale with Project Brainwave. IEEE Micro 38, 2 (2018), 8–20.
 Cococcioni et al. (2018) Cococcioni, M. et al. 2018. Exploiting Posit Arithmetic for Deep Neural Networks in Autonomous Driving Applications. In 2018 International Conference of Electrical and Electronic Technologies for Automotive. 1–6. https://doi.org/10.23919/EETA.2018.8493233
 Colangelo et al. (2018) Colangelo, P. et al. 2018. Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs. 2018 IEEE 26th Annual International Symposium on FieldProgrammable Custom Computing Machines (FCCM) (2018), 73–80.
 Courbariaux et al. (2014) Courbariaux, M. et al. 2014. Low precision arithmetic for deep learning. CoRR abs/1412.7024 (2014).
 Fisher (1936) Fisher, R.A. 1936. The use of multiple measurements in taxonomic problems. Annals of eugenics 7, 2 (1936), 179–188.
 Gustafson and Yonemoto (2017) Gustafson, J.L. and Yonemoto, I.T. 2017. Beating Floating Point at its Own Game: Posit Arithmetic. Supercomputing Frontiers and Innovations 4, 2 (2017), 71–86.
 Gysel (2016) Gysel, P. 2016. Ristretto: HardwareOriented Approximation of Convolutional Neural Networks. CoRR abs/1605.06402 (2016).
 Hammerstrom (1990) Hammerstrom, D. 1990. A VLSI architecture for highperformance, lowcost, onchip learning. In 1990 IJCNN International Joint Conference on Neural Networks. 537–544 vol.2. https://doi.org/10.1109/IJCNN.1990.137621
 Han et al. (2016) Han, S. et al. 2016. Deep Compression  Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. Iclr (2016), 1–13.
 Hashemi et al. (2017) Hashemi, S. et al. 2017. Understanding the impact of precision quantization on the accuracy and energy of neural networks. In Design, Automation Test in Europe Conference Exhibition (DATE), 2017. 1474–1479. https://doi.org/10.23919/DATE.2017.7927224
 Iwata et al. (1989) Iwata, A. et al. 1989. An artificial neural network accelerator using general purpose 24 bits floating point digital signal processors. In IJCNN, Vol. 2. 171–182.
 Jaiswal and So (2018a) Jaiswal, M.K. and So, H.K.H. 2018a. Architecture generator for type3 unum posit adder/subtractor. In 2018 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 1–5.
 Jaiswal and So (2018b) Jaiswal, M.K. and So, H.K.H. 2018b. Universal number posit arithmetic generator on FPGA. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 1159–1162.
 Johnson (2018) Johnson, J. 2018. Rethinking floating point for deep learning. arXiv preprint arXiv:1811.01721 (2018).
 Jouppi et al. (2017) Jouppi, N.P. et al. 2017. InDatacenter Performance Analysis of a Tensor Processing Unit TM. (2017), 1–17.
 Kulisch (2013) Kulisch, U. 2013. Computer arithmetic and validity: theory, implementation, and applications. Vol. 33. Walter de Gruyter.
 Langroudi et al. (2018) Langroudi, S.H.F. et al. 2018. Deep Learning Inference on Embedded Devices: FixedPoint vs Posit. In 2018 1st Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2). 19–23. https://doi.org/10.1109/EMC2.2018.00012

LeCun (1998)
LeCun, Y. 1998.
The MNIST database of handwritten digits.
http://yann. lecun. com/exdb/mnist/ (1998).  Lehóczky et al. (2018) Lehóczky, Z. et al. 2018. Highlevel. NET software implementations of unum type I and posit with simultaneous FPGA implementation using Hastlayer. In Proceedings of the Conference for Next Generation Arithmetic. ACM, 4.
 Mishra and Marr (2018) Mishra, A. and Marr, D. 2018. WRPN & Apprentice: Methods for Training and Inference using LowPrecision Numerics. arXiv preprint arXiv:1803.00227 (2018).
 Podobas and Matsuoka (2018) Podobas, A. and Matsuoka, S. 2018. Hardware Implementation of POSITs and Their Application in FPGAs. In 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 138–145.
 Reagen et al. (2016) Reagen, B. et al. 2016. Minerva: Enabling lowpower, highlyaccurate deep neural network accelerators. In Proceedings of the 43rd International Symposium on Computer Architecture. IEEE Press, 267–278.
 Schlimmer (1987) Schlimmer, J.C. 1987. Concept acquisition through representational adjustment. (1987).
 Shannon (1948) Shannon, C.E. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948), 623–656.
 Shazeer et al. (2017) Shazeer, N. et al. 2017. Outrageously large neural networks: The sparselygated mixtureofexperts layer. arXiv preprint arXiv:1701.06538 (2017).

Street
et al. (1993)
Street, W.N. et al. 1993.
Nuclear feature extraction for breast tumor diagnosis. In
Biomedical Image Processing and Biomedical Visualization, Vol. 1905. International Society for Optics and Photonics, 861–871.  Tichy (2016) Tichy, W. 2016. Unums 2.0: An Interview with John L. Gustafson. Ubiquity 2016, September (2016), 1.
 Wu et al. (2018) Wu, S. et al. 2018. Training and inference with integers in deep neural networks. arXiv preprint arXiv:1802.04680 (2018).
 Xiao et al. (2017) Xiao, H. et al. 2017. Fashionmnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017).
Comments
There are no comments yet.