1. Introduction
High precision arithmetic is necessary for several scientific and engineering applications. Some of the examples are Monte Carlo simulations, Kalman Filter, circuit simulations, and others
(Kapre and DeHon, 2012)(Liao et al., 2019). These applications require high precision arithmetic hardware for efficient execution (Chow et al., 2012). Floatingpoint arithmetic hardware has been challenging to design due to the intricacies involved in staying within the limits of the desired area and power budget (Leeser et al., 2014)(Volkova et al., 2019). Specifically, designing a portable floatingpoint unit (FPU) has been a complex task due to the precise requirements posed by the IEEE 7542008 (floatingpoint) standard. Recently, researchers and computer architects have either compromised on the compliance to the standard or devised their formats to overcome the design challenges (Gustafson and Yonemoto, 2017)(Burgess et al., 2019). Posit is one such data representation proposed by John L. Gustafson in the year 2017 which aims to overcome shortcomings of the floatingpoint format (Gustafson and Yonemoto, 2017).The posit arithmetic and data representation have several absolute advantages over floatingpoint arithmetic and format. The advantages are simpler hardware, smaller area and energy footprints, and higher dynamic range and numerical accuracy. In general, bit posit has a better dynamic range compared to the bit floatingpoint (Guntoro et al., 2020). In the past, researchers have shown that the bit floatingpoint arithmetic can be replaced by bit posit arithmetic units where . It has also been shown empirically that the replacement does not cause loss of accuracy, yet improves the area and energy footprints (Chaurasiya et al., 2018). The posit representation is a superset of the floatingpoint format and can serve as a dropin replacement for floatingpoint arithmetic.
Due to the advantages of the posit number system, several academic and industrial research labs have started exploring and studying applications that can benefit due to posits. The SoftPosit library supports earlystage investigation of posit for different applications in software (Leong, 2018). However, no such framework exists for hardware exploration. There is a dire need for an easily reconfigurable hardware platform for earlystage design space exploration of posit arithmetic for various applications. With everincreasing popularity and a conducive opensource ecosystem, we believe that RISCV (1) is an excellent vehicle to have a quintessential framework supporting posit arithmetic empiricism. We chose the BSV highlevel HDL (Bluespec Inc., 2020a) as the implementation language to enable rapid design space exploration through an easy reconfiguration of the hardware platform. The position of the proposed framework called Clarinet in the platform design cycle is delineated in Fig. 1 along with the posit arithmetic core, Melodica. The major contributions of this paper are:

We present Clarinet; a floatingpoint arithmetic enabled CPUbased framework for posit arithmetic empiricism. Clarinet is based on the RISCV ISA (with custom instructions for posit arithmetic), and is derived from the opensource Flute core developed by Bluespec Inc (Bluespec Inc., 2020b). The Clarinet framework also features a customized RISCV gcc toolchain to support the new instructions.

We present Melodica, a reconfigurable posit arithmetic core that supports fusedmultiplyaccumulate (FMA) with quire functionality, and typeconverters between floatingpoint, posit and quire data representations.

Through Clarinet, we also present a new usage model where posits and floatingpoint can coexist as independent types cleanly, allowing applications to be ported more easily to posits when they offer an advantage.

Finally, we investigate applications in the domain of linear algebra and computer vision to show the effectiveness of Clarinet as an experimental platform. For five different applications, we demonstrate that Clarinet supports tradeoffanalyses between performance, power, area, and accuracy for applications of interest. We also outline the easeofuse aspect of Clarinet.
Why addon?
We prefer Melodica as an addon feature in the Flute rather than a replacement for floatingpoint arithmetic hardware. With Clarinet, We are trying to enable researchers to study the advantages and disadvantages of posit arithmetic. A posit arithmetic empiricism; reasoning based on empirical data for posit arithmetic is needed to quantify the benefits. Furthermore, we see an opportunity for the floats and posits to coexists in a single platform to trade among power, performance, area, and accuracy.
As of now, we support a limited number of operations and quire to carryout experimental studies for our applications. We plan to extend Melodica with more functionality. The Melodica core is extensible to support operations demanded by the applications.
To the best of our knowledge, this is the firstever quire enabled RISCV CPU. The organization of the paper is as follows. In Section 2, we discuss posit, quire and float formats, the Flute core, and some of the recent implementations of posit arithmetic. Clarinet is described in Section 3, and Melodica in Section 4. Application analyses and benchmarking are presented in Section 5. Experimental setup and results are discussed in Section 6. We conclude our work in Section 7.
2. Background and Related Work
2.1. Background
2.1.1. Posits
A posit number is defined by two parameters: the width of the posit number, N, and the maximum width of the exponent field, es. One of the important advantages of the posit number format is that we can vary es to tradeoff between greater dynamic range (larger es) and greater precision (smaller es).
The posit format has four fields: a sign bit indicating positive or negative numbers, a regime and exponent field that together represent the scale, and finally, a fraction.

Sign (s): The MSB of the number. If the bit is set, the posit value is negative. In this case all remaining fields are represented in two’s complement notation.

Regime Field (r): The regime is used to compute the scale factor, k. In a posit number this field starts just after the sign bit and is terminated by a bit opposite to its leading bits. The computation of k is as per the equation 1, where r is the number of leading bits in the regime.

Exponent Field (exp): The exponent begins after the regime field and the maximum width of the exponent field is es.

Fraction Field (f): The remaining number of bits after the exponent make up the fraction. The fractional field is preceded by an implied hidden bit which is always 1.
For a number represented in the posit format, its value is as per the equation 2.
(1) 
(2) 
Bit pattern  Value 

0000_0000  0 
1000_0000  
All others  as per equation 2 
Posits do not have a representation for NaNs, or separate representations for and . Posits recognize only two special cases – zero and notareal (NaR), and support one rounding mode RoundtoNearestEven (RNE). Posit number system shows better accuracy around 1 than floatingpoint of the same size (de Dinechin et al., 2019). Table 1 summarizes the different bit representations with posits, using 8bit posits as an example. Posit and floatingpoint formats are depicted in Fig. 2.
2.1.2. Quire
The quire is a fixedpoint register that serves the purpose of accumulation like a Kulisch accumulator (Kulisch, 2002). The quire for a given positwidth is sized to represent the smallest posit squared, and the largest posit squared without any overflow. When the quire is used as an accumulator for a series of steps, it allows computation without intermediate rounding. The size of an Nbit quire is determined by where N is the posit number width (Fig. 2).
Numerical Examples:
Numerical examples of pi and dot product are shown in Fig. 3. For pi calculation, 32bit quire (q32) converges better compared to 32bit posit (p32) or 32bit floatingpoint (f32). In all our experiments, we have used 64bit floatingpoint as a reference. For iteration 11 there is a dramatic increase in normalized error (compared to 64bit floatingpoint) for p32 and f32, but only a marginal increase in error for q32.
In dot product, we use randomly generated vectors in the range of i)
to , and ii) to . We choose these ranges since the numbers are representable in all the formats. q32 outperforms p32, f32 and q24. Absolute error in f32 is observed to be 9.9534475E08, in p32 it is 1.0127508E08, in q24 it is 8.14790212E07, and in q32 it is 2.676927E09 in dot product of 10000 element vectors with the input data range of to . The loss of precision in q24 is close to one digit only while the reduction in the total bitwidth is 8 bits compared to f32. The elaborated experiments on the dot product are depicted in section 5.2. These preliminary experiments on pi calculation and dot product outline the superiority of using quire over 32bit floatingpoint arithmetic.2.1.3. Flute  A RISCV CPU
The Flute is an inorder opensource CPU based on the RISCV ISA, implemented using the BSV HLHDL. The Flute pipeline is nominally 5stages but longer for instructions like memory loadstores, integermultiply, or floatingpoint operations. The core is parameterized and can be configured to operate at 32bit or 64bit and supports the RV64GC variant of the RISCV ISA (1). The Flute core also supports a memory management unit (MMU) and is capable of booting the Linux operating system. The pipeline stages in Flute are:
 F::

Issue fetch requests to the instruction memory. The fetch stage can also handle compressed instructions.
 D::

Decode the fetched instruction. Checks for illegal instructions.
 E1::

The first execution stage. Reads the register files or accept forwarded values from earlier instructions. Execute all singlecycle opcodes meant for the integer ALU. Branches are resolved here. Discard speculative instructions.
 E2::

Execute multicycle operations, including floatingpoint operations. Multicycle operations are dispatched to their individual pipelines from this stage. If the instruction was executed in E1, this stage is just a passthrough.
 WB::

Collects responses from various multicycle pipelines, handle exceptions and asynchronous events like interrupts, and commit the instruction.
2.2. Related work
Since the inception of posit data representation and arithmetic, there have been several implementations of posit arithmetic in the literature. The early and opensource hardware implementations of posit adder and multiplier were presented in (Jaiswal and So, 2018) and (Jaiswal and So, 2018). In (Jaiswal and So, 2018), the authors have covered the design of a parametric adder/subtractor, while in (Jaiswal and So, 2018), the authors have presented parametric designs of floattoposit and posittofloat converters, and multiplier along with the design of adder/subtractor. The PACoGen opensource framework that can generate a pipelined adder/subtractor, multiplier, and divider are presented in (Jaiswal and So, 2019). The PACoGen is capable of generating the hardware units that can adapt precision at runtime. A more reliable implementation of a parametric posit adder and multiplier generator is presented in (Chaurasiya et al., 2018). A major drawback of the generator presented in (Chaurasiya et al., 2018) is that it is a nonpipelined design resulting in low operating frequency for large bitwidth adders and multipliers.
Cheetah presented in (Langroudi et al., 2019)
discusses the training of deep neural network (DNN) using posits. We believe that the architecture presented in
(Langroudi et al., 2019) is promising and some of the features can be incorporated in Melodica in the future.Apart from the mentioned efforts, there have been several other implementations of posit hardware units (Zhang et al., 2019)(Lu et al., 2019). More recently, (Tiwari et al., 2019) integrated a posit numeric unit as a functional unit with the Shakti CClass RISCV processor. The implementation does not support quire and reuses the floatingpoint infrastructure (including register file) to implement posit arithmetic. This limits the system to using 32bit or 64bit posits.
Unfortunately, none of the previous efforts are directed toward the consolidation of posit research. Further, they do not include an easytouse software framework which allows floatingpoint and posit types to cohabit in an application cleanly. We see here a need and an opportunity to consolidate the research in the domain of computer arithmetic by providing an opensource testbed, Clarinet. We also address the need for a software framework by introducing a programming model that allows floatingpoint and posit types to coexist as independent types in an application.
3. Clarinet
The system comprises two main components – Melodica, a parameterizable Posit Numeric Unit that implements quire described in Section 4, and Clarinet, a RISCV CPU that is enhanced with special instructions for posit arithmetic and a dedicated posit register file (PRF).
3.1. Clarinet organization
Clarinet’s organization is illustrated in Fig. (a)a. The starting point for Clarinet was a Flute CPU core, configured with the RV32IMAFC variant of the RISCV ISA (1).
Clarinet integrates Melodica as a functional execution unit parallel to the existing floating point unit. A new module hierarchy, (Fpipe), encapsulates both the existing floating point core, and the new Melodica core. A thin layer of logic in Fpipe directs the five new instructions to Melodica, while all other floating point instructions continue to be serviced by the FPU. Fpipe also routes responses from Melodica back to the Clarinet pipeline. Except for instructions that update the quire, all other instructions result in outputs from Melodica destined for the FPR, PRF and CSR RF.
3.2. Custom Instructions
In order to use the integrated Melodica execution unit we added five new instructions to the existing instruction set implemented in Flute. As shown in their bit representations in Fig. (b)b all the instructions belong to the Rformat type of the RISCV ISA. All five instructions use the FPOP value as defined in (1), for their sevenbit opcodes. In order to handle posit types, a new binary encoding 10 was introduced for the fmt field. In Rformat instructions, these bits occupy the LSB of the funct7 instruction field. Also new Rs2 binary encoding was introduced for the posit (10000) and quire (10001) types.

FMA.P: Multiplies two posit operands present in the PRF at Rs1 and Rs2, and accumulates the result into the quire. Do not update FCSR.FFLAGS.

FCVT.S.P: Converts the posit value in PRF at Rs1 to a floatingpoint value which is written to the FPR at Rd. This instruction may update FCSR.FFLAGS.

FCVT.P.S: Converts the floatingppint value in the FPR at Rs1 to a posit value which is written to the PRF at Rd. This instruction may update FCSR.FFLAGS.

FCVT.R.P: Converts the posit value in the PRF at Rs1 to a quire value which is written to the quire. Do not update FCSR.FFLAGS.

FCVT.P.R: Converts the value in the quire to a posit value which is written to the PRF at Rd. This instruction may update FCSR.FFLAGS.
The decision to add new instructions instead of reusing existing opcodes belonging to the F subset of the RISCV ISA was driven by two requirements – integrating quire functionality (which does not exist in floatingpoint), and typeconverter instructions that would allow posits and floatingpoint to coexist in an application as independent types.
The new typeconverter instructions allow existing programs to be run on Clarinet without the need to modify their original data segments as has been demonstrated in section 5.1. From our experiments as illustrated in Fig. 3, we realised that applications could see significant reductions in normalized error through the introduction of quirebased accumulation even when most of the computation remained in floatingpoint. When an application can benefit from the use of posits (be it greater dynamic range or accuracy), the typeconverter instructions allow the user to convert a part of the computation to posits and accumulate into the quire register. In order to do so, they would first need to convert their intermediate floatingpoint data to posits using the typeconverter instructions, before executing the FMA.P instruction that accumulates into the quire. Eventually, the results are converted back to the floatingpoint format before writing out to memory.
3.3. Integrating the quire
As indicated in Fig. 2 the recommended size of the quire can grow very rapidly with increasing positwidth. This implies that treating the quire register similar to an entry in one of the register files would be quite expensive as far as hardware resources are concerned. For instance, using 32bit posits would mean making a 512bit quire value available on the forwarding paths and from the register files. Further, providing a path from quire to memory (via modified load and store instructions) would require extensive modifications of the memory pipeline.
Clarinet takes a novel approach to integrating the quire. The quire can be updated directly using the new instructions (FCVT.R.P and FMA.P). However, in order to save hardware resources, there are no instructions to directly access the quire or read and write the quire to memory. To read the quire’s value it has to be first converted to a posit type using the instruction FCVT.P.R which would bring the converted value into the PRF. These decisions allow us to contain the cost of integrating the quire to just the actual storage for the quire register.
3.4. The posit register file
A key advantage of posits (especially with quire) is that it may be profitable to implement nonstandard widths for posit numbers while still retaining most of the precision advantages of operating with posits and quire. To this end, we introduced a new PRF into Clarinet – one that is sized to the value of the posit variables being handled by the Melodica core. While it would have been possible to reuse the floating point register file for posit operations, this would not have permitted the flexibility of benefiting from the use of narrower positwidths for applications that allowed lower bitwidths. The registers in the PRF may only be accessed by instructions which directly take posits as inputs or result in a posit output. A new register file implies the creation of a new bypass path to forward inflight posit operands from the output of the Fpipe to the input to E1. This new path is marked as pbypass in Fig. (a)a and handles only posit values.
4. Melodica
Melodica is a posit arithmetic unit implemented using BSV HLHDL. Melodica accepts three highlevel parameters: the positwidth (N), the maximum width of the exponent field (es) and floatwidth (FW). Melodica supports any size float input but for Clarinet floatwidth is set to 32. For an Nbit Melodica architecture (qw) sized quire is integrated with the operation pipelines as a specialpurpose register. Depending on the size of N, it is possible that the quire may not be sized to a multiple of byte. It delivers accumulator functionality using posit fusedmultiplyaccumulate (FMA) into the quire, and is meant to be used alongside a singleprecision floating point implementation for all other compute operations. In addition to the FMA computation Melodica implements a complete set of typeconverters between floatingpoint and posit formats, and between quire and posit formats.
Melodica’s organization is illustrated in Fig. 5. There are three computational steps involved in Melodica’s operation: i) extract: interpret the posit operands to extract the sign, regime, exponent, fraction fields and infinite/zero flag, ii) operate: perform the appropriate mathematical operation using one or more of the extracted posits or float operand, and iii) normalize: convert the output positfields back into an Nbit posit word.
4.1. Extract
The extractor unpacks posit operand into sign, scale and fraction bit fields, essentially converting from a format with variablewidth fields to one with fixedwidth fields. This conversion is essential for the subsequent pipelines to efficiently compute on posit fields. The scaling factor, scale, is determined using the r and the exp field as given by equation 3, where maximum posit scale width is psw and maximum posit fraction width is pfw.
(3) 
Extraction operates on the Nbit input posit word and generates four outputs as illustrated in Fig. (a)a. In Fig. 6 and Fig. 7 detection is defined using the det block. The steps involved are: i) check for special cases like 0 and to determine zeroinfinity flag (zif). If the sign of the posit number is negative, perform a two’s complement of the remaining N1 bits, ii) compute k using equation 1. The r field in general ends with a flipping bit but in the case when the number of exponent and fraction bits are zero then there may not be a flipping bit, iii) determine the value of the exponent. The exp may have up to es bits. We multiplex between the two cases when its field size is exactly es, and when size is variable (lesser than es). In the latter case the location of the exp field continues until the end of the posit, iv) calculate scale using equation 3. Calculate the f field by extracting remaining bits (if any) after the exp.
4.2. Normalize
The normalization illustrated in Fig. (b)b, is the reverse operation of extraction. It constructs a posit value from the constituent fields available after computation on the operands. There may be a loss in accuracy due to the rounding of fractional bits. The four steps involved in normalization are: i) computation of k and the exp bits from the scale value based on the equation 3, ii) construction and concatenation of the r and exp fields where regime bits are calculated based on runlength of 0s and 1s, iii) shift the f field by shifting r and exp field. The concatenated value is rounded to the nearest even depending on the f bits truncated in the previous stage and the truncation flag (tf), iv) check for special cases (zero, and NaN). If the sign bit is set, the final value is the two’s complement of the remaining N1 bits.
4.3. Operate
This stage in Melodica performs computations on the input operands. The particular operation performed is based on the opcode dispatched to Melodica. The five operations that are supported by Melodica are divided into two categories – type converters and compute.
Compute Operation – FusedMultiplyAccumulate (FMA)
The FMA, illustrated in Fig. (a)a computes the product of two input posit numbers and adds the result to the quire. Using quire as an accumulator helps preserve the overflow or underflow bits without the need to round the results.
The FMA is performed as follows: i) the hidden bit information is added to the input f fields depending on the posit value. Also check for corner cases – NaN, 0, and , ii) the f and sign fields are multiplied using integer multipliers, and the scales are added to create the scale of the output, iii) the product fraction is shifted using the new scale value to align with the quire’s integer and fraction fields. If the product is negative, appropriate twos complement is performed, iv) quire is added to the product of the operands using signed addition, and if there is overflow or underflow, the sum is rounded to the nearest even.
Typeconverter – FloattoPosit (FtoP)
The FtoP block converts a float input to posit format as depicted in Fig. (b)b. The conversion may result in a loss of precision when using narrower posit types. For a number represented in the posit format, its value is as per equation 4, where fsw is the exponent width and ffw is the fraction width of float.
The fixed length f field is interpreted directly from the input operand and mapped to the field in posit format using equation 2 and 4. The scale field after subtracting the bias is bounded between (p, p), where p equals . A truncation flag (tf) is asserted by the block depending on conversion between different size float and posit values. These flags are retained to perform rounding in later stages. The output of the FtoP needs normalization before writeback to the PRF.
(4) 
Typeconverter – PosittoFloat (PtoF)
The PtoF block converts a posit input to float format as illustrated in Fig. (c)c. The PtoF block receives its input after the Extract stage, and its output is directly sent to the FPR in Clarinet. Depending on the configuration parameters for Melodica this operation may result in a change in widths between source and target types.
The main operation involved in the conversion is to bound the scale between (bias, bias) for the target float format, and truncation of the f field if the fieldwidth for the target format is narrower than the source format using equation 2 and 4. In addition the converter also checks for special cases for the target format (zero, NaN, and ).
Typeconverter – QuiretoPosit (QtoP)
The QtoP block converts a value in the quire format to posit format so that it can be written to the PRF after normalization. After adjusting for a negative value, the scale and f are extracted from the quire as illustrated in Fig. (d)d. The truncation flag (tf) which is generated from truncating the fraction value, is sent to the Normalize block to control rounding to RNE.
Typeconverter – PosittoQuire (PtoQ)
The PtoQ block converts an input posit number after extraction to quire format, thereby initializing the quire. The fraction from the extractor block is shifted (based on the scale) and extended to occupy the corresponding field in quire as shown in Fig. 7e.
5. Case Studies
We cover case studies on some of the linear algebra kernels and optical flow in computer vision. We look into application kernels that are rich in floatingpoint arithmetic operations. For matrix operations, we develop a subset of basic linear algebra subprograms (BLAS) and linear algebra package (LAPACK) using SoftPosit for the analyses (Leong, 2018)
. Based on the investigation, we arrive at a suitable arithmetic size for each of the kernels in BLAS and LAPACK, and optical flow estimation using LucasKanade method. We use this information to tweak parameters in Clarinet to arrive at a customized Clarinet instance.
5.1. Using Clarinet  A Simple Example
Sl. No.  Instruction  Disassembly  RF/Quire Updates  Comments 

1  00052007  flw ft0, 0(a0)  FPR[0](ft0) < 0x40200000 
Load 2.50 to FPR ft0 from memory 
2  00452087  flw ft1, 4(a0)  FPR[1](ft1) < 0x40800000 
Load 4.00 to FPR ft1 from memory 
3  44000053  fcvt.p.s p0, ft0  PRF[0](p0) < 0x5400 
Execute FtoP on ft0. Result in PRF p0 
4  440080d3  fcvt.p.s p1, ft1  PRF[1](p1) < 0x6000 
Execute FtoP on ft1. Result in PRF p1 
5  c5110053  fcvt.r.p p2  Quire < 0x0 
Execute PtoQ on p2. Result in quire 
6  34100053  fma.p p0, p1  Quire < 0x00..0a00000000000000 
Accumulate (p0*p1) into quire 
7  34100053  fma.p p0, p1  Quire < 0x00..1400000000000000 
Accumulate (p0*p1) into quire 
8  d5000153  fcvt.p.r p2  PRF[2] < 0x7100 
Execute QtoP on Quire. Result in PRF p2 
9  41010153  fcvt.s.p ft2, p2  FPR[2]< 0x41a00000 
Execute PtoF on p2. Result in FPR ft2 
ClarinetMelodica introduces a new usage model for the posit programmer by focusing on quire functionality. While ClarinetMelodica does not offer support for operations like posit addition, subtraction, and multiplication through dedicated instructions, it does so via the FMA.P instruction. The example presented in Table 2 is of a simple case where a user loads two, 32bit floatingpoint numbers from memory and does a series of operations on them using the quire. In this example, ClarinetMelodica is configured to use 16bit posits. In particular, instruction number 6 illustrates how a user could use the FMA.P instruction to multiply two posit operands (by initializing the quire to zero). Furthermore, this form of multiplication does not suffer rounding error as the result accumulates into the quire. Similarly substituting the first or second operand in FMA.P as 1.0 would allow the user to add posits to or subtract posits from the quire.
5.2. BLAS and LAPACK
The BLAS and LAPACK are encountered in a wide range of engineering and scientific applications. In BLAS, we consider dot product (xDot), matrixvector (xGemv), matrixmatrix operations (xGemm), and in LAPACK, we consider Givens rotation (xGivens) where x denotes the data type used for the implementations. For all the matrix operations, we implement nine different versions using different data types for comparison and use 64bit floatingpoint implementation as a reference. We randomly generate numbers using rand() function. Since, Clarinet supports quire and FMA, we emphasize more on quire based implementations with using SoftPosit for our analyses. To calculate the error in xDot, we average the relative error over 100K runs. To calculate error in xGemv and xGemm, we use and , respectively where and are the operations computed in 64bit floatingpoint and and are the operations computed using SoftPosit.
The accurate digits in the different implementations are shown in Fig. 8. In dot product, the 32bit quire (q32Dot) results in 8.8 accurate digits for small (10) input vector sizes in range of 0 to 1 (Fig. 8a). For large vectors (10000) in the same range, the number of accurate digits drop to 8.2 which is a drop of 6.8%. In the same input rage, we observe a drop of 12.3% in fDot, 17.4% in p32Dot, and 9.3% in q24Dot. For the input vector range of 0 to 10 and the sizes of 10 to 10000, we observe a similar trend (Fig. 8b). Varying the range of input vectors impacts the accuracy heavily, specifically for large vectors. We observe a drop in the number of accurate digits by 55.6% in q32Dot, 53.94% in p32Dot, and 36.3% in q24Dot while in fDot it is 18.63% (Fig. 8c). The drop in accuracy is due to the fact that the posit and quire are more accurate for the values around 1.0 while as the input range shifts from 1.0 the accuracy deteriorates.
A similar trend is observed in xGemv, xGemm, and xGivens routines for increasing matrix sizes and varying ranges (Fig. 8d). A key observation here is that in p32Givens and q32Givens routines where we observe the number of accurate digits are significantly higher (8.2 and 8.8 respectively) compared to fGivens (6.79). The shaded region in Fig. 8 represents the routines that can be executed on the current version of Clarinet due to absence of posit addition, multiplication, division hardware. For implementation of routines in software we have used floatingpoint in conjunction with quire. For example, q32f32Givens is implementation of Givens rotation using combination of 32bit quire and 32bit floatingpoint arithmetic. The implementation yields similar accuracy as q32Givens since the majority of the operations are dominated by quire. In BLAS routines, the 100% of the arithmetic operations can be implemented using only quire. Based on the accurate digits and supported arithmetic in Clarinet, we can arrive at the quality of Clarinet which is further discussed in Section 6.2.
5.3. LucasKanade Optical Flow
LucasKanade is a differential method of tracking features given a sequence of frames. Given I as brightnessperpixel at (x,y), the local optical flow (velocity) vector (,) is given by equation 5.
(5) 




The LucasKanade method is used to calculate the optical flow for consecutive frames of rotating objects which are given in Fig. 9. We compare the different posit and singleprecision floatingpoint configuration combinations with 64bit floatingpoint values using SoftPosits, and generate heat maps of the absolute error for both u and v. The three configurations that are being compared are: i) 32bit singleprecision floatingpoint arithmetic (f32), ii) 32bit singleprecision float arithmetic combined with Nbit quire arithmetic (f32qN), iii) Nbit posit arithmetic and Nbit quire arithmetic (pNqN). Furthermore, owing to the better accuracy of posits around 1.0 we have normalized (norm) greyscale pixel values (0 to 255) to (0.0 to 16.0).
From the heatmaps in Fig. (a)a the effects of normalization and q32 on error become obvious. When working with normalized data, the configuration p32q32 clearly outperforms all other configurations. For data which is not normalized, the performance of p32q32 depends on whether the data naturally falls around 1.0. However, even in the non normalized case, f32q32 performs consistently better than f32. The general trend in maximum and RMS error for Rubik’s cube and sphere object frames for different configurations are shown in Fig. 11. The yaxis value for RMS error in Fig. 11 gives the number accurate digits for the configurations compared to 64bit floatingpoint. Allowing one decimal places of tolerance to error, p24q24norm configuration can give accurate results close to f32. With a penalty of 2 more decimal place p16q16norm can be a feasible alternative. When optical flow is computed for posit configurations for values not around 1.0 the accuracy falls.
Configurations  Rubik’s cube  Sphere 

p32q32norm  1.3415103400712e09  7.683256359993e09 
f32q32  1.5047392902563e09  1.6771673416184e08 
f32  1.9613096794474e09  2.4623573707996e08 
As summarised in Table 3 p32q32norm configuration results in an order improvement in accuracy as compared to f32 for the sphere dataset. The f32q32 configuration for grayscale pixel values (0255) improves the accuracy by 23% and 32% for Rubik’s cube and sphere dataset respectively.
6. Experimental Results
6.1. Implementation Setup
Different configurations of ClarinetMelodica were synthesized using Synopsys Design Compiler. All designs were synthesized with a clock frequency of 200 MHz, on a Faraday 90 nmCMOS Faraday process. No special memory cells were used to synthesize the register files or branch target buffers.
Melodica is not a complete posit implementation. It delivers accumulator functionality using Quire, and is meant to be used alongside a 32bit floatingpoint implementation. The baseline for comparisons is a 32bit RISCV Clarinet processor with support for 32bit floatingpoint arithmetic, and no Melodica. This is the minimum functionality required in a RISCV CPU to integrate Melodica.
For the purpose of comparison, the following five implementations were evaluated:

ClarinetBase: This is the baseline implementation, which features a 32entry, 32bit wide floatingpoint register file (FPR) and bypass logic, and an FPU which is capable of singleprecision arithmetic. ClarinetBase does not integrate a Melodica core, but does support the new positrelated custom instructions as described in Section 3.2.

ClarinetDouble: Support for 64bit floatingpoint arithmetic is added to the ClarinetBase implementation. The FPR is doubled to be 64bit wide and the bypass paths for floatingpoint values are suitably widened. The FPU arithmetic unit is now capable of processing 64bit floatingpoint operands.

ClarinetP161: Melodica configured with N=16 and es=1, is integrated into the ClarinetBase configuration. In this configuration, Melodica features a 128bit Quire. The PRF has 32, 16bit registers, and bypass logic.

ClarinetP242: Melodica configured with N=24 and es=2, is integrated into the ClarinetBase configuration. In this configuration, Melodica features a 288bit Quire. The PRF and bypass logic are widened to 24bit.

ClarinetP322: Melodica configured with N=32 and es=2, is integrated with the ClarinetBase configuration. In this configuration, Melodica features a 512bit Quire. The PRF and bypass logic are widened to 32bit.
As indicated in Table 4, adding support for 64bit floatingpoint leads to a nearly 72% increase in area over ClarinetBase. In comparison adding Melodica configured with N=16, es=1 adds approximately 9% area. Interestingly, the area overhead moving to wider values of N (24 and 32) is marginal (around 3%). The reason for this is Clarinet’s organization where moving from ClarinetBase to ClarinetP161, introduces the new PRF and bypass logic for posit types apart from the Melodica pipes themselves. On the other hand, moving to wider positwidths introduces no new structures, but simply widens existing ones.
Table 4 also notes the cell switching power. Cell switching power does not include the power dissipated due to net switching. Net switching, in particular the clock network, dominates overall power dissipation. Between 70% and 80% of the power is dissipated in the clock tree alone and is largely unchanged in the different configurations. For this reason, we found it more instructive to highlight the cell switching power which amplifies the effect each configuration has on dynamic power dissipation.
Implementation  Melodica  Total  Total  Cell Switching 

Gates  Gates  Area  Power (mW)  
ClarinetBase  
ClarinetDouble  
ClarinetP161  
ClarinetP242  
ClarinetP322 
6.2. Quality of Clarinet
Quality of Clarinet (QoC) is defined using equation 6.
(6) 
where is the area of Clarinetbase, is the are of the instance under consideration, and is the number of accurate digits in the instance under consideration. The QoC is a metric that incorporates platform configuration and application accuracy to measure the quality of Clarinet instances. A highquality implementation would mean an implementation with a low area footprint supporting high precision computations. An implementation supporting high precision computations can be of lowquality if the implementation incurs a high area footprint. A similar equation can be formulated for the power footprint of the instances of Clarinet.
We segregate the QoC in three zones : green zone, blue zone and yellow zone. The QoC in the green zone is superior to the rest of the zones and the quality of the platform is more than 0.85 (85%) while the QoC in blue zone is between 0.750.85 and in yellow zone the QoC is less than 0.75. The QoC for different BLAS and LAPACK routines is shown in Fig. 12. The routines marked with the red star can not be executed on the current implementation of Clarinet due to presence of posit squareroot and division in the routines. The QoC can not be defined as a constant for any instance, but it is a function of accuracy of the executing software application. The QoC for Rubik’s cube and sphere frames is depicted in Fig. 13. The routines that involve addition and multiplication of posits are implemented using FMA operation of Clarinet as described in Section 5.1. The QoC for Clarinetf32 is better than other Clarinet instances since it uses less area. But using Clarinetp32q32normRubik’s and Clarinetp32q32normSphere implementations, which have QoC of 87.2% and 86% respectively, we notice an order improvement in accuracy. As the accuracy varies the QoC varies and depending on the accuracy requirements of the application, a suitable instance of Clarinet can be considered.
6.3. Disclaimer and Limitations
Disclaimer: We notice that the implementation of qX2_fdp_add/sub in SoftPosit uses 32bit posit storage underneath while performing 24bit posit accumulations that results in a 512bit quire register. In hardware, we provide support for the 288bit quire register for the accumulation of 24bit posits. Based on our interactions with the developers of the SoftPosit library, we identify that it is fair to compare the accuracy of 24bit quire (accumulation of 24bit posits) in software and hardware.
Limitation 1: In its present form, Clarinet can execute applications that contain multiply, add, and multiplyaccumulate operations on posits while division and squareroot are not supported. The absence of the operations limits the application analyses and execution unless squareroot and division are implemented in software.
Limitation 2: While Melodica has seen extensive unitlevel verification, further systemlevel tests are in progress on Clarinet.
7. Conclusion
We presented Clarinet – an opensource hardwaresoftware framework that allows posit and floatingpoint arithmetic to coexist in experiments. A posit arithmetic core called Melodica was presented, and the design components of the core were described. Melodica is the firstever posit arithmetic core supporting quire that is integrated into a RISCV CPU. We delved into case studies on basic matrix operations and the LucasKanade optical flow method. The analyses of the kernels and application helped us to quantify the quality of different Clarinet instances. The advantages of different arithmetic formats were identified based on the accuracy of the numerical results. Finally, we presented synthesis results for Clarinet and outlined some of the limitations of the current implementation. We enable the researchers with a consolidated framework that can be used for experimental studies on posit arithmetic. We demonstrated highquality, mediumquality and lowquality implementations on Clarinet by segregating the implementations in green, blue and yellow zones. The implementations that fall in green zones are high quality while the implementations that fall in yellow zone are of poor quality. Our quality metic incorporated the area footprint of the platform. In the future, we plan to extend Melodica to support more operations, and also to explore its use as a positenabled accelerator.
References
 [1] Cited by: §1, §2.1.3, §3.1, §3.2.
 BSV HLHDL. GitHub. Note: https://github.com/BLangorg/bsc Cited by: §1.
 FLUTE RISCV Core. GitHub. Note: https://github.com/bluespec/Flute Cited by: 1st item.
 Bfloat16 processing for neural networks. In 2019 IEEE 26th ARITH, Vol. , pp. 88–91. External Links: Document, ISSN 10636889 Cited by: §1.
 Parameterized posit arithmetic hardware generator. In 2018 IEEE 36th ICCD, Vol. , pp. 334–341. External Links: Document, ISSN 10636404 Cited by: §1, §2.2.
 A mixed precision Monte Carlo methodology for reconfigurable accelerator systems. In Proceedings of the ACM/SIGDA ISFPGA, FPGA ’12, New York, NY, USA, pp. 57–66. External Links: ISBN 9781450311557, Link, Document Cited by: §1.
 Posits: the good, the bad and the ugly. In Proceedings of the CoNGA 2019, CoNGA’19, New York, NY, USA. External Links: ISBN 9781450371391, Link, Document Cited by: §2.1.1.
 Next generation arithmetic for edge computing. In 2020 Design, Automation Test in Europe Conference Exhibition (DATE), Vol. , pp. 1357–1365. Cited by: §1.
 Beating floating point at its own game: posit arithmetic. Supercomput. Front. Innov.: Int. J. 4 (2), pp. 71–86. External Links: ISSN 24096008, Link, Document Cited by: §1.
 Universal number posit arithmetic generator on fpga. In DATE 2018, Vol. , pp. 1159–1162. External Links: Document, ISSN 15581101 Cited by: §2.2.
 Architecture generator for type3 unum posit adder/subtractor. In ISCAS 2018, Vol. , pp. 1–5. External Links: Document, ISSN 2379447X Cited by: §2.2.
 PACoGen: a hardware posit arithmetic core generator. IEEE Access 7 (), pp. 74586–74601. External Links: Document, ISSN 21693536 Cited by: §2.2.
 : Spatial processors interconnected for concurrent execution for accelerating the spice circuit simulator using an FPGA. IEEE TCAD 31 (1), pp. 9–22. External Links: Document, ISSN 19374151 Cited by: §1.
 Advanced arithmetic for the digital computer: design of arithmetic units. SpringerVerlag, Berlin, Heidelberg. External Links: ISBN 3211838708 Cited by: §2.1.2.
 Cheetah: mixed lowprecision hardware & software codesign framework for dnns on the edge. External Links: 1908.02386 Cited by: §2.2.
 Make it real: effective floatingpoint reasoning via exact arithmetic. In DATE 2014, Vol. , pp. 1–4. External Links: Document, ISSN 15581101 Cited by: §1.
 External Links: Link Cited by: §1, §5.
 FPGA implementation of a kalmanbased motion estimator for levitated nanoparticles. IEEE TIM 68 (7), pp. 2374–2386. External Links: Document, ISSN 15579662 Cited by: §1.
 Training deep neural networks using posit number system. External Links: 1909.03831 Cited by: §2.2.
 PERI: A Posit Enabled RISCV Core. pp. 1–14. External Links: 1908.01466, Link Cited by: §2.2.
 Towards hardware iir filters computing just right: direct form i case study. IEEE Transactions on Computers 68 (4), pp. 597–608. External Links: Document, ISSN 23263814 Cited by: §1.

Efficient posit multiplyaccumulate unit generator for deep learning applications
. In ISCAS 2019, Vol. , pp. 1–5. External Links: Document, ISSN 21581525 Cited by: §2.2.
Comments
There are no comments yet.