Improving Accuracy, Area and Speed of Approximate Floating-Point Multiplication Using Carry Prediction
Subject Areas : Communication Systems & DevicesMarziye Fathi 1 , Hooman Nikmehr 2
1 - , Najafabad Branch, Islamic Azad University
2 - University of Isfahan, Isfahan, Iran
Keywords: estimated arithmetic, , partial product matrix, , rounding, , truncated multiplier, , error correction.,
Abstract :
The arithmetic units are the most essential in digital circuits’ construct, and the enhancement of their operation would optimize the whole digital system. Among them, multipliers are the most important operational units, used in a wide range of digital systems such as telecommunication signal processing, embedded systems and mobile. The main drawback of a multiplication unit is its high computational load, which leads to considerable power consumption and silicon area. This also reduces the speed that negatively affects the digital host functionality. Estimating arithmetic is a new branch of computer arithmetic implemented by discarding or manipulating a portion of arithmetic circuits and/or intermediate computations. Applying estimated arithmetic in arithmetic units would improve the speed, power consumption and the implementation area by sacrificing a slight amount of result accuracy. An estimated truncated floating-point multiplier for single precision operands which is capable of compensating the errors to a desired level by applying the least significant columns of the partial product matrix is developed and analyzed in this article. These errors are caused by removing a number of carry digits in the partial product matrix that have a direct contribution in rounding the floating-point numbers. The evaluation results indicate that the proposed method improves speed, accuracy and silicon area in comparison to those of the common truncated multiplication methods.
[1] B. Parhami, Computer Arithmetic, New York: Oxford University Press, 2000
[2] X. Guan,Y. Fei and H. Lin. "Hierarchical design of an application-specific instruction set processor for high-throughput and scalable FFT processing." IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 20, No.3, (2012), PP. 551-563
[3] A. V. Oppenheim, R. W. Schafer and J. R. Buck, Discrete-Time Signal Processing, USA: Prentice-Hall, 1998
[4] J. R. Choi, H. G. Kim, S. S. Han and S. C. Hwang, “Variable 2K/4K/8K-point FFT/IFFT with compact memory for OFDM-based DVB-T system,” International Conference on Systems and Informatics (ICSAI), May 2012, pp. 977-980
[5] H. M. Hassan, K. Mohammad and A. F. Shalash, “Implementation of a reconfigurable ASIP for high throughput low power DFT/DCT/FIR engine,” EURASIP Journal on Embedded Systems, No.1, 2012, pp. 1-18.
[6] J. Sohn and E. E. Swartzlander Jr,“ Improved architectures for a fused floating-point add-subtract unit,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 59, no. 10, 2012, pp. 2285-2291.
[7] X. Chen, A. Minwegen, Y. Hassan, D. Kammler, S. Li, T. Kempf and G. Ascheid, “efficient multi-mode MIMO detection using reconfigurable ASIP,” 20th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), April 2012, pp. 69-76.
[8] D. Menard, D. Chillet, F. Charot and O. Sentieys, “Automatic floating-point to fixed-point conversion for DSP code generation,” ACM. In Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems, October 2002, pp. 270-276.
[9] S. Z. Gilani, N. S. Kim and M. Schulte, “Virtual floating-point units for low-power embedded processors,” 23rd International Conference on application-Specific Systems, Architectures and Processors (ASAP), July 2012, pp. 61-68.
[10] S. Z. Gilani, N. S. Kim and M. Schulte, “Energy-efficient floating-point arithmetic for software-defined radio architectures,” 2011 IEEE International Conference on Application-Specific Systems, Architectures and Processors (ASAP), September 2011, pp. 122-129.
[11] P. Korkmaz, B. E. Akgul and K. V. Palem, “Energy, performance, and probability tradeoffs for energy-efficient probabilistic CMOS circuits,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 55, no. 8, 2008, pp. 2249-2262.
[12] D. Kelly, B. Phillips and S. Al-Sarawi, “Approximate signed binary integer multipliers for arithmetic data value speculation,” In Conference on Design & Architectures For Signal And Image Processing, 2009.
[13] Y. C. Lim, “Single-precision multiplier with reduced circuit complexity for signal processing applications,” Computers, IEEE Transactions on, vol. 41, no. 10, 1992, pp. 1333-1336.
[14] N. Petra, D. D. Caro, V. Garofalo, E. Napoli and A. G. Strollo, “Truncated binary multipliers with variable correction and minimum mean square error,” Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 57, no. 6, 2010, pp. 1312-1325.
[15] V. Garofalo, N. Petra and E. Napoli, “Analytical calculation of the maximum error for a family of truncated multipliers providing minimum mean square error,”Computers, IEEE Transactions on, vol. 60, no. 9, 2011, pp. 1366-1371.
[16] E. J. King and E. E. Swartzlander, “Data-dependent truncation scheme for parallel multipliers,” Conference Record of the Thirty-First Asilomar Conference on Signals, Systems & Computers, vol. 2, November 1997, pp. 1178-1182.
[17] E. E. Swartzlander, “Truncated multiplication with approximate rounding,” Conference Record of the Thirty-Third Asilomar Conference on Signals, Systems, and Computers, vol. 2, October 1999, pp. 1480-1483.