Block floating point

From Wikipedia, the free encyclopedia

Block floating point (BFP) is a method used to provide an arithmetic approaching floating point while using a fixed-point processor. BFP assigns a group of significands (the non-exponent part of the floating-point number) to a single exponent, rather than single significand being assigned its own exponent. BFP can be advantageous to limit space use in hardware to perform the same functions as floating-point algorithms, by reusing the exponent; some operations over multiple values between blocks can also be done with a reduced amount of computation.[1]

The common exponent is found by data with the largest amplitude in the block. To find the value of the exponent, the number of leading zeros must be found (count leading zeros). For this to be done, the number of left shifts needed for the data must be normalized to the dynamic range of the processor used. Some processors have means to find this out themselves, such as exponent detection and normalization instructions.[2][3]

Block floating-point algorithms were extensively studied by James Hardy Wilkinson.[4][5][6]

BFP can be recreated in software for smaller performance gains.

Hardware support[edit]

The following hardware supports BFP operations:

  • d-Matrix Jayhawk II[7][8]
  • Tenstorrent Grayskull e75 and e150 (BFP8, BFP4 and BFP2)[9]
  • Tenstorrent Wormhole n150 and n300 (BFP8, BFP4 and BFP2)[9]

See also[edit]

References[edit]

  1. ^ "Block floating point". BDTI DSP Dictionary. Berkeley Design Technology, Inc. (BDTI). Archived from the original on 2018-07-11. Retrieved 2015-11-01.
  2. ^ Chhabra, Arun; Iyer, Ramesh (December 1999). "TMS320C55x A Block Floating Point Implementation on the TMS320C54x DSP" (PDF) (Application report). Digital Signal Processing Solutions. Texas Instruments. SPRA610. Archived (PDF) from the original on 2018-07-11. Retrieved 2018-07-11.
  3. ^ Elam, David; Iovescu, Cesar (September 2003). "A Block Floating Point Implementation for an N-Point FFT on the TMS320C55x DSP" (PDF) (Application report). TMS320C5000 Software Applications. Texas Instruments. SPRA948. Archived (PDF) from the original on 2018-07-11. Retrieved 2015-11-01.
  4. ^ Wilkinson, James Hardy (1963). Rounding Errors in Algebraic Processes (1 ed.). Englewood Cliffs, NJ, USA: Prentice-Hall, Inc. MR 0161456.
  5. ^ Muller, Jean-Michel; Brisebarre, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Stehlé, Damien; Torres, Serge (2010). Handbook of Floating-Point Arithmetic (1 ed.). Birkhäuser. doi:10.1007/978-0-8176-4705-6. ISBN 978-0-8176-4704-9. LCCN 2009939668.
  6. ^ Overton, Michael L. (2001). Numerical Computing with IEEE Floating Point Arithmetic - Including One Theorem, One Rule of Thumb and One Hundred and One Exercises (1 ed.). Society for Industrial and Applied Mathematics (SIAM). ISBN 0-89871-482-6. 9-780898-714821-90000.
  7. ^ Clarke, Peter (2023-08-28). "Chiplet-base generative AI platform raises LLM performance". eeNews Europe. Retrieved 2024-04-23.
  8. ^ [SPCL_Bcast] A chiplet based generative inference architecture with block floating point datatypes. Retrieved 2024-04-23 – via www.youtube.com.
  9. ^ a b "Tenstorrent AI Accelerators" (PDF).

Further reading[edit]