SPRACN6 July   2019 TMS320F28384D , TMS320F28384D-Q1 , TMS320F28384S , TMS320F28384S-Q1 , TMS320F28386D , TMS320F28386D-Q1 , TMS320F28386S , TMS320F28386S-Q1 , TMS320F28388D , TMS320F28388S , TMS320F28P550SJ , TMS320F28P559SJ-Q1 , TMS320F28P650DH , TMS320F28P650DK , TMS320F28P650SH , TMS320F28P650SK , TMS320F28P659DK-Q1

 

  1.   Fast Integer Division – A Differentiated Offering From C2000 Product Family
    1.     Trademarks
    2. 1 Introduction
    3. 2 Different Division Functions
      1. 2.1 Truncated Division or Traditional Division
      2. 2.2 Floored Division or Modulo Division
      3. 2.3 Euclidean Division
    4. 3 Intrinsic Support Through TI C2000 Compiler
      1. 3.1 Software Examples
    5. 4 Cycle Count
    6. 5 Summary
    7. 6 References

Cycle Count

The cycles for the different types of division operations and sizes of the operands are listed below. These can be profiled using the examples provided in the C2000WARE as well. Wide variety of division operations, varying operands sizes are listed below along with cycles numbers. The boost in cycles using the fast integer division is shown in Table 4 with respect to the cycles needed to do the same operation on the C28x CPU.

Table 4. Performance Improvement Comparison

Division Operation Using C Operator '/' Without FASTINTDIV Hardware on C28x Using Intrinsics With FASTINTDIV Hardware + C28x Improvement Factor
i16/i16 traditional 52 16 3.3
i16/i16 Euclidean 56 14 4.0
i16/i16 Modulo 56 14 4.0
u16/u16 56 14 4.0
i32/i32 traditional 59 13 4.5
i32/i32 Euclidean 63 14 4.5
i32/i32 Modulo 63 14 4.5
i32/u32 traditional 37 14 2.6
i32/u32 Modulo 41 14 2.9
u32/u32 37 12 3.1
i32/i16 traditional 60 18 3.3
i32/i16 Euclidean 64 16 4.0
i32/i16 Modulo 64 16 4.0
u32/u16 38 13 2.9
i64/i64 traditional (1) 78-2631 42 1.9-62.6
i64/i64 Euclidean (1) 82-2635 42 2.0-62.7
i64/i64 Modulo (1) 82-2635 42 2.0-62.7
i64/u64 traditional (1) 54-2605 42 1.3-62.0
i64/u64 Euclidean (1) 58-2609 42 1.4-62.1
i64/u6 Modulo (1) 58-2609 42 1.4-62.1
u64/u64/ (1) 53-2548 42 1.3-60.7
  1. The FASTINTDIV hardware implements 64-bit integer division with optimal fixed number of cycles for fast deterministic behavior. MCUs without such hardware acceleration, implement 64-bit integer division using generic CPU instructions that are not optimized for division or use algorithm techniques that optimize execution based on the value of the numerator and denominator. For instance, if the value of the numerator and denominator is less than 32-bits the software will execute a 32-bit division. Hence, the number of cycles can vary significantly and for large numerator and denominator values, overall cycles are much higher than achievable by the FASTINTDIV accelerator