SPRUIG8I January 2018 – December 2023
A set of utilities are provided in the
compiler library for writing vector-width independent code for C7000. To use these
utilities, #include c7x_scalable.h
in source code.
These utilities are available for use in C++ code only due to use of C++ language features in their implementation.
These utilities are available when using the TI C7000 compiler or when compiling with TI C7000 Host Emulation.
c7x_scalable.h
in source code). When the MMA scalable vector
programming utilities are ready for general use in a future release, defining
__C7X_UNSTABLE_API will no longer be required.The following APIs are available, all of which are described in further detail in the
c7x_scalable.h
file:
c7x::max_simd<T>::value
c7x::element_count_of<T>::value
c7x::element_type_of<T>::type
c7x::component_type_of<T>::type
c7x::make_vector<T,N>::type
c7x::make_full_vector<T>::type
c7x::is_target_vector<T>::value
c7x::char_vec
c7x::short_vec
etc
c7x::char_hvec
c7x::short_hvec
etc
c7x::char_qvec
c7x::short_qvec
etc
c7x::char_vec_ptr
c7x::const_short_vec_ptr
etc
c7x::reinterpret<T>(v)
c7x::convert<T>(v)
c7x::as_char_vec(v)
c7x::convert_short_vec(v)etc
c7x::se_veclen<T>::value
c7x::se_eletype<T>::value
c7x::sa_veclen<T>::value
c7x::strm_eng<I,T>::get()
c7x::strm_eng<I,T>::get_adv()
c7x::strm_agen<I,T>::get(p)
c7x::strm_agen<I,T>::get_adv(p)
c7x::strm_agen<I,T>::get_vpred()
The following macros are defined by
c7x_mma.h
and can be used to determine information about the
MMA for use with the scalable vector programming model:
Macro Syntax | Description |
---|---|
__MMA_A_MAT_BYTES__ | The size of an A matrix in bytes. Currently, each A matrix contains one row. |
__MMA_A_ROW_WIDTH_BYTES__ | The size of a row in an A matrix in bytes. |
__MMA_A_ROWS__ | The number of rows in an A matrix. |
__MMA_A_COLS(ebytes) | The number of columns in an A matrix given the number of bytes in
each element of A. Often useful with sizeof(). For example,
__MMA_A_COLS(sizeof(short)) . |
__MMA_A_ENTRIES__ | The number of A entries that can be contained in the A storage. |
__MMA_B_MAT_BYTES__ | The size of a B matrix in bytes. |
__MMA_B_ROW_WIDTH_BYTES__ | The size of a row in a B matrix in bytes. |
__MMA_B_ROWS(ebytes) | The number of rows in a B matrix given the number of bytes in
each element of B. Often useful with sizeof(). For example,
__MMA_B_ROWS(sizeof(short)) . |
__MMA_B_COLS(ebytes) | The number of columns in a B matrix given the number of bytes in
each element of B. Often useful with sizeof(). For example,
__MMA_B_COLS(sizeof(short)) . |
__MMA_C_MAT_BYTES__ | The size of a C matrix. Currently, each C matrix contains one row. Currently the C matrix is 4 times wider than the A matrix for larger accumulators. |
__MMA_C_ROW_WIDTH_BYTES__ | The size of a row in a C matrix. |
__MMA_C_ROWS__ | The number of rows in a C matrix. |
__MMA_C_COLS(ebytes) | The number of columns in a C matrix given the number of bytes in
each element of C. Often useful with sizeof(). For example,
__MMA_C_COLS(sizeof(short)) . |
__MMA_C_ENTRIES__ | The number of C entries that can be contained in C storage. |
As a moderate complexity example, the following is an implementation of a C++ function template for memcpy that uses the input type as a template. This example uses a streaming engine and a streaming address generator (see Section 4.15).
#include <c7x_scalable.h>
using namespace c7x;
/* memcpy_scalable_strm<typename S>(const S*in, S *out, int len)
*
* S - A basic data type such as short or float.
* in - The input buffer.
* out - The output buffer.
* len - The number of elements to copy.
*
* Defaulted template arguments:
* V - A full vector type of S
*/
template<typename S,
typename V = typename make_full_vector<S>::type>
void memcpy_scalable_strm(const S *restrict in, S *restrict out, int len)
{
/*
* Find the maximum number of vector loads/stores needed to copy the buffer,
* including any remainder.
*/
int cnt = len / element_count_of<V>::value;
cnt += (len % element_count_of<V>::value > 0);
/* Initialize the SE for a linear read in and the SA for a linear write out. */
__SE_TEMPLATE_v1 in_tmplt = __gen_SE_TEMPLATE_v1();
__SA_TEMPLATE_v1 out_tmplt = __gen_SA_TEMPLATE_v1();
in_tmplt.VECLEN = se_veclen<V>::value;
in_tmplt.ELETYPE = se_eletype<V>::value;
in_tmplt.ICNT0 = len;
out_tmplt.VECLEN = sa_veclen<V>::value;
out_tmplt.ICNT0 = len;
__SE0_OPEN(in, in_tmplt);
__SA0_OPEN(out_tmplt);
/* Perform the copy. If there is remainder, the last store will be predicated. */
int i;
for (i = 0; i < cnt; i++)
{
V tmp = strm_eng<0, V>::get_adv();
__vpred pred = strm_agen<0, V>::get_vpred();
V *addr = strm_agen<0, V>::get_adv(out);
__vstore_pred(pred, addr, tmp);
}
__SE0_CLOSE();
__SA0_CLOSE();
}