NUCLEO-U575ZI-Q采用的是M33核,根据官网资料介绍:
The Cortex®-M33 core features a single-precision FPU (floating-point unit), that supports all the Arm® single-precision data-processing instructions and all the data types.
The Cortex®-M33 core also implements a full set of DSP (digital signal processing) instructions and a MPU (memory protection unit) that enhances the application security.
M33核采用ARMv8M架构,
将STM32CubeU5中的DSP库arm_ARMv8MMLldfsp_math.lib添加到工程中,其中ARMv8MMLldfsp含义如下:
Cortex-M33 内核,l 表示小端格式,f 表示带 FPU 单元,sp 表示 Single Precision 单精度浮点。
RTE开启event recorder调试支持:
fft测试函数:
#include "arm_math.h"
#include "arm_const_structs.h"
#include "main.h"
#define TEST_LENGTH_SAMPLES 1024
static float32_t testOutput_f32[TEST_LENGTH_SAMPLES*2];
static float32_t testInput_f32[TEST_LENGTH_SAMPLES*2];
static float32_t Phase_f32[TEST_LENGTH_SAMPLES*2];
void arm_cfft_f32_app(void)
{
arm_status status = ARM_MATH_SUCCESS;
uint32_t fftSize = TEST_LENGTH_SAMPLES;
uint32_t refIndex = 213, testIndex = 0;
uint32_t ifftFlag = 0;
uint32_t doBitReverse = 1;
float32_t maxValue;
uint16_t i;
//按照实部,虚部,实部,虚部..... 的顺序存储数据
for(i=0; i<TEST_LENGTH_SAMPLES; i++)
{
//波形是由直流分量,50000Hz 正弦波组成,波形采样率 1024,初始相位 60°
testInput_f32[i*2] = 1 + cos(2*3.1415926f*50000*i/1024 + 3.1415926f/3) + cos(2*3.1415926f*1000*i/1024 + 3.1415926f/2) + cos(2*3.1415926f*12000*i/1024 + 3.1415926f/6);
testInput_f32[i*2+1] = 0.0;
}
for(i=0; i<TEST_LENGTH_SAMPLES; i++)
{
printf("%d %f %f\r\n",i,testInput_f32[i*2],testInput_f32[i*2+1]);
}
uint32_t start = DWT->CYCCNT;
EventStartA(0);
arm_cfft_f32(&arm_cfft_sR_f32_len1024, testInput_f32, ifftFlag, doBitReverse);
EventStopA(0);
uint32_t end = DWT->CYCCNT;
float t = (end - start)*1.0/160;
printf("t=%fus %fms\r\n",t,t/1e3);
arm_cmplx_mag_f32(testInput_f32, testOutput_f32, TEST_LENGTH_SAMPLES);
arm_max_f32(testOutput_f32, fftSize, &maxValue, &testIndex);
printf("maxValue=%f testIndex=%d\r\n",maxValue,testIndex);
}
调试效果,可以看到进行1024点浮点fft运算耗时1ms左右
网友测试stm32f407进行fft运算的性能:
总结:M33核比M4核在dsp性能上似乎没有太多优势。
|