【英飞凌 CY8CKIT-062S2-AI评测】声音识别
本文介绍了英飞凌 CY8CKIT-062S-AI 开发板结合板载 Audio 传感器收集环境声音数据,并通过机器学习模型预测和推理特定声音信号,实现声音识别的项目设计。
项目介绍
该项目使用板载 Audio 传感器收集加速度计数据,发送至 ML 模型以检测特定声音,如咳嗽、哭笑、鸟鸣等;

- 环境搭建:安装相关软件和机器学习工具,用以生成对应的模型代码;
- 工程创建:使用 ModusToolbox 软件快速加载和编译固件及调试;
- 工程代码:给出项目方案落地实现的关键代码,包括流程图等;
- 效果演示:通过串口显示目标声音的概率,推理并给出预测结果。
环境搭建
-
在 CY8CKIT-062S2-AI 设备 官方网站 下载对应的开发工具和 IDE 软件,包括
- ModusToolbox ;
- DEEPCRAFT™ Studio 或 Imagimob ;
-
可通过 ModusToolbox Setup 软件安装相关软件和工具链;
-
使用 ModusToolbox Programmer 软件烧录固件。
工程测试
加载 CY8CKIT-062S-AI 开发板 Demo 工程,演示了部署由 DEEPCRAFT™ Studio 生成的机器学习(ML)模型。
- 使用声学模型/关键词探测器,接收脉冲密度调制(PDM)音频数据作为输入;
- 检测各种关键词,如数字、笑声、方向、鸟、狗、猫等;
- 调谐麦克风的预设检测距离为 1 米;
- 运行示例工程,接收麦克风的音频并传递至机器学习模型,通过串口终端输出识别结果。
工程创建
- 进入
Eclipse for ModusToolbox 软件;
- 在
Quick Package 标签界面下选择 Start - New Application ;
- 待加载出设备目录后(需要科学上网),在检索框中输入
CY8CKIT-062S2-AI 获取对应设备;

- 勾选 Machine Learning 目录下的
DEEPCRAFT Deploy Model Audio 工程,点击 Create 按钮;

- 待完成 Demo 创建,右键项目,构建工程,确认无报错;
详见:Infineon/mtb-example-ml-deepcraft-deploy-audio .
流程图

工程代码
打开工程目录中的 main.c 文件,代码如下
#include "cyhal.h"
#include "cybsp.h"
#include "cy_retarget_io.h"
#include <float.h>
/* Model to use */
#include <models/model.h>
/*******************************************************************************
* Macros
********************************************************************************/
/* Desired sample rate. Typical values: 8/16/22.05/32/44.1/48 kHz */
#define SAMPLE_RATE_HZ 16000
/* Audio Subsystem Clock. Typical values depends on the desire sample rate:
- 8/16/48kHz : 24.576 MHz
- 22.05/44.1kHz : 22.579 MHz */
#define AUDIO_SYS_CLOCK_HZ 24576000
/* Decimation Rate of the PDM/PCM block. Typical value is 64 */
#define DECIMATION_RATE 64
/* Microphone sensitivity
* PGA in 0.5 dB increment, for example a value of 5 would mean +2.5 dB. */
#define MICROPHONE_GAIN 20
/* Multiplication factor of the input signal.
* This should ideally be 1. Higher values will have a negative impact on
* the sampling dynamic range. However, it can be used as a last resort
* when MICROPHONE_GAIN is already at maximum and the ML model was trained
* with data at a higher amplitude than the microphone captures.
* Note: If you use the same board for recording training data and
* deployment of your own ML model set this to 1.0. */
#define DIGITAL_BOOST_FACTOR 10.0f
/* Specifies the dynamic range in bits.
* PCM word length, see the A/D specific documentation for valid ranges. */
#define AUIDO_BITS_PER_SAMPLE 16
/* PDM/PCM Pins */
#define PDM_DATA P10_5
#define PDM_CLK P10_4
/* Size of audio buffer */
#define AUDIO_BUFFER_SIZE 512
/* Converts given audio sample into range [-1,1] */
#define SAMPLE_NORMALIZE(sample) (((float) (sample)) / (float) (1 << (AUIDO_BITS_PER_SAMPLE - 1)))
/* DEEPCRAFT compatibility defines to support all versions of code generation APIs */
#ifndef IPWIN_RET_SUCCESS
#define IPWIN_RET_SUCCESS (0)
#endif
#ifndef IPWIN_RET_NODATA
#define IPWIN_RET_NODATA (-1)
#endif
#ifndef IPWIN_RET_ERROR
#define IPWIN_RET_ERROR (-2)
#endif
#ifndef IMAI_DATA_OUT_SYMBOLS
#define IMAI_DATA_OUT_SYMBOLS IMAI_SYMBOL_MAP
#endif
/* End DEEPCFRAT compatibility defines */
/*******************************************************************************
* Function Prototypes
*******************************************************************************/
static void init_board(void);
static void init_audio(cyhal_pdm_pcm_t* pdm_pcm);
static void halt_error(int code);
static void pdm_frequency_fix();
/**********************************************
* Function Name: main
***********************************************/
int main(void)
{
int16_t audio_buffer[AUDIO_BUFFER_SIZE] = {0};
float label_scores[IMAI_DATA_OUT_COUNT];
char *label_text[] = IMAI_DATA_OUT_SYMBOLS;
cy_rslt_t result;
size_t audio_count;
cyhal_pdm_pcm_t pdm_pcm;
int16_t prev_best_label = 0;
int16_t best_label = 0;
float sample = 0.0f;
float sample_abs = 0.0f;
float max_score = 0.0f;
float sample_max = 0;
float sample_max_slow = 0;
/* Basic board setup */
init_board();
/* Initialize model */
result = IMAI_init();
halt_error(result);
/* Initialize audio sampling */
init_audio(&pdm_pcm);
/* ANSI ESC sequence for clear screen */
printf("\x1b[2J\x1b[;H\x1b[?25l;");
for (;;)
{
/* Move cursor home */
printf("\033[H");
printf("DEEPCRAFT Studio Audio Model Example\r\n\n");
/* Initialize the audio_buffer to zeroes and read data
* from the pdm mic into it */
audio_count = AUDIO_BUFFER_SIZE;
memset(audio_buffer, 0, AUDIO_BUFFER_SIZE * sizeof(uint16_t));
result = cyhal_pdm_pcm_read(&pdm_pcm, (void *) audio_buffer, &audio_count);
halt_error(result);
sample_max_slow -= 0.0005;
sample_max = 0;
for(int i = 0; i < audio_count; i++)
{
/* Convert integer sample to float and pass it to the model */
sample = SAMPLE_NORMALIZE(audio_buffer[i]) * DIGITAL_BOOST_FACTOR;
if (sample > 1.0)
{
sample = 1.0;
}
else if (sample < -1.0)
{
sample = -1.0;
}
result = IMAI_enqueue(&sample);
halt_error(result);
/* Used to tune gain control. sample_max should be near 1.0
* when shouting directly into the microphone */
sample_abs = fabs(sample);
if(sample_abs > sample_max)
{
sample_max = sample_abs;
}
if(sample_max > sample_max_slow)
{
sample_max_slow = sample_max;
}
/* Check if there is any model output to process */
best_label = 0;
max_score = -1000.0f;
switch(IMAI_dequeue(label_scores))
{
case IMAI_RET_SUCCESS: /* We have data, display it */
for(int i = 0; i < IMAI_DATA_OUT_COUNT; i++)
{
printf("label: %-10s: score: %.4f\r\n", label_text[i], label_scores[i]);
if (label_scores[i] > max_score)
{
max_score = label_scores[i];
best_label = i;
}
}
printf("\r\n");
/* Post processing
* If the previous best label still has a confidence score above > 0.05
* keep it as the best label. */
if(prev_best_label != 0 && label_scores[prev_best_label] > 0.05)
{
best_label = prev_best_label;
printf("Output: %-30s\r\n", label_text[best_label]);
}
/* Otherwise, if the best label is not "unlabeled", and conf score is above 0.5
* use it as best label. */
else if(best_label != 0 && max_score >= 0.50)
{
prev_best_label = best_label;
printf("Output: %-30s\r\n", label_text[best_label]);
}
/* Else the best label is "unlabeled" */
printf("\r\n");
printf("Volume: %.4f (%.2f)\r\n", sample_max, sample_max_slow);
printf("Audio buffer utilization: %.3f\r\n", audio_count / (float)AUDIO_BUFFER_SIZE);
break;
case IMAI_RET_NODATA: /* No new output, continue with sampling */
break;
case IMAI_RET_ERROR: /* Abort on error */
halt_error(IMAI_RET_ERROR);
break;
}
}
}
}
保存代码。
固件上传
- 连接开发板和电脑,点击菜单栏的运行按钮,完成固件上传;
- 或使用
ModusToolbox Programmer 工具烧录固件;
- 固件位于
.../DEEPCRAFT_..._Audio/build/APP_CY8CKIT-062S2-AI/Debug 文件夹;

- 加载固件,配置烧录器、开发板型号;
- 点击 Program 即可。

效果
- 运行
Tera Term 软件,连接设备串口,配置波特率为 115200;
- 短按板载 RESET 键,终端显示 Audio 例程,并进行声音推理;

- 环境输入标签对应的各种声音信号,开发板可根据 Audio 模型推理识别出相应的声音并标签显示;

总结
本文介绍了英飞凌 CY8CKIT-062S-AI 开发板结合板载 Audio 传感器收集环境声音数据,并通过机器学习模型预测和推理特定声音信号,实现声音识别的项目设计,为相关产品在边缘 AI 领域的快速开发和设计应用提供了参考。