【英飞凌 CY8CKIT-062S2-AI评测】声音识别

登录 · 发表于 2025-11-20 14:11

【英飞凌 CY8CKIT-062S2-AI评测】声音识别

本文介绍了英飞凌 CY8CKIT-062S-AI 开发板结合板载 Audio 传感器收集环境声音数据，并通过机器学习模型预测和推理特定声音信号，实现声音识别的项目设计。

项目介绍

该项目使用板载 Audio 传感器收集加速度计数据，发送至 ML 模型以检测特定声音，如咳嗽、哭笑、鸟鸣等；

环境搭建：安装相关软件和机器学习工具，用以生成对应的模型代码；
工程创建：使用 ModusToolbox 软件快速加载和编译固件及调试；
工程代码：给出项目方案落地实现的关键代码，包括流程图等；
效果演示：通过串口显示目标声音的概率，推理并给出预测结果。

环境搭建

在 CY8CKIT-062S2-AI 设备官方网站下载对应的开发工具和 IDE 软件，包括
- ModusToolbox ；
- DEEPCRAFT™ Studio 或 Imagimob ；
可通过 ModusToolbox Setup 软件安装相关软件和工具链；
使用 ModusToolbox Programmer 软件烧录固件。

工程测试

加载 CY8CKIT-062S-AI 开发板 Demo 工程，演示了部署由 DEEPCRAFT™ Studio 生成的机器学习（ML）模型。

使用声学模型/关键词探测器，接收脉冲密度调制（PDM）音频数据作为输入；
检测各种关键词，如数字、笑声、方向、鸟、狗、猫等；
调谐麦克风的预设检测距离为 1 米；
运行示例工程，接收麦克风的音频并传递至机器学习模型，通过串口终端输出识别结果。

工程创建

进入 Eclipse for ModusToolbox 软件；
在 Quick Package 标签界面下选择 Start - New Application ；
待加载出设备目录后（需要科学上网），在检索框中输入 CY8CKIT-062S2-AI 获取对应设备；

勾选 Machine Learning 目录下的 DEEPCRAFT Deploy Model Audio 工程，点击 Create 按钮；

待完成 Demo 创建，右键项目，构建工程，确认无报错；

详见：Infineon/mtb-example-ml-deepcraft-deploy-audio .

流程图

工程代码

打开工程目录中的 main.c 文件，代码如下

#include "cyhal.h"
#include "cybsp.h"
#include "cy_retarget_io.h"
#include <float.h>

/* Model to use */
#include <models/model.h>

/*******************************************************************************
* Macros
********************************************************************************/
/* Desired sample rate. Typical values: 8/16/22.05/32/44.1/48 kHz */
#define SAMPLE_RATE_HZ              16000

/* Audio Subsystem Clock. Typical values depends on the desire sample rate:
- 8/16/48kHz    : 24.576 MHz
- 22.05/44.1kHz : 22.579 MHz */
#define AUDIO_SYS_CLOCK_HZ          24576000

/* Decimation Rate of the PDM/PCM block. Typical value is 64 */
#define DECIMATION_RATE             64

/* Microphone sensitivity
 * PGA in 0.5 dB increment, for example a value of 5 would mean +2.5 dB. */
#define MICROPHONE_GAIN             20

/* Multiplication factor of the input signal.
 * This should ideally be 1. Higher values will have a negative impact on
 * the sampling dynamic range. However, it can be used as a last resort 
 * when MICROPHONE_GAIN is already at maximum and the ML model was trained
 * with data at a higher amplitude than the microphone captures.
 * Note: If you use the same board for recording training data and 
 * deployment of your own ML model set this to 1.0. */
#define DIGITAL_BOOST_FACTOR            10.0f

/* Specifies the dynamic range in bits.
 * PCM word length, see the A/D specific documentation for valid ranges. */
#define AUIDO_BITS_PER_SAMPLE       16

/* PDM/PCM Pins */
#define PDM_DATA                    P10_5
#define PDM_CLK                     P10_4

/* Size of audio buffer */
#define AUDIO_BUFFER_SIZE           512

/* Converts given audio sample into range [-1,1] */
#define SAMPLE_NORMALIZE(sample)        (((float) (sample)) / (float) (1 << (AUIDO_BITS_PER_SAMPLE - 1)))

/* DEEPCRAFT compatibility defines to support all versions of code generation APIs */
#ifndef IPWIN_RET_SUCCESS
#define IPWIN_RET_SUCCESS (0)
#endif
#ifndef IPWIN_RET_NODATA
#define IPWIN_RET_NODATA (-1)
#endif
#ifndef IPWIN_RET_ERROR
#define IPWIN_RET_ERROR (-2)
#endif
#ifndef IMAI_DATA_OUT_SYMBOLS
#define IMAI_DATA_OUT_SYMBOLS IMAI_SYMBOL_MAP
#endif
/* End DEEPCFRAT compatibility defines */

/*******************************************************************************
* Function Prototypes
*******************************************************************************/
static void init_board(void);
static void init_audio(cyhal_pdm_pcm_t* pdm_pcm);
static void halt_error(int code);
static void pdm_frequency_fix();


/**********************************************
* Function Name: main
***********************************************/
int main(void)
{
    int16_t audio_buffer[AUDIO_BUFFER_SIZE] = {0};
    float label_scores[IMAI_DATA_OUT_COUNT];
    char *label_text[] = IMAI_DATA_OUT_SYMBOLS;

    cy_rslt_t result;
    size_t audio_count;
    cyhal_pdm_pcm_t pdm_pcm;
    int16_t prev_best_label = 0;
    int16_t best_label = 0;
    float sample = 0.0f;
    float sample_abs = 0.0f;
    float max_score = 0.0f;
    float sample_max = 0;
    float sample_max_slow = 0;

    /* Basic board setup */
    init_board();

    /* Initialize model */
    result = IMAI_init();
    halt_error(result);

    /* Initialize audio sampling */
    init_audio(&pdm_pcm);

    /* ANSI ESC sequence for clear screen */
    printf("\x1b[2J\x1b[;H\x1b[?25l;");

    for (;;)
    {
        /* Move cursor home */
        printf("\033[H");
        printf("DEEPCRAFT Studio Audio Model Example\r\n\n");

        /* Initialize the audio_buffer to zeroes and read data
         * from the pdm mic into it */
        audio_count = AUDIO_BUFFER_SIZE;
        memset(audio_buffer, 0, AUDIO_BUFFER_SIZE * sizeof(uint16_t));
        result = cyhal_pdm_pcm_read(&pdm_pcm, (void *) audio_buffer, &audio_count);
        halt_error(result);

        sample_max_slow -= 0.0005;
        sample_max = 0;
        for(int i = 0; i < audio_count; i++)
        {
            /* Convert integer sample to float and pass it to the model */
            sample = SAMPLE_NORMALIZE(audio_buffer[i]) * DIGITAL_BOOST_FACTOR;
            if (sample > 1.0)
            {
                sample = 1.0;
            }
            else if (sample < -1.0)
            {
                sample = -1.0;
            }
            result = IMAI_enqueue(&sample);
            halt_error(result);

            /* Used to tune gain control. sample_max should be near 1.0 
             * when shouting directly into the microphone */
            sample_abs = fabs(sample);
            if(sample_abs > sample_max)
            {
                sample_max = sample_abs;
            }

            if(sample_max > sample_max_slow)
            {
                sample_max_slow = sample_max;
            }
            /* Check if there is any model output to process */
            best_label = 0;
            max_score = -1000.0f;
            switch(IMAI_dequeue(label_scores))
            {
                case IMAI_RET_SUCCESS:      /* We have data, display it */

                    for(int i = 0; i < IMAI_DATA_OUT_COUNT; i++)
                    {
                        printf("label: %-10s: score: %.4f\r\n", label_text[i], label_scores[i]);
                        if (label_scores[i] > max_score)
                        {
                            max_score = label_scores[i];
                            best_label = i;
                        }
                    }
                    printf("\r\n");

                    /* Post processing
                     * If the previous best label still has a confidence score above > 0.05
                     * keep it as the best label. */
                    if(prev_best_label != 0 && label_scores[prev_best_label] > 0.05)
                    {
                        best_label = prev_best_label;
                        printf("Output: %-30s\r\n", label_text[best_label]);
                    }
                    /* Otherwise, if the best label is not "unlabeled", and conf score is above 0.5
                     * use it as best label. */
                    else if(best_label != 0 && max_score >= 0.50)
                    {
                        prev_best_label = best_label;
                        printf("Output: %-30s\r\n", label_text[best_label]);
                    }
                    /* Else the best label is "unlabeled" */
                    printf("\r\n");
                    printf("Volume: %.4f    (%.2f)\r\n", sample_max, sample_max_slow);
                    printf("Audio buffer utilization: %.3f\r\n", audio_count / (float)AUDIO_BUFFER_SIZE);
                    break;
                case IMAI_RET_NODATA:   /* No new output, continue with sampling */
                    break;
                case IMAI_RET_ERROR:    /* Abort on error */
                    halt_error(IMAI_RET_ERROR);
                    break;
            }
        }
    }
}

保存代码。

固件上传

连接开发板和电脑，点击菜单栏的运行按钮，完成固件上传；
或使用 ModusToolbox Programmer 工具烧录固件；
固件位于 .../DEEPCRAFT_..._Audio/build/APP_CY8CKIT-062S2-AI/Debug 文件夹；

加载固件，配置烧录器、开发板型号；
点击 Program 即可。

效果

运行 Tera Term 软件，连接设备串口，配置波特率为 115200；
短按板载 RESET 键，终端显示 Audio 例程，并进行声音推理；

环境输入标签对应的各种声音信号，开发板可根据 Audio 模型推理识别出相应的声音并标签显示；

总结

本文介绍了英飞凌 CY8CKIT-062S-AI 开发板结合板载 Audio 传感器收集环境声音数据，并通过机器学习模型预测和推理特定声音信号，实现声音识别的项目设计，为相关产品在边缘 AI 领域的快速开发和设计应用提供了参考。

[ModusToolbox™] 【英飞凌 CY8CKIT-062S2-AI评测】声音识别

【英飞凌 CY8CKIT-062S2-AI评测】声音识别

项目介绍

环境搭建

工程测试

工程创建

流程图

工程代码

固件上传

效果

总结

本帖子中包含更多资源

相关帖子