打印

Is it typical at least 2 cycles taken for load from

[复制链接]
146|0
手机看帖
扫描二维码
随时随地手机跟帖
跳转到指定楼层
楼主
社畜一枚|  楼主 | 2018-9-9 07:22 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
I expected load and store instructions accessing zero wait state accessible memory to take only 1 cycle (average and with pipeline filled), but it doesn\'t seem to. Is it typical even with zero wait state memory access for load and store to take at least 2 cycles?
(Here, by the zero wait state memory I mean, for example, an internal RAM with operating clock freq. larger than that of the processor core.)
Here below is the test code and its generated assembly code I used. (I tested this on STM32F429ZITx board.)
for (i=0; i<20000; i++) {
data = test_data;\n\ntest_data[20000-1-i] = data;\n}
And below is the generated assembly code (loop unrolled with two iterations in the loop; with optimize option -O3 -Otime). This 14 instruction loop is measured to take 36 cycles. So, it takes 2.6 cycles/instruction.
0x080019E0 F8343011 LDRH     r3,[r4,r1,LSL #1]
0x080019E4 F8AD3000 STRH     r3,[sp,#0x00]
0x080019E8 F8BDC000 LDRH     r12,[sp,#0x00]
0x080019EC 1A53   SUBS     r3,r2,r1
0x080019EE F824C013 STRH     r12,[r4,r3,LSL #1]
0x080019F2 EB040341 ADD      r3,r4,r1,LSL #1
0x080019F6 885B   LDRH     r3,[r3,#0x02]
0x080019F8 F8AD3000 STRH     r3,[sp,#0x00]
0x080019FC F8BDC000 LDRH     r12,[sp,#0x00]
0x08001A00 1A43   SUBS     r3,r0,r1
0x08001A02 F824C013 STRH     r12,[r4,r3,LSL #1]
0x08001A06 1C89   ADDS     r1,r1,#2
0x08001A08 42A9   CMP      r1,r5
0x08001A0A D3E9   BCC      0x080019E0

使用特权

评论回复

相关帖子

发新帖 我要提问
您需要登录后才可以回帖 登录 | 注册

本版积分规则

397

主题

401

帖子

0

粉丝