打印

Partial register dependency neon

[复制链接]
141|0
手机看帖
扫描二维码
随时随地手机跟帖
跳转到指定楼层
楼主
河童|  楼主 | 2018-9-9 10:30 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
I\'m having trouble finding any informations on partial neon register dependencies.
Take for example the following code:
Fullscreen

Does the second load have to wait for the previous one to complete or may it continue right away?
I\'m working with image data that needs to be palletised from a 256 16-bit entry table and I want to further process it with neon. Unfortunately due to the table size are tbl instructions not an option, since it would take up all of the 32 registers. Would doing the look up with arm first, then combining and transfering the results in 4 64-bit registers be faster?
If it helps I\'m targeting Cortex-A57.

使用特权

评论回复

相关帖子

发新帖 我要提问
您需要登录后才可以回帖 登录 | 注册

本版积分规则

452

主题

452

帖子

0

粉丝