一、引言
        最早的NEON指令在ARM Cortex-A5内核上,作为可选的模块出现。
       直到后ARM Cortex-A7全面支持NEON协处理器。才开始广泛应用于实际项目开发中。 由于32位寄存器的局限性,ARM公司的在神们希望,能过增加寄存器的位宽来增加CPU的数据处理能力。通过数据位扩展,以前一条指令只能处理单个数据,目前可扩展处理多个数据。
二、什么是NEON
      1)、NEON is a wide SIMD data processing architecture
      2)、Extension of the ARM instruction set
      3)、32 registers, 64-bits wide (dual view as 16 registers, 128-bits wide)
      4)、Registers are considered as vectors of elements of the same data type

                               Neon 编程入门

       5)、Data types can be: signed/unsigned 8-bit, 16-bit, 32-bit, 64-bit, single prec. float
Instructions perform the same operation in all lanes
三、NEON相关指令
1)、Vectors and Scalars
  Registers hold one or more elements of the same data type.
  Vn can be used to reference either a 64-bit Dn or 128-bit Qn register
  A register, data type combination describes a vector of elements
Neon 编程入门
Some instructions can reference individual scalar elements
Scalar elements are referenced using the array notation Vn[x]

Neon 编程入门

Array ordering is always from the least significant bit.
2)、Neon Operation

Arithmetic
○ VABA, VABD, VABS, VNEG, VADD, VSUB, VADDHN, VSUBHN, VHADD, VHSUB,
VPADD, VPADAL, VMAX, VMIN, VPMAX, VPMIN, VCLS, VCLZ, VCNT
● Multiplication
○ VMUL, VMLA, VMLS, VQDMULL, VQDMLAL, VQDMLSL, VQDMULH
● Shifts
○ VSHL, VSHR, VSRA, VSLI, VSRI
● Comparison and Selection
○ VCEQ, VCGE, VCGT, VCLE, VCLT, VTST, VBIF, VBIT, VBSL
● Logical
○ VAND, VBIC, VEOR, VORN, VORR, VMVN
● Reciprocal Estimate/Step, Reciprocal Square Root Estimate/Step
○ VRECPE, VRSQRTE, VRECPS, VRSQRTS
● Miscellaneous
○ VMOV, VDUP, VCVT, VEXT, VREV, VSWP, VTBL, VTBX, VTRN, VUZP, VZIP
● Load/Store
○ VLD1, VLD2, VLD3, VLD4, VST1, VST2, VST3, VST4

相关文章:

  • 2022-12-23
  • 2021-11-18
  • 2021-11-23
  • 2021-11-27
  • 2021-08-12
  • 2021-06-30
  • 2021-09-12
  • 2021-08-21
猜你喜欢
  • 2022-12-23
  • 2021-11-23
  • 2021-11-28
  • 2021-07-06
  • 2021-06-21
相关资源
相似解决方案