Skip to content

SME1 kernel support for AI workloads using ArmNN #1220

@hpc-qti

Description

@hpc-qti

Hi @morgolock,

In continuation to the PR 1206
Need support for SME1 kernels in ComputeLibrary in arm_gemm

AI workloads using ArmNN library needs SME1 equivalent kernels in ComputeLibrary. This enables SME based acceleration for applications like Geekbench AI workloads - computer vision tests, NLP tests etc.

Below is the list of kernels identified that will be needed for the above use-case.

sme1_interleaved_nomerge_bf16fp32_mopa_1VLx4VL
sme1_interleaved_nomerge_bf16fp32_mopa_2VLx2VL
sme1_interleaved_nomerge_bf16fp32_mopa_4VLx1VL
sme1_interleaved_nomerge_fp16fp32_mopa_1VLx4VL
sme1_interleaved_nomerge_fp16fp32_mopa_2VLx2VL
sme1_interleaved_nomerge_fp16fp32_mopa_4VLx1VL
sme1_interleaved_nomerge_fp16fp32fp16_mopa_1VLx4VL
sme1_interleaved_nomerge_fp16fp32fp16_mopa_2VLx2VL
sme1_interleaved_nomerge_fp16fp32fp16_mopa_4VLx1VL
sme1_interleaved_nomerge_s8q_mopa_1VLx4VL
sme1_interleaved_nomerge_s8q_mopa_2VLx2VL
sme1_interleaved_nomerge_s8q_mopa_4VLx1VL
sme1_interleaved_nomerge_s8qfp32_mopa_1VLx4VL
sme1_interleaved_nomerge_s8qfp32_mopa_2VLx2VL
sme1_interleaved_nomerge_s8qfp32_mopa_4VLx1VL
sme1_interleaved_nomerge_s8s32_mopa_1VLx4VL
sme1_interleaved_nomerge_s8s32_mopa_2VLx2VL
sme1_interleaved_nomerge_s8s32_mopa_4VLx1VL
sme1_interleaved_nomerge_u8q_mopa_1VLx4VL
sme1_interleaved_nomerge_u8q_mopa_2VLx2VL
sme1_interleaved_nomerge_u8q_mopa_4VLx1VL

sme1_gemv_bf16fp32_dot_16VL
sme1_gemv_fp16_mla_16VL
sme1_gemv_fp16fp32fp16_dot_16VL
sme1_gemv_fp32_mla_16VL
sme1_gemv_fp32bf16fp32_dot_16VL
sme1_gemv_s8qa_dot_16VL
sme1_gemv_u8qa_dot_16VL

Metadata

Metadata

Assignees

Labels

No labels
No labels
No fields configured for Feature.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions