Skip to content

Incorrect live sgpr analysis #124

@kirill146

Description

@kirill146

RGA 2.11, win11
kernel.cl:

__kernel __attribute__((reqd_work_group_size(8, 8, 1))) void foo(__global uchar *buf) {
  uint a = get_global_id(0);
  uint b = get_global_id(1);
  buf[a] = a - b;
}

Compiled as
rga -s opencl -c gfx1030 --O3 --livereg-sgpr . kernel.cl
Analysis result:

 Line | Rn  |          | Instruction
--------------------------------------------------------------------------------
    1 |   3 |     :::  | s_clause 0x1
    2 |   5 | ^^^^vv:  | s_load_dwordx4 s[0:3], s[4:5], 0x30
    3 |   5 | : : xx:  | s_load_dwordx2 s[4:5], s[4:5], null
    4 |   5 | : : ::v  | v_lshl_or_b32 v0, s6, 3, v0
    5 |   5 | : : :: ^ | s_mulk_i32 s7, 0xf8
    6 |   5 | : : :: : | s_waitcnt lgkmcnt(0)
    7 |   5 | : v :: : | v_add_nc_u32 v1, s2, v1
    8 |   4 | v   :: : | v_add_nc_u32 v0, s0, v0
    9 |   3 |     :: v | v_sub_nc_u32 v1, s7, v1
   10 |   2 |     ::   | v_add_nc_u32 v2, v1, v0
   11 |   3 | ^   v:   | v_add_co_u32 v0, s0, s4, v0
   12 |   2 | v    v   | v_add_co_ci_u32 v1, null, s5, 0, s0
   13 |   0 |          | global_store_byte v[0:1], v2, off
   14 |   0 |          | s_endpgm

At the start of the kernel register s7 contains get_group_id(1) and in the 5th instruction it gets both read and written (s_mulk_i32's semantic is D = D * simm16), so the correct analysis should actually start with

 Line | Rn  |          | Instruction
--------------------------------------------------------------------------------
    1 |   3 |     :::: | s_clause 0x1
    2 |   5 | ^^^^vv:: | s_load_dwordx4 s[0:3], s[4:5], 0x30
    3 |   5 | : : xx:: | s_load_dwordx2 s[4:5], s[4:5], null
    4 |   5 | : : ::v: | v_lshl_or_b32 v0, s6, 3, v0
    5 |   5 | : : :: x | s_mulk_i32 s7, 0xf8
   ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions