VectorBlox vnnx_tflite.py Fix: Problem 3 - SLICE/STRIDED_SLICE 5D Tensor Processing

3 minute read

Published:

Problem 3: SLICE/STRIDED_SLICE 5D Tensor Processing

Compilation Failure Log

struct.pack error: unpack requires a buffer of X bytes

File "vnnx_tflite.py", line XXXX

# struct field count mismatch

Root Cause

There were operations in the SPNv2 model that sliced 5-dimensional tensors [N, H, W, A, K]. For example, selecting only specific anchors or keypoints. TFLite supports 5-dimensional SLICE/STRIDED_SLICE, so it converts normally.

However, VectorBlox SDK’s SLICE subnode structure only supports 4 dimensions. The structure defines begin, end, and stride as arrays of 4 elements each:

// VectorBlox structure
struct SliceOptions {
    int32_t begin[4];   
    int32_t end[4];     
    int32_t stride[4];  
};

When trying to put 5-dimensional tensor parameters (5 elements) into this structure, struct.pack() raises an error saying “expected 4 but got 5”.

Solution Process

A method was needed to safely convert 5-dimensional tensors to 4-dimensional tensors.

Solution strategy: Fold the last two axes (A, K) into a single axis (A×K) to make it 4-dimensional.

[N, H, W, A, K][N, H, W, A×K]

Mapping 5D index (a, k) to 4D index c = a×K + k results in the same memory location:

# 5D: offset = ... + a*K + k
# 4D: offset = ... + c = ... + (a*K + k)
# → same location!

# Example:
# 5D [N,H,W,A=9,K=4] at (a=2, k=3)
# → 4D [N,H,W,C=36] at c = 2*4 + 3 = 11
# Memory offset = ... + 2*4 + 3 = ... + 11
# → identical! ✓

Therefore, axis folding maintains the memory layout while simply simplifying the notation to 4D.

Modified Code

elif subcode == "SLICE":
    begin = get_numpy_data_from_index(subop['inputs'][1], tensors, buffers)
    size = get_numpy_data_from_index(subop['inputs'][2], tensors, buffers)
    ishape = i_tensor['shape']
    
    # 5D detection
    if len(begin) > 4 or len(size) > 4 or len(ishape) > 4:
        # Pad to 5 dimensions
        begin_pad = pad_list(list(begin), 5, 0)
        size_pad = pad_list(list(size), 5)
        ishape_pad = pad_list(list(ishape), 5)
        
        # Fold last two axes
        K = ishape_pad[4]
        begin4 = [begin_pad[0], begin_pad[1], begin_pad[2], 
                  begin_pad[3] * K + begin_pad[4]]  # (a, k) → a*K + k
        size4_last = size_pad[3] * K if size_pad[4] == -1 else \
                     size_pad[3] * K + size_pad[4]
        size4 = [size_pad[0], size_pad[1], size_pad[2], size4_last]
        
        # Set 4D parameters
        sn.SliceOptions.begin = begin4
        sn.SliceOptions.end = [b + s for b, s in zip(begin4, size4)]
        sn.SliceOptions.stride = [1, 1, 1, 1]
    else:
        # 4D and below → existing logic
        [...]

Index mapping (a, k) → a×K + k converts 5D indices to 4D.

Solution Result

After modification, 5-dimensional SLICE compiles successfully:

[SLICE 5D tensor detected]
5D → 4D conversion (fold last two axes)
Adjusted begin/end/stride to 4 elements
Compilation successful ✓

Conclusion

The SLICE/STRIDED_SLICE 5D tensor processing problem occurred due to a mismatch between VectorBlox SDK’s hardware constraints (4-dimensional structure) and SPNv2’s use of 5-dimensional tensors.

The solution used a method of folding the last two axes to convert to 4 dimensions, which is a safe approach that maintains memory layout while ensuring hardware compatibility.


Series Posts

Language: 한국어 (Korean)