Features:
- Support fp32, fp16, bf16 tensors in Paddle.
- Implicitly pad with zeros if dimension is not a power of 2.
- Pure Paddle implementation with autograd support and no PyTorch/CUDA extension dependency.
cd fast-hadamard-transform
pip install -v .
import paddle
from fast_hadamard_transform import hadamard_transform
x = paddle.randn([16, 1024], dtype="float32")
y = hadamard_transform(x, scale=1.0)
def hadamard_transform(x, scale=1.0):
"""
Arguments:
x: (..., dim) Paddle Tensor.
scale: float. Multiply the output by this number.
Returns:
out: (..., dim)
Multiply each row of x by the Hadamard transform matrix.
If dim is not a power of 2, x is implicitly padded with zeros to the next power of 2.
"""
Use the Paddle benchmark script:
python benchmarks/benchmark_fast_hadamard_transform.py