adaSoftmax is slower than pytorch model's forward pass(naive counterpart of adaSoftmax), even though its sample complexity is much lower than the naive one.
Currently optimizing adaSoftmax via Numba decorator, but it's producing an error when trying to compile the adaSoftmax with nopython option. Currently trying to solve this issue.
adaSoftmax is slower than pytorch model's forward pass(naive counterpart of adaSoftmax), even though its sample complexity is much lower than the naive one.
Currently optimizing adaSoftmax via Numba decorator, but it's producing an error when trying to compile the adaSoftmax with nopython option. Currently trying to solve this issue.