1

Let's say we have an ordered 1D tensor of ints/longs t.

I want to produce a new tensor a with size max(t) where each term a[i] contains the first occurrence of the value i in the tensor t.

We could easily compute the tensor a using standard Python code, but it would be too slow for large input as the one I'm using.

I'm looking for a fast solution that can run on GPU using the CUDA backend of PyTorch or simply a fast solution.

1 Answer 1

1

You can do this:

def first_occurrence_indices(t):
    max_val = torch.max(t) + 1
    a = torch.full((max_val,), -1, dtype=torch.long, device=t.device)

    # Get the first occurrence index using `scatter_reduce`
    indices = torch.arange(len(t), device=t.device)
    a.scatter_reduce_(0, t, indices, reduce="amin", include_self=False)

    return a

Now you can use it like so:

>>> t = torch.tensor([2, 1, 3, 2, 3, 0, 1, 5], device="cuda")
>>> first_occurrence_indices(t)
tensor([ 5,  1,  0,  2, -1,  7])

Notice that "4" is not in our tensor, so the index is still the default fill value of -1.

That said, I would test to see how efficient scatter_reduce is with your full tensors.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.