Skip to main content
0 votes
0 answers
22 views

I am trying to implement this rewrite rule from the TASO paper with ONNX Script rewriter. However, I cannot figure out how to implement a pattern with multiple outputs X and Y. The ONNX Script does ...
Anita Hailey's user avatar
2 votes
1 answer
75 views

I am trying to follow along with this webpage: https://jtr13.github.io/cc21fall2/tutorial-on-r-torch-package.html I am trying to understand R's implementation of PyTorch. I am having some trouble with ...
Huy Pham's user avatar
  • 173
1 vote
2 answers
126 views

After converting module A to CPU, the origin parameter tensor still stays on the GPU? When it is released? Is it wrong if I reuse the parameter? My code: import torch.nn as nn class A(nn.Module): ...
jiwei zhang's user avatar
2 votes
1 answer
25 views

In Torch, .view() reshapes the tensor. However, there are multiple ways to reshape a multi-dimensional tensor to a target shape. How does it decide between those different ways? For example, in Torch, ...
Sanchit's user avatar
  • 21
0 votes
0 answers
25 views

Using einsteinpy package of Python, I am defining the electromagnetic tensor (or any other arbitrary tensor). While defining, I am defining it as 'uu' tensor using the BaseRelativityTensor class file. ...
ASarkar's user avatar
  • 559
0 votes
0 answers
66 views

I have a matrix A of size n^2 by n^2, and I wanted to know if for a given accuracy (or a number r) there is a way to express A as the sum of Bi kron Ci for i=1...R where Bi, Ci are n by n? i.e. ...
Brice's user avatar
  • 9
1 vote
1 answer
316 views

I am writing PTX assembly code on CUDA C++ for research. This is my setup: I have just downloaded the latest CUDA C++ toolkit (13.0) yesterday on WSL linux. The local compilation environment does not ...
Junhao Liu's user avatar
3 votes
1 answer
66 views

I have the following code in Python3.11 using PyTorch: arr = np.array([[0, 0, 0], [0, -1, 0], [0, 1, 0]]) arr_tensor = torch.tensor(arr, dtype=torch.float32, device=...
BBB's user avatar
  • 105
0 votes
1 answer
36 views

When I run: from transformers import AutoProcessor, BarkModel import os from scipy.io.wavfile import write as write_wav CUDA_VISIBLE_DEVICES=0 os.environ["SUNO_OFFLOAD_CPU"] = "True&...
rocky's user avatar
  • 1
1 vote
0 answers
106 views

This is a crosspost from the Math Exchange forum, it seems to me that this question can be approached in two different ways so I am curious about different approaches. https://math.stackexchange.com/...
glowl's user avatar
  • 51
1 vote
0 answers
41 views

I have a model that given a configuration, or state (of a Rubik's cube, but whatever, it is a sequence of integers) generates a movement (from 0 to 5). This movement can be used to bring the ...
Nikio's user avatar
  • 111
1 vote
1 answer
63 views

I have a Vector, M, with size N and a Tensor, d, with size NxNxD. My aim is to perform the matrix multication M*d[i,:,:] for each i to get a new matrix with size nxD. Now I could just do it like this: ...
william paine's user avatar
0 votes
1 answer
38 views

Why TensorDataset devide the data to minibatches? For example, when putting in it 2D array, instead of yielding 2D tensors as batches, it sets the required batches to be minibatches, and its actual &...
J. Doe's user avatar
  • 305
0 votes
1 answer
343 views

I'm confused what exactly is handled by CuTe and by Cutlass. From my understanding Cutlass handles the following: Gemm computation of CuTe Tensors Communication between CPU and GPU Abstract memory ...
jonithani123's user avatar
1 vote
0 answers
43 views

My code involves slicing large tensors on the CPU by index and asynchronously transmitting them back to the GPU. However, through the Profiler debugging tool, I found that this step would seriously ...
Ponytail's user avatar

15 30 50 per page
1
2 3 4 5
194