Newest 'parallel-processing' Questions

2 votes

1 answer

55 views

Dask client connects successfully but no workers are available

I am using Dask for some processing. The client starts successfully, but I am seeing zero workers. This is how I am creating the client: client = Client("tls://localhost:xxxx") This is the ...

martian muonhunter

23

asked Dec 25, 2025 at 11:20

-3 votes

0 answers

47 views

How do multiple kernels and streams affect gpu utilisation if single kernel is not enough [closed]

I’m trying to reason about GPU utilisation and I feel like I’m missing something. If kernels in the default stream run sequentially, then how do we actually fully utilise the GPU? A single kernel ...

Pratisha Bista

1

asked Dec 22, 2025 at 13:25

1 vote

1 answer

148 views

Why when I place I/O task before CPU-bound task, runs faster than place I/O task after CPU-bound task?

using System.Diagnostics; const int TASKS = 100; var mainSw = Stopwatch.StartNew(); var tasks = Enumerable.Range(0, TASKS).Select(i => Task.Run(async () => { await Task.Delay(...

yuri

29

asked Dec 17, 2025 at 10:58

Best practices

0 votes

5 replies

111 views

Parallel.ForEach Returning Values

I need to process a list of objects (not the same shown on the sample), which I thought could be greatly improved by running it in parallel.foreach loop. However, the result is not what I expected. ...

Jlong101

3

asked Dec 13, 2025 at 3:24

Advice

1 vote

0 replies

45 views

What is the best pattern for triggering N sub-workflows in parallel and resuming main workflow when all complete?

I need to trigger a dynamic number of sub-workflows in parallel (around 100) and wait for ALL of them to complete before continuing the main workflow. I’ve implemented a solution but I’m wondering if ...

Michal

121

asked Dec 10, 2025 at 8:09

-4 votes

0 answers

44 views

What is the Global Interpreter Lock (GIL) in Python and why does it prevent true multithreading? [duplicate]

I’ve been reading about Python’s Global Interpreter Lock (GIL), and I’m a bit confused about how it actually works behind the scenes. From what I understand, the GIL allows only one thread to execute ...

Yash Gupta

1

asked Dec 7, 2025 at 12:42

Advice

2 votes

2 replies

63 views

Efficient MPI Parallelization Strategies for Localized PDE Subproblems within a Globally Decomposed Domain

I am working on a global PDE problem that is solved using a standard domain-decomposition strategy (e.g., Scotch, METIS). This part of the computation is well balanced across all MPI processes. ...

hrx71

1

asked Dec 6, 2025 at 12:46

Tooling

1 vote

3 replies

77 views

using persistent-memory gawk how variables can created to be local and issolated from other execution instances?

The idea of Persistent-Memory gawk is fabulous because it improves the performance, size, and clarity of many scripts on static and reference data. However, I have a significant problem in adopting ...

Sergio Albert

1

asked Dec 1, 2025 at 15:05

1 vote

0 answers

84 views

How to share a large CustomObject to workers in Python multiprocessing on Windows (spawn)?

I'm trying to run calculations using multiple cores in Python on multiple platforms (Linux, macOS, Windows). I need to pass a large CustomClass Object and a dict (both readonly) to all workers. So far ...

polyte

459

asked Nov 30, 2025 at 13:00

0 votes

0 answers

46 views

Attribution Error when using Huggingface transformers Trainer with FSDP

I am now trying to use FSDP in Huggingface transformers Trainer. The training script is something like train_dataset = Mydataset(...) args = TrainingArguments(...) model = LlamaForCausalLM....

xuehao-049

11

asked Nov 28, 2025 at 4:11

0 votes

0 answers

22 views

OptimisticLockingException when using multiInstanceLoopCharacteristics for parallel execution of subprocess

I have the following process definition I try to execute on Camunda 7.24 / CibSeven 2.1 which currently logs during execution many OptimisticLockingException. I could already trace it down that it ...

BigMichi1

308

asked Nov 21, 2025 at 14:30

0 votes

1 answer

126 views

Why are items not written to console immediately after being processed?

I have the following C# code : var rand = new Random(1); var range = Enumerable.Range(1, 8); var partition = Partitioner.Create(range, EnumerablePartitionerOptions.NoBuffering); foreach (var x in ...

tigrou

4,596

asked Nov 20, 2025 at 12:37

0 votes

1 answer

100 views

Taking advantage of memory contiguousness in HLSL

This is a bit of a slog so bare with me. I'm currently writing a 3D S(moothed) P(article) H(ydrodynamics) simulation in Unity with a parallel HLSL backend. It's a Lagrangian method of fluid simulation,...

Ben Williams

13

asked Nov 18, 2025 at 14:50

Tooling

0 votes

0 replies

36 views

ComfyUI + Flux 1 dev + limited RAM + same workflow: With 2 GPUs?

I am running Flux 1 dev text to image model through ComfyUI in Kaggle. Everything works but I noticed that Kaggle offers a second GPU inside the notebook. If I try to run two instances of the ComfyUI ...

Bram Fran

133

asked Nov 17, 2025 at 15:03

1 vote

0 answers

81 views

Intuition over TBB parallel scan/parallel prefix requirements

I am reading a paragraph about the tbb::parallel_scan algorithm from the book Intel Threading Building Blocks, and I understood what the operation does serially, but I am not understanding what are ...

luczzz

446

asked Nov 14, 2025 at 10:48

Collectives™ on Stack Overflow

Dask client connects successfully but no workers are available

How do multiple kernels and streams affect gpu utilisation if single kernel is not enough [closed]

Why when I place I/O task before CPU-bound task, runs faster than place I/O task after CPU-bound task?

Parallel.ForEach Returning Values

What is the best pattern for triggering N sub-workflows in parallel and resuming main workflow when all complete?

What is the Global Interpreter Lock (GIL) in Python and why does it prevent true multithreading? [duplicate]

Efficient MPI Parallelization Strategies for Localized PDE Subproblems within a Globally Decomposed Domain

using persistent-memory gawk how variables can created to be local and issolated from other execution instances?

How to share a large CustomObject to workers in Python multiprocessing on Windows (spawn)?

Attribution Error when using Huggingface transformers Trainer with FSDP

OptimisticLockingException when using multiInstanceLoopCharacteristics for parallel execution of subprocess

Why are items not written to console immediately after being processed?

Taking advantage of memory contiguousness in HLSL

ComfyUI + Flux 1 dev + limited RAM + same workflow: With 2 GPUs?

Intuition over TBB parallel scan/parallel prefix requirements

Hot Network Questions