ChatGPT解决这个技术问题 Extra ChatGPT

How can I limit Parallel.ForEach?

I have a Parallel.ForEach() async loop with which I download some webpages. My bandwidth is limited so I can download only x pages per time but Parallel.ForEach executes whole list of desired webpages.

Is there a way to limit thread number or any other limiter while running Parallel.ForEach?

Demo code:

Parallel.ForEach(listOfWebpages, webpage => {
  Download(webpage);
});

The real task has nothing to do with webpages, so creative web crawling solutions won't help.

@jKlaus If the list isn't modified e.g. it's just a set of URLs, I can't really see the issue?
@Shiv, given enough time you will... Count your number of executions and compare it to the count of the list.
@jKlaus What are you saying will go wrong?
@jKlaus you are modifying a non-threadsafe element (the integer). I would expect it to not work in that scenario. The OP on the other hand is not modifying anything that needs to be threadsafe.
@jKlaus Here is an example of Parallel.ForEach that sets the count correctly > dotnetfiddle.net/moqP2C. MSDN Link: msdn.microsoft.com/en-us/library/dd997393(v=vs.110).aspx

N
Nick Butler

You can specify a MaxDegreeOfParallelism in a ParallelOptions parameter:

Parallel.ForEach(
    listOfWebpages,
    new ParallelOptions { MaxDegreeOfParallelism = 4 },
    webpage => { Download(webpage); }
);

MSDN: Parallel.ForEach

MSDN: ParallelOptions.MaxDegreeOfParallelism


It may not apply to this particular case but I figured I'd throw it out in case anyone wonders across this and finds it useful. Here I am utilizing 75% (rounded up) of the processor count. var opts = new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 1.0)) };
Just to save anyone else having to look it up in the documentation, passing a value of -1 is the same as not specifying it at all: "If [the value] is -1, there is no limit on the number of concurrently running operations"
It's not clear to me from documentation - does setting MaxDegreeOfParallelism to 4 (for instance) mean there'll be 4 threads each running 1/4th of the loop iterations (one round of 4 threads dispatched), or does each thread still do one loop iteration and we're just limiting how many run in parallel?
To be clear cores and threads are not the same thing. Depending on the CPU, there are a different number of threads per core, usually 2 per core. For example, if you have a 4 core CPU with 2 threads per core, then you have a max of 8 threads. To adjust @jKlaus comment var opts = new ParallelOptions { MaxDegreeOfParallelism = Convert.ToInt32(Math.Ceiling((Environment.ProcessorCount * 0.75) * 2.0)) };. Link to threads vs cores - askubuntu.com/questions/668538/…
U
Uwe Keim

You can use ParallelOptions and set MaxDegreeOfParallelism to limit the number of concurrent threads:

Parallel.ForEach(
    listOfwebpages, 
    new ParallelOptions{MaxDegreeOfParallelism=2}, 
    webpage => {Download(webpage);});     

R
Richard

Use another overload of Parallel.Foreach that takes a ParallelOptions instance, and set MaxDegreeOfParallelism to limit how many instances execute in parallel.


u
user3496060

And for the VB.net users (syntax is weird and difficult to find)...

Parallel.ForEach(listOfWebpages, New ParallelOptions() With {.MaxDegreeOfParallelism = 8}, Sub(webpage)
......end sub)