Proper use of 'yield return'

c# yield-return

The yield keyword is one of those keywords in C# that continues to mystify me, and I've never been confident that I'm using it correctly.

Of the following two pieces of code, which is the preferred and why?

Version 1: Using yield return

public static IEnumerable<Product> GetAllProducts()
{
    using (AdventureWorksEntities db = new AdventureWorksEntities())
    {
        var products = from product in db.Product
                       select product;

        foreach (Product product in products)
        {
            yield return product;
        }
    }
}

Version 2: Return the list

public static IEnumerable<Product> GetAllProducts()
{
    using (AdventureWorksEntities db = new AdventureWorksEntities())
    {
        var products = from product in db.Product
                       select product;

        return products.ToList<Product>();
    }
}

yield is tied to IEnumerable<T> and its kind. It's in someway lazy evaluation

here is a greate answer to the similar question. stackoverflow.com/questions/15381708/…

Here's a good usage example: stackoverflow.com/questions/3392612/…

I see a good case for using yield return if the code that iterates through the results of GetAllProducts() allows the user a chance to prematurely cancel the processing.

I found this thread really helpful: programmers.stackexchange.com/a/97350/148944

abelenky

I tend to use yield-return when I calculate the next item in the list (or even the next group of items).

Using your Version 2, you must have the complete list before returning. By using yield-return, you really only need to have the next item before returning.

Among other things, this helps spread the computational cost of complex calculations over a larger time-frame. For example, if the list is hooked up to a GUI and the user never goes to the last page, you never calculate the final items in the list.

Another case where yield-return is preferable is if the IEnumerable represents an infinite set. Consider the list of Prime Numbers, or an infinite list of random numbers. You can never return the full IEnumerable at once, so you use yield-return to return the list incrementally.

In your particular example, you have the full list of products, so I'd use Version 2.

I'd nitpick that in your example in question 3 conflates two benefits. 1) It spreads out computational cost (sometimes a benefit, sometimes not) 2) It may lazily avoid computation indefinitely in many use cases. You fail to mention the potential drawback that it keeps around intermediate state. If you have significant amounts of intermediate state (say a HashSet for duplicate elimination) then the use of yield can inflate your memory footprint.

Also, if each individual element is very large, but they only need to be accessed sequentially, a yield is better.

And finally... there's a slightly wonky but occasionally effective technique for using yield to write asynchronous code in a very serialized form.

Another example that might be interesting is when reading rather large CSV files. You want to read each element but you also want to extract your dependency away. Yield returning an IEnumerable<> will allow you to return each row and process each row individually. No need to read a 10 Mb file into memory. Just one line at a time.

Yield return seems to be shorthand for writing your own custom iterator class (implement IEnumerator). Hence, the mentioned benefits also apply to custom iterator classes. Anyway, both constructs keep intermediate state. In its most simple form it's about holding a reference to the current object.

Ajay

Populating a temporary list is like downloading the whole video, whereas using yield is like streaming that video.

I am perfectly aware that this answer is not a technical answer but I believe that the resemblance between yield and video streaming serves as a good example when understanding the yield keyword. Everything technical has already been said about this subject, so I tried to explain "in other words". Is there a community rule that says you cannot explain your ideas in non-technical terms?

Kache

As a conceptual example for understanding when you ought to use yield, let's say the method ConsumeLoop() processes the items returned/yielded by ProduceList():

void ConsumeLoop() {
    foreach (Consumable item in ProduceList())        // might have to wait here
        item.Consume();
}

IEnumerable<Consumable> ProduceList() {
    while (KeepProducing())
        yield return ProduceExpensiveConsumable();    // expensive
}

Without yield, the call to ProduceList() might take a long time because you have to complete the list before returning:

//pseudo-assembly
Produce consumable[0]                   // expensive operation, e.g. disk I/O
Produce consumable[1]                   // waiting...
Produce consumable[2]                   // waiting...
Produce consumable[3]                   // completed the consumable list
Consume consumable[0]                   // start consuming
Consume consumable[1]
Consume consumable[2]
Consume consumable[3]

Using yield, it becomes rearranged, sort of interleaved:

//pseudo-assembly
Produce consumable[0]
Consume consumable[0]                   // immediately yield & Consume
Produce consumable[1]                   // ConsumeLoop iterates, requesting next item
Consume consumable[1]                   // consume next
Produce consumable[2]
Consume consumable[2]                   // consume next
Produce consumable[3]
Consume consumable[3]                   // consume next

And lastly, as many before have already suggested, you should use Version 2 because you already have the completed list anyway.

Adam W. McKinley

I know this is an old question, but I'd like to offer one example of how the yield keyword can be creatively used. I have really benefited from this technique. Hopefully this will be of assistance to anyone else who stumbles upon this question.

Note: Don't think about the yield keyword as merely being another way to build a collection. A big part of the power of yield comes in the fact that execution is paused in your method or property until the calling code iterates over the next value. Here's my example:

Using the yield keyword (alongside Rob Eisenburg's Caliburn.Micro coroutines implementation) allows me to express an asynchronous call to a web service like this:

public IEnumerable<IResult> HandleButtonClick() {
    yield return Show.Busy();

    var loginCall = new LoginResult(wsClient, Username, Password);
    yield return loginCall;
    this.IsLoggedIn = loginCall.Success;

    yield return Show.NotBusy();
}

What this will do is turn my BusyIndicator on, call the Login method on my web service, set my IsLoggedIn flag to the return value, and then turn the BusyIndicator back off.

Here's how this works: IResult has an Execute method and a Completed event. Caliburn.Micro grabs the IEnumerator from the call to HandleButtonClick() and passes it into a Coroutine.BeginExecute method. The BeginExecute method starts iterating through the IResults. When the first IResult is returned, execution is paused inside HandleButtonClick(), and BeginExecute() attaches an event handler to the Completed event and calls Execute(). IResult.Execute() can perform either a synchronous or an asynchronous task and fires the Completed event when it's done.

LoginResult looks something like this:

public LoginResult : IResult {
    // Constructor to set private members...

    public void Execute(ActionExecutionContext context) {
        wsClient.LoginCompleted += (sender, e) => {
            this.Success = e.Result;
            Completed(this, new ResultCompletionEventArgs());
        };
        wsClient.Login(username, password);
    }

    public event EventHandler<ResultCompletionEventArgs> Completed = delegate { };
    public bool Success { get; private set; }
}

It may help to set up something like this and step through the execution to watch what's going on.

Hope this helps someone out! I've really enjoyed exploring the different ways yield can be used.

your code sample is an excellent example on how to use yield OUTSIDE of a for or foreach block. Most examples show yield return within an iterator. Very helpful as I was just about to ask the question on SO How to use yield outside of an iterator!

It has never occurred to me to use yield in this way. It seems like an elegant way to emulate the async/await pattern (which I assume would be used instead of yield if this were rewritten today). Have you found that these creative uses of yield have yielded (no pun intended) diminishing returns over the years as C# has evolved since you answered this question? Or are you still coming up with modernized clever use-cases such as this? And if so, would you mind sharing another interesting scenario for us?

This is reactive programming in a nutshell (just less elegant and flexible).

pogosama

Yield return can be very powerful for algorithms where you need to iterate through millions of objects. Consider the following example where you need to calculate possible trips for ride sharing. First we generate possible trips:

    static IEnumerable<Trip> CreatePossibleTrips()
    {
        for (int i = 0; i < 1000000; i++)
        {
            yield return new Trip
            {
                Id = i.ToString(),
                Driver = new Driver { Id = i.ToString() }
            };
        }
    }

Then iterate through each trip:

    static void Main(string[] args)
    {
        foreach (var trip in CreatePossibleTrips())
        {
            // possible trip is actually calculated only at this point, because of yield
            if (IsTripGood(trip))
            {
                // match good trip
            }
        }
    }

If you use List instead of yield, you will need to allocation 1 million objects to memory (~190mb) and this simple example will take ~1400ms to run. However, if you use yield, you don't need to put all these temp objects to memory and you will get significantly faster algorithm speed: this example will take only ~400ms to run with no memory consumption at all.

under the covers what is yield? I would have thought it was a list, hence how would it improve memory usage?

@rolls yield works under the covers by implementing a state machine internally. Here's an SO answer with 3 detailed MSDN blog posts that explain the implementation in great detail. Written by Raymond Chen @ MSFT

maples

This is what Chris Sells tells about those statements in The C# Programming Language;

I sometimes forget that yield return is not the same as return , in that the code after a yield return can be executed. For example, the code after the first return here can never be executed: int F() { return 1; return 2; // Can never be executed } In contrast, the code after the first yield return here can be executed: IEnumerable F() { yield return 1; yield return 2; // Can be executed } This often bites me in an if statement: IEnumerable F() { if(...) { yield return 1; // I mean this to be the only thing returned } yield return 2; // Oops! } In these cases, remembering that yield return is not “final” like return is helpful.

to cut down on ambiguity please clarify when you say can, is that, will or might? might it be possible for the first to return and not execute the second yield?

@JohnoCrawford the second yield statement will only execute if the second/next value of the IEnumerable is enumerated. It's entirely possible that it wont, e.g. F().Any() - this will return after trying to enumerate the first result only. In general, you shouldn't rely on an IEnumerable yield to change program state, because it may not actually get triggered

Jason Baker

The two pieces of code are really doing two different things. The first version will pull members as you need them. The second version will load all the results into memory before you start to do anything with it.

There's no right or wrong answer to this one. Which one is preferable just depends on the situation. For example, if there's a limit of time that you have to complete your query and you need to do something semi-complicated with the results, the second version could be preferable. But beware large resultsets, especially if you're running this code in 32-bit mode. I've been bitten by OutOfMemory exceptions several times when doing this method.

The key thing to keep in mind is this though: the differences are in efficiency. Thus, you should probably go with whichever one makes your code simpler and change it only after profiling.

Shivprasad Koirala

Yield has two great uses

It helps to provide custom iteration with out creating temp collections. ( loading all data and looping)

It helps to do stateful iteration. ( streaming)

Below is a simple video which i have created with full demonstration in order to support the above two points

http://www.youtube.com/watch?v=4fju3xcm21M

Soviut

Assuming your products LINQ class uses a similar yield for enumerating/iterating, the first version is more efficient because its only yielding one value each time its iterated over.

The second example is converting the enumerator/iterator to a list with the ToList() method. This means it manually iterates over all the items in the enumerator and then returns a flat list.

Mark A. Nicolosi

This is kinda besides the point, but since the question is tagged best-practices I'll go ahead and throw in my two cents. For this type of thing I greatly prefer to make it into a property:

public static IEnumerable<Product> AllProducts
{
    get {
        using (AdventureWorksEntities db = new AdventureWorksEntities()) {
            var products = from product in db.Product
                           select product;

            return products;
        }
    }
}

Sure, it's a little more boiler-plate, but the code that uses this will look much cleaner:

prices = Whatever.AllProducts.Select (product => product.price);

prices = Whatever.GetAllProducts().Select (product => product.price);

Note: I wouldn't do this for any methods that may take a while to do their work.

petr k.

And what about this?

public static IEnumerable<Product> GetAllProducts()
{
    using (AdventureWorksEntities db = new AdventureWorksEntities())
    {
        var products = from product in db.Product
                       select product;

        return products.ToList();
    }
}

I guess this is much cleaner. I do not have VS2008 at hand to check, though. In any case, if Products implements IEnumerable (as it seems to - it is used in a foreach statement), I'd return it directly.

shA.t

I would have used version 2 of the code in this case. Since you have the full-list of products available and that's what expected by the "consumer" of this method call, it would be required to send the complete information back to the caller.

If caller of this method requires "one" information at a time and the consumption of the next information is on-demand basis, then it would be beneficial to use yield return which will make sure the command of execution will be returned to the caller when a unit of information is available.

Some examples where one could use yield return is:

Complex, step-by-step calculation where caller is waiting for data of a step at a time Paging in GUI - where user might never reach to the last page and only sub-set of information is required to be disclosed on current page

To answer your questions, I would have used the version 2.

recursive

Return the list directly. Benefits:

It's more clear

The list is reusable. (the iterator is not) not actually true, Thanks Jon

You should use the iterator (yield) from when you think you probably won't have to iterate all the way to the end of the list, or when it has no end. For example, the client calling is going to be searching for the first product that satisfies some predicate, you might consider using the iterator, although that's a contrived example, and there are probably better ways to accomplish it. Basically, if you know in advance that the whole list will need to be calculated, just do it up front. If you think that it won't, then consider using the iterator version.

Don't forget that it's returning in IEnumerable, not an IEnumerator - you can call GetEnumerator again.

Even if you know in advance the whole list will need to be calculated it might still be beneficial to use yield return. One example is when collection contains hundreds thousand of items.

dns

Given the exact two code snippets, I think Version 1 is the better one as it can be more efficient. Let's say there are a lot of products and the caller wants to convert to DTOs.

var dtos = GetAllProducts().Select(ConvertToDto).ToList();

With Version 2 first a list of Product objects would be created, and then another list of ProductDto objects. With Version 1 there is no list of Product objects, only the list of the required ProductDto objects gets built.

Even without converting, Version 2 has a problem in my opinion: The list is returned as IEnumerable. The caller of GetAllProducts() does not know how expensive the enumeration of the result is. And if the caller needs to iterate more than once, she will probably materialize once by using ToList() (tools like ReSharper also suggest this). Which results in an unnecessary copy of the list already created in GetAllProducts(). So if Version 2 should be used, the return type should be List and not IEnumerable.

123

The usage of yield is similar to the keyword return, except that it will return a generator. And the generator object will only traverse once.

yield has two benefits:

You do not need to read these values twice; You can get many child nodes but do not have to put them all in memory.

There is another clear explanation maybe help you.

Proper use of 'yield return'

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US