ChatGPT解决这个技术问题 Extra ChatGPT

C# 5 async CTP: why is internal "state" set to 0 in generated code before EndAwait call?

Yesterday I was giving a talk about the new C# "async" feature, in particular delving into what the generated code looked like, and the GetAwaiter() / BeginAwait() / EndAwait() calls.

We looked in some detail at the state machine generated by the C# compiler, and there were two aspects we couldn't understand:

Why the generated class contains a Dispose() method and a $__disposing variable, which never appear to be used (and the class doesn't implement IDisposable).

Why the internal state variable is set to 0 before any call to EndAwait(), when 0 normally appears to mean "this is the initial entry point".

I suspect the first point could be answered by doing something more interesting within the async method, although if anyone has any further information I'd be glad to hear it. This question is more about the second point, however.

Here's a very simple piece of sample code:

using System.Threading.Tasks;

class Test
{
    static async Task<int> Sum(Task<int> t1, Task<int> t2)
    {
        return await t1 + await t2;
    }
}

... and here's the code which gets generated for the MoveNext() method which implements the state machine. This is copied directly from Reflector - I haven't fixed up the unspeakable variable names:

public void MoveNext()
{
    try
    {
        this.$__doFinallyBodies = true;
        switch (this.<>1__state)
        {
            case 1:
                break;

            case 2:
                goto Label_00DA;

            case -1:
                return;

            default:
                this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
                this.<>1__state = 1;
                this.$__doFinallyBodies = false;
                if (this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate))
                {
                    return;
                }
                this.$__doFinallyBodies = true;
                break;
        }
        this.<>1__state = 0;
        this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();
        this.<a2>t__$await4 = this.t2.GetAwaiter<int>();
        this.<>1__state = 2;
        this.$__doFinallyBodies = false;
        if (this.<a2>t__$await4.BeginAwait(this.MoveNextDelegate))
        {
            return;
        }
        this.$__doFinallyBodies = true;
    Label_00DA:
        this.<>1__state = 0;
        this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();
        this.<>1__state = -1;
        this.$builder.SetResult(this.<1>t__$await1 + this.<2>t__$await3);
    }
    catch (Exception exception)
    {
        this.<>1__state = -1;
        this.$builder.SetException(exception);
    }
}

It's long, but the important lines for this question are these:

// End of awaiting t1
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

// End of awaiting t2
this.<>1__state = 0;
this.<2>t__$await3 = this.<a2>t__$await4.EndAwait();

In both cases the state is changed again afterwards before it's next obviously observed... so why set it to 0 at all? If MoveNext() were called again at this point (either directly or via Dispose) it would effectively start the async method again, which would be wholly inappropriate as far as I can tell... if and MoveNext() isn't called, the change in state is irrelevant.

Is this simply a side-effect of the compiler reusing iterator block generation code for async, where it may have a more obvious explanation?

Important disclaimer

Obviously this is just a CTP compiler. I fully expect things to change before the final release - and possibly even before the next CTP release. This question is in no way trying to claim this is a flaw in the C# compiler or anything like that. I'm just trying to work out whether there's a subtle reason for this that I've missed :)

The VB compiler produces a similar state machine (don't know if that's expected or not, but VB didn't have iterator blocks before)
@Rune: MoveNextDelegate is just a delegate field which refers to MoveNext. It's cached to avoid creating a new Action to pass into the awaiter each time, I believe.
I think the answer is: this is a CTP. The high order bit for the team was getting this out there and the language design validated. And they did so amazingly quickly. You should expect the shipped implementation (of the compilers, not MoveNext) to differ significantly. I think Eric or Lucian to come back with an answer along the lines that there's nothing deep here, just a behavior/bug that doesn't matter in most cases and no one noticed. Because it's a CTP.
@Stilgar: I've just checked with ildasm, and it really is doing this.
@JonSkeet: Notice how no one upvotes the answers. 99% of us can't really tell if the answer even sounds about right.

I
Igor Lyadov

Okay, I finally have a real answer. I sort of worked it out on my own, but only after Lucian Wischik from the VB part of the team confirmed that there really is a good reason for it. Many thanks to him - and please visit his blog (on archive.org), which rocks.

The value 0 here is only special because it's not a valid state which you might be in just before the await in a normal case. In particular, it's not a state which the state machine may end up testing for elsewhere. I believe that using any non-positive value would work just as well: -1 isn't used for this as it's logically incorrect, as -1 normally means "finished". I could argue that we're giving an extra meaning to state 0 at the moment, but ultimately it doesn't really matter. The point of this question was finding out why the state is being set at all.

The value is relevant if the await ends in an exception which is caught. We can end up coming back to the same await statement again, but we mustn't be in the state meaning "I'm just about to come back from that await" as otherwise all kinds of code would be skipped. It's simplest to show this with an example. Note that I'm now using the second CTP, so the generated code is slightly different to that in the question.

Here's the async method:

static async Task<int> FooAsync()
{
    var t = new SimpleAwaitable();
    
    for (int i = 0; i < 3; i++)
    {
        try
        {
            Console.WriteLine("In Try");
            return await t;
        }                
        catch (Exception)
        {
            Console.WriteLine("Trying again...");
        }
    }
    return 0;
}

Conceptually, the SimpleAwaitable can be any awaitable - maybe a task, maybe something else. For the purposes of my tests, it always returns false for IsCompleted, and throws an exception in GetResult.

Here's the generated code for MoveNext:

public void MoveNext()
{
    int returnValue;
    try
    {
        int num3 = state;
        if (num3 == 1)
        {
            goto Label_ContinuationPoint;
        }
        if (state == -1)
        {
            return;
        }
        t = new SimpleAwaitable();
        i = 0;
      Label_ContinuationPoint:
        while (i < 3)
        {
            // Label_ContinuationPoint: should be here
            try
            {
                num3 = state;
                if (num3 != 1)
                {
                    Console.WriteLine("In Try");
                    awaiter = t.GetAwaiter();
                    if (!awaiter.IsCompleted)
                    {
                        state = 1;
                        awaiter.OnCompleted(MoveNextDelegate);
                        return;
                    }
                }
                else
                {
                    state = 0;
                }
                int result = awaiter.GetResult();
                awaiter = null;
                returnValue = result;
                goto Label_ReturnStatement;
            }
            catch (Exception)
            {
                Console.WriteLine("Trying again...");
            }
            i++;
        }
        returnValue = 0;
    }
    catch (Exception exception)
    {
        state = -1;
        Builder.SetException(exception);
        return;
    }
  Label_ReturnStatement:
    state = -1;
    Builder.SetResult(returnValue);
}

I had to move Label_ContinuationPoint to make it valid code - otherwise it's not in the scope of the goto statement - but that doesn't affect the answer.

Think about what happens when GetResult throws its exception. We'll go through the catch block, increment i, and then loop round again (assuming i is still less than 3). We're still in whatever state we were before the GetResult call... but when we get inside the try block we must print "In Try" and call GetAwaiter again... and we'll only do that if state isn't 1. Without the state = 0 assignment, it will use the existing awaiter and skip the Console.WriteLine call.

It's a fairly tortuous bit of code to work through, but that just goes to show the kinds of thing that the team has to think about. I'm glad I'm not responsible for implementing this :)


@Shekhar_Pro: Yes, it's a goto. You should expect to see plenty of goto statements in auto-generated state machines :)
@Shekhar_Pro: Within manually written code, it is - because it makes the code hard to read and follow. No-one reads autogenerated code though, except fools like me who decompile it :)
So what does happen when we await again after an exception? We start all over again?
@configurator: It calls GetAwaiter on the awaitable, which is what I'd expect it to do.
gotos don't always make code harder to read. In fact, sometimes they even make sense to use (sacrilege to say, I know). For example, sometimes you may need to break multiple nested loops. I lesser used feature of goto (and IMO uglier use) is to cause switch statements to cascade. On a seperate note, I remember a day and age when gotos were a primary underpinning of some programming languages and because of that I fully realize why the mere mention of goto makes developers shudder. They can make things ugly if used poorly.
R
Rob

if it was kept at 1 (first case) you would get a call to EndAwait without a call to BeginAwait. If it's kept at 2 (second case) you'd get the same result just on the other awaiter.

I'm guessing that calling the BeginAwait returns false if it has be started already (a guess from my side) and keeps the original value to return at the EndAwait. If that's the case it would work correctly whereas if you set it to -1 you might have an uninitialized this.<1>t__$await1 for the first case.

This however assumes that BeginAwaiter won't actually start the action on any calls after the first and that it will return false in those cases. Starting would of course be unacceptable since it could have side effect or simply give a different result. It also assumpes that the EndAwaiter will always return the same value no matter how many times it's called and that is can be called when BeginAwait returns false (as per the above assumption)

It would seem to be a guard against race conditions If we inline the statements where movenext is called by a different thread after the state = 0 in questions it woule look something like the below

this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.<>1__state = 0;

//second thread
this.<a1>t__$await2 = this.t1.GetAwaiter<int>();
this.<>1__state = 1;
this.$__doFinallyBodies = false;
this.<a1>t__$await2.BeginAwait(this.MoveNextDelegate)
this.$__doFinallyBodies = true;
this.<>1__state = 0;
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

//other thread
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

If the assumptions above are correct the there's some unneeded work done such as get sawiater and reassigning the same value to <1>t__$await1. If the state was kept at 1 then the last part would in stead be:

//second thread
//I suppose this un matched call to EndAwait will fail
this.<1>t__$await1 = this.<a1>t__$await2.EndAwait();

further if it was set to 2 the state machine would assume it already had gotten the value of the first action which would be untrue and a (potentially) unassigned variable would be used to calculate the result


Bear in mind that the state isn't actually being used between the assignment to 0 and the assignment to a more meaningful value. If it's meant to guard against race conditions, I'd expect some other value to indicate that, e.g. -2, with a check for that at the start of MoveNext to detect inappropriate use. Bear in mind that a single instance should never actually be used by two threads at a time anyway - it's meant to give the illusion of a single synchronous method call which manages to "pause" every so often.
@Jon I agree it shouldn't be an issue with a race condition in the async case but could be in iteration block and could be a left over
@Tony: I think I'll wait until the next CTP or beta comes out, and check that behaviour.
G
GaryMcAllister

Could it be something to do with stacked/nested async calls ?..

i.e:

async Task m1()
{
    await m2;
}

async Task m2()
{
    await m3();
}

async Task m3()
{
Thread.Sleep(10000);
}

Does the movenext delegate get called multiple times in this situation ?

Just a punt really?


There would be three different generated classes in that case. MoveNext() would be called once on each of them.
f
fixagon

Explanation of actual states:

possible states:

0 Initialized (i think so) or waiting for end of operation

>0 just called MoveNext, chosing next state

-1 ended

Is it possible that this implementation just wants to assure that if another Call to MoveNext from whereever happens (while waiting) it will reevaluate the whole state-chain again from the beginning, to reevaluate results which could be in the mean time already outdated?


But why would it want to start from the beginning? That's almost certainly not what you'd actually want to happen - you'd want an exception thrown, because nothing else should be calling MoveNext.