Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Narrowing inconsistencies #9260

Closed
yortus opened this issue Jun 20, 2016 · 11 comments
Closed

Narrowing inconsistencies #9260

yortus opened this issue Jun 20, 2016 · 11 comments
Labels
Question An issue which isn't directly actionable in code

Comments

@yortus
Copy link
Contributor

yortus commented Jun 20, 2016

TypeScript Version:

nightly (1.9.0-dev.20160619-1.0)

Code

Three functions that narrow three types to see how narrowing behaves. foo1 and foo2 are identical apart from which type guards they use.

// A discriminated union with three cases
interface A { type: 'A' }
interface B { type: 'B' }
interface C { type: 'C' }

// Custom type guards for A/B/C
function isA(x: A|B|C): x is A { return x.type === 'A'; }
function isB(x: A|B|C): x is B { return x.type === 'B'; }
function isC(x: A|B|C): x is C { return x.type === 'C'; }

// Using if/else with type guard functions
function foo1(x: A|B|C): any {
    x // x is A | B | C
    if (isA(x)) {
        return x; // x is A
    }
    x // x is B | C
    if (isB(x)) {
        return x; // x is B
    }
    x // x is C
    if (isC(x)) {
        return x; // x is C
    }
    x // x is C, shouldn't x be 'never' here?           <===== (1)

    // If x was 'never', #8548 should apply here:
    if (isA(x)) {
        return x; // x is C & A                         <===== (2)
    }
    else {
        x // x is C                                     <===== (3)
    }
}

// Using if/else with discriminated unions
function foo2(x: A|B|C): any {
    x // x is A | B | C
    if (x.type === 'A') {
        return x; // x is A
    }
    x // x is B | C
    if (x.type === 'B') {
        return x; // x is B
    }
    x // x is C
    if (x.type === 'C') {
        return x; // x is C
    }
    x // x is A | B, shouldn't x be 'never' here?       <===== (4)

    // If x was 'never', #8548 should apply here:
    if (x.type === 'A') {
        return x; // x is A                             <===== (5)
    }
    else {
        x // x is B                                     <===== (6)
    }
}

// Using switch/case with discriminated unions
function foo3(x: A|B|C): any {
    x // x is A | B | C
    switch (x.type) {
        case 'A': return x; // x is A
        case 'B': return x; // x is B
        case 'C': return x; // x is C
        default:
            x // x is never                                 <===== (7)
            if (isA(x)) {
                x // x is never, why not A due to #8548?    <===== (8)
            }
    }
}

Expected behavior:

  • The inferred types at (1)/(2)/(3) should be the same as the inferred types at (4)/(5)/(6)
  • The inferred type at (1) and at (4) should be never, like it is at (7)
  • Type of x should be A at (8) due to Type guards as assertions #8548

Actual behavior:
See the comments in the code.


I wrote this code trying to understand what was going on in #9246 and #9254, and now I think I'm more confused. Firstly, foo1 and foo2 have identical logic, but tsc infers types differently in each. Secondly, I couldn't work out how to invoke the 'type guards as assertions' (#8548) behaviour. I thought it should kick in after (1) and (4), where the types have been exhausted and then another type guard appears. But control flow analysis never infers never in either function. We do get never inferred at (7), but then #8548 doesn't seem to apply at (8) where I expected it should.

@ahejlsberg
Copy link
Member

#8548 works a bit differently from what you assume. It basically says that in either the when true or when false branch of a type guard for some x, the resulting type becomes never only when the type guard produces never for both the current control flow type of x and the declared type of x. In other words, you only get never if it's the result whether we trust the control flow or view the guard as an assertion. However, once we've produced a never, nothing but a manifest type assertion will make it go away.

Let me start with (4). Prior to the x.type === 'C' guard, the control flow type of x is C. Therefore, in the when false branch of that type guard (i.e. at the (4) marker), the control flow type would become never. That causes #8548 to kick in (because we view the type guard as an assertion) and repeat the type guard with the declared type instead. So, the control flow type of x becomes A | B in the when false branch.

I think it is fair to assume you should see the same type at (1). The reason you don't is that (for historical reasons) we don't narrow in the when false branch of a typeof, instanceof, or user defined type guard unless the current type is a union type. We probably should, which would produce a never, which would then cause #8548 to kick in and produce A | B.

Regarding the switch statement, here we view the entire construct as a single type guard, so whether we apply it to the current control flow type or the declared type, we still end up with type never in the default clause. So we let that stand.

@ahejlsberg ahejlsberg added the Question An issue which isn't directly actionable in code label Jun 21, 2016
@yortus
Copy link
Contributor Author

yortus commented Jun 22, 2016

Thanks @ahejlsberg, that helps me understand #8548 better. So it is applying in the above example, just not where or how I expected.

I can see the problem #8548 solves by following its link to #8513. However, by applying to the example above, it seems to introduce problems of its own. Here is a simpler example showing what I mean:

type Coin = {type: 'heads', h} | {type: 'tails', t};

function f1(c: Coin) {
    if (c.type === 'heads') {
        return c.h;
    }
    else /* must be 'tails' */ {
        return c.t;
    }
    c; // ERROR: unreachable code detected
}

function f2(c: Coin) {
    if (c.type === 'heads') {
        return c.h;
    }
    else if (c.type === 'tails') {
        return c.t;
    }
    c; // c is 'heads'
}

Assuming that the argument c meets its preconditions (i.e. that it is either 'heads' or 'tails' at runtime, and nothing else), then f1 and f2 do the same thing, with just different implicit/explicit styles. Some people favour one style and some the other.

With #8548 in play, tsc sees f2 differently. It no longer detects unreachable code. Also, the inference that c is 'heads' after the if/else block does not agree with either runtime or intuitive behaviour. What if we had checked for 'tails' first and then 'heads'? Then tsc would infer that c was 'tails' after the if/else. So depending on the order of checks, we may or may not get a compiler warning about a typo in the following version of f2:

function f2b(c: Coin) {
    if (c.type === 'heads') {
        return c.h;
    }
    else if (c.type === 'tails') {
        return c.t;
    }

    c.h; // No compiler error
    c.t; // ERROR: 't' doesn't exist

    // --- If we had checked for 'tails' before 'heads' in the if/else above, we'd get the opposite result: ---
    c.h; // ERROR: 'h' doesn't exist
    c.t; // No compiler error
}

These errors are really artifacts of #8548. They arise from a type inference that does not reflect runtime behaviour, and they are sensitive to code order that has no logical relevance. Personally, I'd rather see the 'unreachable code' error in all the examples above.

I'd like to think that there is some exception that could be implemented around #8548 so that it still applies in situations like #8513, but no longer produces artifacts like those here and in #9246 and #9254.

@yortus
Copy link
Contributor Author

yortus commented Jun 22, 2016

Suggestion: don't apply #8548 when encountering a type guard whose asserted type exactly matches the control flow type at that point in the code.

Here is a example demonstrating the type inference behaviour from before #8548 was merged:

// compiled with [email protected]
let x: number | string;
x // x is number | string
x = 1234;
x // x is number
if (typeof x === "string") {
    return x; // x is nothing           <===== #8513 problem
}
x // x is number
if (typeof x === "number") {
    return x; // x is number
}
x // x is nothing

Here is the behaviour of the same example currently (i.e. with #8548):

// compiled with [email protected]
let x: number | string;
x // x is number | string
x = 1234;
x // x is number
if (typeof x === "string") {
    return x; // x is string            <===== fixed #8513
}
x // x is number
if (typeof x === "number") {
    return x; // x is number
}
x // x is string                        <===== new #9246/#9254/#9260 problem

Here is what I think the behaviour would be if #8548 is not applied when a type assertion exactly matches the current control flow type:

// Compiled with hypothetical change to #8548 behaviour
let x: number | string;
x // x is number | string
x = 1234;
x // x is number
if (typeof x === "string") {
    return x; // x is string            <===== fixed #8513 (still)
}
x // x is number
if (typeof x === "number") { //         <===== DON'T apply #8548 here
    return x; // x is number
}
x // x is never                         <===== fixed #9246/#9254/#9260

The last example still fixes #8513, but now also produces the 'expected behaviour' for #9246 and #9254 and #9260.

@ahejlsberg
Copy link
Member

I see how that might work for this example, but I'm not sure we could formulate a notion of exactly matches for all the different kinds of type guards we support. For example, it's not clear what would exactly match a truthy or falsy check. Also, I'm not really sure it works in all cases. Here's a twist on the motivating example from #8513:

let cond: boolean;
function foo() {
    let x: string | number = 0;
    x;  // number
    while (cond) {
        x;  // number, then string | number
        x = typeof x === "number" ? "abc" : x.slice();
        x;  // string
    }
}

I think you're arguing x should be never in the false branch of the conditional operator in the initial analysis of the loop body (before the flow back is considered). That would cause an error to be reported, which is what we're trying to avoid.

We could potentially solve #8513 by only applying #8548 when we know we're analyzing a loop body with incomplete data. But that would require more closely tracking the distinction between the phases of loop analysis. And it still wouldn't solve the initial example in #8548, i.e. the cases where you really want the type guard to function as an assertion because control flow analysis isn't aware of an assignment that has occurred in a nested function.

@yortus
Copy link
Contributor Author

yortus commented Jun 22, 2016

@ahejlsberg I don't know the details of the compiler's loop analysis. So speaking purely from my meat-powered type analysis of your loop example, it's unclear to me why the above suggestion would infer never anywhere?

Here is the same code expanded a little:

let cond: boolean;
function foo() {
    let x: string | number = 0;
    x;  // number
    while (cond) {
        x;  // number | string              <===== (1)
        if (typeof x === 'number') {
            x;  // number                   <===== (2)
            x = "abc";
            x;  // string
        }
        else {
            x;  // string                   <===== (3)
            x = x.slice();
            x;  // string
        }
        x;  // string                       <===== (4)
    }
    x;  // number | string
}

In your example at (1) you've commented x; // number, then string | number, whereas intellisense just shows number | string. Is the 'x, then y' type tracking important to tsc's loop analysis? Because if it's just analysed as number | string at (1), then #8548 would not apply to the type guard, and we'd get number at (2), string at (3), and string at (4), and no never anywhere. I suppose I am missing something?

@yortus
Copy link
Contributor Author

yortus commented Jun 23, 2016

I'm not sure we could formulate a notion of exactly matches for all the different kinds of type guards we support

I'm not sure why that would even be necessary, at least initially. If 'exactly matches' meant literally 'the same type', it would fix all cases of the 'type exhaustion' pattern where a series if if/thens check for every possible type in a union, much like the way discriminated unions now work with switch/case.

That would resolve #9246, #9254, and this issue. Anything more sophisticated than that would be a further improvement that could be addressed later.

@ahejlsberg
Copy link
Member

Fixed in #10118.

@yortus
Copy link
Contributor Author

yortus commented Aug 4, 2016

@ahejlsberg can this please be re-opened? Some of the reported inconsistencies are still there with 2.1.0-dev.20160804 - it seems like the improvements from #10118 affect some parts of the code but not others.

Remaining issues with the original example:

  • The inferred types at (1)/(2)/(3) should be the same as the inferred types at (4)/(5)/(6), but they are still different. The type of x at (4), (5), and (6) is now never, so Limit "type guards as assertions" behavior #10118 seems to have fixed that. But inference at (1)/(2)/(3) is unchanged, and x at (1) remains inferred as C. Shouldn't it be never, like it is in (4)? The two functions have the same logical control flow, after all.

Remaining issues with the heads | tails example from this comment above:

  • The f2 function looks better - c is now infered as never after the if/else. But why does f1 report unreachable code on the last line, but f2 does not? If c is inferred as never on the last line of function f2, doesn't that imply unreachable code?

Another example with unreachable code:

In #8652 it says "never is the return type for functions that never return". So the following example, where both functions behave identically, surprises me:

function nonReturningA() { // inferred return type is string|number. Shouldn't it be never?
    let x = Math.random() > 0.5 ? 'foo' : 42; // x is string|number
    if (typeof x === 'string') {
        throw x; // x is string
        x; // ERROR: unreachable code detected
    }
    else /* x must be a number */ {
        throw x; // x is number
        x; // ERROR: unreachable code detected
    }
    return x; // ERROR: unreachable code detected. Doesn't that imply return type is never?
}

function nonReturningB() { // inferred return type is never
    let x = Math.random() > 0.5 ? 'foo' : 42;
    if (typeof x === 'string') {
        throw x; // x is string
        x; // ERROR: unreachable code detected
    }
    else if (typeof x === 'number') {
        throw x; // x is number
        x; // ERROR: unreachable code detected
    }
    return x; // x is never, and return type is never, implying this is unreachable
              // no error reported here. Why doesn't tsc report this as unreachable code?
}
  1. tsc correctly infers that nonReturningA doesn't return, reporting 'unreachable code' on the last line, so shouldn't the inferred return type be never?
  2. In nonReturningB, tsc infers x is never on the last line, implying that line is unreachable according to the above definition of 'never'. And indeed, the function's inferred return type is never. So shouldn't it report the last line as unreachable code like the other function does?

EDIT: I should add that I really care about getting the 'unreachable code detected' error because it instantly proves that the prior control flow branches are exhaustive. Conversely, 'reachable' code below what should be an exhaustive control flow block prompts me to look for errors in my control flow logic.

@ahejlsberg
Copy link
Member

But inference at (1)/(2)/(3) is unchanged, and x at (1) remains inferred as C. Shouldn't it be never, like it is in (4)? The two functions have the same logical control flow, after all.

Agreed. If !isFoo(x) applied to x: Foo | Bar produces Bar, then !isFoo(x) applied to x: Foo ought to produce never. The current behavior is a holdover from before we had control flow analysis and never types and is unrelated to #10118. I think we need a separate issue for it.

But why does f1 report unreachable code on the last line, but f2 does not? If c is inferred as never on the last line of function f2, doesn't that imply unreachable code?

We only report unreachable code errors based on the structure of the code, i.e. when we are 100% certain the code is unreachable in all circumstances. A never type does indeed imply unreachable code, but only if the assumptions made by the type checker are correct. There are many ways the assumptions could be violated, such as incorrect type assertions or bad parameters passed from unchecked code. You want to be able to write code to guard against that (e.g. assertNever calls), and that wouldn't be possible if we always flagged such code as unreachable.

@yortus
Copy link
Contributor Author

yortus commented Aug 4, 2016

@ahejlsberg I've submitted issue #10145 for the remaining type guard narrowing issue.

For the unreachable code part, I understand the rationale you've just given, the compiler is being conservative with its inferences. But the different compiler behaviour for nonReturningA and nonReturningB still do seem surprising when put side-by-side, given the meaning of never. Maybe there's some room for future improved 'strict analysis mode' if enough people wanted it.

@tekacs
Copy link

tekacs commented Sep 9, 2016

I was about to post about 'never' in switch statements and unreachable code in a new issue and found this.

I guess I'll instead take the time to +1 the idea of a strict analysis mode (or perhaps assume-TypeScript-caller mode?). In cases where TypeScript calls TypeScript, it's somewhat frustrating to make concessions for where the type system 'could have been given the wrong information' (see also my issue #10765 in a similar vein, where all code is TS but we rely upon the user providing valid type information).

In line with A. Hejlsberg's comments, perhaps we could at least /warn/ here? I'll see if it's possible to add a lint to TSLint, for now.

Edit: a related issue is palantir/tslint#661


Original issue:

type X = 'a' | 'b';

function sw(x: X): number {
  switch (x) {
    case 'a':
      return 1;
    case 'b':
      return 2;
    default:
      return 'string' // Error: Type 'string' is not assignable to type 'number'.
  }
}

Expected behavior: Error on the line 'default' indicating unreachable code (or better, an over-exhaustive match).

Actual behavior: Error on the returned string line.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Question An issue which isn't directly actionable in code
Projects
None yet
Development

No branches or pull requests

3 participants