Powershell: Suggestion: implement null-coalescing, null-conditional access (null-soaking), null-conditional assignment

Created on 2 Mar 2017  ·  71Comments  ·  Source: PowerShell/PowerShell

Null-coalescing and null-conditonal access (null-soaking) would be handy additions to the language.

_Update_: @BrucePay additionally suggests null-conditional _assignment_, with ?= - see comment below.

For instance, instead of writing:

if ($null -ne $varThatMayBeNull) { $varThatMayBeNull } else { $fallbackValue }
#
if ($null -ne $varThatMayBeNull) { $varThatMayBeNull.Name } else { $null }

one might be able to write:

$varThatMayBeNull ?? $fallbackValue  # null-coalescing
# 
$varThatMayBeNull?.Name   # null-conditional access, for Set-StrictMode -Version 2+
$varThatMayBeNull?[42]  # ditto, with indexing

Re null-conditional access: With Set-StrictMode being OFF (the default), you can just use $varThatMayBeNull.Name - no special syntax needed; however, if Set-StrictMode -Version 2 or higher is in effect, $varThatMayBeNull.Name would _break_ and that's where the null-conditional operator (?.) is helpful, to signal the explicit intent to ignore the $null in a concise manner.


Open question:

$varThatMayBeNull?[42] handles the case where the variable is $null, but if it isn't, an array element with the specified index _must exist_.

It would therefore also be helpful to make _indexing_ null-conditional - something that C# does _not_ support, incidentally (you have to use .ElementAtOrDefault()).

The two basic choices are:

  • Come up with additional syntax that explicitly signal the intent to ignore a non-existent _index_:

    • The question is what syntax to choose for this, given that ?[...] and [...]? are not an option due to ambiguity.
    • Perhaps [...?], but that seems awkward .
  • Rely on the existing behavior with respect to accessing non-existent indices, as implied by the Set-StrictMode setting - see table below.


Related: implement ternary conditionals

Issue-Enhancement Resolution-Fixed WG-Language

Most helpful comment

@rjmholt I did an analysis over here. A quick breakdown: Out of nearly 400,000 scripts with 22,000,000+ variable names (not unique), there were only ~70 unique variables that ended with ?.

All 71 comments

@mklement0 It seems the second sample works by design without "?".
Maybe change to name()?

@iSazonov: Good point, but it only works without the ? unless Set-StrictMode -Version 2 or higher is _not_ in effect - I've updated the initial post to make that clear.

I think syntactic sugar is something powershell could use that other languages enjoy. +1 on the proposal.

Consider the following samples from other languages:

a="${b:-$c}"
a = b || c;
a = b or c
a := b ? c
a = b ? b : c;

even

a = b if b else c

is better than

if (b) { a = b } else { a = c }

and often times you need to clarify with the additionally verbose

a=(if (b -ne $null) { b } else { c })

Makes one feel all dirty and bashed.

Found myself doing this today as work around.

```powershell
$word = ($null, "two", "three").Where({$_ -ne $null}, "First")
````

I use a similar pattern:

$word = ($null, "two", "three" -ne $null)[0]

@kfsone There is a small error in your example $a=(if ($b -ne $null) { $b } else { $c }). It should be $a=$(if ($b -ne $null) { $b } else { $c }) however the only version of PowerShell where $( ) was required was version 1. From v2 on, you can simply do:

$a = if ($b) { $b } else { $c }

@bgshacklett Don't you mean $word=($null,'two')[$null -ne 'three']?

It's a bit unfortunate this has gone from 6.1 to 6.2 to "Future".

@TheIncorrigible1 No, if you copy and paste what I added above, you should see that the value of $word is set to "two". There's more detail in this Stack Overflow answer where I first saw the pattern:
https://stackoverflow.com/a/17647824/180813

While perlesque hack-arounds appeal to the 20-year ago me that wrote a shebang that registered and/or queried a RIPE-DB user or organization record, what I'm hoping for from powershell is something that encourages my colleagues to use the language rather than instill fear in them.

My litmus test is this: Would I want to read this via my tablet at 3am New Years morning while hung over with the ceo on the phone crying at me how many millions of dollars we are losing a second.

(Aside: this was only a loose exaggeration of an actual experience until I worked at Facebook and came back from an urgent quick leak to be told that, in the 2 minutes I was gone, more people than the population of Holland had gotten an empty news feed. It wasn't my code and the mistake was a minute change in the semantics of return x vs return (x) in the c++ standard for very specific template cases, and "would you want to read this code with a 2 minute deadline, a full bladder and the fate of every clog wearing Dutch person's cat pictures on your shoulder???" didn't sound as cool)

Yup. There are some neat tricks possible in PS that are great in a pinch or if you need a one-time shorthand.

For maintainable code, ideally we should have explicit null-coalescing operators like C#. The main trouble there for me is -- what do we use for those? ? is already aliased to Where-Object (much as I'd love for that to be erased, it's in very common use). Mind you, % is aliased to ForEach-Object but that doesn't hinder modulus operations, so in theory at least having ? be a null-coalescing operator as well would potentially be fine; it would only be interpreted as such in expressions, where Where-Object isn't really valid anyway.

@vexx32:

At least _syntactically_ ?? should be fine, because we're talking about _expression_ mode, whereas ?, as a command [alias], is only recognized in _argument_ mode.
Not sure if reuse of the symbol causes confusion, but it wouldn't be the first time that a symbol does double duty in different contexts.

Surprisingly, using ?. and ?[] for null-soaking would technically be a breaking change, because PowerShell currently allows ? as a non-initial character in a variable name.

PS> $foo? = @{ bar = 1 }; $foo?.bar   # !! $foo? is a legal variable name
1

However, I hope this would be considered a Bucket 3: Unlikely Grey Area change.

I can't say I've ever seen anyone use ? in a variable name... Nor would I, because chances are it would be misread. But yeah, hopefully that should be perfectly useable.

I suspect it's a bit late to bring this up, but Bash has handled this (and some other cases) with its Parameter Substitution features: https://www.tldp.org/LDP/abs/html/parameter-substitution.html.

While it can be a bear to learn due to the sheer number of things that can be done with it, it's incredibly powerful. I understand that it would not be possible to use this exact notation due to the way PowerShell uses braces with variables, nor would it necessarily fit with the general feel of the language, but it seems like a useful data point.

@bgshacklett:

Yes, parameter substitution is powerful, but, unfortunately, it's not only a bear to _learn_, but also to _remember_.

So while Bash's _features_ are often interesting, their _syntactic form_ is often arcane, hard to remember, and not a good fit for PowerShell.

_Brace expansion_ (e.g., a{1,2,3} in bash expanding to a1 a2 a3) is an example of an interesting feature whose expressiveness I'd love to see in PowerShell, but with PowerShell-appropriate syntax - see #4286

I completely agree. I brought it up more as an example of how this issue has been solved elsewhere than an exact solution.

There's one other operator to possibly consider:

$x ?= 12

which would set $x if it's not set (doesn't exist). This is part of the "initializer pattern" which is not common in conventional languages but for (dynamically scoped) languages like shells, make tools, etc. it's pretty common to have a script that sets a default if the user hasn't specified it. (Though in fact parameter initializers are pretty widespread.)

Extending this to properties:

$obj.SomeProperty ?= 13

would add and initialize a note property SomeProperty on the object if it didn't exist.

And - for fun - one more variation on initializing a variable using -or:

$x -or ($x = 3.14) > $null

$x ?= 12 sounds like a great idea.

It occurred to me that we probably should apply all of these operators not only if the LHS doesn't _exist_, but also if it _does exist but happens to contain_ $null (with [System.Management.Automation.Internal.AutomationNull]::Value treated like $null in this context).

add and initialize a note property SomeProperty

In that vein, $obj.SomeProperty ?= 13 makes sense to me _only_ if .SomeProperty exists and contains $null, given that you cannot implicitly create properties even with regular assignments (by contrast, for _hasthables_ the implicit entry creation makes sense).

All operators discussed will need to exempt the LHS from strict-mode existence checking.

I'm wondering about the intent / reasoning behind why strictmode stops you from accessing a property which doesn't exist. Is the intent to catch typos like $x.pretnd? Is the intent to check assumptions like assert(x != null)? Is the intent to block accidentally propagating $null further on in your code?

the null-soaking (?.) is helpful, to signal the explicit intent to ignore the $null in a concise manner.

Is that not what . signals already in nonstrict mode? If you had wanted to be strict, you could have checked for .Name existing first, by not checking first, you already signalled explicit intent that checking wasn't important to you.

If StrictMode has a purpose where accessing a non-existent property and returning $null is forbidden, then won't the same reasoning apply to ?., and that should throw the same exception for the same reason. If not, why not?

With null-coalescing, you could write:

$result = Invoke-RestMethod -Uri 'https://example.org/api/test'
$test = $result?.Name

Will it quickly become recommended or idiomatic to always use ?. for every property reference, because it "behaves like . does, and won't throw exceptions for people using strictmode"?

Is the real desire to have a configurable strictmode with variable settings, so that e.g. you could have a declaration at the top of your script # StrictMode SkipMemberExistenceCheck, or an escape hatch attribute for indicating a block of unstrict code generally? [I suspect the desire is terse C# style syntax, but you know what I mean]

Thanks for the thoughts, @HumanEquivalentUnit.

Is the intent to catch typos like $x.pretnd?

I can't speak to the design intent, but that's what makes sense to me - analogous to what Set-StrictMode -Version 1 does for _variables_ (only).

With variables it's about their _existence_, not whether the value happens to be $null, and checking for property existence follows the same pattern.

I think of it as the closest a dynamic scripting language can get to finding errors that a statically typed compiled language would report at _compile time_.


Is that not what . signals already in nonstrict mode? If you had wanted to be strict, you could have checked for .Name existing first

Yes, but in non-strict mode you forgo the aforementioned existence checks, and testing manually is obviously quite cumbersome - and invariably slower than a built-in feature.

by not checking first, you already signalled explicit intent that checking wasn't important to you.

The point is that, with Set-StrictMode -Version 2 or higher, you get:

  • the benefit of member existence checking _by default_
  • the option to opt out, _on demand, concisely_ with .?

you could have a declaration at the top of your script # StrictMode SkipMemberExistenceCheck, or an escape hatch attribute for indicating a block of unstrict code generally

Related: The current dynamic scoping of Set-StrictMode is problematic; see @lzybkr's RFC draft for implementing a _lexical_ strict mode: https://github.com/PowerShell/PowerShell-RFC/blob/master/1-Draft/RFC0003-Lexical-Strict-Mode.md

the benefit of member existence checking by default

PS contains code which does member checking, and returns $null when they don't exist. That's a benefit by default, which you can also avoid by writing your own member existence test with $x.psobject.properties... if you choose to. StrictMode takes away that benefit and leaves the hinderence of having to write existence tests. Then instead of giving a nice way to write exactly a member existence test, ?. would give you a concise workaround to get what . always did. And if it matters that the member exists, you still have to code that separately.

But this does make me ask what about method calls:

$x.pretend     -> $null
$x.pretend()   -> Exception MethodNotFound
$x?.pretend  -> $null
$x?.pretend()   -> ____ what goes here? $null?

set-strictmode -version 2
$x.pretend -> Exception PropertyNotFoundStrict
$x.pretend()  -> Exception MethodNotFound
$x?.pretend  -> $null
$x?.pretend()   -> ____  ?

i.e ?. would behave the same as . for property lookups in non-strictmode, but would behave differently to . for method calls in non-strictmode, making it nuanced and not a drop-in replacement. Does that mean that when using ?. you would have no way at all to tell whether a method was invoked, or nothing at all happened? Properties can be backed by getter methods, but I think it's convention in DotNet for getters not to mutate internal state, but normal for method calls to mutate state. With ?. you would have no idea if you mutated an object's state?

I wonder if you can get the desired benefits, and better, by making ?? able to handle PropertyNotFoundException and MethodNotFound Exceptions directly from the expression on the left (but not other Exceptions raised inside methods), as well as handling $nulls, and not have ?. at all. e.g.

$varThatMayBeNull ?? $fallbackValue # null-coalescing
#
$varThatMayBeNull.Name ?? $fallbackvalue    # catching PropertyNotFoundStrict exception
$varThatMayBeNull.Name ??  # default fallback is $null
$varThatMayBeNull.Method() ??  # catching MethodNotFound Exception

# and potentially addressing the index case too
$varThatMayBeNull.first.second[3].Method() ??  # catching MethodNotFound Exception, or index not found Exception

And then, instead of ?. calling a member or accessing a property, it could be a concise way of asking if a member exists, returning a [bool].

With the same syntax addition of ?? and ?. you could handle more situations, with less confusing overlap.

StrictMode takes away that benefit and leaves the hindrance of having to write existence tests

One person's hindrance is another person's benefit:
It all depends on what you want to have happen _by default_:

  • If you're confident that there are no _typos_ in your code - by all means, take advantage of the _default_ behavior (strict mode OFF) - and you'll have no need for .? (for _property_ access - see next comment re _method calls_ and _indexing_).

    • On a _personal_ note: I tend to use Set-StrictMode -Version 1 to avoid typos in _variable_ names, but I find -Version 2 or higher too annoying, primarily due to #2798 (which is a bug that is independent of this proposal).
  • If you want to make sure that you (a) you haven't misspelled property names and/or (b) your code doesn't accidentally operate on data types other than you meant to operate on, set strict mode to version 2 or higher.

    • Indeed, that _currently_ makes it cumbersome to _opt out_ of these checks _on demand_,
    • which is precisely why introducing null-conditional access is being proposed here: to make it _easy_ to opt out on demand.

Explicitly testing for the presence of a member is really a separate use case that can be a necessity irrespective of the strict mode in effect.

As for _method calls_, which the OP doesn't cover:

_Method calls_ (e.g., $varThatMayBeNull.Method()) and _indexing_ (e.g., $varThatMayBeNull[$index]) are two cases where even strict mode being off or at version 1 could benefit, given that both cases currently _invariably_ cause an error if the value being accessed is $null.

The syntax would be same as for properties for methods - $varThatMayBeNull?.Method() - and analogous for indexing - $varThatMayBeNull?[$index] - as in C#.

_Update_: There is a second aspect to indexing, namely if the variable is non-null, but the element at the given index doesn't exist (e.g., $arr = 0,1; $arr[2]). With strict mode version 2 or below, this evaluates to $null, but in version 3 or higher it causes an error, and it would be nice to be able to opt out of this error as well. However, the syntax for this case presents a challenge - see the updated OP, which currently proposes the perhaps awkward [...?].

And, yes, I would make such access default to $null as well, even for methods - again, you're _explicitly_ signaling that it's fine to do nothing if the object being accessed is $null or if no such array element exists.

Note that C# has no null-conditional indexing in the second sense.

It has null-conditional indexing, just not quite like how you're proposing. C# allows (I'm pretty sure, at least...) array?[0] which bypasses the usual "index into null array/collection" issue and just returns null.

But yeah, I agree that this is a good idea. It saves us going through exception handling, and it keeps code pretty clear and concise.

@vexx32, that is _one_ aspect of null-conditional indexing: if array in array?[0] is null, the indexing operation is ignored - C# has that too, as you state.

I was (update: _also_) talking about the _other_ aspect of null-conditional indexing: the case where array in array?[0] is _not_ null but it is _the element at index 0_ that doesn't exist.

It is the latter that cannot be ignored in C#, to my knowledge, yet I think it would be useful too.

Can you think of terminology to give these two aspects distinct names?

The code from another JavaScript issue would go from $_.Description[0].Text to $_?.Description[0?]?.Text which I think is not pretty; and my other suggestion for how ?? could behave would take it to $_.Description[0].Text ?? which is subjectively nicer, and would - if possible to implement - handle failures at any lookup point in the chain all in one go, and still let you choose what you want to happen by default, and have a different fallback at will instead of being limited to only $null.

As an aside, would this ?. syntax have an analogue for static properties and methods, i.e. $t = [int]; $t::MaxValu; $t::Pars("1"), would that become ?: or ?:: ?

What would be the desired behaviour for array indexing with more than one index specified? $array[0,4?,1,2] where each one can be allowed to fail, or $array[0,4,1,2?] with only the entire lot allowed to succeed or fail?

I first have to take a step back, because I realize that I've conflated aspects that are worth separating, at least conceptually:

  • _Null-value_-conditional member or index access, as in C#:

    • If a value being accessed is $null, ignore attempts to access its members or index into it and evaluate to $null.
  • _Member-existence_-conditional member or index access, specific to PowerShell:

    • If a value being accessed is _non-$null_, ignore attempts to access non-existing _members_ (properties, method calls, indexing) that don't exist.

Note: For brevity I'm using _member_ somewhat loosely here to also encompass _elements_ of objects that happen to be collections, which are accessed with _indexing_ rather than dot notation.

If you're confident that there are no typos in your code - by all means, take advantage of the default behavior (strict mode OFF) - and you'll have no need for .? (for property access

@mklement0 I think you're discussing from the view "one person using strict mode to improve their own code" and I from the view "someone else can run my code in strictmode, what changes would make my code work for them as well?" and that's leading me down different reasoning. With ?. it would be so convenient to use it instead of . and make code also work in strictmode that I may as well use it everywhere habitually; I lose nothing and gain compatibility. I fear all PowerShell code will uglify in that way because it "works" in more cases.

But it isn't a drop-in replacement, because it silences failed method call lookup in non-strictmode, and silences failed property lookups in strictmode - it adds a fiddly nuance to member lookup, an incredibly common operation. It doesn't help anyone who wants to benefit from strictmode checks by making that boilerplate easier. It doesn't help enough people in enough situations, for the complexity it adds. And it's too convenient to make code more compatible, I don't think "ignore it if you don't want it" is strong enough to hold it at bay.

PowerShell can do things C# doesn't, like setting automatic variables, or having a precedent for having exceptions be thrown and silenced behind the scenes. There's got to be a better approach which doesn't complexify member lookup at all, but simplifies strictmode safety check boilerplate so that people who do want to use the checks have easier lives and people who were annoyed by the checks have easier lives as well.

@HumanEquivalentUnit How does it silence these things? You would have to write a load of exception handling and null checks previously, instead:

$prop = $item?.Method().PropIwant ?? 'PropActionFailed'

v.

$prop = if ($null -ne $item) {
    $propIwant= $item.Method().PropIwant
    if ($null -eq $propIwant) {
        'PropActionFailed'
    }
    else {
        $propIwant
    }
}

And this case isn't uncommon. Or you just throw $ErrorActionPreference = 'Stop' and wrap the whole thing in try/catch to simplify the null checking which is a bad pattern (exception-driven logic).

Also, this is the future of the language, it's not like we're going back to 5.1 and changing things. If you want more adopters, you need to make it easier to read code, which I feel this suggestion does.

@mklement0 I think you're discussing from the view "_one person using strict mode to improve their own code_" and I from the view "_someone else can run my code in strictmode, what changes would make my code work for them as well?_"

Are we talking about "set -euo pipefail" (bash strict mode) equivalent?

a- Limit to variable look-ups (although I don't see why you think a simple null check on member lookups would be a bad thing, the actual member lookup is way more expensive than if (result is null))
b- Use [Strict] on scopes/codeblocks to tell the parser to treat them as such,
c- Strictness is based on scope: code that isn't marked strict isn't strict.

If I wrote strict code that fails when you use it, then one of us made a mistake. Either I referenced a missing variable in code I never tested or you've misunderstood my badly designed API that depends on the existence of variables that aren't passed in to my function and you failed to populate a variable I expect to be populated...

# Their code
Function Log($Message) { Write-Host $Massege }  # not strict

# My code
[Strict]
Function DebugMsg
{
  Param([String] $Message)
  if ($DEBUG) { Log Host $Message }
}

# Your code
DebugMsg "Hello"  # you didn't define DEBUG, and by saying strict I expressed a contract whereby the variable MUST be defined.

The "Log" function wasn't marked strict so it doesn't complain about the typo

🤔 Now that you mention it, a [Strict()] attribute would be an _awesome_ way to implement strict mode in a scope-restricted way. That needs to be mentioned in that other issue thread on strict mode issues, I'll go dig that up and link back here...

@TheIncorrigible1 "How does it silence these things?" in non-strict mode $null.x is silent, $null.x() is an exception. With ?. as discussed, $null?.x is silent, $null?.x() is silent.

So you'd have a . which works like C#'s ?., except in strict-mode where it throws exceptions. Then you'd have a ?. which works in strict mode like C#'s ?. and like PowerShell's . works in in non-strict mode, but which works differently to PowerShell's . in non-strict mode for method calls. What a horrible combination, (and a combination where ?. is clearly better and more useful in more of those scenarios).

You would have to write a load of exception handling and null checks previously, instead: $prop = $item?.Method().PropIwant ?? 'PropActionFailed'

Your method call might throw, or return $null and PropIWant might be missing, you'd still have to write:

$prop = try {
    $item?.Method()?.PropIwant ?? 'PropActionFailed'  
} catch [someMethodException] {
    'PropActionFailed'
}

vs with my proposed ?? (if it was possible to build), and no need for ?., and it covers possible cases of Method exceptions in case all you care about is "success or not" and no detailed error handling.

$prop = $item.Method().PropIWant ?? 'AnythingFailed'

If you want more adopters, you need to make it easier to read code, which I feel this suggestion does.

How often do you type . in PowerShell? How often do you want to explain to new users what the difference is between . and ?. and tell them they have to consider the difference every time they type a . forever?

"Oh, how did that come about?" "Well in C# . threw an exception for missing members, so they invented ?. which didn't. That was great, so they did that in PowerShell by default, the . doesn't throw. Then they invented strict-mode which makes . behave like C# and throw but only in that context, But the way to manually check members was really wordy and bothersome and nobody wanted to type it; instead of addressing that problem head-on and making that more convenient to benefit from the strictmode, they dodged that and aped C#'s ?. instead to gain a sneaky way out of the strictness that you asked for, which is mildly convenient for you, but sucks for everyone else. And you still have no member existence test, and benefiting from strictmode is still wordy and bothersome, but at least you don't have to suffer that because 'strictmode' has a 'normalmode' escape hatch now". Ick.

If I wrote strict code that fails when you use it, then one of us made a mistake.

although I don't see why you think a simple null check on member lookups would be a bad thing, the actual member lookup is way more expensive than if (result is null)

I don't think that would be bad, PS non-strictmode already does that, and I think that would be a desirable alternative to the proposed ?. - very different, useful in more situations.

But note that in PS you can do something C# cannot:

(1..3)[1kb..100kb]

and no exceptions in non-strictmode. Change the numbers and see how long it takes; judging by the performance and how I think it must work behind the scenes to index into arbitrary objects, it seems there actually are ~100,000 exceptions being raised and silenced, for our convenience. PS doesn't always make the "must be as fast as possible, push the inconvenience on the user" choice, and I like that about it.

@HumanEquivalentUnit There are many features in every language you may never use in your career and if you don't like the style, that's fine, you don't need to use those features. I don't see why you're writing strict mode off; it's a good practice in scripts so you're consciously dealing with problems rather than letting the language swallow them (and in essence, having implicit empty catch blocks everywhere). Again, you're still welcome to not use the features (e.g. I've never used Out-Default).

Also, the operator is ?, not ?.; it would work with index access as well.

To continue to lay the groundwork for further discussion, let me summarize how the existing strict modes affect null-value-conditional and member-existence-conditional access (the terms introduced above):

The columns below represent the Set-StrictMode settings, and the column values indicate:

  • 👍 ... allowed
  • 🚫 ... prohibited (causes statement-terminating error)

As an aside: the fact the errors are only _statement-_, not _script_-terminating, relates to the larger discussion around PowerShell's error handling - see https://github.com/PowerShell/PowerShell-Docs/issues/1583.

  • _Null-value_-conditional access - $obj _itself_ is $null:

Construct | -Off | -Version 1 | -Version 2 | -Version 3+
---------- | ---- | ---------- | ---------- | ------------
$obj | 👍 | 🚫 | 🚫 | 🚫
$obj.Prop | 👍 | 👍 | 🚫 | 🚫
$obj[42] | 🚫 | 🚫 | 🚫 | 🚫
$obj.Method() | 🚫 | 🚫 | 🚫 | 🚫

It is curious that with -Off and -Version 1, $obj.Prop is allowed, whereas $obj[42] isn't.

  • _Member-existence_-conditional access - $obj is non-$null, but the member / element being accessed doesn't exist:

Construct | -Off | -Version 1 | -Version 2 | -Version 3+
---------- | ---- | ---------- | ---------- | ------------
$obj.Prop | 👍 | 👍 | 🚫 | 🚫
$obj[42] (indexable) | 👍 | 👍 | 👍 | 🚫
$obj[42] (non-indexable) | 👍 | 👍 | 🚫 | 🚫
$obj.Method() | 🚫 | 🚫 | 🚫 | 🚫

It is curious that inherently indexable objects (e.g., arrays) allow access to nonexistent elements even with -Version 2, whereas scalars - where PowerShell provides indexing capabilities for the sake of unified handling with collections - do not.

The null-soaking (null-conditional or Elvis in C#) operator is almost mandatory when running in StrictMode Latest and trying to access a property that may or may not exist. Currently you have to resort to stupid tricks like this:

function Get-OptionalPropertyValue($object, [string] $propertyName) {
    if (-not ([bool] (Get-Member -InputObject $object -MemberType Properties -Name $propertyName))) {
        return $null
    }

    $object.$propertyName
}

$value = Get-OptionalPropertyValue $foo "bar" # if $foo.bar exists, $value will contain its data; else $value will be $null

when you _really_ should just be able to write:

$value = $foo?.bar # if $foo.bar exists, $value will contain its data; else $value will be $null

Please guys, if you could get only null-conditional from this proposal implemented and only for properties, it would save so many people so much pain.

I started having a look at this and found a few problems.

Lets consider ?? and ?=

$x ?= 1 and $x ?? 1 seem to be straightforward. But, since they are operators we need to support $x?=1 and $x??1.

The problem with that is $x? and $x?? both validate variable names in PowerShell. The only way to disambiguate would be ${x}?=1 and ${x}??1.

And more over for conditional member access ?. and ?[], we would have to have ${x}?.name and ${x}?[0].
For these, $x ?.name and $x ?[0] will not be supported.

The other option is to introduce a breaking change and not allow ? in the variable name at all.

What do you think? @mklement0

/cc @rjmholt @daxian-dbw @JamesWTruher

I have never particularly _liked_ nor understood why ? is a valid variable name. It opens the door for names like $Safe? which is generally more clear and less awkward when written as $IsSafe anyway.

I think that change has probably been a long time coming.

It opens the door for names like $Safe? which is generally more clear and less awkward when written as $IsSafe anyway

Interestingly I expressed the opposite sentiment. It's a classic Scheme/Ruby tradition.

if ($questionMark?)
{
    Write-Host "That's silly!"
}

Didn't the ternary operator have this same problem, where the resolution was to require spaces around the operator or curly braces around the variable name in order for it to properly work with variables that end with ??

I would expect that with these operators being introduced, PowerShell would prioritize recognizing the ? as part of the operator in syntax such as $x?[0], such that users who use variable names ending in ? to store a collection would have to use this syntax instead (since spaces would be invalid in this case): ${x?}?[0]. Ditto for $x?.Name and ${x?}?.Name.

The other option is to introduce a breaking change and not allow ? in the variable name at all.

👍 Given this is a major version change (6 to 7), I think it is worth a breaking change here. I wish GitHub had better code search to see how many instances there are of PS variables that end with a ?. I suspect this wouldn't be that impactful but it would be nice to verify that against GitHub. In 14 years of using PowerShell, I've never used a ? in a variable name but maybe that's just me. :-)

That ship has sailed long ago, and I wouldn't be surprised to see variables ending with ? in use in scripts today.

At this point if we simply prioritize parsing ? as part of an operator rather than as part of a variable name when it comes to these new operators, folks who use variable names ending in ? would need to use {} or spaces (where spaces are allowed) to use those variables with these operators.

It's strange to hear the phrase "that ship has sailed" in this context. This is a _major version_ change. It's not unreasonable for this to change here, I think.

@rkeithhill I've used it in personal stuff, but thought it would be unclear to collaborative work since it's such an anti-intuitive thing to programmers to have symbols as part of variable names (similar to using emojis as variables)

@KirkMunro having "prioritizied parsing" sounds like an open door for bugs.

@vexx32: It's not unreasonable for a breaking change since this is a major version change; however, the bar for such breaking changes should remain very high, and I don't think this comes close to passing that bar, since users could use variables ending in ? just fine as long as they use the {} enclosures to identify the variable name.

Note that you can have a property name that ends in ? as well. Currently if you try to view such a property in PowerShell without wrapping the name in quotes, you'll get a parser error.

For example:

PS C:\> $o = [pscustomobject]@{
    'DoesItBlend?' = $true
}
PS C:\> $o.DoesItBlend?
At line:1 char:15
+ $o.DoesItBlend?
+               ~
Unexpected token '?' in expression or statement.
+ CategoryInfo          : ParserError: (:) [], ParentContainsErrorRecordException
+ FullyQualifiedErrorId : UnexpectedToken

PS C:\> $o.'DoesItBlend?'
True

I'm a little surprised that doesn't parse today (why doesn't that parse?), but regardless, for such properties you need to enclose their name in quotes, in which case you could follow the quote with a ternary, null-coalescing, etc. operator without spaces and it would work just fine. I find this syntax very similar to ${x?}?.name, and I'm ok with the stance that you can use such variable/property names if you want, but such names may require extra syntax or spacing to work with ternary or null-* operators.

@KirkMunro Nothing stops people from using variable-bounds going forward if they want esoteric variable names. I do agree with the others in that I wonder what the current usage of that behavior is currently in use.

People on PowerShell 7 are likely already enthusiasts and will be aware of the breaking changes. People who are not, are still using <=v5.1 and will continue to for a long time; likely until msft removes it from Windows 10 (never).

@KirkMunro Sure, but removing it from the standard permissible variable characters doesn't prevent users from just doing ${Valid?} as the variable name anyway. Since they'd have to do that with these operators regardless, I think it'd be better to just have it consistent, rather than have ? become a character that's _only sometimes_ considered part of a variable name.

That already _is_ going to be a breaking change, and I'd think it best to at least be consistent about it and go all the way rather than introduce another level of ambiguity. 🙂

@adityapatwardhan I think it would be better for the language to remove ? as a valid variable character. It would easily enable both null-soak/-coalesce and ternary operators in a familiar syntax which add a lot of ergonomics to the script authoring process.

@KirkMunro having "prioritized parsing" sounds like an open door for bugs.

@TheIncorrigible1: Any code can open the door for bugs if it's not implemented properly. I'm just talking about a simple single-character lookahead to identify if PowerShell runs into a ternary or null-* operator when it parses a variable name that is not enclosed in variable bounds and encounters a ? character. That's not complicated, and doesn't open the door for more bugs than any other code change does.

People on PowerShell 7 are likely already enthusiasts and will be aware of the breaking changes. People who are not, are still using <=v5.1 and will continue to for a long time; likely until msft removes it from Windows 10 (never).

@TheIncorrigible1: What basis/evidence do you have of that statement? PowerShell 7 is in preview, so today it's only used by enthusiasts. That's a given. But beyond preview, if PowerShell 7 or later offer compelling features that companies need, while supporting the functionality they need, they'll use those versions. That is especially true if PowerShell 7 gets installed with the OS. The only point that enthusiasts comes into play is in organizations that don't have a business need for what PowerShell 7 brings to the table.

That already is going to be a breaking change, and I'd think it best to at least be consistent about it and go all the way rather than introduce another level of ambiguity. 🙂

@vexx32 That's stretching it. It would be a breaking change if you had ?? in a variable name, but the likelihood of that is much more remote than having a variable whose name ends in a single ?. Other than that, how would the introduction of null-* operators while still supporting ? as a standard permissible variable character break scripts today?

In my experience breaking changes for nice-to-haves (which is what this seems to be) are by far more often than not rejected, and the discussion/arguments around them only serve to slow down the process of getting things done dramatically, to the point where things just stall or miss getting into a release, etc. The slow down is often necessary, because if you're proposing a breaking change evidence is needed to be able to assess the impact and justify such a change. It's hard to gather that evidence today. I'm just saying this because I'll choose getting features done now over arguing about a nice-to-have breaking change any day.

I never use ? in variable names, nor would I. I expect some folks do, though, because it can read very well in a script, and putting up unnecessary barriers to entry for PowerShell 7 just slows down adoption, especially when many people working with PowerShell aren't developers who are more accustomed to working through breaking changes.

Anyway, it is not my intent to slow down this process -- rather the opposite. But I've shared my thoughts and experience, so I won't comment further on whether or not we should push for a breaking change here.

Consider this contrived example:

$ValueIsValid? = @( $true, $false, $false, $true )

$ValueIsValid?[0]
# old behaviour? gets `$true`
# new behaviour? gets nothing, because the `?[0]` is interpreted as a null-conditional access.

This behaviour would _already_ break with the proposed changes. I would prefer a consistent, clean break than a confusing break that needs a half-page explanation to list all the possible exceptions and when and where ? suddenly isn't valid, and where it still is.

@KirkMunro if PowerShell 7 or later offer compelling features that companies need, while supporting the functionality they need, they'll use those versions. That is especially true if PowerShell 7 gets installed with the OS.

Having worked in a few Fortune 50s with some employee bases being in the hundreds of thousands, even getting away from what was default on the OS was a challenge (i.e., moving to v5.x). I have yet to see any place adopt Core; they'd rather just move to Python for cross-compatibility. Enabling Windows optional features was also a pretty seldom task.

I think companies work with what they have and the knowledge base of their employees over seeking out new technology or language versions to solve their problems. Some would be perfectly content staying on v2 forever and never touching the Win10/Server16 cmdlets that make life dramatically easier.

My point with all of this, is that these features are not in a vacuum. If you make a language more ergonomic, it will see greater adoption by people being interested in the tool to solve the problems they have faster. (See: C# and the growth in popularity with more/better language features)

Regarding variables ending in ? - that could be a significant breaking change because of the automatic variable $?.

@lzybkr I suspect the best case for dealing with that is a special case like what already exists for $^ and $$.

@lzybkr @TheIncorrigible1 As far as I remember, _all_ of those variables are explicitly special-cased in the tokenizer. $^ _already_ isn't a valid variable name. The tokenizer has special cases for all of those before it starts looking for standard variable names.

The only solution here is to use ¿:

$silly?¿=1

I dunno Steve, I'm firmly in the interrobang crowd on this one.

$silly?‽=1

@lzybkr - to my knowledge $? $$ and $^ are treated in a special way.

https://github.com/PowerShell/PowerShell/blob/8b9f4124cea30cfcd52693cb21bcd8100d39a796/src/System.Management.Automation/engine/parser/tokenizer.cs#L3001-L3004

To summarize, we have these four options

  • Big breaking change - not allow ? in the variable names.
  • Prefer ?. as an operator. This means variable names ending with ? must use ${variablename} syntax. Still a breaking change.
  • Prefer old behavior. This means to use ?. and ?[], ${variablename}?.property must be used. No breaking change, but makes using the new operators clumsy.
  • Do not implement the new operators.

I personally, do not prefer the 1st and the last one.

This is a major version change. It's not unreasonable for this to change here, I think.

I think an important point to make is that the major version is being incremented to signal readiness to replace Windows PowerShell, which is to say it indicates compatibility, not breakage. From the original announcement:

Note that the major version does not imply that we will be making significant breaking changes.

Are options 1 and 2 not the same? Or would variables still be permitted to use ? in the case that there is no null-coalescing or ternary operator following them?

If so, I think we may have a lot of trouble handling the tokenizing / parsing logic without a lot of backtracking. This might lead to performance degradations when variables end with ?.

Given that from what I can see, option 2 seems to be the general preference, I'm not really sure I understand the reluctance to make the breaking change here. Doing it that way will actively discourage use of ? in variable names anyway, simply by introducing scenarios that they aren't usable without enclosing the variable name.

I think this is a fairly minor change as breaking changes go, and having a break in the consistency of what can and can't be used in a variable name is probably the worse option, in my opinion.

These things should have clear rules that apply in pretty much all situations. Currently, this is the case. We're proposing to muddy the waters here and make the behaviour _less_ clear. I can't see anything good coming from it, except perhaps that variable names containing ? become less used simply because they're harder to use in some scenarios -- which (effectively) brings us right back to option 1 by default, almost... so I don't see any particular reason not to take the break now and just avoid the ambiguous behaviour.

@vexx32 There is a slight difference between 1 and 2.

For # 1 we disallow ? to be used in variable names all together. This means $x? = 1, $x?? = 1 $x?x?x = 1 will not parse.

For # 2, $x? = 1 is still valid but $x?.name is equivalent to ${x}?.name. This only breaks variable names with ? at the end, which are accessing members. So, $x? = 1, $x?? = 1 $x?x?x = 1 would still be valid.

Yeah, I'd be a little concerned about tokenizing overhead there. Might be worth implementing both possibilities and doing some benchmarks on a script that uses similar variables names reasonably heavily.

My preference is definitely option 1... a one-time breaking change is, to me at least, way more preferable to having to deal with the inconsistencies in how that can be parsed way into the future.

If so, I think we may have a lot of trouble handling the tokenizing / parsing logic without a lot of backtracking. This might lead to performance degradations when variables end with ?.

I share this concern about that possibility. Having ? as a sometimes-token-separator feels like a jagged line in the language to me.

If so, I think we may have a lot of trouble handling the tokenizing / parsing logic without a lot of backtracking. This might lead to performance degradations when variables end with ?.

It depends on how you implement it. As long as you take a lookahead approach rather than a tokenize and then back-up approach, you should be fine. You can look ahead at what characters are next when you encounter a ? in a variable name, and make a decision on how you want to "wrap up" the variable token based on what's next. That isn't a lot of trouble and shouldn't require backtracking or result in noticable performance degradations.

@PowerShell/powershell-committee reviewed this one today, we have a couple thoughts:

  • No matter what we do, we're going to do some analysis of our corpus of scripts to see how often folks use ? in variable names
  • Some of us have a hypothesis (that others would like to validate) that the users who are using ? in their variable names may be less advanced users (as we agree in this room we'd stay away from it because of the potential problems that could arise). On the other hand, anyone using the functionality described here will be able to understand a slightly more complicated syntax (like ${foo}?.bar). Therefore, we prefer option 3 because it avoids breaking changes on these less experienced users. (But again, we will validate it per the first bullet.)
  • @daxian-dbw raised a good point about making it harder to find all variable references when scripters are mixing usage of $foo and ${foo}. However, we agreed that this can be fixed in tooling (like the VS Code PowerShell extension), and users like @JamesWTruher who use editors like Vim can easily match on both with their search syntax.

harder to find all variable references when scripters are mixing usage of $foo and ${foo}

That depends on whether the AST differentiates the variable name based on whether it saw the braces. I suspect that any tool that uses the PowerShell AST will have no trouble with braces here.

But for regex, it certainly means you have to be more aware: varname -> \{?varname\}?

@rjmholt I did an analysis over here. A quick breakdown: Out of nearly 400,000 scripts with 22,000,000+ variable names (not unique), there were only ~70 unique variables that ended with ?.

Thanks, @mburszley - to me that makes it a Bucket 3: Unlikely Grey Area change, and burdening user with the need for {...} is highly unfortunate, paralleling the unfortunate need for $(...) around exit / return / throw statements in the context of && and ||.

To borrow the reasoning from that issue:

If we force the use of {...} around variable names just so you can use ?.:

  • New users will not expect the need for {...} and won't necessarily know that _it_ is what is required when the following fails (possibly _undetected_, unless Set-StrictMode -Version 2 or higher is in effect): $o = [pscustomobject] @{ one = 1 }; $o?.one

  • Even once users _do_ know about the need for {...}:

    • They will forget to use it on occasion, because of the counter-intuitive need for it.
    • When they do remember, this seemingly artificial requirement will be an ongoing source of frustration, especially since {...} is hard to type.

This issue has been marked as fixed and has not had any activity for 1 day. It has been closed for housekeeping purposes.

@rjmholt, quoting you from https://github.com/PowerShell/PowerShell/issues/10967#issuecomment-561843650:

not breaking the way we parse existing valid syntax was the right way to go. Again, a change there isn't just a case of us making a decision to support a small number of wild programs out there — we've made numerous breaking changes in cases where we thought the pros outweighed the cons (not that I personally agree with all of them, or that those do or don't justify others). The issue is that changing some aspect of the tokenizer or parser means that two PowerShell versions will no longer generate the same AST for some scripts, so two PowerShells will "see" the same script differently. That means we can't even read the syntax the same way, so there's no PSScriptAnalyzer rule you can write to pick up possible issues around it; PSScriptAnalyzer will see a different syntax from one PowerShell version to the next.

Unlike C#, PowerShell can't pre-compile a different syntax to a compatible format for later execution, meaning that a syntactic change is tethered to the runtime. Once PowerShell makes a syntactic break, we reduce the number of scripts that can work against different versions, meaning users are forced to write two scripts, which is a serious issue for a shell and a scripting language. PowerShell is supposed to be dynamic enough to bend past all the differences with logic inside one script, but the syntax is the one non-negotiably static thing we have. And making a syntax change where both before and after are valid is especially pernicious, since there's no simple way to detect it, there's no warning for it and even after executing it in both environments, users might not know that a different behaviour occurred.

In general, I can appreciate that such changes can be very problematic and should only be made if the pros outweigh the cons, an important factor of which is how much existing code breaks.

we reduce the number of scripts that can work against different versions
meaning users are forced to write two scripts

We're talking about a _new_ syntactic feature here (?.), so by definition scripts that use it _cannot_ (meaningfully) run on older versions - unless you specifically add conditionals that provide legacy-version-compatible code paths, but that seems hardly worth it - you would then just stick with the legacy features.

Yes, the interpretation of $var?.foo in old scripts that used $var? as a variable name would break, but:

  • ? should never have been allowed as part of an identifier _without enclosing it in {...}_.

  • Since being able to do so is unexpected and probably even _unknown_ to many, such variable names are exceedingly rare in the wild, speaking from personal experience, but @mburszley's analysis provides more tangible evidence.

    • Even Ruby only allows ? (and !) at the _end_ of identifiers, and there only of _method_ identifiers, and I suspect that most Ruby users are aware that they should _not_ assume that other languages support the same thing.

So, pragmatically speaking:

  • The vast majority of existing code will not be affected - a token such as $var?.foo will simply not be encountered.

  • If you write $var?.foo with the new semantics, then yes, running that on older versions could result in different behavior (rather than breaking in an obvious manner), depending on what strict mode is in effect - but _you should always enforce the minimum version required to run your code as intended anyway_ (#requires -Version, module-manifest keys).

All in all, to me this a clear case of a bucket 3 change: a technically breaking change that breaks very little existing code while offering real benefits (or, conversely, avoiding perennial headaches due to unexpected behavior).

@mklement0 we're keeping the Null-Conditional access experimental for PS7.0. Perhaps you can open a new issue to continue this discussion for 7.1?

Sounds good, @SteveL-MSFT - please see #11379.

I encourage everyone here to again show support (or, gasp, non-support) there.

Was this page helpful?
0 / 5 - 0 ratings