1

Calling -split on an empty string results in an array with one element which is an empty string. I expect it to return a zero-length array or a $Null which is not an array.

What am I doing wrong?

# split an empty string
$a = '' -split ','
# the result is an array with one element which is an empty string:
$a -is [array]     # true
$a.Count           # 1
"a[0]='$($a[0])'"  # result: a[0]=''

EDIT: from the comments, people ask how is the empty string different from a string with no comma at all, eg:

$s = 'a'
$s -split ','
# returns 'a'

It returns "the first element of the array expressed as a string" which is the string 'a'.

So in the case of an empty string, It also returns "the first element of the array expressed as a string" which is in this case the empty string.

So I guess it's the round round trip through -join that seems not consistent. If you -join then -split a zero-element array, you don't get a zero-element array back. This seems asymmetrical.

# empty array
$a_src = @()
$a_src.Count
# returns 0

# join
$a_src_str = $a_src -join ','
"a_src_str='$a_src_str'"  # empty string

# then split
$a = $a_src_str -split ','

# non-empty array
$a.Count
# returns 1

So I guess the problem is that two array states collapse into the same "array expressed as string" so state is lost, so -split can never decide the original state. Both a zero-length array, and an array of 1 whose element is the empty string, when -join'ed both result in the same state--an empty string.

# array of zero elements
$a_empty = @()
$a_empty_str = $a_empty -join ','
"a_empty_str='$a_empty_str'"  # empty string

# array of one element an empty string
$a_one = @('')
$a_one_str = $a_one -join ','
"a_one_str='$a_one_str'"  # empty string

I suppose if you needed this round trip to work (I do) you'd have to create and manage another bit of state to differentiate.

[also, a meta side: I posted this 11pm one day, and 9am the next day it was already closed. Sorta all happened when I was sleeping ;)]

EDIT:

It's not just "join". Any feature that expresses items as a list in a string separated by commas is going to be misrepresented by -split.

Here's one: wsman (WinRM) TrustedHosts:

get-item 'wsman:\localhost\client\trustedhosts'
# returns
('node1,node2' -split ',').Count  # 2
('node1'       -split ',').Count  # 1
(''            -split ',').Count  # 1 (not 0)
7
  • 5
    Can you elaborate on why you expect a $null or an empty array here? What you're describing is the exact same behavior as in 'a' -split ',': the first array element contains the first match a. In the case of an empty string as input, the first array element also contains the first match: an empty string. This is exactly what I would expect. Commented Sep 30 at 6:30
  • 1
    the -split operator will split on any matching delimiter ... and return the result in a string array. any delimiter will give the same result since the result of the split will ALWAYS be stuffed into a string array. Commented Sep 30 at 6:32
  • 9
    Simply speakling, if delimiter is not found in the string, the string itself is returned as the only item in the array. It applies to empty string as well :) Commented Sep 30 at 6:43
  • 6
    I’m voting to close this question because the behavior in question is expected and normal Commented Sep 30 at 6:44
  • 2
    The updated version of this post is... just a list of observations. What is the question? Commented Oct 1 at 8:30

1 Answer 1

1

There's good information in the comments on the question, but let me try to provide a systematic summary:

The -split operator by design always returns an array (of type [string[]]) containing at least one element:

  • Specifically, a single-element array containing the input string as-is is returned if the LHS (input string) does not contain any instances of the separator described by the RHS regex.

    • This is what happened in your case: since the empty string as the LHS by definition contains no separators, a single-element array whose only element is the empty string is returned.
  • There is one exception, however:

    • Unlike the underlying .NET [regex]::Split() method that the -split operator is based on, it can operate on an array as the LHS, in which case the split operation is performed on each element of the array, and a flat array of strings is returned with the results across all elements.[1]

    • Providing an empty array (or other empty enumerable) as the LHS returns an empty [string[]] array, e.g. @() -split ','


So I guess it's the round trip through -join that seems not consistent.
If you -join then -split a zero-element array, you don't get a zero-element array back. This seems asymmetrical.

  • The purpose of the -join operator, conceptually the inverse of -split, is to join the elements of the LHS input array to form a single string using the specified separator (delimiter).

  • Indeed, @() -join ',' and @('') -join '' both result in '', i.e. the empty string (as does '' -join ','), making it impossible to infer from a '' result which of the two inputs caused that result.

  • Even though this behavior isn't documented, de facto it has always been in place and is unlikely to change, given that backward-compatibility could be broken.

As for what could be done, hypothetically:

  • -join could be modified so that @() -join ',' returns the enumerable null ([System.Management.Automation.Internal.AutomationNull]::Value) rather than '' - but note that, strictly speaking, doing so would violate -join's mandate of returning a string.

  • -split's existing behavior would then suffice: with the enumerable null as input ( which is treated the same as an empty array), it returns an empty [string[]] array; e.g. (& {} is the most concise way to produce an enumerable null):

    $result = (& {}) -split ','
    $result.GetType().Name  # -> String[]
    $result.Count # -> 0, i.e. an *empty string array*
    

[1] E.g, 'a,b', 'c,d' -split ',' returns @('a', 'b', 'c', 'd'), i.e. a flat array of result tokens across all input strings. This behavior may be the reason why this feature isn't often seen used in practice - in contrast with the often-used filtering behavior exhibited by PowerShell's comparison operators with array LHSs.

Sign up to request clarification or add additional context in comments.

2 Comments

I agree with the answer. I edited the OP question to consider cases where items are expressed as lists in strings eg wsman TrustedHosts. That doesnt change the answer, just helps inform how the default behavior can be a hardship for coders. ALSO another option is -split could take an option, RETURN_ZERO_LENGTH_ARRAY_ON_EMPTY_STRING or some such.
Thanks. Re adding an option to -split: You could post a feature request in PowerShell's GitHub repo (though I wouldn't hold my breath that action will be taken).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.