Perl 5.18: getline and $/ = \N
Perl 5.18 will ship with a change in behaviour when using
getline() (aka the
<$handle> operator) on handles marked as returning Unicode where
$/ is a reference to an integer.
If you're not familar with the behaviour, for a file with no PerlIO layers:
$/ = \500; my $x = <$fh>; # read 500 bytes
This won't change in 5.18, but it will if the stream has a layer that internally returns unicode, such as any of:
:encoding(utf-8)(ok, any :encoding stream)
In 5.16 and earlier,
getline() will read the specified number of bytes from the stream, even if that would fall in the middle of a character.
This leads to a few problems:
- the result can be a UTF-8 marked scalar that doesn't contain valid UTF-8.
- the input stream can be left on a non-character boundary.
- the record read only corresponds to bytes in the file if the file is UTF-8.
getline($fh) will behave like
read($fh, $out, $$\) -- the specified number of characters will be read from the stream instead of the specified number of bytes.
For more details (and arguments), see perlbug ticket 79960.