How to remove a carriage return (\r\n)

use strict;
use warnings;

{
my $str = "abcd\r\n";
$str =~ s/\r|\n//g;
print "[$str]";
}

{
my $str = "abcd\n";
$str =~ s/\r|\n//g;
print "[$str]";
}

{
my $str = "abcd\r";
$str =~ s/\r|\n//g;
print "[$str]";
}

4 Comments

There is also \v which matches both \r and \n.

$g =~ s/\v//g;

$g =~ s/\R//;
Takes care of all Unicode line endings.

Abigail says



$g =~ s/\R//;

Takes care of all Unicode line endings.

Do you know of any difference between the behaviour of \v and \R? They seem to do exactly the same thing, on Perl 5.12.3:
\v:
000A: '^J' 
000B: '^K'
000C: '^L'
000D: '^M'
0085: '…'
2028: '
' LINE SEPARATOR
2029: '
' PARAGRAPH SEPARATOR

\R:
000A: '^J' 
000B: '^K'
000C: '^L'
000D: '^M'
0085: '…'
2028: '
' LINE SEPARATOR
2029: '
' PARAGRAPH SEPARATOR

See man pages for perlre, perlrecharclass, and perlrebackslash.

\v matches vertical whitespace, which are the characters shown above as of Unicode 6.0: [\x0A-\x0D\x85\x{2028}\x{2029}]

\R matches these and multi-character newline sequences. This means it can't be used inside bracket character matches (e.g. [h\R]). It will match the CRLF sequence (and could match others if more are indicated in later versions of Unicode). As of Unicode 6.0 it is equal to (?>\x0D\x0A?|\v).


Looks like the perlrebackslash man page has a typo in the equivalent regex. It might be tempting to think it would be (?:(?>\x0D\x0A)|\v), but this example shows it isn't implemented that way:

say "v match" if "\x0d\n" =~ /^\v\v$/m;
say "R match" if "\x0d\n" =~ /^\R\R$/m;

matches v but not R. The first \R greedily consumes the sequence.

Leave a comment

About Michael Li

user-pic I blog issues resolved at work