user-pic

Francisco Obispo

  • About: I blog about Perl.
  • Commented on Perl Regular Expression Awesomeness
    I realized that the comment section is very small and won't fit the script. I've uploaded it to: https://github.com/fobispo-link/tools/blob/master/words/grammar The script is really fast, it will detect words in the dictionary....
  • Commented on Perl Regular Expression Awesomeness
    I had a similar problem not so long ago, and ended up using Regexp::Grammars This is how it works: #!/usr/bin/env perl use common::sense; use Regexp::Grammars; use Data::Dumper; use File::Slurp; use Benchmark qw{timethese}; my $text = $ARGV[0]; my %dict = (...
Subscribe to feed Recent Actions from Francisco Obispo

  • ysth commented on Perl Regular Expression Awesomeness

    Less efficient, but simple and works:

    my @words = $input =~ /\G($list)(?=(?:$list)*\z)/g;

  • Ingy döt Net commented on Perl Regular Expression Awesomeness

    Genius!

    ysth, would you mind writing a couple paragraphs explaining why list capturing works in your regexp?

    My best guess is that `\G` with `\g` forms some kind of `pos` loop where `$1` is returned over and over. The assertion makes certain that we parse correctly to the end in each iteration, thus triggering the needed backtracking, thus getting the proper next `$1`.

    FWIW, The `\G` stuff is another regexp thing you don't see used outside of Perl. In fact (last I checked) it doesn't work with alternate re::engine implementations inside Perl. It seems like something t…

  • Ingy döt Net commented on Perl Regular Expression Awesomeness

    Peter,

    I was playing around with it last night. This happened:

        echo minus the message for every tried word | perl word-parse.pl
        minus them ess age fore very tried word

    I fixed it by putting the words `the` and `for` at the front of the regexp. I suppose a weighting of common words first combined with the long word weighting might yield more optimal results, but this is just an interview question, right? :)

  • ysth commented on Perl Regular Expression Awesomeness

    It's really features of perl's match operator, not the regex engine. Though I am surprised alternate engines don't handle \G; that seems like a glaring bug.

    What perl's match operator does has not just the normal scalar vs list context distinction, but also (orthogonally) /g vs non-/g and capturing parentheses vs no capturing parentheses. It's worthwhile learning how all 8 resulting flavors work.

    See the couple paragraphs before and the paragraph after http://perldoc.perl.org/perlop.html#\G-_assertion_

  • Nova Patch commented on Perl Regular Expression Awesomeness

    \G is documented in Mastering Regular Expressions by Jeffrey Friedl, published in 2006, as being supported by Perl, .NET, and Java, as well as PHP and Ruby, with the latter two having slightly different semantics (which make them less useful than the former three). Perl’s implementation is especially powerful in that the last matching position is associated with the string and not the regex, so it can be used with multiple different regexes on the same string. I’m sure \G support among modern regex engines has changed in the last decade and would be…

Subscribe to feed Responses to Comments from Francisco Obispo

About blogs.perl.org

blogs.perl.org is a common blogging platform for the Perl community. Written in Perl and offering the modern features you’ve come to expect in blog platforms, the site is hosted by Dave Cross and Aaron Crane, with a design donated by Six Apart, Ltd.