The sad state of syntax highlighting libraries on CPAN
I just posted reviews of several code syntax highlighting libraries on CPAN. In short, most of them are crap and there is nothing remotely similar to Python's Pygments or Ruby's coderay (oh how the mighty CPAN has fallen). I found that Syntax::SourceHighlight, a Perl interface to GNU Source-highlight, is the only usable one. One downside is just that I need to pull over 100MB-worth of Debian packages to have it installed, a huge dependency especially since my original requirement is merely to colorize the terminal output of some JSON and YAML. And it's too bad that it doesn't support YAML out-of-the-box yet. So I don't plan on using it anytime soon.
Currently I'm investigating (via the lazy web) utilizing emacs'/vim's syntax-highlighting capability. It's a good bet that one of those two editors are available on a standard Linux box. A pure-Perl library would be ideal though.
BTW, I've just hacked and released JSON::Color and YAML::Tiny::Color to fulfill my specific needs.
Darn, The MT won't show images in comments. json-color.png.
I rather like Syntax-Highlight-Engine-Kate. Some languages are better highlighted than others, but it does the job for me.
Yeah, but did you also notice how slow it is?
Hi Steven,
Talking about Emacs, here is a "weirdo" distribution of mine on github:
https://github.com/benkasminbullock/Emacs-HTMLize
This is what I use to make the syntax highlighting on my website, like the following:
http://www.lemoda.net/perl/hash-ref-or-copy/index.html
http://www.lemoda.net/images/sizes/index.html
It is really slow and not conceptually great, which is why it is not released to CPAN. There are all sorts of reasons for that code being like that, most of which I don't remember. One problem was making it run without a terminal process, since Emacs doesn't usually bother doing the syntax highlighting unless there is a controlling terminal. I got the idea to use "expect" from Stackoverflow. But it is very good at adapting to any language, e.g. I can write a program in Octave or JavaScript and it "just works":
http://www.lemoda.net/games/othello/index.html
http://www.lemoda.net/octave/normal-probability/index.html
I also have a module which is specifically for C programs:
https://metacpan.org/release/C-Tokenize
There is a script included in the distribution:
https://metacpan.org/module/c2html
This is quite fast, faster than running an Emacs process and then killing it again.
Hi Ben,
Thanks. Someone at SO also pointed out about a similar command htmlfontify-buffer. I played around with it a bit. It's a bit painful, like you said. And it's not exactly what I need because I want ANSI escapes output and not HTML.
My current favorite is Syntax::SourceHighlight, since it's fast and covers a lot of languages (YAML is not yet in the list though) and output formats (ANSI escape and HTML are supported, among others).
My current needs is so far met with the two modules I wrote today. They're not exactly syntax highlighters, more like dumper (with fixed formatting) but it's just what I need.
I keep meaning to write an article/example/description/something on how you can use Parser::MGC to parse up the input text and yield a syntax tree annotated to give the positions it found the various constructs in the input. This would make it easy to drive a syntax highlight engine from it.
You're right! Maybe using vim is viable after all (emacs is currently out of the question though, its startup overhead is too much).
Hoping Parser::MGC (or Marpa, or whatever) will help form a basis for the next Perl syntax highlighting library project :)
PPI::HTML is very, very good, but only highlights Perl code of course.
Not a Perl solution, but i typically use vim to get syntax highlighting for code which i post on websites:
This will create a html file with syntax highlighting and quit right afterwards.