Sunday, March 6, 2011

Syntax Highlighting

I just burned about an hour of my life going back and retrofitting syntax highlighting to some of my old posts. The fault lies entirely with Bo Jensen, who first suggested it.  Some of the code to be highlighted (including what triggered the suggestion) is in R, some in Java, a bit in Bash.  (Thank goodness I got over APL decades ago; that would be a font nightmare.)  Obviously, for blogging purposes, I need a highlighter that generates HTML, not just a syntax highlighting code editor.  Given my rather modest output rate, an online highlighter would be just fine (no need to install it locally).  Finally, and this turns out to disqualify several of the available R highlighters, I like having function names highlighted, not just keywords.

So I scrounged around the 'Net a bit and found two very useful sites:
Thanks to both sites for making my life easier.

One side note:  This morning, by sheer coincidence, I received a couple of tweets from Hakan Kjellerstrand indicating that he was experimenting with a GPL version of the J programming language.  Curious, I took a look at some code samples on the Jsoftware site and started having flashbacks to APL.  While perusing the Wikipedia page for APL (linked above), I discovered that the flashback was not just random:  Kenneth Iverson, the designer of APL, was also a designer of J.

I need to stare at some FORTRAN for a while to clear my head.

Update (9 March 2011):  It's official -- I'm stupid. On my Windows box, I've been using Notepad++ for some time now (not so much for programming as for general editing of text files).  As it turns out, Notepad++ does syntax highlighting for a variety of languages, including both R and Java, and can export to an HTML file.

Fine, but I do most of work on Linux Mint these days, and Notepad++ is a Windows-only program.  It's based on Scintilla, though, as is SciTE, which I use for similar purposes on my Mint PC and laptop.  (SciTE is also available for Windows, but I'm already using NP++ and, as we say here, if it ain't broke, don't fix it.)  SciTE does Java highlighting out of the box, and with a small tweak, it does R syntax highlighting.  It also exports to HTML.  (The tweak: run SciTE via sudo, open the global options file, scroll down near the bottom and uncomment "import r", then save.)

So I'm set for highlighting with tools I'm already using.

Update (13 March 2011):  I discovered a Linux command line utility named (shockingly) highlight.  It converts code files in a variety of languages (including AMPL, which I needed today, but sadly not including R) to a variety of output formats (notably HTML, but also LaTeX). The utility is available from the Ubuntu universe repository, so you can load it via Synaptic without having to add a new source.

Of course, life can't be quite that simple.  The executable is installed as /usr/bin/highlight.  I already have a program of the same name at /usr/local/bin/highlight.  I don't know where it came from or what it does, but it seems to expect input from stdin regardless of any command line switches.  Since it's in the local bin directory, it loads ahead of the one I want (grrr).  Not knowing whether it's part of a larger package, I'm reluctant to nuke it.  So I've added alias highlight=/usr/bin/highlight to my .bashrc file, which gives me a safe (I think) workaround.

Update (10 July 2012): A reader tipped me off that Java code in one of my posts did not show up in Internet Explorer.  When I looked at the blog in IE, I discovered one post where nothing appeared except the title and the footers!  It turns out that in some cases I had inline CSS styles, but when I switched to highlight I was using a <style> tag to provide the style details.  Although this worked fine in Firefox, Chrome and (for all I know) every other browser, it was enough to confuse IE. I couldn't find a way to generate inline styles with highlight, so I switched to Pygments, also available via Synaptic (and recommended by my namesake in the comments below). It provides both a command line program (which I use) and a Python library. The syntax I use looks like

pygmentize -f html -l java -o myfile.html -O noclasses,nobackground,cssstyles="background: #CCFFFF;"

where the first option specifies HTML output, the second specifies Java input, the third (lower case "o") specifies the output file, the fourth (upper case "O") specifies options, and the last argument is the source code file to highlight. The noclasses option is the key: it forces inline CSS. The other two options suppress the usual background color in the <div> tag that surrounds the code and replace it with a color of my choosing.


  1. Yep I am the one to blame.

    "I need to stare at some FORTRAN for a while to clear my head."

    May I suggest FORTRAN77..that always convince me I could be doing worse.

  2. FORTRAN 77 was my all-time favorite programming language. Attempts to turn it into an object oriented language (was the FORTRAN 90?) were misguided. I'd still be using FORTRAN if CPLEX (before the ILOG acquisition) had not dropped support for it and forced me into the evil clutches of C (and later C++).

    My second favorite language might have been SNOBOL. Good luck finding a syntax highlighter for that!

  3. Oh you could have tried Pygments as well:

    Pygments highlights almost anything....

    You probably don't use Vim, but Vim highlights most languages and generates HTML code from your code.

  4. I use highlighting in my on-line documentation, in that case it's better to have a engine which converts on the fly. This eliminates the need to copy some converted html code back, when code examples changes. The syntaxHighligter does a good job, if you can settle with the few supported languages (I only need c/c++,python and C#). In a blog the examples rarely changes, it's a one time job.

  5. @Paul: (Good name, by the way.) I took a look at Pygments and was quite impressed, but at the time I looked for it I was looking for an R highlighter, and unfortunately Pygments doesn't do R. When I went looking for a Java highlighter (which Pygments does), I decided to go with a free online highlighter rather than a download. And you're right about my not using Vim (tried once, but with great power comes a long learning curve). I should check if Notepad++ (which does syntax highlighting and saves to HTML) will work, though. Hmmm.

    @Bo: Hadn't thought about online documentation, but I see your point.

  6. That sneaky Paul...I missed the missing Rubin, thought it was the almighty blogger himself :-)

  7. Yes, when I first saw the comment, I thought for a moment that one of my alternate personalities had surfaced while I was working on the blog.


If this is your first time commenting on the blog, please read the Ground Rules for Comments. In particular, if you want to ask an operations research-related question not relevant to this post, consider asking it on OR-Exchange.