As some of you might notice from my old tech blog, I often have problems with my colleagues in the Humanities because they use Word and I use LaTeX. This cause me recently to have a lost day as I had to translate a PDF by hand into Open Office so I could send it in Word to my editor.

In my search to make this process much less arduous, I believe that I have found the panacea: pandoc. I tried it on my XeLaTeX file for my current article and it worked very, very well. It outputs in OpenOffice format and from there, it is easy to translate into Word.

There are two problems (as there always are). First, it does not handle BibTeX at all so you must copy and paste that information by hand from your PDF. Second, it mangled the Greek that I had in the file which means that pandoc does not handle UTF-8 very well at some point in the process of producing the Open Office file. I will need to file a bug report. Other than that, however, I am very impressed


That looks really interesting, thanks! :-)

Check out Citeproc-hs with Pandoc support. It’s not BibTeX, but it does integrate somewhat with Zotero a rather good firefox based reference manager.

The stuff in the .bbl file looks very much like TeX source. You might be able to \include it in the version you run through pandoc…

If you get citeproc-hs up and running I don’t suppose you’d like to post back a quick HOWTO from installation to usage? At the moment my workflow is like this, but it would be nice to be able to eliminate the Microsoft Word step even further :)

I had a reply here which I made into a post and expanded slightly instead.

If you get citeproc-hs working with pandoc, is there any chance you could post a quick HOWTO?

The Oxford University Computing Services developed an interesting (REST) web service to convert between various document formats.

They are still developing the service but it covers many formats, probably more than has so far been published to sourceforge so do drop the project admins a line if you are interested.

It’s implemented in Java and XSLT, I believe.

