Sunday, 4 October 2009

The Amazing Ruby: Precompiled Header Hack for Eclipse CDT

With a good IDE, such as Eclipse, and good libraries, such as boost, programming in C++ is almost bearable. Unfortunately, boost's magic templates take a long time to compile. Like most compilers, g++ provides precompiled headers to mitigate this problem, but support for these in Eclipse has so far been limited. This post describes a hack to get some level of precompiled header support in Eclipse with the g++ tool chain on Linux.

How it Works

The usual build process for a C++ project in Eclipse is:

  1. Eclipse writes Makefiles based on the source files in your project, and your project's configuration settings.
  2. Eclipse calls make, and make builds your project.

Here we'll replace step 2 with:

  1. Eclipse calls a ruby script.
  2. The ruby script edits Eclipse's generated Makefiles to include build rules for the precompiled header.
  3. The ruby script calls make, and make builds your project, using the precompiled header as needed.

Using this approach, you can add and remove files from your project or change configuration settings, and things will work mostly as expected.

Caveats

The current limitations of this method are:

  1. The script only supports one precompiled header file per project. It works like Visual Studio, in this regard: you have to supply a header file (stdafx.h) that includes all the headers you want to precompile.

  2. You can only add precompiled header support to one configuration for each project; for example, if you have precompiled headers in the Debug configuration, they won't be used in the Release configuration. This is because the script puts the precompiled header into the source folder, rather than in the build folder.

  3. This is a hack. Your mileage may vary. It may not work in future versions of eclipse.

So, with those caveats in mind, the next section describes how to get it set up.

Tutorial

  1. Make sure you have ruby installed and that you can run it from a terminal (command prompt). These instructions have been tested with Ruby 1.8.7 on Ubuntu 9.04 (Jaunty). You can install ruby via the Synaptic package manager.

  2. Start eclipse. These instructions have been tested with Eclipse 3.5 SR1 (Galileo). I've also used this script with Eclipse 3.4 and 3.5.

  3. You can easily add this script to an existing project using the steps that follow. For this demo, I'll create a new Hello World project, to make things more concrete.

  4. Run the project to make sure it works, before changing things.

  5. Download the cdtgch.rb script. The script is also included in this post, below. Put a copy in the root of your project folder (workspace/cdtgchdemo in this case).

  6. Add a header file that includes the headers you want precompiled. The script assumes that you call it stdafx.h, for consistency with Visual Studio. It also assumes that you have put it in a folder called src, for consistency with the default Eclipse C++ project structure. (See step 8 for a picture of the file structure used in this demo.) You can use any name and folder structure you want, but you will have to edit the first few lines of cdtgch.rb accordingly.

  7. Now it's time to tell eclipse to call cdtgch.rb instead of make. Right click on the project in the Project Explorer and choose Properties. On the left, choose C/C++ Build. Uncheck the Use default build command option, and enter

    ruby ../cdtgch.rb

    as the build command. The ../ is important, because the command runs in the build directory, and the script is in the project root. Click OK.

    Note: This hack only allows you to use precompiled headers with one configuration per project. Here, we'll put it into the Debug configuration, which makes sense, because that's the one that gets built most often (sigh).

  8. Build the project. (Choose Project > Build All.) If you refresh the Project Explorer, you should find a stdafx.h.gch file along side your stdafx.h file; this is the precompiled header. It will be rebuilt every time you change stdafx.h.

  9. Make sure you change your source files to include stdafx.h. It should be the first file that you include. For example, my main C++ file now reads:


    #include "stdafx.h" // was #include <iostream>
    // add other includes here...


    using namespace std;

    int main() {
    cout << "!!!Hello World!!!" << endl; // prints !!!Hello World!!!
    return 0;
    }

    and my stdafx.h now reads:


    #ifndef STDAFX_H_
    #define STDAFX_H_

    #include <iostream> // was in main file

    #endif /* STDAFX_H_ */
  10. Continue developing as normal (but somewhat faster!).

The Ruby Script (cdtgch.rb)


#
# Edits Eclipse's generated Makefiles to make them support a precompiled header.
#
# To use this script:
# 1. Create a file that includes the headers you want to precompile; this script
# assumes that you use my_project/src/stdafx.h, but you can change the PCH
# constant, below, if you want to do it differently.
# 2. Put this script in your project root.
# 3. Edit your project's properties; under C/C++ build, set the build command to
# ruby ../cdtgch.rb
# (the "../" is important because eclipse runs the script from the build
# directory, and the script is in the project root).
# 4. Make sure your source files include "stdafx.h"; it's a good idea to make
# this the first include.
# 5. Everything else should be as normal.
#
# Note that the script only allows you to have *one precompiled header file*.
# Note that you only get to use the gch in *one configuration*.
#
# See
# http://jdleesmiller.blogspot.com/2009/10/amazing-ruby-precompiled-header-hack.html
# for more info.
#

#
# Header file that you want to precompile.
# Path must be relative to the project root.
# The directory containing the stdafx.h must contain at least one .cpp file.
#
PCH = 'src/stdafx.h'

# Look at existing makefile to see if it needs hacking; we will only rewrite
# it if necessary.
rewrite_makefile = false
makefile_lines = IO.readlines('makefile')
makefile_lines.each {|l| l.chomp!}

# Need to import dependency file for the pch. This has to happen after we've
# imported subdir.mk.
dep_line = "CPP_DEPS += #{PCH}.gch.d"
objects_line = makefile_lines.index("-include objects.mk")
raise "cannot find subdir.mk include line" unless objects_line
unless makefile_lines[objects_line+1] == dep_line
makefile_lines.insert(objects_line+1, dep_line)
rewrite_makefile = true
end

# Make all objects depend on the precompiled header (even if not all of them
# really do).
gch_o_rule = "$(OBJS):%.o:../#{PCH}.gch"
unless makefile_lines.member?(gch_o_rule)
makefile_lines << ""
makefile_lines << gch_o_rule
makefile_lines << ""

rewrite_makefile = true
end

# Look for the rule to build the gch. We need to use the same g++ arguments as
# everywhere else; we can get these from a subdir.mk file. The dependencies on
# the project files ensure that the PCH gets rebuilt when the project settings
# change; otherwise, this happens for all the other files, but not the PCH, for
# reasons that I don't fully understand.
gch_rule = "../#{PCH}.gch: ../#{PCH} ../.cproject ../.project"
unless makefile_lines.find {|l| l =~ /^#{gch_rule}/}
# Need to look up the command in the subdir.mk file.
subdir_mk = File.new(File.join(File.dirname(PCH),'subdir.mk')).read
subdir_mk =~ /^(\tg\+\+.*)$/ or raise "cannot find g++ command in subdir.mk"
cmd = $1

# Make the command do dependencies for the gch file.
cmd.gsub! /-MF"[^"]*"/, "-MF\"#{PCH}.gch.d\""
cmd.gsub! /-MT"[^"]*"/, "-MT\"#{PCH}.gch.d\""

# Append a rule for building the precompiled header.
makefile_lines<<""
makefile_lines<<"#{gch_rule}"
makefile_lines<<cmd

rewrite_makefile = true
end

# Add a command to the clean rule so we get rid of the gch-related files.
clean_line = (0...makefile_lines.size).find{|i| makefile_lines[i] =~ /^clean:/}
raise "couldn't find clean: line in makefile" unless clean_line
unless makefile_lines[clean_line+1] =~ /#{PCH}\.gch/
makefile_lines.insert(clean_line+1,
"\trm -f #{PCH}.gch.d ../#{PCH}.gch")
rewrite_makefile = true
end

# Save changes, if any.
if rewrite_makefile
File.open('makefile', 'w') do |f|
f.write makefile_lines.join("\n")
end
end

# Now run make.
exec "make", ARGV.join(' ')

#
# Copyright (c) 2009 John Lees-Miller
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation
# files (the "Software"), to deal in the Software without
# restriction, including without limitation the rights to use,
# copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following
# conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
#

Thursday, 21 May 2009

The Amazing Ruby: Rake + EtherPad + LaTeX


Update (20120208): I recently launched writeLaTeX, a service that combines EtherPad with LaTeX preview. Try it out — it's free!


Update (20100518): EtherPad has finally shut down its servers after being acquired by Google, last year. The scripts below may still work (with some changes) with PiratePad, or with a pad that you host yourself, but I haven't tried it out myself.


Update (20090713): The epview AppJet application that I originally used seems to have disappeared. The code has been updated to use EtherPad's new export feature, so it should now be working again.

When I was in primary school, I spent most of my time on the 3 R's: reading, writing and arithmetic. Now I'm a PhD student studying maths; guess how I spend most of my time! The more things change, the more they stay the same, as they say.


One thing that has changed is that I rarely write things on my own, now. Most of my written work is collaborative, and almost all of it involves math. A lot of scientific writing fits this pattern, but I've found that tool support remains limited.


The `old fashioned' way is to pass LaTeX files around by e-mail, but this requires either one-at-a-time editing or manual merging. A better way is to put the document into a shared version control system, but this has high administration overhead, and it still requires lots of external coordination. (It's also almost totally unknown outside of the software engineering community, perhaps because it's not very user-friendly — did you forget to update before you committed?)


Google Docs is pretty good for general collaborative writing, because the document is stored on a central server. The authors can edit simultaneously via their browsers, from wherever in the world they happen to be. The key advantage is that there's one current version of your document; it's always up to date (plus or minus about 10s) and there is no manual merging. As an added bonus, you can easily view the state of your document at any time in the past, so you don't have to worry about losing text if you delete it. The main problem for me is that it has no support for typesetting math, so I just end up writing LaTeX markup in Google Docs. The editor is also a bit clumsy; for example (at least when I last used it), my undo stack reset itself every time someone else made a change.


Enter EtherPad. Again, the document is stored on a central server and edited via a browser, so there is one current version. Unlike Google Docs, it doesn't support images or formatting; it's just a plain text document, but that's fine when writing LaTeX markup. This simplicity allows the collaborative editor to be very responsive; updates are distributed so quickly that two authors can work on the same sentence without stepping on each others' toes. So, it's quite possible to write LaTeX collaboratively in EtherPad. But, there's a problem: you have to get your LaTeX source out of EtherPad so you can compile it; copying and pasting quickly gets tedious.


Enter Ruby. The following Ruby code grabs the current content of your pad. (Note that this code is linked below, so you don't have to copy and paste it.)

# Get the plain text content of an etherpad.
def get_etherpad pad
# Based on http://forums.etherpad.com/viewtopic.php?id=168
url = URI.parse("http://etherpad.com/ep/pad/export/#{pad}/latest?format=txt")
$stderr.print "Getting #{url}... "
s = Net::HTTP.get(url)
$stderr.puts "done."
s.strip
end

For extra streamlining, combine this with a custom Rake (Ruby make) task. (If you're not familiar with the rake or make tools, you might be interested in this article on dependency-based programming, by Martin Fowler.)


# See etherpad_file.

class EtherpadFileTask < Rake::FileTask

  attr_accessor :pad


  def remote_pad

    @remote_pad = get_etherpad(pad) unless @remote_pad

    @remote_pad

  end


  def needed?

    return true unless File.exists?(name)

    local_pad = File.open(name) {|f| f.read}

    return remote_pad != local_pad

  end

end


#

# Task to copy an etherpad to a local file.

# Each time it is invoked, it checks whether the etherpad has changed;

# if it has, the local file is updated; if it hasn't, the local file is

# left alone.

#

def etherpad_file(file_name, pad, &block)

  eft = EtherpadFileTask.define_task({file_name => []}) do |t|

    raise unless t.remote_pad

    File.open(t.name, 'w') {|f| f.write(t.remote_pad)}

    $stderr.puts "Wrote pad #{t.pad} to #{t.name}."

  end

  eft.pad = pad

  eft

end


Then add the following call to your Rakefile, where the Etherpad you want to use has the URL http://etherpad.com/your_pad_id.


etherpad_file 'your_local_file.tex', 'your_pad_id'


Now, when you run the command


rake your_local_file.tex


the script will grab the text from the pad and, if it differs from the local version (if any), it will overwrite the local version with the one from the pad. For even more convenience, add a LaTeX rule to your Rakefile, like:


rule '.pdf' => %w(.tex) do |t|

  tex = t.prerequisites.first

  dvi = tex.sub(/\.tex$/,'.dvi')

  ps = tex.sub(/\.tex$/,'.ps')

  sh <<SH

latex #{tex}

latex #{tex}

latex #{tex}

dvips -o #{ps} #{dvi}

ps2pdf #{ps} #{t}

SH

end


Now, you can run the command


rake your_local_file.pdf


and all the right stuff will happen. When the pad changes, and you run rake, the PDF will be rebuilt. Magic!


Of course, this is still far from the ideal collaborative LaTeX editing solution.

  • There is no syntax highlighting or reference auto-completion in the EtherPad editor.
  • There is no way to attach figures to the document; you have to exchange them some other way.
  • There's no safe way to import the pad into your favorite editor and then export it back again.
  • This method relies on the `epview' application, which may change or cease to exist at any time.
  • This method isn't very efficient; it goes out and gets the whole content of the document to compare with the local copy. All that's needed is the latest modification time, or maybe a hash of the document content.
  • EtherPad currently does not support private pads; anyone who knows the URL can read and edit your pad.

On the plus side, the method works for other text-based data you might stick on a pad. You could conceivably put any kind of code on several different pads and then use rake to compile them together. It would be rather slow, though.


For longer or more serious stuff (like the thesis I should be writing instead of this blog post), my preferred solution is the LyX `what you see is what you mean' (WYSIWYM) editor with version control (e.g. subversion). It may be that LyX will one day support collaborative editing, but so far there doesn't seem to be much progress on this.


For convenience, I've posted a demo Rakefile.rb file on my website, but all of the code you need is on this page.