Clicky

Harry R. Schwartz

Software engineer, nominal scientist, gentleman of the internet. Member, ←Hotline Webring→.


When is Literate Programming Appropriate?

Published 19 May 2016. Tags: computer-science.

We’ve been chatting a lot about Org-mode at the last few EmacsBoston meetups. The topic of literate programming—the practice of structuring code as a narrative through copious comments—keeps coming up. Org makes it easy to structure programs literately, and that ease encourages users to try it out.

This usually leads to a question: should programs be structured literately? I think it depends.

Working software engineers—especially folks (like me) who’ve been influenced by the XP movement—often dislike literate programs. Since real-world projects are guaranteed to change, “good code” means “code that can be changed easily.” Literate programs are usually much harder to change.

For example, an XP practitioner might argue that comments are just lies waiting to happen: eventually someone will change the code without updating the comments that describe that code and that discrepancy will deceive comment-readers forever after.

This mindset encourages the use of small, thoughtfully named methods instead of comments. Using good names provides a sort of “executable documentation.” When changing the function of the code the programmer almost has to update the appropriate method and variable names, so “documentation” (in the form of names) automatically follows function.

Additionally, big refactorings might require discarding or restructuring huge swathes of comments. This discourages refactoring and ossifies the code.

On the other hand, folks from a more research-oriented or academic background often like literate programming.

Research is the business of producing and communicating ideas, not maintaining long-running projects. Research projects are often prototypes built to test an idea, so long-term maintenance is less important than communicating the ideas expressed in the code.

Literate programming is also pedagogically useful. It’s easier for a student or colleague to understand code when it’s accompanied by a prose explanation.

Some researchers who are also devotees of reproducible research might go so far as to structure whole papers as literate programs. I haven’t done this myself, but I hear it works well.

So, in summary, when should you use literate programming?

If your program needs to change often (which, admittedly, describes most production code), literate programming may be a hindrance.

But if your code is relatively static, and especially if you’re mostly interested in communicating ideas, literate programming can be a valuable tool.