Wednesday, February 13, 2013

PyRTF, Python library to create RTF documents

By Vasudev Ram

PyRTF is a Python library that enables programmatic creation of RTF (Rich Text Format) documents. RTF files are compatible with Microsoft Word and many other leading word processors such as OpenOffice, and also compress well. PyRTF makes it fairly easy to generate RTF content programmatically, with many features such as sections, paragraphs, headers and footers, tables, etc.

Some years earlier I had done some interesting work with RTF using Java, as part of developing a product at a startup. The work basically involved reverse-engineering part of the RTF specification / format, and then writing custom Java code to generate RTF from the data in J2EE application. The RTF files could be imported into MS Word and Adobe InDesign.
The code was written in such a way as to try to keep style and content separate, so that each could be varied independently. It worked, to an extent.

RTF page in Wikipedia.

PyRTF download page on SourceForge.

Below is a simplified version of examples.py from the PyRTF package; I modified examples.py to create only one simple file instead of 7 increasingly complex ones. Save this file as small_example.py in the examples subdirectory of the directory where you extract PyRTF:

# small_example.py

import sys
sys.path.append( '../' )

from PyRTF import *


def MakeExample1() :
 doc     = Document()
 ss      = doc.StyleSheet
 section = Section()
 doc.Sections.append( section )

 # text can be added directly to the section
 # a paragraph object is create as needed
 section.append( 'Example 1' )

 # blank paragraphs are just empty strings
 section.append( '' )

 # a lot of useful documents can be created
 # with little more than this
 section.append(
 'A lot of useful documents can be created '
 'in this way, more advance formating is available '
 'but a lot of users just want to see their data come out '
 'in something other than a text file.' )
 return doc

def OpenFile( name ) :
 return file( '%s.rtf' % name, 'w' )

if __name__ == '__main__' :
 DR = Renderer()
 doc1 = MakeExample1()
 DR.Write( doc1, OpenFile( '1' ) )
 print "Finished"

Then run it with the command: python small_example.py

It will create a file called 1.rtf in the same directory.
The contents of the file will be this (in .RTF format):
Example 1

A lot of useful documents can be created in this way, more advance formating is available but a lot of users just want to see their data come out in something other than a text file.
So you can open the file in MS Word, OpenOffice, or any other word processor that supports the RTF format, and also save it to other formats like .DOC if you want to.



- Vasudev Ram - Dancing Bison Enterprises


4 comments:

Mark said...

PyRTF hasn't been updated since 2005, which, whilst not a problem, suggests you'd be pretty much on your own using it.

It turns out raw RTF isn't that hard to write, and the specs are readily available, as is a useful pocket guide. My app churns out RTF exports which can be potentially hundreds of pages long with a fair bit of structure by using an RTF Django template.

Ricardo Duarte said...

Nice post. If you want unicode support, you can take a look at pyrtf-ng.

Vasudev Ram said...

@Mark: Thanks for the tip about PyRTF not being updated nowadays. But yes, that does not prevent us from using it for stuff it already supports well.

Yes, raw RTF is not hard to write. In fact I mentioned in the post that I did that, in that Java project I worked on earlier. IIRC, I did look at the RTF spec but it was not too user-friendly, which is why I resorted to reverse-engineering the format by creating an RTF doc incrementally in Word (having first just a single letter as the content, then adding a word, a word in bold, then a paragraph, etc.) and then looking at it in a hex editor. This enabled me to figure out what characters were used as RTF markup for different types of content, such as a paragraph, bold text, italic text, etc. The rest was straightforward: just intersperse that markup as needed with the content pulled from the DB via Java.

Will check out the RTF Pocket Guide, thanks.

@Ricardo: Thanks for the tip about rtf-ng. Will check it out.


Unknown said...

Actually, I have just been looking into doing this too. And pyrtf-ng (http://code.google.com/p/pyrtf-ng/) is the old version. There is a newer version of the pyrtf-ng code base on launchpad. https://launchpad.net/pyrtf