Friday, January 9, 2015

Convert TSV (Tab Separated Values) to PDF with xtopdf

By Vasudev Ram

I wrote this program,, as a demo of how to convert TSV data to PDF, using my xtopdf toolkit.

TSV, which stands for Tab Separated Values, is a common data format. From the Wikipedia article linked in the previous sentence:

"TSV is a simple file format that is widely supported, so it is often used to move tabular data between different computer programs that support the format. For example, a TSV file might be used to transfer information from a database program to a spreadsheet.

TSV is an alternative to the common comma-separated values (CSV) format, which often causes difficulties because of the need to escape commas – literal commas are very common in text data, but literal tab stops are infrequent in running text. The IANA standard for TSV achieves simplicity by simply disallowing tabs within fields." uses the TSVReader module, for reading TSV data, and uses the PDFWriter module, for writing the PDF output. Both and are part of my xtopdf toolkit for PDF creation in Python.

Here is

A demo program to show how to convert TSV data to PDF, 
where TSV stands for Tab Separated Values, a data format commonly 
used on Unix and other operating systems.
Author: Vasudev Ram -
Copyright 2015 Vasudev Ram

import sys
from TSVReader import TSVReader
from PDFWriter import PDFWriter

def usage():
    sys.stderr.write("Usage: python " + sys.argv[0] + " tsv_file pdf_file\n")

def main():
    # check for right # of args
    if (len(sys.argv) != 3):

    # extract tsv and pdf filenames from args -
    # using Python's parallel assignment
    tsv_fn, pdf_fn = sys.argv[1:3]

    # create and open the TSVReader instance
    tr = TSVReader(tsv_fn)

    # create the PDFWriter instance
    # and set some of its fields:
    pw = PDFWriter(pdf_fn)
    pw.setFont("Courier", 10)
    pw.setHeader("Conversion of TSV data to PDF: Input: " + tsv_fn)
    pw.setFooter("Generated by xtopdf:")

    sep = '=' * 68

    # print the TSV data to PDF
    rec_num = 0
        while True:
            row = tr.next_row()
            s = ""
            for col in row:
                s = s + col + " "
            pw.writeLine(str(rec_num).rjust(5) + ": " + s)
            rec_num += 1
    except StopIteration:


if __name__ == '__main__':


I ran the demo program like this:
python file1.tsv file1.pdf
where file1.tsv was a TSV file that I created for the purpose of testing.

And here is a screenshot of the output PDF file, in Foxit PDF Reader:

- Vasudev Ram - Dancing Bison Enterprises

Signup to hear about new products or services from me.

Contact Page

No comments: