jugad2 - Vasudev Ram on software innovation: Linux-utilities

Showing posts with label Linux-utilities. Show all posts

Saturday, March 4, 2017

m, a Unix shell utility to save cleaned-up man pages as text

mu image attribution

I was using this Unix utility called m today, as I often do, when working on Linux. It's a shell script that lets you save the man pages for one or more Unix commands, system calls or other topics, to text files, after cleaning up the man command output to remove formatting meant for emphasis, printing, etc.

I had first written m a while ago, on Unix boxes that I used to work on earlier, as opposed to Linux, which I use these days.
At that time it was needed, because on some Unix versions, man page output used to be formatted for printers (by text-processing tools such as nroff and troff. These tools would insert extra formatting characters in the output, for effects like bold and underscore, that made it less easy to read the text file on screen, if you simply redirected it to a file, and opened it in a text editor. (Reading the page via the man command itself would work fine.)

Here is the m script, shown by cat [1]:

cat ~/bin/m

cat displays:

mkdir -p ~/man
for i
do
    man $i | col -bx > ~/man/$i.m
done

[1] Check out "useless use of cat" at the cat link above.

(A less-known fact is that "for i" is shorthand for "for i in $*", i.e. it iterates over all the command-line arguments to the script. Not to be confused with "for i in *" which will iterate over all the filenames in the current directory, because the * expands to that.)

m uses the convention of putting all the text files that it creates (one per command-line argument), into a directory called man, under your home directory, i.e. ~/man. If the directory does not exist, it will be created.

You have to save the above script as a file called m in a directory that is in your Unix PATH. Creating a directory called ~/bin is a good choice - your local bin directory:

mkdir ~/bin
cp m ~/bin  # Assumes you created m in your current directory.

and make it executable using chmod:

chmod u+x ~/man/m

Now if I run m as follows, to generate (as cleaned-up text) the man pages for, say, the fopen and fclose C stdio library functions,

m fopen fclose

it creates the text files fopen.m and fclose.m in my ~/man directory.
I can then open fopen.m with the view command (vi in read-only mode):

view ~/man/fopen.m

Here is a screenshot of the file opened in vi(ew):

Enjoy.

P.S. If you are new to vi and want to get up and running with it fast, check out my vi quickstart tutorial. I first wrote it for a couple of friends, Windows system administrator colleagues of mine, who had been given additional charge of a few Unix boxes, at their request, to help them to get up to speed with vi. They later said it helped with that.

P.P.S. If you like short words and commands like m, check out the Japanese word mu for some interesting points.

A few excerpts:
[
The Japanese and Korean term mu (Japanese: 無; Korean: 무) or Chinese wú (traditional Chinese: 無; simplified Chinese: 无) meaning "not have; without" is a key word in Buddhism, especially Zen traditions.
...
Some English translation equivalents of wú or mu 無 are:
"no", "not", "nothing", or "without"[2]
nothing, not, nothingness, un-, is not, has not, not any[3]
[1] Nonexistence; nonbeing; not having; a lack of, without. [2] A negative. [3] Caused to be nonexistent. [4] Impossible; lacking reason or cause. [5] Pure human awareness, prior to experience or knowledge. This meaning is used especially by the Chan school.
...
The character wu 無 originally meant "dance" and was later used as a graphic loan for wu "not". The earliest graphs for 無 pictured a person with outstretched arms holding something (possibly sleeves, tassels, ornaments) and represented the word wu "dance; dancer".
...
The Gateless Gate, which is a 13th-century collection of Chan or Zen kōans, uses the word wu or mu in its title (Wumenguan or Mumonkan 無門關) and first kōan case ("Joshu's Dog" 趙州狗子). Chinese Chan calls the word mu 無 "the gate to enlightenment".[9] The Japanese Rinzai school classifies the Mu Kōan as hosshin 発心 "resolve to attain enlightenment", that is, appropriate for beginners seeking kenshō "to see the Buddha-nature"'.[10]
...
In the original text, the question is used as a conventional beginning to a question-and-answer exchange (mondo). The reference is to the Mahāyāna Mahāparinirvāṇa Sūtra[14] which says for example:
In this light, the undisclosed store of the Tathagata is proclaimed: "All beings have the Buddha-Nature".[15]
...
In Robert M. Pirsig's 1974 novel Zen and the Art of Motorcycle Maintenance, mu is translated as "no thing", saying that it meant "unask the question". He offered the example of a computer circuit using the binary numeral system, in effect using mu to represent high impedance:
...
"Mu" may be used similarly to "N/A" or "not applicable," a term often used to indicate the question cannot be answered because the conditions of the question do not match the reality.
...
Because of this meaning, programming language Perl 6 uses "Mu" for the root of its type hierarchy.[23]
]

The image at the top of the post, is the character mu in seal script.

P.P.P.S. Really the last this time :) Another m-word:

Check out Muji - simplicity is deceptively complex.

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel

Share |

Sunday, January 8, 2017

An Unix seq-like utility in Python

By Vasudev Ram

Due to a chain (or sequence - pun intended :) of thoughts, I got the idea of writing a simple version of the Unix seq utility (command-line) in Python. (Some Unix versions have a similar command called jot.)

Note: I wrote this program just for fun. As the seq Wikipedia page says, modern versions of bash can do the work of seq. But this program may still be useful on Windows - not sure if the CMD shell has seq-like functionality or not. PowerShell probably has it, is my guess.)

The seq command lets you specify one or two or three numbers as command-line arguments (some of which are optional): the start, stop and step values, and it outputs all numbers in that range and with that step between them (default step is 1). I have not tried to exactly emulate seq, instead I've written my own version. One difference is that mine does not support the step argument (so it can only be 1), at least in this version. That can be added later. Another is that I print the numbers with spaces in between them, not newlines. Another is that I don't support floating-point numbers in this version (again, can be added).

The seq command has more uses than the above description might suggest (in fact, it is mainly used for other things than just printing a sequence of numbers - after all, who would have a need to do that much). Here is one example, on Unix (from the Wikipedia article about seq):

# Remove file1 through file17:
for n in `seq 17`
do
    rm file$n
done

Note that those are backquotes or grave accents around seq 17 in the above code snippet. It uses sh / bash syntax, so requires one of them, or a compatible shell.

Here is the code for seq1.py:

'''
seq1.py
Purpose: To act somewhat like the Unix seq command.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys

def main():
    sa, lsa = sys.argv, len(sys.argv)
    if lsa < 2:
        sys.exit(1)
    try:
        start = 1
        if lsa == 2:
            end = int(sa[1])
        elif lsa == 3:
            start = int(sa[1])
            end = int(sa[2])
        else: # lsa > 3
            sys.exit(1)
    except ValueError as ve:
        sys.exit(1)

    for num in xrange(start, end + 1):
        print num, 
    sys.exit(0)
    
if __name__ == '__main__':
    main()

And here are a few runs of seq1.py, and the output of each run, below:

$ py -2 seq1.py

$ py -2 seq1.py 1
1

$ py -2 seq1.py 2
1 2

$ py -2 seq1.py 3
1 2 3

$ py -2 seq1.py 1 1
1

$ py -2 seq1.py 1 2
1 2

$ py -2 seq1.py 1 3
1 2 3

$ py -2 seq1.py 4
1 2 3 4

$ py -2 seq1.py 1 4
1 2 3 4

$ py -2 seq1.py 2 2
2

$ py -2 seq1.py 5 3

$ py -2 seq1.py -6 -2
-6 -5 -4 -3 -2

$ py -2 seq1.py -4 -0
-4 -3 -2 -1 0

$ py -2 seq1.py -5 5
-5 -4 -3 -2 -1 0 1 2 3 4 5

There are many other possible uses for seq, if one uses one's imagination, such as rapidly generating various filenames or directory names, with numbers in them (as a prefix, suffix or in the middle), for testing or other purposes, etc.

- Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel

Share |

Friday, May 8, 2015

tabtospaces, utility to change tabs to spaces in Python files

By Vasudev Ram

Near the end of a recent blog post:

asciiflow.com: Draw flowcharts online, in ASCII

, I showed how this small snippet of Python code can be used to make a Python program usable as a component in a Unix pipeline:

for lin in sys.stdin:
    sys.stdout.write(process(lin))

Today I saw Raymond Hettinger (@raymondh)'s tweet about the -t and -tt command line options of Python:

#python tip: In Python 2, the -tt option raises an error when you foolishly mix spaces and tabs. In Python 3, that is always an error.

That made me think of writing a simple Python 2 tool to change the tabs in a Python file to spaces. Yes, I know it can be easily done in Unix or Windix [1] with any of sed / awk / tr etc. That's not the point. So here is tabtospaces.py:

import sys
for lin in sys.stdin:
    sys.stdout.write(lin.replace("\t", "    "))

[ Note: this code converts each tab into 4 spaces. It can be parameterized by passing a command-line option that specifies the number of spaces, such as 4 or 8, and then replacing each tab with that many spaces. Also note that I have not tested the program on many sets of data, just one for now. ]

I created a simple Python file, test1.py, that has mixed tabs and spaces to use as input to tabtospaces.py. Then I ran the following commands:

$ py -tt test1.py
  File "test1.py", line 4
    print arg,
              ^
TabError: inconsistent use of tabs and spaces in indentation

$ py tabtospaces.py < test1.py > test2.py

$ py -tt test2.py
0 1 2 3 4 5 6 7 8 9

which shows that tabtospaces.py does convert the tabs to spaces.

And you can see from this diff that the original test1.py and the test2.py generated by tabtospaces.py, differ only in the use of tabs vs. spaces:

$ fc /l test1.py test2.py
Comparing files test1.py and TEST2.PY
***** test1.py
    for arg in args:
                print arg,

***** TEST2.PY
    for arg in args:
        print arg,

*****

[1] Windix is the latest upcoming Unix-compatible OS from M$, due Real Soon Now. You heard it here first - TM.

- Vasudev Ram - Online Python training and programming

Dancing Bison Enterprises

Signup to hear about new software products or info-products that I create.

Posts about Python Posts about xtopdf

Contact Page

Share |

Friday, October 24, 2014

Print selected text pages to PDF with Python, selpg and xtopdf on Linux

By Vasudev Ram

In a recent blog post, titled My IBM developerWorks article, I talked about a tutorial that I had written for IBM developerWorks a while ago. The tutorial showed some of the recommended techniques and practices to follow when writing a Linux command-line utility that is intended for production use, and how to write it in such a way that it can easily cooperate with existing UNIX command-line tools, when used in a UNIX command pipeline.

This ability of properly written command-line tools to cooperate with each other when used in a pipeline, is, as I said in that IBM article, one of the keys to the power of Linux (and UNIX) as a development environment. (See the classic book The UNIX Programming Environment, for much more on this topic.)

The utility I wrote and discussed (in that IBM article), called selpg (for SELect PaGes), allows the user to select a specified range of pages from a text file. At the end of the aforementioned blog post, I had said that I would show some practical uses of the selpg utility later. I describe one such use case below, involving a combination of selpg and my xtopdf toolkit), which is a Python library for PDF creation.

(The xtopdf toolkit contains a PDF creation library, and also includes some sample applications that show how to use the library to create PDF output in various ways, and from various input sources, which is why I tend to call xtopdf a toolkit instead of just a library.

I had written one such application of xtopdf a while ago, called StdinToPDF(.py) (for standard input to PDF). I blogged about it at the time, here:

[xtopdf] PDFWriter can create PDF from standard input. (PDFWriter is a module of xtopdf, which provides the core PDF creation functionality.)

The selpg utility can be used with StdinToPDF, in a pipeline, to select a range of pages (by starting and ending page numbers) from a (possibly large) text file, and write only those selected pages to a PDF file. Here is an example of how to do that:

First, build the selpg utility from source, for your Linux OS. selpg is only meant to work on Linux, since it uses some Linux C standard library functions, such as from stdio.h, and popen(); but you can try to run it on Windows (at your own risk), since Windows does have (had?) a POSIX subsystem, from Windows NT onward. I have used it in the past. (Update: I checked - according to this section of the Wikipedia article about POSIX, Windows may have had POSIX support only from Windows NT up to Windows 2000.) Anyway, to build selpg on Linux, follow the steps below (the $ sign is the shell prompt and not to be typed):

1. Download the source code from the sources section of the selpg project repository on Bitbucket.

Download all of these files: makefile, mk, selpg.c and showsyserr.c .

2. Make the (shell script) file mk executable, with the command:

$ chmod u+x mk

3. Then run the file mk, with:

$ ./mk

That will run the makefile that builds the selpg executable using the C compiler on your Linux box. The C compiler (invoked as cc or gcc) is installed on most mainstream Linux distributions. If it is not, you will need to install it from the repository for your Linux distribution. Sometimes only a minimal version of a C compiler is installed, which is only enough to (re)compile the kernel after making kernel parameter changes, such as for performance tuning. Consult your local Linux expert for help if such is the case.

3. Now make the file selpg executable, with the command:

$ chmod u+x selpg

4. (Optional) You can check the usage of selpg by reading the IBM tutorial article and/or running selpg without any command-line arguments:

$ ./selpg

which will show a usage message.

6. (Optional) You can run selpg a few times with some text file(s) as input, and different values for the -s and -e command-line options, to get a feel for how it works.

Now download xtopdf (which includes StdinToPDF) from here:

xtopdf on Bitbucket.

To install it, follow the steps given in this post:

Guide to installing and using xtopdf, including creating simple PDF e-books

That post was written a while ago, when xtopdf was hosted on SourceForge. So you need to make one change to the instructions given in that guide: instead of downloading xtopdf from SourceForge, as stated in Step 5 of the guide, get it from the xtopdf Bitbucket link I gave above.

(To make xtopdf work, you also have to install ReportLab, which xtopdf depends uses internally; the steps for that are given in my xtopdf installation guide linked above, or you can also look at the instructions in the ReportLab distribution. It is easy, just a couple of steps - download, unzip, configure a setting or two.)

Once you have both selpg and xtopdf installed, you can use selpg and StdinToPDF together. Here is an example run, to select only pages 2 through 4 from an input text file:

I wrote a simple Python program, gen_selpg_test_file,py, to create a text file that can be used to test the selpg and StdinToPDf programs together.

Here is an excerpt of the core logic of gen_selpg_test_file.py, omitting argument and error handling for brevity (I have those in the actual code):

# Generate the test file with the given filename and number of lines of text.
    try:
        out_fil = open(out_filename, "w")
    except IOError as ioe:
        sys.stderr.write("Error: Could not open output file {}.\n".format(out_filename))
        sys.exit(1)
    for line_num in range(1, num_lines + 1):
        line = "Line #" + str(line_num).zfill(10) + "\n"
        out_fil.write(line)
    out_fil.close()

I ran it like this:

$ python gen_selpg_test_file.py selpg_test_file_1000.txt 1000

to generate a text file with 1000 lines, in the file selpg_test_file_1000.txt .

Then I could run the pipeline using selpg and StdinToPDF, as described above:

$ ./selpg -s2 -e4 selpg_test_file_1000.txt | python StdinToPDF.py p2-p4.pdf

This command extracts only the specifed pages (2 to 4) from the input file, and pipes them to StdinToPDF, which converts those pages only, to PDF, in the filename specified at the end of the command.

After doing the above, you can open the file p2_p4.pdf in your favorite PDF reader (Evince is one PDF reader for Linux), to confirm that it contains all (and only) the lines from page 2 to 4 of the input file selpg_test_file_1000.txt (considering 72 lines per page, which is the default that selpg uses).

Read the IBM article to see how that default can be changed - to either another number of lines per page, e.g. 66 or 80 or whatever, or to specify form feeds (ASCII code 12) as the page delimiter. Form feeds are often used as a page delimiter in text file reports generated by programs, when the reports are destined for a printer, since the form feed character causes the printer to advance the print head to the top of the next page/form (that's how the character got its name).

Though this post seemed long, note that a lot it was either background information or instructions on how to build selpg and install xtopdf. Those are both one time jobs. Once those are done, you can select the needed pages from any text file and print them to PDF with a single command-line, as shown in the last command above.

This is useful when you printed the entire file earlier, and some pages didn't print properly because the printer jammed. Just use selpg with xtopdf to print only the needed pages again.

The image above is from the Wikipedia article on Printing, and titled:

Jikji, "Selected Teachings of Buddhist Sages and Son Masters" from Korea, the earliest known book printed with movable metal type, 1377. Bibliothèque Nationale de France, Paris

- Enjoy.

- Vasudev Ram - Dancing Bison Enterprises

Click here to get email about new products from Vasudev Ram.

Contact Page

Share |

Thursday, October 9, 2014

The Linux Foundation's new Linux Certification program

By Vasudev Ram

Saw this recently via the newsletter I get from The Linux Foundation

The Linux Foundation is introducing a new Linux certification program. It will be available anywhere, online.

Jim Zemlin, the executive director of the Linux Foundation, has details about it in this blog post:

Linux Growth Demands Bigger Talent Pool

There are two certifications:

Linux Foundation Certified System Administrator (LFCS)

Linux Foundation Certified Engineer (LFCE)

These Linux certifications are likely to be a good value addition to anyone seeking to start or grow a career involving Linux, since they are from the official foundation that is behind Linux - the Linux Foundation, which does a lot of work related to sponsoring Linux development (*), conducting conferences like LinuxCon, etc.

In fact, the Linux Foundation sponsors the work of Linux Torvalds, the founder of Linux - Linus is a Linux Foundation Fellow. See this page about the Linux Fellow Program - Linus's name is at the top of the list of Linux Fellows.
On a related note, if you are into Linux and would like to learn how to write Linux command-line utilities in C, check out this blog post by me on the topic of Developing a Linux command-line utility in C, an article I wrote for IBM developerWorks a while ago. It got many views and a 4-star rating, and some people have told me they used the article (which is a tutorial) as a guide to developing command-line utilities on Linux for production use.

- Vasudev Ram - Python and Linux training and consulting - Dancing Bison Enterprises

Click here to signup for email notifications about new products and services from Vasudev Ram.

Contact Page

Share |

Sunday, September 28, 2014

My IBM developerWorks article: Developing a Linux command-line utility

By Vasudev Ram

I had written an article about Developing a Linux command-line utility for IBM developerWorks (IBM dW), some years ago. It was a tutorial on how to write Linux command-line utilities in C. It used a real-life Linux utility that I had earlier written [1], to show some of the techniques involved in writing such utilities for general-purpose use.

[1] I had originally written the utility for production use for one of the largest motorcycle manufacturers in the world.

The article was fairly well-received while it was on the site (for a long time) and received multiple four-star ratings (out of a possible five stars). It was viewed over 35,000 times. Since it was recently archived from the IBM dW site, I thought of putting up the article - as a PDF file [2], with the accompanying source code, in a project on my Bitbucket account, for the benefit of those interested in learning how to write Linux command-line utilities in C. The name of the utility was selpg (for select pages), so I named the project selpg on Bitbucket too.

[2] I got to know that the article had been archived from the IBM dW site, and wrote to them asking for a copy of the PDF of the article, which they kindly sent me.

Here is the selpg project on Bitbucket:

Developing a Linux command-line utility (selpg)

And you can get the article and all the source files here:

selpg source

In an upcoming post, I'll show a few practical uses of the selpg utility.

Enjoy.

- Vasudev Ram - Dancing Bison Enterprises. Python, C, Linux and open source consulting and training.

Contact Page

Share |

jugad2 - Vasudev Ram on software innovation

Pages

Saturday, March 4, 2017

m, a Unix shell utility to save cleaned-up man pages as text

Sunday, January 8, 2017

An Unix seq-like utility in Python

Friday, May 8, 2015

tabtospaces, utility to change tabs to spaces in Python files

Friday, October 24, 2014

Print selected text pages to PDF with Python, selpg and xtopdf on Linux

Thursday, October 9, 2014

The Linux Foundation's new Linux Certification program

Sunday, September 28, 2014

My IBM developerWorks article: Developing a Linux command-line utility

Blog Archive

Labels