Showing posts with label Linux. Show all posts
Showing posts with label Linux. Show all posts
Wednesday, February 6, 2019
Exploring the /proc filesystem: an article by me in Linux Pro Magazine
- By Vasudev Ram - Online Python training / SQL training / Linux training
Hi, readers,
Somewhat recently, I wrote this article which was published in Linux Pro Magazine:
Exploring the /proc filesystem with Python and shell commands
As the title suggests, it is about getting information from the Linux /proc file system, which is a pseudo-file system that contains different kinds of information about running processes. The article shows some ways of getting a few kinds of information of interest about one or more specified processes from /proc, using both Python programs and Linux shell commands or scripts. It also shows a bit of shell quoting magic.
(The article has a few small errors that crept in, late in the publishing process, but any programmer with a bit of Python knowledge will be able to spot them and still understand the article.)
Check it out.
Enjoy.
- Vasudev
- Vasudev Ram - Online Python training and consulting
I conduct online courses on Python programming, Unix / Linux commands and shell scripting and SQL programming and database design, with course material and personal coaching sessions.
The course details and testimonials are here.
Contact me for details of course content, terms and schedule.
Try FreshBooks: Create and send professional looking invoices in less than 30 seconds.
Getting a new web site or blog, and want to help preserve the environment at the same time? Check out GreenGeeks.com web hosting.
Sell your digital products via DPD: Digital Publishing for Ebooks and Downloads.
Learning Linux? Hit the ground running with my vi quickstart tutorial. I wrote it at the request of two Windows system administrator friends who were given additional charge of some Unix systems. They later told me that it helped them to quickly start using vi to edit text files on Unix. Of course, vi/vim is one of the most ubiquitous text editors around, and works on most other common operating systems and on some uncommon ones too, so the knowledge of how to use it will carry over to those systems too.
Check out WP Engine, powerful WordPress hosting.
Creating online products for sale? Check out ConvertKit, email marketing for online creators.
Teachable: feature-packed course creation platform, with unlimited video, courses and students.
Posts about: Python * DLang * xtopdf
My ActiveState Code recipes
Follow me on:
Friday, January 18, 2019
Announcing PIaaS - Python Interviewing as a Service
Hello, readers,
Announcing Python Interviewing as a Service:
I'm now officially offering PIaaS - Python Interviewing as a Service. I have done it some earlier, informally, for clients. Recently a couple of companies asked me for help on this again, so I am now adding it to my list of offered services, the others being consulting (software design and development, code review, technology evaluation and recommendation) and software training.
I can help your organization interview and hire Python developer candidates, offloading (some of) that work from your core technical and HR / recruitment staff.
I can also interview on related areas like SQL and RDBMS, and Unix and Linux commands and shell scripting.
I have long-term experience in all the above areas.
To hire me for PIaaS or to learn more about it, contact me via the Gmail address on my site's contact page.
- Vasudev Ram
My Codementor profile: Vasudev Ram on Codementor
Saturday, December 29, 2018
The Zen of Python is well sed :)
- By Vasudev Ram - Online Python training / SQL training / Linux training
$ python -c "import this" | sed -n "4,4p;15,16p" Explicit is better than implicit. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch.
- Vasudev Ram - Online Python training and consulting
I conduct online courses on Python programming, Unix / Linux commands and shell scripting and SQL programming and database design, with course material and personal coaching sessions.
The course details and testimonials are here.
Contact me for details of course content, terms and schedule.
Or if you're a self-starter, check out my Python programming course by email.
Try FreshBooks: Create and send professional looking invoices in less than 30 seconds.
Learning Linux? Hit the ground running with my vi quickstart tutorial.
Sell your digital products via DPD: Digital Publishing for Ebooks and Downloads.
Posts about: Python * DLang * xtopdf
My ActiveState Code recipes
Follow me on:
Labels:
command-line,
humor,
humour,
Linux,
one-liners,
python,
Python-one-liners,
The-Zen-of-Python,
UNIX,
Zen-of-Python
Saturday, November 10, 2018
VIDEO: The History of Unix, by Rob Pike
By Vasudev Ram
This is a video about Unix history by Rob Pike, Unix veteran and co-creator of the Go programming language.
The History of Unix, Rob Pike
The video is also embedded below.
There is also a current HN thread about it.
- Enjoy.
- Vasudev Ram - Online Python training and consulting
I conduct online courses on Python programming, Unix / Linux commands and shell scripting and SQL programming and database design, with course material and personal coaching sessions.
Here are the course outlines and some client testimonials.
Contact me for more details, terms and schedule.
Getting a new web site or blog, and want to help preserve the environment at the same time? Check out GreenGeeks.com web hosting.
DPD: Digital Publishing for Ebooks and Downloads.
Learning Linux? Hit the ground running with my vi quickstart tutorial. I wrote it at the request of two Windows system administrator friends who were given additional charge of some Unix systems. They later told me that it helped them to quickly start using vi to edit text files on Unix. Of course, vi/vim is one of the most ubiquitous text editors around, and works on most other common operating systems and on some uncommon ones too, so the knowledge of how to use it will carry over to those systems too.
Check out WP Engine, powerful WordPress hosting.
Sell More Digital Products With SendOwl.
Get a fast web site with A2 Hosting.
Creating or want to create online products for sale? Check out ConvertKit, email marketing for online creators.
Teachable: feature-packed course creation platform, with unlimited video, courses and students.
Posts about: Python * DLang * xtopdf
My ActiveState Code recipes
Follow me on:
This is a video about Unix history by Rob Pike, Unix veteran and co-creator of the Go programming language.
The History of Unix, Rob Pike
The video is also embedded below.
There is also a current HN thread about it.
- Enjoy.
- Vasudev Ram - Online Python training and consulting
I conduct online courses on Python programming, Unix / Linux commands and shell scripting and SQL programming and database design, with course material and personal coaching sessions.
Here are the course outlines and some client testimonials.
Contact me for more details, terms and schedule.
Getting a new web site or blog, and want to help preserve the environment at the same time? Check out GreenGeeks.com web hosting.
DPD: Digital Publishing for Ebooks and Downloads.
Learning Linux? Hit the ground running with my vi quickstart tutorial. I wrote it at the request of two Windows system administrator friends who were given additional charge of some Unix systems. They later told me that it helped them to quickly start using vi to edit text files on Unix. Of course, vi/vim is one of the most ubiquitous text editors around, and works on most other common operating systems and on some uncommon ones too, so the knowledge of how to use it will carry over to those systems too.
Check out WP Engine, powerful WordPress hosting.
Sell More Digital Products With SendOwl.
Get a fast web site with A2 Hosting.
Creating or want to create online products for sale? Check out ConvertKit, email marketing for online creators.
Teachable: feature-packed course creation platform, with unlimited video, courses and students.
Posts about: Python * DLang * xtopdf
My ActiveState Code recipes
Follow me on:
Labels:
Linux,
operating-systems,
Rob-Pike,
tech-videos,
UNIX,
Unix-history,
videos
Saturday, May 5, 2018
A Python version of the Linux watch command
By Vasudev Ram
Watcher image attribution: Yours truly
Hi readers,
[ Update: A note to those reading this post via Planet Python or other aggregators:
Before first pulbishing it, I reviewed the post in Blogger's preview mode, and it appeared okay, regarding the use of the less-than character, so I did not escape it. I did not know (or did not remember) that Planet Python's behavior may be different. As a result, the code had appeared without the less-than signs in the Planet, thereby garbling it. After noticing this, I fixed the issue in the post. Apologies to those seeing the post twice as a result. ]
I was browsing Linux command man pages (section 1) for some work, and saw the page for an interesting command called watch. I had not come across it before. So I read the watch man page, and after understanding how it works (it's pretty straightforward [1]), thought of creating a Python version of it. I have not tried to implement exactly the same functionality as watch, though, just something similar to it. I called the program watch.py.
[1] The one-line description of the watch command is:
watch - execute a program periodically, showing output fullscreen
How watch.py works:
It is a command-line Python program. It takes an interval argument (in seconds), followed by a command with optional arguments. It runs the command with those arguments, repeatedly, at that interval. (The Linux watch command has a few more options, but I chose not to implement those in this version. I may add some of them [2], and maybe some other features that I thought of, in a future version.)
[2] For example, the -t, -b and -e options should be easy to implement. The -p (--precise) option is interesting. The idea here is that there is always some time "drift" [3] when trying to run a command periodically at some interval, due to unpredictable and variable overhead of other running processes, OS scheduling overhead, and so on. I had experienced this issue earlier when I wrote a program that I called pinger.sh, at a large company where I worked earlier.
[3] You can observe the time drift in the output of the runs of the watch.py program, shown below its code below. Compare the interval with the time shown for successive runs of the same command.
I had written it at the request of some sysadmin friends there, who wanted a tool like that to monitor the uptime of multiple Unix servers on the company network. So I wrote the tool, using a combination of Unix shell, Perl and C. They later told me that it was useful, and they used it to monitor the uptime of multiple servers of the company in different cities. The C part was where the more interesting stuff was, since I used C to write a program (used in the overall shell script) that sort of tried to compensate for the time drift, by doing some calculations about remaining time left, and sleeping for those intervals. It worked somewhat okay, in that it reduced the drift a good amount. I don't remember the exact logic I used for it right now, but do remember finding out later, that the gettimeofday function might have been usable in place of the custom code I wrote to solve the issue. Good fun. I later published the utility and a description of it in the company's Knowledge Management System.
Anyway, back to watch.py: each time, it first prints a header line with the interval, the command string (truncated if needed), and the current date and time, followed by some initial lines of the output of that command (this is what "watching" the command means). It does this by creating a pipe with the command, using subprocess.Popen and then reading the standard output of the command, and printing the first num_lines lines, where num_lines is an argument to the watch() function in the program.
The screen is cleared with "clear" for Linux and "cls" for Windows. Using "echo ^L" instead of "clear" works on some Linux systems, so changing the clear screen command to that may make the program a little faster, on systems where echo is a shell built-in, since there will be no need to load the clear command into memory each time [4]. (As a small aside, on earlier Unix systems I've worked on, on which there was sometimes no clear command (or it was not installed), as a workaround, I used to write a small C program that printed 25 newlines to the screen, and compile and install that as a command called clear or cls :)
[4] Although, on recent Windows and Linux systems, after a program is run once, if you run it multiple times a short while later, I've noticed that the startup time is faster from the second time onwards. I guess this is because the OS loads the program code into a memory cache in some way, and runs it from there for the later times it is called. Not sure if this is the same as the OS buffer cache, which I think is only for data. I don't know if there is a standard name for this technique. I've noticed for sure, that when running Python programs, for example, the first time you run:
python some_script_name.py
it takes a bit of time - maybe a second or three, but after the first time, it starts up faster. Of course this speedup disappears when you run the same program after a bigger gap, say the next day, or after a reboot. Presumably this is because that program cache has been cleared.
Here is the code for watch.py.
(BTW, the dfs command shown, is from the Quick-and-dirty disk free space checker for Windows post that I had written recently.)
Watcher image attribution: Yours truly
Hi readers,
[ Update: A note to those reading this post via Planet Python or other aggregators:
Before first pulbishing it, I reviewed the post in Blogger's preview mode, and it appeared okay, regarding the use of the less-than character, so I did not escape it. I did not know (or did not remember) that Planet Python's behavior may be different. As a result, the code had appeared without the less-than signs in the Planet, thereby garbling it. After noticing this, I fixed the issue in the post. Apologies to those seeing the post twice as a result. ]
I was browsing Linux command man pages (section 1) for some work, and saw the page for an interesting command called watch. I had not come across it before. So I read the watch man page, and after understanding how it works (it's pretty straightforward [1]), thought of creating a Python version of it. I have not tried to implement exactly the same functionality as watch, though, just something similar to it. I called the program watch.py.
[1] The one-line description of the watch command is:
watch - execute a program periodically, showing output fullscreen
How watch.py works:
It is a command-line Python program. It takes an interval argument (in seconds), followed by a command with optional arguments. It runs the command with those arguments, repeatedly, at that interval. (The Linux watch command has a few more options, but I chose not to implement those in this version. I may add some of them [2], and maybe some other features that I thought of, in a future version.)
[2] For example, the -t, -b and -e options should be easy to implement. The -p (--precise) option is interesting. The idea here is that there is always some time "drift" [3] when trying to run a command periodically at some interval, due to unpredictable and variable overhead of other running processes, OS scheduling overhead, and so on. I had experienced this issue earlier when I wrote a program that I called pinger.sh, at a large company where I worked earlier.
[3] You can observe the time drift in the output of the runs of the watch.py program, shown below its code below. Compare the interval with the time shown for successive runs of the same command.
I had written it at the request of some sysadmin friends there, who wanted a tool like that to monitor the uptime of multiple Unix servers on the company network. So I wrote the tool, using a combination of Unix shell, Perl and C. They later told me that it was useful, and they used it to monitor the uptime of multiple servers of the company in different cities. The C part was where the more interesting stuff was, since I used C to write a program (used in the overall shell script) that sort of tried to compensate for the time drift, by doing some calculations about remaining time left, and sleeping for those intervals. It worked somewhat okay, in that it reduced the drift a good amount. I don't remember the exact logic I used for it right now, but do remember finding out later, that the gettimeofday function might have been usable in place of the custom code I wrote to solve the issue. Good fun. I later published the utility and a description of it in the company's Knowledge Management System.
Anyway, back to watch.py: each time, it first prints a header line with the interval, the command string (truncated if needed), and the current date and time, followed by some initial lines of the output of that command (this is what "watching" the command means). It does this by creating a pipe with the command, using subprocess.Popen and then reading the standard output of the command, and printing the first num_lines lines, where num_lines is an argument to the watch() function in the program.
The screen is cleared with "clear" for Linux and "cls" for Windows. Using "echo ^L" instead of "clear" works on some Linux systems, so changing the clear screen command to that may make the program a little faster, on systems where echo is a shell built-in, since there will be no need to load the clear command into memory each time [4]. (As a small aside, on earlier Unix systems I've worked on, on which there was sometimes no clear command (or it was not installed), as a workaround, I used to write a small C program that printed 25 newlines to the screen, and compile and install that as a command called clear or cls :)
[4] Although, on recent Windows and Linux systems, after a program is run once, if you run it multiple times a short while later, I've noticed that the startup time is faster from the second time onwards. I guess this is because the OS loads the program code into a memory cache in some way, and runs it from there for the later times it is called. Not sure if this is the same as the OS buffer cache, which I think is only for data. I don't know if there is a standard name for this technique. I've noticed for sure, that when running Python programs, for example, the first time you run:
python some_script_name.py
it takes a bit of time - maybe a second or three, but after the first time, it starts up faster. Of course this speedup disappears when you run the same program after a bigger gap, say the next day, or after a reboot. Presumably this is because that program cache has been cleared.
Here is the code for watch.py.
""" ------------------------------------------------------------------ File: watch.py Version: 0.1 Purpose: To work somewhat like the Linux watch command. See: http://man7.org/linux/man-pages/man1/watch.1.html Does not try to replicate its functionality exactly. Author: Vasudev Ram Copyright 2018 Vasudev Ram Web site: https://vasudevram.github.io Blog: https://jugad2.blogspot.com Product store: https://gumroad.com/vasudevram Twitter: https://mobile.twitter.com/vasudevram ------------------------------------------------------------------ """ from __future__ import print_function import sys import os from subprocess import Popen, PIPE import time from error_exit import error_exit # Assuming 25-line terminal. Adjust if different. # If on Unix / Linux, can get value of environment variable # COLUMNS (if defined) and use that instead of 80. DEFAULT_NUM_LINES = 20 def usage(args): lines = [ "Usage: python {} interval command [ argument ... ]".format( args[0]), "Run command with the given arguments every interval seconds,", "and show some initial lines from command's standard output.", "Clear screen before each run.", ] for line in lines: sys.stderr.write(line + '\n') def watch(command, interval, num_lines): # Truncate command for display in the header of watch output. if len(command) > 50: command_str = command[:50] + "..." else: command_str = command hdr_part_1 = "Every {}s: {} ".format(interval, command_str) # Assuming 80 columns terminal width. Adjust if different. # If on Unix / Linux, can get value of environment variable # COLUMNS (if defined) and use that instead of 80. columns = 80 # Compute pad_len only once, before the loop, because # neither len(hdr_part_1) nor len(hdr_part_2) change, # even though hdr_part_2 is recomputed each time in the loop. hdr_part_2 = time.asctime() pad_len = columns - len(hdr_part_1) - len(hdr_part_2) - 1 while True: # Clear screen based on OS platform. if "win" in sys.platform: os.system("cls") elif "linux" in sys.platform: os.system("clear") hdr_str = hdr_part_1 + (" " * pad_len) + hdr_part_2 print(hdr_str + "\n") # Run the command, read and print its output up to num_lines lines. # os.popen is the old deprecated way, Python docs recommend to use # subprocess.Popen. #with os.popen(command) as pipe: with Popen(command, shell=True, stdout=PIPE).stdout as pipe: for line_num, line in enumerate(pipe): print(line, end='') if line_num >= num_lines: break time.sleep(interval) hdr_part_2 = time.asctime() def main(): sa, lsa = sys.argv, len(sys.argv) # Check arguments and exit if invalid. if lsa < 3: usage(sa) error_exit( "At least two arguments are needed: interval and command;\n" "optional arguments can be given following command.\n") try: # Get the interval argument as an int. interval = int(sa[1]) if interval < 1: error_exit("{}: Invalid interval value: {}".format(sa[0], interval)) # Build the command to run from the remaining arguments. command = " ".join(sa[2:]) # Run the command repeatedly at the given interval. watch(command, interval, DEFAULT_NUM_LINES) except ValueError as ve: error_exit("{}: Caught ValueError: {}".format(sa[0], str(ve))) except OSError as ose: error_exit("{}: Caught OSError: {}".format(sa[0], str(ose))) except Exception as e: error_exit("{}: Caught Exception: {}".format(sa[0], str(e))) if __name__ == "__main__": main()Here is the code for error_exit.py, which watch imports.
# error_exit.py # Author: Vasudev Ram # Web site: https://vasudevram.github.io # Blog: https://jugad2.blogspot.com # Product store: https://gumroad.com/vasudevram # Purpose: This module, error_exit.py, defines a function with # the same name, error_exit(), which takes a string message # as an argument. It prints the message to sys.stderr, or # to another file object open for writing (if given as the # second argument), and then exits the program. # The function error_exit can be used when a fatal error condition occurs, # and you therefore want to print an error message and exit your program. import sys def error_exit(message, dest=sys.stderr): dest.write(message) sys.exit(1) def main(): error_exit("Testing error_exit with dest sys.stderr (default).\n") error_exit("Testing error_exit with dest sys.stdout.\n", sys.stdout) with open("temp1.txt", "w") as fil: error_exit("Testing error_exit with dest temp1.txt.\n", fil) if __name__ == "__main__": main()Here are some runs of watch.py and their output:
(BTW, the dfs command shown, is from the Quick-and-dirty disk free space checker for Windows post that I had written recently.)
$ python watch.py 15 ping google.com Every 15s: ping google.com Fri May 04 21:15:56 2018 Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data: Reply from 2404:6800:4007:80d::200e: time=117ms Reply from 2404:6800:4007:80d::200e: time=109ms Reply from 2404:6800:4007:80d::200e: time=117ms Reply from 2404:6800:4007:80d::200e: time=137ms Ping statistics for 2404:6800:4007:80d::200e: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 109ms, Maximum = 137ms, Average = 120ms Every 15s: ping google.com Fri May 04 21:16:14 2018 Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data: Reply from 2404:6800:4007:80d::200e: time=501ms Reply from 2404:6800:4007:80d::200e: time=56ms Reply from 2404:6800:4007:80d::200e: time=105ms Reply from 2404:6800:4007:80d::200e: time=125ms Ping statistics for 2404:6800:4007:80d::200e: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 56ms, Maximum = 501ms, Average = 196ms Every 15s: ping google.com Fri May 04 21:16:33 2018 Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data: Reply from 2404:6800:4007:80d::200e: time=189ms Reply from 2404:6800:4007:80d::200e: time=141ms Reply from 2404:6800:4007:80d::200e: time=245ms Reply from 2404:6800:4007:80d::200e: time=268ms Ping statistics for 2404:6800:4007:80d::200e: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 141ms, Maximum = 268ms, Average = 210ms $ python watch.py 15 c:\ch\bin\date Every 15s: c:\ch\bin\date Tue May 01 00:33:00 2018 Tue May 1 00:33:00 India Standard Time 2018 Every 15s: c:\ch\bin\date Tue May 01 00:33:15 2018 Tue May 1 00:33:16 India Standard Time 2018 Every 15s: c:\ch\bin\date Tue May 01 00:33:31 2018 Tue May 1 00:33:31 India Standard Time 2018 Every 15s: c:\ch\bin\date Tue May 01 00:33:46 2018 Tue May 1 00:33:47 India Standard Time 2018 In one CMD window: $ d:\temp\fill-and-free-disk-space In another: $ python watch.py 10 dfs d:\ Every 10s: dfs d:\ Tue May 01 00:43:25 2018 Disk free space on d:\ 37666.6 MiB = 36.78 GiB Every 10s: dfs d:\ Tue May 01 00:43:35 2018 Disk free space on d:\ 37113.7 MiB = 36.24 GiB $ python watch.py 20 dir /b "|" sort Every 20s: dir /b | sort Fri May 04 21:29:41 2018 README.txt runner.py watch-outputs.txt watch-outputs2.txt watch.py watchnew.py $ python watch.py 10 ping com.nosuchsite Every 10s: ping com.nosuchsite Fri May 04 21:30:49 2018 Ping request could not find host com.nosuchsite. Please check the name and try again. $ python watch.py 20 dir z:\ Every 20s: dir z:\ Tue May 01 00:54:37 2018 The system cannot find the path specified. $ python watch.py 2b echo testing watch.py: Caught ValueError: invalid literal for int() with base 10: '2b' $ python watch.py 20 foo Every 20s: foo Fri May 04 21:33:35 2018 'foo' is not recognized as an internal or external command, operable program or batch file. $ python watch.py -1 foo watch.py: Invalid interval value: -1- Enjoy.Interested in a Python programming or Linux commands and shell scripting course? I have good experience built over many years of real-life experience, as well as teaching, in both those subject areas. Contact me for course details via my contact page here.- Vasudev Ram - Online Python training and consultingFast web hosting with A2 HostingGet updates (via Gumroad) on my forthcoming apps and content. Jump to posts: Python * DLang * xtopdf Subscribe to my blog by email My ActiveState Code recipesFollow me on: LinkedIn * Twitter Are you a blogger with some traffic? Get Convertkit:Email marketing for professional bloggers
Labels:
command-line,
command-line-utilities,
Linux,
python,
Python-utilities,
UNIX,
utilities,
watch.py
Saturday, March 4, 2017
m, a Unix shell utility to save cleaned-up man pages as text
By Vasudev Ram
mu image attribution
I was using this Unix utility called m today, as I often do, when working on Linux. It's a shell script that lets you save the man pages for one or more Unix commands, system calls or other topics, to text files, after cleaning up the man command output to remove formatting meant for emphasis, printing, etc.
I had first written m a while ago, on Unix boxes that I used to work on earlier, as opposed to Linux, which I use these days.
At that time it was needed, because on some Unix versions, man page output used to be formatted for printers (by text-processing tools such as nroff and troff. These tools would insert extra formatting characters in the output, for effects like bold and underscore, that made it less easy to read the text file on screen, if you simply redirected it to a file, and opened it in a text editor. (Reading the page via the man command itself would work fine.)
Here is the m script, shown by cat [1]:
(A less-known fact is that "for i" is shorthand for "for i in $*", i.e. it iterates over all the command-line arguments to the script. Not to be confused with "for i in *" which will iterate over all the filenames in the current directory, because the * expands to that.)
m uses the convention of putting all the text files that it creates (one per command-line argument), into a directory called man, under your home directory, i.e. ~/man. If the directory does not exist, it will be created.
You have to save the above script as a file called m in a directory that is in your Unix PATH. Creating a directory called ~/bin is a good choice - your local bin directory:
I can then open fopen.m with the view command (vi in read-only mode):
Enjoy.
P.S. If you are new to vi and want to get up and running with it fast, check out my vi quickstart tutorial. I first wrote it for a couple of friends, Windows system administrator colleagues of mine, who had been given additional charge of a few Unix boxes, at their request, to help them to get up to speed with vi. They later said it helped with that.
P.P.S. If you like short words and commands like m, check out the Japanese word mu for some interesting points.
A few excerpts:
[
The Japanese and Korean term mu (Japanese: 無; Korean: 무) or Chinese wú (traditional Chinese: 無; simplified Chinese: 无) meaning "not have; without" is a key word in Buddhism, especially Zen traditions.
...
Some English translation equivalents of wú or mu 無 are:
"no", "not", "nothing", or "without"[2]
nothing, not, nothingness, un-, is not, has not, not any[3]
[1] Nonexistence; nonbeing; not having; a lack of, without. [2] A negative. [3] Caused to be nonexistent. [4] Impossible; lacking reason or cause. [5] Pure human awareness, prior to experience or knowledge. This meaning is used especially by the Chan school.
...
The character wu 無 originally meant "dance" and was later used as a graphic loan for wu "not". The earliest graphs for 無 pictured a person with outstretched arms holding something (possibly sleeves, tassels, ornaments) and represented the word wu "dance; dancer".
...
The Gateless Gate, which is a 13th-century collection of Chan or Zen kōans, uses the word wu or mu in its title (Wumenguan or Mumonkan 無門關) and first kōan case ("Joshu's Dog" 趙州狗子). Chinese Chan calls the word mu 無 "the gate to enlightenment".[9] The Japanese Rinzai school classifies the Mu Kōan as hosshin 発心 "resolve to attain enlightenment", that is, appropriate for beginners seeking kenshō "to see the Buddha-nature"'.[10]
...
In the original text, the question is used as a conventional beginning to a question-and-answer exchange (mondo). The reference is to the Mahāyāna Mahāparinirvāṇa Sūtra[14] which says for example:
In this light, the undisclosed store of the Tathagata is proclaimed: "All beings have the Buddha-Nature".[15]
...
In Robert M. Pirsig's 1974 novel Zen and the Art of Motorcycle Maintenance, mu is translated as "no thing", saying that it meant "unask the question". He offered the example of a computer circuit using the binary numeral system, in effect using mu to represent high impedance:
...
"Mu" may be used similarly to "N/A" or "not applicable," a term often used to indicate the question cannot be answered because the conditions of the question do not match the reality.
...
Because of this meaning, programming language Perl 6 uses "Mu" for the root of its type hierarchy.[23]
]
The image at the top of the post, is the character mu in seal script.
P.P.P.S. Really the last this time :) Another m-word:
Check out Muji - simplicity is deceptively complex.
Enjoy.
- Vasudev Ram - Online Python training and consulting Get updates (via Gumroad) on my forthcoming apps and content. Jump to posts: Python * DLang * xtopdf Subscribe to my blog by email My ActiveState Code recipesFollow me on: LinkedIn * Twitter Managed WordPress Hosting by FlyWheel
mu image attribution
I was using this Unix utility called m today, as I often do, when working on Linux. It's a shell script that lets you save the man pages for one or more Unix commands, system calls or other topics, to text files, after cleaning up the man command output to remove formatting meant for emphasis, printing, etc.
I had first written m a while ago, on Unix boxes that I used to work on earlier, as opposed to Linux, which I use these days.
At that time it was needed, because on some Unix versions, man page output used to be formatted for printers (by text-processing tools such as nroff and troff. These tools would insert extra formatting characters in the output, for effects like bold and underscore, that made it less easy to read the text file on screen, if you simply redirected it to a file, and opened it in a text editor. (Reading the page via the man command itself would work fine.)
Here is the m script, shown by cat [1]:
cat ~/bin/mcat displays:
mkdir -p ~/man for i do man $i | col -bx > ~/man/$i.m done[1] Check out "useless use of cat" at the cat link above.
(A less-known fact is that "for i" is shorthand for "for i in $*", i.e. it iterates over all the command-line arguments to the script. Not to be confused with "for i in *" which will iterate over all the filenames in the current directory, because the * expands to that.)
m uses the convention of putting all the text files that it creates (one per command-line argument), into a directory called man, under your home directory, i.e. ~/man. If the directory does not exist, it will be created.
You have to save the above script as a file called m in a directory that is in your Unix PATH. Creating a directory called ~/bin is a good choice - your local bin directory:
mkdir ~/bin cp m ~/bin # Assumes you created m in your current directory.and make it executable using chmod:
chmod u+x ~/man/mNow if I run m as follows, to generate (as cleaned-up text) the man pages for, say, the fopen and fclose C stdio library functions,
m fopen fcloseit creates the text files fopen.m and fclose.m in my ~/man directory.
I can then open fopen.m with the view command (vi in read-only mode):
view ~/man/fopen.mHere is a screenshot of the file opened in vi(ew):
Enjoy.
P.S. If you are new to vi and want to get up and running with it fast, check out my vi quickstart tutorial. I first wrote it for a couple of friends, Windows system administrator colleagues of mine, who had been given additional charge of a few Unix boxes, at their request, to help them to get up to speed with vi. They later said it helped with that.
P.P.S. If you like short words and commands like m, check out the Japanese word mu for some interesting points.
A few excerpts:
[
The Japanese and Korean term mu (Japanese: 無; Korean: 무) or Chinese wú (traditional Chinese: 無; simplified Chinese: 无) meaning "not have; without" is a key word in Buddhism, especially Zen traditions.
...
Some English translation equivalents of wú or mu 無 are:
"no", "not", "nothing", or "without"[2]
nothing, not, nothingness, un-, is not, has not, not any[3]
[1] Nonexistence; nonbeing; not having; a lack of, without. [2] A negative. [3] Caused to be nonexistent. [4] Impossible; lacking reason or cause. [5] Pure human awareness, prior to experience or knowledge. This meaning is used especially by the Chan school.
...
The character wu 無 originally meant "dance" and was later used as a graphic loan for wu "not". The earliest graphs for 無 pictured a person with outstretched arms holding something (possibly sleeves, tassels, ornaments) and represented the word wu "dance; dancer".
...
The Gateless Gate, which is a 13th-century collection of Chan or Zen kōans, uses the word wu or mu in its title (Wumenguan or Mumonkan 無門關) and first kōan case ("Joshu's Dog" 趙州狗子). Chinese Chan calls the word mu 無 "the gate to enlightenment".[9] The Japanese Rinzai school classifies the Mu Kōan as hosshin 発心 "resolve to attain enlightenment", that is, appropriate for beginners seeking kenshō "to see the Buddha-nature"'.[10]
...
In the original text, the question is used as a conventional beginning to a question-and-answer exchange (mondo). The reference is to the Mahāyāna Mahāparinirvāṇa Sūtra[14] which says for example:
In this light, the undisclosed store of the Tathagata is proclaimed: "All beings have the Buddha-Nature".[15]
...
In Robert M. Pirsig's 1974 novel Zen and the Art of Motorcycle Maintenance, mu is translated as "no thing", saying that it meant "unask the question". He offered the example of a computer circuit using the binary numeral system, in effect using mu to represent high impedance:
...
"Mu" may be used similarly to "N/A" or "not applicable," a term often used to indicate the question cannot be answered because the conditions of the question do not match the reality.
...
Because of this meaning, programming language Perl 6 uses "Mu" for the root of its type hierarchy.[23]
]
The image at the top of the post, is the character mu in seal script.
P.P.P.S. Really the last this time :) Another m-word:
Check out Muji - simplicity is deceptively complex.
Enjoy.
- Vasudev Ram - Online Python training and consulting Get updates (via Gumroad) on my forthcoming apps and content. Jump to posts: Python * DLang * xtopdf Subscribe to my blog by email My ActiveState Code recipesFollow me on: LinkedIn * Twitter Managed WordPress Hosting by FlyWheel
Labels:
bash,
Japanese-words,
Linux,
Linux-utilities,
man-pages,
mu-word,
muji,
shell-utilities,
UNIX,
Unix-utilities,
Zen
Sunday, January 8, 2017
An Unix seq-like utility in Python
By Vasudev Ram
Due to a chain (or sequence - pun intended :) of thoughts, I got the idea of writing a simple version of the Unix seq utility (command-line) in Python. (Some Unix versions have a similar command called jot.)
Note: I wrote this program just for fun. As the seq Wikipedia page says, modern versions of bash can do the work of seq. But this program may still be useful on Windows - not sure if the CMD shell has seq-like functionality or not. PowerShell probably has it, is my guess.)
The seq command lets you specify one or two or three numbers as command-line arguments (some of which are optional): the start, stop and step values, and it outputs all numbers in that range and with that step between them (default step is 1). I have not tried to exactly emulate seq, instead I've written my own version. One difference is that mine does not support the step argument (so it can only be 1), at least in this version. That can be added later. Another is that I print the numbers with spaces in between them, not newlines. Another is that I don't support floating-point numbers in this version (again, can be added).
The seq command has more uses than the above description might suggest (in fact, it is mainly used for other things than just printing a sequence of numbers - after all, who would have a need to do that much). Here is one example, on Unix (from the Wikipedia article about seq):
Here is the code for seq1.py:
There are many other possible uses for seq, if one uses one's imagination, such as rapidly generating various filenames or directory names, with numbers in them (as a prefix, suffix or in the middle), for testing or other purposes, etc.
- Enjoy.
- Vasudev Ram - Online Python training and consulting Get updates (via Gumroad) on my forthcoming apps and content. Jump to posts: Python * DLang * xtopdf Subscribe to my blog by email My ActiveState Code recipesFollow me on: LinkedIn * Twitter Managed WordPress Hosting by FlyWheel
Due to a chain (or sequence - pun intended :) of thoughts, I got the idea of writing a simple version of the Unix seq utility (command-line) in Python. (Some Unix versions have a similar command called jot.)
Note: I wrote this program just for fun. As the seq Wikipedia page says, modern versions of bash can do the work of seq. But this program may still be useful on Windows - not sure if the CMD shell has seq-like functionality or not. PowerShell probably has it, is my guess.)
The seq command lets you specify one or two or three numbers as command-line arguments (some of which are optional): the start, stop and step values, and it outputs all numbers in that range and with that step between them (default step is 1). I have not tried to exactly emulate seq, instead I've written my own version. One difference is that mine does not support the step argument (so it can only be 1), at least in this version. That can be added later. Another is that I print the numbers with spaces in between them, not newlines. Another is that I don't support floating-point numbers in this version (again, can be added).
The seq command has more uses than the above description might suggest (in fact, it is mainly used for other things than just printing a sequence of numbers - after all, who would have a need to do that much). Here is one example, on Unix (from the Wikipedia article about seq):
# Remove file1 through file17: for n in `seq 17` do rm file$n doneNote that those are backquotes or grave accents around seq 17 in the above code snippet. It uses sh / bash syntax, so requires one of them, or a compatible shell.
Here is the code for seq1.py:
''' seq1.py Purpose: To act somewhat like the Unix seq command. Author: Vasudev Ram Copyright 2017 Vasudev Ram Web site: https://vasudevram.github.io Blog: https://jugad2.blogspot.com Product store: https://gumroad.com/vasudevram ''' import sys def main(): sa, lsa = sys.argv, len(sys.argv) if lsa < 2: sys.exit(1) try: start = 1 if lsa == 2: end = int(sa[1]) elif lsa == 3: start = int(sa[1]) end = int(sa[2]) else: # lsa > 3 sys.exit(1) except ValueError as ve: sys.exit(1) for num in xrange(start, end + 1): print num, sys.exit(0) if __name__ == '__main__': main()And here are a few runs of seq1.py, and the output of each run, below:
$ py -2 seq1.py $ py -2 seq1.py 1 1 $ py -2 seq1.py 2 1 2 $ py -2 seq1.py 3 1 2 3 $ py -2 seq1.py 1 1 1 $ py -2 seq1.py 1 2 1 2 $ py -2 seq1.py 1 3 1 2 3 $ py -2 seq1.py 4 1 2 3 4 $ py -2 seq1.py 1 4 1 2 3 4 $ py -2 seq1.py 2 2 2 $ py -2 seq1.py 5 3 $ py -2 seq1.py -6 -2 -6 -5 -4 -3 -2 $ py -2 seq1.py -4 -0 -4 -3 -2 -1 0 $ py -2 seq1.py -5 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
There are many other possible uses for seq, if one uses one's imagination, such as rapidly generating various filenames or directory names, with numbers in them (as a prefix, suffix or in the middle), for testing or other purposes, etc.
- Enjoy.
- Vasudev Ram - Online Python training and consulting Get updates (via Gumroad) on my forthcoming apps and content. Jump to posts: Python * DLang * xtopdf Subscribe to my blog by email My ActiveState Code recipesFollow me on: LinkedIn * Twitter Managed WordPress Hosting by FlyWheel
Thursday, March 31, 2016
Microsoft supporting Ubuntu apps running on Windows
By Vasudev Ram
WINDOWS <-> UBUNTU
Seen today on HN:
Ubuntu on Windows (dustinkirkland.com)
(It's the top post on HN at the time I'm writing this, and for a while before.)
Original post here: Ubuntu on Windows -- The Ubuntu Userspace for Windows Developers by Dustin Kirkland of Canonical, the maker of Ubuntu.
I commented a few times and asked a few questions too.
It's a pretty interesting thread, IMO, for those with interest in the Windows and Linux operating systems.
There are a lot of technical topics discussed and also some business ones, related to this move. Senior people from the Linux and Windows camps participating.
E.g.:
[ > So do Cygwin and/or MSYS emulate the fork() system call
Yes. That's one thing we spent considerable engineering effort on in this first version of the Windows Subsystem for Linux: We implement fork in the Windows kernel, along with the other POSIX and Linux syscalls.
This allows us to build a very efficient fork() and expose it to the GNU/Ubuntu user-mode apps via the fork(syscall).
We'll be publishing more details on this very soon. ]
There was also discussion of the POSIX subsystem that was there on Windows for a few Windows versions (from NT). I had used it to run some of my Unix command-line utilities (that used mainly the stdio and stdlib C libraries [1]) on Windows, in the Windows NT and Windows 2000 days.
[1] Because the POSIX subsystem support on Windows was limited.
Here is another HN thread about it, at around the same time, though this one is off the front page now:
Microsoft and Canonical partner to bring Ubuntu to Windows 10 (zdnet.com)
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
WINDOWS <-> UBUNTU
Seen today on HN:
Ubuntu on Windows (dustinkirkland.com)
(It's the top post on HN at the time I'm writing this, and for a while before.)
Original post here: Ubuntu on Windows -- The Ubuntu Userspace for Windows Developers by Dustin Kirkland of Canonical, the maker of Ubuntu.
I commented a few times and asked a few questions too.
It's a pretty interesting thread, IMO, for those with interest in the Windows and Linux operating systems.
There are a lot of technical topics discussed and also some business ones, related to this move. Senior people from the Linux and Windows camps participating.
E.g.:
[ > So do Cygwin and/or MSYS emulate the fork() system call
Yes. That's one thing we spent considerable engineering effort on in this first version of the Windows Subsystem for Linux: We implement fork in the Windows kernel, along with the other POSIX and Linux syscalls.
This allows us to build a very efficient fork() and expose it to the GNU/Ubuntu user-mode apps via the fork(syscall).
We'll be publishing more details on this very soon. ]
There was also discussion of the POSIX subsystem that was there on Windows for a few Windows versions (from NT). I had used it to run some of my Unix command-line utilities (that used mainly the stdio and stdlib C libraries [1]) on Windows, in the Windows NT and Windows 2000 days.
[1] Because the POSIX subsystem support on Windows was limited.
Here is another HN thread about it, at around the same time, though this one is off the front page now:
Microsoft and Canonical partner to bring Ubuntu to Windows 10 (zdnet.com)
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
Thursday, January 7, 2016
Code for recent post about PDF from a Python pipeline
By Vasudev Ram
In this recent post:
Generate PDF from a Python-controlled Unix pipeline ,
I forgot to include the code for the program PopenToPDF.py. Here it is now:
Also, this is the one-off script, gen-file.py, that created the 1000 line input file:
- Vasudev
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
In this recent post:
Generate PDF from a Python-controlled Unix pipeline ,
I forgot to include the code for the program PopenToPDF.py. Here it is now:
# PopenToPDF.py # Demo program to read text from a shell pipeline using # subprocess.Popen, and write the text to PDF using xtopdf. # Author: Vasudev Ram # Copyright (C) 2016 Vasudev Ram - http://jugad2.blogspot.com import sys import subprocess from PDFWriter import PDFWriter def error_exit(message): sys.stderr.write(message + '\n') sys.stderr.write("Terminating.\n") sys.exit(1) def main(): try: # Create and set up a PDFWriter instance. pw = PDFWriter("PopenTo.pdf") pw.setFont("Courier", 12) pw.setHeader("Use subprocess.Popen to read pipe and write to PDF.") pw.setFooter("Done using selpg, xtopdf, Python and ReportLab, on Linux.") # Set up a pipeline with nl and selpg such that we can read from its stdout. # nl numbers the lines of the input. # selpg extracts pages 3 to 5 from the input. pipe = subprocess.Popen("nl -ba 1000-lines.txt | selpg -s3 -e5", \ shell=True, bufsize=-1, stdout=subprocess.PIPE, stderr=sys.stderr).stdout # Read from the pipeline and write the data to PDF, using the PDFWriter instance. for idx, line in enumerate(pipe): pw.writeLine(str(idx).zfill(8) + ": " + line) except IOError as ioe: error_exit("Caught IOError: {}".format(str(ioe))) except Exception as e: error_exit("Caught Exception: {}".format(str(e))) finally: pw.close() main()I ran it in the usual way with:
$ python PopenToPDF.pyto get the output shown in the previous post describing PopenToPDF.
Also, this is the one-off script, gen-file.py, that created the 1000 line input file:
with open("1000-lines.txt", "w") as fil: for i in range(1000): fil.write("This is a line of text.\n") fil.close()
- Vasudev
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
Labels:
Linux,
PDF-creation,
PDF-generation,
PopenToPDF,
python,
Python-pipes,
UNIX,
utilities,
xtopdf
Generate PDF from a Python-controlled Unix pipeline
By Vasudev Ram
This post is about a new xtopdf app I wrote, called PopenToPDF.py.
(xtopdf is my PDF generation toolkit, written in Python. The toolkit consists of a core library and multiple applications built using it.)
This program, PopenToPDF, shows how to use xtopdf to generate PDF output from any Python-controlled Unix pipeline. It uses the subprocess Python module.
I had written a few posts earlier about the uses of StdinToPDF.py, another xtopdf app [1]
(There are many kinds of pipeline"; it is a powerful concept.)
StdinToPDF is an application of xtopdf that can be used at the end of a Unix or Windows pipeline, to publish the text output of the pipeline to PDF.
[1] Here are some of those posts about StdinToPDF:
a) PDFWriter can create PDF from standard input
b) Print selected text pages to PDF with Python, selpg and xtopdf on Linux
c) Generate Windows Task List to PDF with xtopdf
PopenToPDF has the same general goal as StdinToPDF (to allow creation of a pipeline whose final output is PDF), but works somewhat differently.
Instead of just being used passively (like StdinToPDF) as the last component in a pipeline run from the command line, PopenToPDf is a Python program that itself sets up and runs a pipeline (of all the preceding commands, excepting itself), using subprocess.Popen, and then reads the output of that pipeline, programmatically, and converts the text it reads to PDF. So it is a different approach that may allow for other possibilities for customization.
For the example, I created an input text file of 1000 lines, via a small one-off script. The file is called 1000-lines.txt.
The pipeline (created by PopenToPDF) runs "nl -ba" to add sequential line numbers to each line of the input file. (nl is a Unix command to number lines.) Then the output is passed to my selpg utility (a command-line utility in C), which is a filter that reads its input and selects only a specified range of pages to pass on to the output. (Full details of the selpg utility, including explanation of its logic, source code, and the build steps, are at the URL in the previous sentence, or at links accessible from that URL.)
(This page on sites.harvard.edu is a good resource for Linux command line utility development, and also references my IBM dW article about selpg.)
PopenToPDF sets up the above pipeline (nl -ba piped to selpg), and then reads all the lines from it, adds its own line numbers to the input, and writes it all to a PDF file.
Thus we end up with two sets of line numbers prefixed to each line (in the PDF): the original line numbers added by the nl command, which represents the position of each line extracted from the original file, and the serial numbers (starting from 0) for the subset of lines that PopenToPDF sees.
I did this so that we could verify that the pipeline is extracting the right lines that we specified, by looking at the relative and absolute line numbers in the output (screenshots below).
Here is a screenshot of the first page of the PDF output:
And here is a screenshot of the last page, page 4, of the PDF output:
You can see that the last relative line number (added by PopenToPDF, in the extreme left number column) is 215, and the first was 0 (on the first page), so the number of lines extracted by selpg is 216, which corresponds to what we asked selpg for by specifying a start page of 3 (-s3) and an end page of 5 (-e5), since there are 72 lines per page (the default) and 72 * (5 -3 + 1) = 72 * 3 = 216. You can do a similar calculation for the absolute line numbers shown, to verify that we have extracted not only the right number of pages, but also the right pages.
So this approach (using Popen) can be used to run a pipeline under control of a Python program, read the output of the pipeline, and do some further processing on it. Obviously, it is a generic approach, not limited to producing PDF. It could be used for any other purpose where you want to run a pipeline under program control and do something with the output of the pipeline, in your own Python code.
I'll end with a few points about related topics:
This program is actually an example of a category of data processing operations commonly used in organizations, which can be broadly described as starting with some data source, and passing it through a series of transformations until we have the final output we want.
Often, but not always, the input for these transformations is downloaded from some database or application (of the organization), and/or the output is uploaded to another database or application (also of the organization).
In some of these cases, the process is called ETL, meaning Extract, Transform, Load. This operation is also related to IT system integration.
In general, these tasks can consist of a combination of the use of existing components (programs) and purpose-written code in a compiled or interpreted language. The operation can also consist of a combination of manual and automated steps.
When there is enough uniformity in the data and needed processing rules across runs, using more automation leads to more time and cost savings. Some amount of variation in the data or rules can be handled by parameterization of input and output filenames, database connections, table names, use of conditional logic, etc.
Finally, in the process of writing this program and post (across a couple of sessions), I came across mentions of microservices in tech forums. Microservices have been in the news for a while. So I looked up definitions of microservices and realized that they are in some ways similar to Unix pipelines and the Unix philosophy of creating small tools that do one thing well, and then combining them to achieve bigger tasks.
If you're interested in pipes and Python and their intersection, also check out this HN comment by me, which lists multiple other Python pipe-like tools, including one (pipe_controller) by me:
Yes, pyp is interesting. So are some other roughly similar Python tools
- Enjoy.
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
Labels:
command-line,
Linux,
pdf,
PDF-creation,
PDF-generation,
pipes,
programming,
python,
Python-pipes,
UNIX,
utilities
Wednesday, December 23, 2015
Generate Windows Task List to PDF with xtopdf
By Vasudev Ram
While working at the DOS command line in Windows, I had the idea of using the DOS TASKLIST command along with xtopdf, my PDF generation toolkit, to generate a list of currently running Windows tasks to a PDF file, along with some other info, such as whether a task is a service or a console process, the process id, the memory usage, etc. The TASKLIST command shows all that information, by default.
I also sorted the output in ascending order by the Mem Usage field, by passing it through the DOS SORT command. (I could have sorted it by any other field such as the Image Name or the PID, of course.) I starred out some of the fields in the output.
Here are the steps to generate a Windows task list as a PDF, using xtopdf:
( I use $ as the prompt, even in DOS :)
1: Run TASKLIST and redirect its output to a text file.
$ tasklist > tasklist.out
2: Sort the file into another file.
$ sort /+65 tasklist.out > tasklist.srt
(Sort the output of TASKLIST by the character position of the Mem Usage field.)
3: Go edit tasklist to put the header lines back at the top :)
[ They get dislodged by the sort. ]
[ This is not Unix, so you can't easily do the fast, fluid command-line data munging that you can on Unix, unless you use something like Cygwin or UWin.
UWin was developed by David Korn, creator of the Korn Shell, for Windows. You can get UWin from the AT&T site here (after doing a convoluted license agreement dance, last time I checked). But IMO, the dance is not too long, and is worth it, to get a suite of Unix tools that work well on Windows, and UWin is also smaller & lighter than Cygwin, though not so comprehensive.
Be sure to read the section "Korn shell and Microsoft" at the David Korn link above :-) ]
4: Pipe the sorted task list to StdinToPDF, to generate the PDF output.
$ type tasklist.srt | python StdinToPDF.py tasklist.pdf
We just pipe the output of TASKLIST to StdinToPDF.py (an xtopdf app), which can be used at the end of any arbitrary command pipeline that generates text (on Unix / Windows / Linux / Mac OS X), to convert that text to PDF.
A screenshot of the PDF output I got (viewed in Foxit PDF Reader), is shown at the top of this post.
- Enjoy.
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
While working at the DOS command line in Windows, I had the idea of using the DOS TASKLIST command along with xtopdf, my PDF generation toolkit, to generate a list of currently running Windows tasks to a PDF file, along with some other info, such as whether a task is a service or a console process, the process id, the memory usage, etc. The TASKLIST command shows all that information, by default.
I also sorted the output in ascending order by the Mem Usage field, by passing it through the DOS SORT command. (I could have sorted it by any other field such as the Image Name or the PID, of course.) I starred out some of the fields in the output.
Here are the steps to generate a Windows task list as a PDF, using xtopdf:
( I use $ as the prompt, even in DOS :)
1: Run TASKLIST and redirect its output to a text file.
$ tasklist > tasklist.out
2: Sort the file into another file.
$ sort /+65 tasklist.out > tasklist.srt
(Sort the output of TASKLIST by the character position of the Mem Usage field.)
3: Go edit tasklist to put the header lines back at the top :)
[ They get dislodged by the sort. ]
[ This is not Unix, so you can't easily do the fast, fluid command-line data munging that you can on Unix, unless you use something like Cygwin or UWin.
UWin was developed by David Korn, creator of the Korn Shell, for Windows. You can get UWin from the AT&T site here (after doing a convoluted license agreement dance, last time I checked). But IMO, the dance is not too long, and is worth it, to get a suite of Unix tools that work well on Windows, and UWin is also smaller & lighter than Cygwin, though not so comprehensive.
Be sure to read the section "Korn shell and Microsoft" at the David Korn link above :-) ]
4: Pipe the sorted task list to StdinToPDF, to generate the PDF output.
$ type tasklist.srt | python StdinToPDF.py tasklist.pdf
We just pipe the output of TASKLIST to StdinToPDF.py (an xtopdf app), which can be used at the end of any arbitrary command pipeline that generates text (on Unix / Windows / Linux / Mac OS X), to convert that text to PDF.
A screenshot of the PDF output I got (viewed in Foxit PDF Reader), is shown at the top of this post.
- Enjoy.
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
Monday, December 21, 2015
Microsoft to acquire Linux ...
... skills. :-)
So says:
<a href="https://mobile.twitter.com/linuxfoundation/status/678665931434815490">The Linux Foundation</a>
- Vasudev Ram
jugad2.blogspot.com
Sunday, November 1, 2015
data_dump, a Python tool like Unix od (octal dump)
By Vasudev Ram
The Unix od command, which stands for octal dump, should be known to regular Unix users. Though the name includes the word octal (for historical reasons) [1], it supports other numeric systems as well; see below.
[1] See:
The Wikipedia page for od, which says that "od is one of the earliest Unix programs, having appeared in version 1 AT&T Unix."
od is a handy tool. It dumps the contents of a file (or standard input) to standard output, in "unambiguous" ways, such as the ability to show the file contents as numeric values (ASCII codes), interpreted as bytes / two-byte words / etc. It can do this in octal, decimal, binary or hexadecimal format. It can also show the content as characters. But the Unix cat command does that already, so the od command is more often used to show characters along with their numeric codes. It also shows the byte offset (from the start of the file) of every, say, 10th character in the file, in the left column of its output, so the user can keep track of where any content occurs in the file.
All this is useful because it allows Unix users (programmers and system administrators as well as end users) to inspect the contents of files in different ways (hex, binary, character, etc.). The files thus inspected could be text files or binary files of any kind. Often, programmers use the output of od to debug their application, by viewing a file that their program is either reading from or writing to, to verify that it contains what they expect, or to find that it contains something that they do not expect - which could be due either to invalid input or to a bug in their program causing incorrect output.
I needed to use od recently. Doing so made me think of writing a simple version of it in Python, for fun and practice. So I did it. I named it data_dump.py. Here is the code for it:
In a future post, I'll make some improvements, and also show and discuss some interesting and possibly anomalous results that I got when testing data_dump.py with different inputs.
Happy dumping! :)
Details of the above image are available here:
Truck image credits
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
The Unix od command, which stands for octal dump, should be known to regular Unix users. Though the name includes the word octal (for historical reasons) [1], it supports other numeric systems as well; see below.
[1] See:
The Wikipedia page for od, which says that "od is one of the earliest Unix programs, having appeared in version 1 AT&T Unix."
od is a handy tool. It dumps the contents of a file (or standard input) to standard output, in "unambiguous" ways, such as the ability to show the file contents as numeric values (ASCII codes), interpreted as bytes / two-byte words / etc. It can do this in octal, decimal, binary or hexadecimal format. It can also show the content as characters. But the Unix cat command does that already, so the od command is more often used to show characters along with their numeric codes. It also shows the byte offset (from the start of the file) of every, say, 10th character in the file, in the left column of its output, so the user can keep track of where any content occurs in the file.
All this is useful because it allows Unix users (programmers and system administrators as well as end users) to inspect the contents of files in different ways (hex, binary, character, etc.). The files thus inspected could be text files or binary files of any kind. Often, programmers use the output of od to debug their application, by viewing a file that their program is either reading from or writing to, to verify that it contains what they expect, or to find that it contains something that they do not expect - which could be due either to invalid input or to a bug in their program causing incorrect output.
I needed to use od recently. Doing so made me think of writing a simple version of it in Python, for fun and practice. So I did it. I named it data_dump.py. Here is the code for it:
''' Program name: data_dump.py Author: Vasudev Ram. Copyright 2015 Vasudev Ram. Purpose: To dump the contents of a specified file or standard input, to the standard output, in one or more formats, such as: - as characters - as decimal numbers - as hexadecimal numbers - as octal numbers Inspired by the od (octal dump) command of Unix, and intended to work, very roughly, like it. Will not attempt to replicate od exactly or even closely. May diverge from od's way of doing things, as desired. ''' # Imports: from __future__ import print_function import sys # Global constants: # Maximum number of character (from the input) to output per line. MAX_CHARS_PER_LINE = 16 # Global variables: # Functions: def data_dump(infil, line_len=MAX_CHARS_PER_LINE, options=None): ''' Dumps the data from the input source infil to the standard output. ''' byte_addr = 0 buf = infil.read(line_len) # While not EOF. while buf != '': # Print the offset of the first character to be output on this line. # The offset refers to the offset of that character in the input, # not in the output. The offset is 0-based. sys.stdout.write("{:>08s}: ".format(str(byte_addr))) # Print buf in character form, with . for control characters. # TODO: Change to use \n for line feed, \t for tab, etc., for # those control characters which have unambiguous C escape # sequences. byte_addr += len(buf) for c in buf: sys.stdout.write(' ') # Left padding before c as char. if (0 <= ord(c) <= 31) or (c == 127): sys.stdout.write('.') else: sys.stdout.write(c) sys.stdout.write('\n') # Now print buf in hex form. sys.stdout.write(' ' * 10) # Padding to match that of byte_addr above. for c in buf: sys.stdout.write(' ') # Left padding before c in hex. sys.stdout.write('{:>02s}'.format((hex(ord(c))[2:].upper()))) sys.stdout.write('\n') buf = infil.read(line_len) infil.close() def main(): ''' Checks the arguments, sets option flags, sets input source. Then calls data_dump() function with the input source and options. ''' try: lsa = len(sys.argv) if lsa == 1: # Input from standard input. infil = sys.stdin elif lsa == 2: # Input from a file. infil = open(sys.argv[1], "rb") data_dump(infil) sys.exit(0) except IOError as ioe: print("Error: IOError: " + str(ioe)) sys.exit(1) if __name__ == '__main__': main()And here is the output of a sample run, on a small text file:
$ data_dump.py t3 00000000: T h e q u i c k b r o w n 54 68 65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20 00000016: f o x j u m p e d o v e r 66 6F 78 20 6A 75 6D 70 65 64 20 6F 76 65 72 20 00000032: t h e l a z y d o g . . . T 74 68 65 20 6C 61 7A 79 20 64 6F 67 2E 0D 0A 54 00000048: h e q u i c k b r o w n f 68 65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20 66 00000064: o x j u m p e d o v e r t 6F 78 20 6A 75 6D 70 65 64 20 6F 76 65 72 20 74 00000080: h e l a z y d o g . . . T h 68 65 20 6C 61 7A 79 20 64 6F 67 2E 0D 0A 54 68 00000096: e q u i c k b r o w n f o 65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20 66 6F 00000112: x j u m p e d o v e r t h 78 20 6A 75 6D 70 65 64 20 6F 76 65 72 20 74 68 00000128: e l a z y d o g . 65 20 6C 61 7A 79 20 64 6F 67 2E $Note that I currently replace control / non-printable characters by a dot, in the output. Another option could be to replace (at least some of) them with C escape sequences, such as \r (carriage return, ASCII 13), \n (line feed, ASCII 10), etc. That is the way the original od does it.
In a future post, I'll make some improvements, and also show and discuss some interesting and possibly anomalous results that I got when testing data_dump.py with different inputs.
Happy dumping! :)
Details of the above image are available here:
Truck image credits
- Vasudev Ram - Online Python training and programming Signup to hear about new products and services I create. Posts about Python Posts about xtopdf My ActiveState recipes
Labels:
command-line,
command-line-utilities,
data_dump,
Linux,
octal-dump,
python,
Python-utilities,
UNIX,
utilities,
windows
Sunday, July 5, 2015
Nine dollar Linux computer
CHIP – The World’s First $9 Computer | Ardevon
The game's afoot, Watson!
- Vasudev Ram
jugad2.blogspot.com
dancingbison.com (site down a while due to changing host taking time)
Friday, May 8, 2015
tabtospaces, utility to change tabs to spaces in Python files
By Vasudev Ram
Near the end of a recent blog post:
asciiflow.com: Draw flowcharts online, in ASCII
, I showed how this small snippet of Python code can be used to make a Python program usable as a component in a Unix pipeline:
Today I saw Raymond Hettinger (@raymondh)'s tweet about the -t and -tt command line options of Python:
I created a simple Python file, test1.py, that has mixed tabs and spaces to use as input to tabtospaces.py. Then I ran the following commands:
And you can see from this diff that the original test1.py and the test2.py generated by tabtospaces.py, differ only in the use of tabs vs. spaces:
[1] Windix is the latest upcoming Unix-compatible OS from M$, due Real Soon Now. You heard it here first - TM.
- Vasudev Ram - Online Python training and programming Dancing Bison EnterprisesSignup to hear about new software products or info-products that I create. Posts about Python Posts about xtopdf Contact Page
Near the end of a recent blog post:
asciiflow.com: Draw flowcharts online, in ASCII
, I showed how this small snippet of Python code can be used to make a Python program usable as a component in a Unix pipeline:
for lin in sys.stdin: sys.stdout.write(process(lin))
Today I saw Raymond Hettinger (@raymondh)'s tweet about the -t and -tt command line options of Python:
#python tip: In Python 2, the -tt option raises an error when you foolishly mix spaces and tabs. In Python 3, that is always an error.That made me think of writing a simple Python 2 tool to change the tabs in a Python file to spaces. Yes, I know it can be easily done in Unix or Windix [1] with any of sed / awk / tr etc. That's not the point. So here is tabtospaces.py:
import sys for lin in sys.stdin: sys.stdout.write(lin.replace("\t", " "))[ Note: this code converts each tab into 4 spaces. It can be parameterized by passing a command-line option that specifies the number of spaces, such as 4 or 8, and then replacing each tab with that many spaces. Also note that I have not tested the program on many sets of data, just one for now. ]
I created a simple Python file, test1.py, that has mixed tabs and spaces to use as input to tabtospaces.py. Then I ran the following commands:
$ py -tt test1.py File "test1.py", line 4 print arg, ^ TabError: inconsistent use of tabs and spaces in indentation $ py tabtospaces.py < test1.py > test2.py $ py -tt test2.py 0 1 2 3 4 5 6 7 8 9which shows that tabtospaces.py does convert the tabs to spaces.
And you can see from this diff that the original test1.py and the test2.py generated by tabtospaces.py, differ only in the use of tabs vs. spaces:
$ fc /l test1.py test2.py Comparing files test1.py and TEST2.PY ***** test1.py for arg in args: print arg, ***** TEST2.PY for arg in args: print arg, *****
[1] Windix is the latest upcoming Unix-compatible OS from M$, due Real Soon Now. You heard it here first - TM.
- Vasudev Ram - Online Python training and programming Dancing Bison EnterprisesSignup to hear about new software products or info-products that I create. Posts about Python Posts about xtopdf Contact Page
Sunday, April 19, 2015
asciiflow.com: Draw flowcharts online, in ASCII
By Vasudev Ram
Saw this today: asciiflow.com
asciiflow.com is a site that allows you to draw flowcharts online, on their site, using the metaphor of a drag-and-drop paint program like MS Paint, but the flowcharts are drawn entirely using ASCII characters.
I tried it out a bit. Innovative.
One point is that to save the flowchart, it requires access to your Google Drive account.
The image at the top of this page, is of a flowchart that I created with asciiflow.com. I did not use the Save feature, but instead took a screenshot and saved it as a PNG file (using MS Paint, ha ha). The flowchart shows a diagram that illustrates the concept of a UNIX command pipeline, where the standard output of a preceding program becomes the standard input of a succeeding one (in the pipeline). (How's that for using web-based and Windows software to illustrate something about UNIX? :)
For another example of the innovative use of ASCII characters, check out this post I wrote somewhat recently, about the Python library called PrettyTable, which lets you generate visually appealing tables of data, bordered and boxed by ASCII characters:
PrettyTable to PDF is pretty easy with xtopdf
Also, since we're talking about standard input and output and UNIX pipelines, these two posts may be of interest:
1) [xtopdf] PDFWriter can create PDF from standard input
(The post at the above link also has an example of eating your own dog food.)
2) Print selected text pages to PDF with Python, selpg and xtopdf on Linux
Generalizing from a fragment of code in post 1) above, I'll also note that making a Python program usable as a component of a UNIX pipeline, can, in some cases, be as simple as having something like this in your code:
$ foo | bar | baz
where foo may be a built-in UNIX command (a filter) or a shell script, bar may be (for example) a Perl program that leverages some powerful Perl features, and baz may be a Python program that leverages some powerful Python features, thereby leveraging the UNIX philosophy concept of writing small programs, each of which do one thing well, or in this case, leveraging the features of different languages (each of which may do some things better than others), to write individual components in those respective languages. The possibilities are limitless ...
- Enjoy.
- Vasudev Ram - Online Python and Linux training;
freelance Python programming Dancing Bison EnterprisesSignup to hear about new software products that I create. Posts about Python Posts about xtopdf Contact Page
Saw this today: asciiflow.com
asciiflow.com is a site that allows you to draw flowcharts online, on their site, using the metaphor of a drag-and-drop paint program like MS Paint, but the flowcharts are drawn entirely using ASCII characters.
I tried it out a bit. Innovative.
One point is that to save the flowchart, it requires access to your Google Drive account.
The image at the top of this page, is of a flowchart that I created with asciiflow.com. I did not use the Save feature, but instead took a screenshot and saved it as a PNG file (using MS Paint, ha ha). The flowchart shows a diagram that illustrates the concept of a UNIX command pipeline, where the standard output of a preceding program becomes the standard input of a succeeding one (in the pipeline). (How's that for using web-based and Windows software to illustrate something about UNIX? :)
For another example of the innovative use of ASCII characters, check out this post I wrote somewhat recently, about the Python library called PrettyTable, which lets you generate visually appealing tables of data, bordered and boxed by ASCII characters:
PrettyTable to PDF is pretty easy with xtopdf
Also, since we're talking about standard input and output and UNIX pipelines, these two posts may be of interest:
1) [xtopdf] PDFWriter can create PDF from standard input
(The post at the above link also has an example of eating your own dog food.)
2) Print selected text pages to PDF with Python, selpg and xtopdf on Linux
Generalizing from a fragment of code in post 1) above, I'll also note that making a Python program usable as a component of a UNIX pipeline, can, in some cases, be as simple as having something like this in your code:
import sys # ... for lin in sys.stdin: lin = process(lin) sys.stdout.write(lin)which could be shortened to:
for lin in sys.stdin: sys.stdout.write(process(lin))Due to this (being able to easily make a Python program into a component of a UNIX pipeline), you can do things like this (and more):
$ foo | bar | baz
where foo may be a built-in UNIX command (a filter) or a shell script, bar may be (for example) a Perl program that leverages some powerful Perl features, and baz may be a Python program that leverages some powerful Python features, thereby leveraging the UNIX philosophy concept of writing small programs, each of which do one thing well, or in this case, leveraging the features of different languages (each of which may do some things better than others), to write individual components in those respective languages. The possibilities are limitless ...
- Enjoy.
- Vasudev Ram - Online Python and Linux training;
freelance Python programming Dancing Bison EnterprisesSignup to hear about new software products that I create. Posts about Python Posts about xtopdf Contact Page
Labels:
ASCII,
ASCII-flowcharts,
asciiflow.com,
command-line,
command-line-utilities,
filters,
flowcharts,
innovation,
Linux,
python,
UNIX,
utilities
Friday, April 17, 2015
Linux skills in high demand in 2015, says Linux Foundation newsletter
By Vasudev Ram
Just saw this news via the newsletter that I get from the Linux Foundation:
Linux skills are going to be in high demand in 2015, according to a survey carried out by Dice.com and the Linux Foundation. This is the 4th year in a row that the survey has been done.
Excerpts from the report:
[
“Competition for Linux talent is accelerating, as the software becomes more ubiquitous,” said Shravan Goli, President of Dice
”Demand for Linux talent continues apace, and it’s becoming more important for employers to be able to verify candidates have the skillsets they need,” said Jim Zemlin, executive director at The Linux Foundation.
]
- Vasudev Ram - Online Python and Linux training and programming Dancing Bison EnterprisesSignup to hear about new software or info products that I create. Posts about Python Posts about xtopdf Contact Page
Just saw this news via the newsletter that I get from the Linux Foundation:
Linux skills are going to be in high demand in 2015, according to a survey carried out by Dice.com and the Linux Foundation. This is the 4th year in a row that the survey has been done.
Excerpts from the report:
[
“Competition for Linux talent is accelerating, as the software becomes more ubiquitous,” said Shravan Goli, President of Dice
”Demand for Linux talent continues apace, and it’s becoming more important for employers to be able to verify candidates have the skillsets they need,” said Jim Zemlin, executive director at The Linux Foundation.
]
- Vasudev Ram - Online Python and Linux training and programming Dancing Bison EnterprisesSignup to hear about new software or info products that I create. Posts about Python Posts about xtopdf Contact Page
Labels:
Dice.com,
Linux,
Linux-Foundation,
Linux-jobs,
Linux-training
Friday, March 20, 2015
A simple UNIX-like "which" command in Python
By Vasudev Ram
UNIX users are familiar with the which command. Given an argument called name, it checks the system PATH environment variable, to see whether that name exists (as a file) in any of the directories specified in the PATH. (The directories in the PATH are colon-separated on UNIX and semicolon-separated on Windows.)
I'd written a Windows-specific version of the which command some time ago, in C.
Today I decided to write a simple version of the which command in Python. In the spirit of YAGNI and incremental development, I tried to resist the temptation to add more features too early; but I did give in once and add the exit code stuff near the end :)
Here is the code for which.py:
(Note: the tests are done on Windows, though the command prompt is a $ sign (UNIX default); I just set it to that because I like $'s and UNIX :)
$ which vim
\vim
$ which vim.exe
C:\Ch\bin\vim.exe
$ set PATH | grep -i vim73
$ addpath c:\vim\vim73
$ which.py vim.exe
C:\Ch\bin\vim.exe
c:\vim\vim73\vim.exe
$ which metapad.exe
C:\util\metapad.exe
$ which pscp.exe
C:\util\pscp.exe
C:\Ch\bin\pscp.exe
$ which dostounix.exe
C:\util\dostounix.exe
$ which pythonw.exe
C:\Python278\pythonw.exe
D:\Anaconda-2.1.0-64\pythonw.exe
# Which which is which? All four combinations:
$ which which
.\which
$ which.py which
.\which
$ which which.py
.\which.py
$ which.py which.py
.\which.py
As you can see, calling the which Python command with different arguments, gives various results, including sometimes finding one instance of vim.exe and sometimes two instances, depending on the values in the PATH variable (which I changed, using my addpath.bat script, to add the \vim\vim73 directory to it).
Also, it works when invoked either as which.py or just which.
I'll discuss my interpretation of these variations in an upcoming post - including a variation that uses os.stat(full_path).st_mode - see the commented part of the code under the line:
if os.path.exists(full_path):
Meanwhile, did you know that YAGNI was written about much before agile was a thing? IIRC, I've seen it described in either Kernighan and Ritchie (The C Programming Language) or in Kernighan and Pike (The UNIX Programming Environment). It could be possibly be older than that, say from the mainframe era.
Finally, as I was adding labels to this blog post, Blogger showed me "pywhich" as a label, after I typed "which" in the labels box. That reminded me that I had written another post earlier about a Python which utility (not by me), so I found it on my blog by typing in this URL:
http://jugad2.blogspot.in/search/label/pywhich
which finds all posts on my blog with the label 'pywhich' (and the same approach works for any other label); the resulting post is:
pywhich, like the Unix which tool, for Python modules.
- Enjoy.
- Vasudev Ram - Online Python training and programming Dancing Bison EnterprisesSignup to hear about new products that I create. Posts about Python Posts about xtopdf Contact Page
UNIX users are familiar with the which command. Given an argument called name, it checks the system PATH environment variable, to see whether that name exists (as a file) in any of the directories specified in the PATH. (The directories in the PATH are colon-separated on UNIX and semicolon-separated on Windows.)
I'd written a Windows-specific version of the which command some time ago, in C.
Today I decided to write a simple version of the which command in Python. In the spirit of YAGNI and incremental development, I tried to resist the temptation to add more features too early; but I did give in once and add the exit code stuff near the end :)
Here is the code for which.py:
from __future__ import print_function # which.py # A minimal version of the UNIX which utility, in Python. # Author: Vasudev Ram - www.dancingbison.com # Copyright 2015 Vasudev Ram - http://www.dancingbison.com import sys import os import os.path import stat def usage(): sys.stderr.write("Usage: python which.py name\n") sys.stderr.write("or: which.py name\n") def which(name): found = 0 for path in os.getenv("PATH").split(os.path.pathsep): full_path = path + os.sep + name if os.path.exists(full_path): """ if os.stat(full_path).st_mode & stat.S_IXUSR: found = 1 print(full_path) """ found = 1 print(full_path) # Return a UNIX-style exit code so it can be checked by calling scripts. # Programming shortcut to toggle the value of found: 1 => 0, 0 => 1. sys.exit(1 - found) def main(): if len(sys.argv) != 2: usage() sys.exit(1) which(sys.argv[1]) if "__main__" == __name__: main()And here are a few examples of using the command:
(Note: the tests are done on Windows, though the command prompt is a $ sign (UNIX default); I just set it to that because I like $'s and UNIX :)
$ which vim
\vim
$ which vim.exe
C:\Ch\bin\vim.exe
$ set PATH | grep -i vim73
$ addpath c:\vim\vim73
$ which.py vim.exe
C:\Ch\bin\vim.exe
c:\vim\vim73\vim.exe
$ which metapad.exe
C:\util\metapad.exe
$ which pscp.exe
C:\util\pscp.exe
C:\Ch\bin\pscp.exe
$ which dostounix.exe
C:\util\dostounix.exe
$ which pythonw.exe
C:\Python278\pythonw.exe
D:\Anaconda-2.1.0-64\pythonw.exe
# Which which is which? All four combinations:
$ which which
.\which
$ which.py which
.\which
$ which which.py
.\which.py
$ which.py which.py
.\which.py
As you can see, calling the which Python command with different arguments, gives various results, including sometimes finding one instance of vim.exe and sometimes two instances, depending on the values in the PATH variable (which I changed, using my addpath.bat script, to add the \vim\vim73 directory to it).
Also, it works when invoked either as which.py or just which.
I'll discuss my interpretation of these variations in an upcoming post - including a variation that uses os.stat(full_path).st_mode - see the commented part of the code under the line:
if os.path.exists(full_path):
Meanwhile, did you know that YAGNI was written about much before agile was a thing? IIRC, I've seen it described in either Kernighan and Ritchie (The C Programming Language) or in Kernighan and Pike (The UNIX Programming Environment). It could be possibly be older than that, say from the mainframe era.
Finally, as I was adding labels to this blog post, Blogger showed me "pywhich" as a label, after I typed "which" in the labels box. That reminded me that I had written another post earlier about a Python which utility (not by me), so I found it on my blog by typing in this URL:
http://jugad2.blogspot.in/search/label/pywhich
which finds all posts on my blog with the label 'pywhich' (and the same approach works for any other label); the resulting post is:
pywhich, like the Unix which tool, for Python modules.
- Enjoy.
- Vasudev Ram - Online Python training and programming Dancing Bison EnterprisesSignup to hear about new products that I create. Posts about Python Posts about xtopdf Contact Page
Friday, October 24, 2014
Print selected text pages to PDF with Python, selpg and xtopdf on Linux
By Vasudev Ram
In a recent blog post, titled My IBM developerWorks article, I talked about a tutorial that I had written for IBM developerWorks a while ago. The tutorial showed some of the recommended techniques and practices to follow when writing a Linux command-line utility that is intended for production use, and how to write it in such a way that it can easily cooperate with existing UNIX command-line tools, when used in a UNIX command pipeline.
This ability of properly written command-line tools to cooperate with each other when used in a pipeline, is, as I said in that IBM article, one of the keys to the power of Linux (and UNIX) as a development environment. (See the classic book The UNIX Programming Environment, for much more on this topic.)
The utility I wrote and discussed (in that IBM article), called selpg (for SELect PaGes), allows the user to select a specified range of pages from a text file. At the end of the aforementioned blog post, I had said that I would show some practical uses of the selpg utility later. I describe one such use case below, involving a combination of selpg and my xtopdf toolkit), which is a Python library for PDF creation.
(The xtopdf toolkit contains a PDF creation library, and also includes some sample applications that show how to use the library to create PDF output in various ways, and from various input sources, which is why I tend to call xtopdf a toolkit instead of just a library.
I had written one such application of xtopdf a while ago, called StdinToPDF(.py) (for standard input to PDF). I blogged about it at the time, here:
[xtopdf] PDFWriter can create PDF from standard input. (PDFWriter is a module of xtopdf, which provides the core PDF creation functionality.)
The selpg utility can be used with StdinToPDF, in a pipeline, to select a range of pages (by starting and ending page numbers) from a (possibly large) text file, and write only those selected pages to a PDF file. Here is an example of how to do that:
First, build the selpg utility from source, for your Linux OS. selpg is only meant to work on Linux, since it uses some Linux C standard library functions, such as from stdio.h, and popen(); but you can try to run it on Windows (at your own risk), since Windows does have (had?) a POSIX subsystem, from Windows NT onward. I have used it in the past. (Update: I checked - according to this section of the Wikipedia article about POSIX, Windows may have had POSIX support only from Windows NT up to Windows 2000.) Anyway, to build selpg on Linux, follow the steps below (the $ sign is the shell prompt and not to be typed):
1. Download the source code from the sources section of the selpg project repository on Bitbucket.
Download all of these files: makefile, mk, selpg.c and showsyserr.c .
2. Make the (shell script) file mk executable, with the command:
3. Now make the file selpg executable, with the command:
6. (Optional) You can run selpg a few times with some text file(s) as input, and different values for the -s and -e command-line options, to get a feel for how it works.
Now download xtopdf (which includes StdinToPDF) from here:
xtopdf on Bitbucket.
To install it, follow the steps given in this post:
Guide to installing and using xtopdf, including creating simple PDF e-books
That post was written a while ago, when xtopdf was hosted on SourceForge. So you need to make one change to the instructions given in that guide: instead of downloading xtopdf from SourceForge, as stated in Step 5 of the guide, get it from the xtopdf Bitbucket link I gave above.
(To make xtopdf work, you also have to install ReportLab, which xtopdf depends uses internally; the steps for that are given in my xtopdf installation guide linked above, or you can also look at the instructions in the ReportLab distribution. It is easy, just a couple of steps - download, unzip, configure a setting or two.)
Once you have both selpg and xtopdf installed, you can use selpg and StdinToPDF together. Here is an example run, to select only pages 2 through 4 from an input text file:
I wrote a simple Python program, gen_selpg_test_file,py, to create a text file that can be used to test the selpg and StdinToPDf programs together.
Here is an excerpt of the core logic of gen_selpg_test_file.py, omitting argument and error handling for brevity (I have those in the actual code):
Then I could run the pipeline using selpg and StdinToPDF, as described above:
After doing the above, you can open the file p2_p4.pdf in your favorite PDF reader (Evince is one PDF reader for Linux), to confirm that it contains all (and only) the lines from page 2 to 4 of the input file selpg_test_file_1000.txt (considering 72 lines per page, which is the default that selpg uses).
Read the IBM article to see how that default can be changed - to either another number of lines per page, e.g. 66 or 80 or whatever, or to specify form feeds (ASCII code 12) as the page delimiter. Form feeds are often used as a page delimiter in text file reports generated by programs, when the reports are destined for a printer, since the form feed character causes the printer to advance the print head to the top of the next page/form (that's how the character got its name).
Though this post seemed long, note that a lot it was either background information or instructions on how to build selpg and install xtopdf. Those are both one time jobs. Once those are done, you can select the needed pages from any text file and print them to PDF with a single command-line, as shown in the last command above.
This is useful when you printed the entire file earlier, and some pages didn't print properly because the printer jammed. Just use selpg with xtopdf to print only the needed pages again.
The image above is from the Wikipedia article on Printing, and titled:
Jikji, "Selected Teachings of Buddhist Sages and Son Masters" from Korea, the earliest known book printed with movable metal type, 1377. Bibliothèque Nationale de France, Paris
- Enjoy.
- Vasudev Ram - Dancing Bison EnterprisesClick here to get email about new products from Vasudev Ram. Contact Page
In a recent blog post, titled My IBM developerWorks article, I talked about a tutorial that I had written for IBM developerWorks a while ago. The tutorial showed some of the recommended techniques and practices to follow when writing a Linux command-line utility that is intended for production use, and how to write it in such a way that it can easily cooperate with existing UNIX command-line tools, when used in a UNIX command pipeline.
This ability of properly written command-line tools to cooperate with each other when used in a pipeline, is, as I said in that IBM article, one of the keys to the power of Linux (and UNIX) as a development environment. (See the classic book The UNIX Programming Environment, for much more on this topic.)
The utility I wrote and discussed (in that IBM article), called selpg (for SELect PaGes), allows the user to select a specified range of pages from a text file. At the end of the aforementioned blog post, I had said that I would show some practical uses of the selpg utility later. I describe one such use case below, involving a combination of selpg and my xtopdf toolkit), which is a Python library for PDF creation.
(The xtopdf toolkit contains a PDF creation library, and also includes some sample applications that show how to use the library to create PDF output in various ways, and from various input sources, which is why I tend to call xtopdf a toolkit instead of just a library.
I had written one such application of xtopdf a while ago, called StdinToPDF(.py) (for standard input to PDF). I blogged about it at the time, here:
[xtopdf] PDFWriter can create PDF from standard input. (PDFWriter is a module of xtopdf, which provides the core PDF creation functionality.)
The selpg utility can be used with StdinToPDF, in a pipeline, to select a range of pages (by starting and ending page numbers) from a (possibly large) text file, and write only those selected pages to a PDF file. Here is an example of how to do that:
First, build the selpg utility from source, for your Linux OS. selpg is only meant to work on Linux, since it uses some Linux C standard library functions, such as from stdio.h, and popen(); but you can try to run it on Windows (at your own risk), since Windows does have (had?) a POSIX subsystem, from Windows NT onward. I have used it in the past. (Update: I checked - according to this section of the Wikipedia article about POSIX, Windows may have had POSIX support only from Windows NT up to Windows 2000.) Anyway, to build selpg on Linux, follow the steps below (the $ sign is the shell prompt and not to be typed):
1. Download the source code from the sources section of the selpg project repository on Bitbucket.
Download all of these files: makefile, mk, selpg.c and showsyserr.c .
2. Make the (shell script) file mk executable, with the command:
$ chmod u+x mk3. Then run the file mk, with:
$ ./mkThat will run the makefile that builds the selpg executable using the C compiler on your Linux box. The C compiler (invoked as cc or gcc) is installed on most mainstream Linux distributions. If it is not, you will need to install it from the repository for your Linux distribution. Sometimes only a minimal version of a C compiler is installed, which is only enough to (re)compile the kernel after making kernel parameter changes, such as for performance tuning. Consult your local Linux expert for help if such is the case.
3. Now make the file selpg executable, with the command:
$ chmod u+x selpg4. (Optional) You can check the usage of selpg by reading the IBM tutorial article and/or running selpg without any command-line arguments:
$ ./selpgwhich will show a usage message.
6. (Optional) You can run selpg a few times with some text file(s) as input, and different values for the -s and -e command-line options, to get a feel for how it works.
Now download xtopdf (which includes StdinToPDF) from here:
xtopdf on Bitbucket.
To install it, follow the steps given in this post:
Guide to installing and using xtopdf, including creating simple PDF e-books
That post was written a while ago, when xtopdf was hosted on SourceForge. So you need to make one change to the instructions given in that guide: instead of downloading xtopdf from SourceForge, as stated in Step 5 of the guide, get it from the xtopdf Bitbucket link I gave above.
(To make xtopdf work, you also have to install ReportLab, which xtopdf depends uses internally; the steps for that are given in my xtopdf installation guide linked above, or you can also look at the instructions in the ReportLab distribution. It is easy, just a couple of steps - download, unzip, configure a setting or two.)
Once you have both selpg and xtopdf installed, you can use selpg and StdinToPDF together. Here is an example run, to select only pages 2 through 4 from an input text file:
I wrote a simple Python program, gen_selpg_test_file,py, to create a text file that can be used to test the selpg and StdinToPDf programs together.
Here is an excerpt of the core logic of gen_selpg_test_file.py, omitting argument and error handling for brevity (I have those in the actual code):
# Generate the test file with the given filename and number of lines of text. try: out_fil = open(out_filename, "w") except IOError as ioe: sys.stderr.write("Error: Could not open output file {}.\n".format(out_filename)) sys.exit(1) for line_num in range(1, num_lines + 1): line = "Line #" + str(line_num).zfill(10) + "\n" out_fil.write(line) out_fil.close()I ran it like this:
$ python gen_selpg_test_file.py selpg_test_file_1000.txt 1000to generate a text file with 1000 lines, in the file selpg_test_file_1000.txt .
Then I could run the pipeline using selpg and StdinToPDF, as described above:
$ ./selpg -s2 -e4 selpg_test_file_1000.txt | python StdinToPDF.py p2-p4.pdfThis command extracts only the specifed pages (2 to 4) from the input file, and pipes them to StdinToPDF, which converts those pages only, to PDF, in the filename specified at the end of the command.
After doing the above, you can open the file p2_p4.pdf in your favorite PDF reader (Evince is one PDF reader for Linux), to confirm that it contains all (and only) the lines from page 2 to 4 of the input file selpg_test_file_1000.txt (considering 72 lines per page, which is the default that selpg uses).
Read the IBM article to see how that default can be changed - to either another number of lines per page, e.g. 66 or 80 or whatever, or to specify form feeds (ASCII code 12) as the page delimiter. Form feeds are often used as a page delimiter in text file reports generated by programs, when the reports are destined for a printer, since the form feed character causes the printer to advance the print head to the top of the next page/form (that's how the character got its name).
Though this post seemed long, note that a lot it was either background information or instructions on how to build selpg and install xtopdf. Those are both one time jobs. Once those are done, you can select the needed pages from any text file and print them to PDF with a single command-line, as shown in the last command above.
This is useful when you printed the entire file earlier, and some pages didn't print properly because the printer jammed. Just use selpg with xtopdf to print only the needed pages again.
The image above is from the Wikipedia article on Printing, and titled:
Jikji, "Selected Teachings of Buddhist Sages and Son Masters" from Korea, the earliest known book printed with movable metal type, 1377. Bibliothèque Nationale de France, Paris
- Enjoy.
- Vasudev Ram - Dancing Bison EnterprisesClick here to get email about new products from Vasudev Ram. Contact Page
Subscribe to:
Posts (Atom)