Showing posts with label command-line. Show all posts
Showing posts with label command-line. Show all posts

Monday, April 1, 2019

rmline: Python command-line utility to remove lines from a file [Rosetta Code solution]



- By Vasudev Ram - Online Python training / SQL training / Linux training



Pipeline image attribution

Hi readers,

Long time no post. Sorry.

I saw this programming problem about removing lines from a file on Rosetta Code.

Rosetta Code (Wikipedia) is a programming chrestomathy site.

It's a simple problem, so I thought it would make a good example for Python beginners.

So I wrote a program to solve it. To get the benefits of reuse and composition (at the command line), I wrote it as a Unix-style filter.

Here it is, in file rmline.py:
# Author: Vasudev Ram
# Copyright Vasudev Ram
# Product store:
#    https://gumroad.com/vasudevram
# Training (course outlines and testimonials):
#    https://jugad2.blogspot.com/p/training.html
# Blog:
#    https://jugad2.blogspot.com
# Web site:
#    https://vasudevram.github.io
# Twitter:
#    https://twitter.com/vasudevram

# Problem source:
# https://rosettacode.org/wiki/Remove_lines_from_a_file

from __future__ import print_function
import sys

from error_exit import error_exit

# globals 
sa, lsa = sys.argv, len(sys.argv)

def usage():
    print("Usage: {} start_line num_lines file".format(sa[0]))
    print("Usage: other_command | {} start_line num_lines".format(
    sa[0]))

def main():
    # Check number of args.
    if lsa < 3:
        usage()
        sys.exit(0)

    # Convert number args to ints.
    try:
        start_line = int(sa[1])
        num_lines = int(sa[2])
    except ValueError as ve:
        error_exit("{}: ValueError: {}".format(sa[0], str(ve)))

    # Validate int ranges.
    if start_line < 1:
        error_exit("{}: start_line ({}) must be > 0".format(sa[0], 
        start_line))
    if num_lines < 1:
        error_exit("{}: num_lines ({}) must be > 0".format(sa[0], 
        num_lines))

    # Decide source of input (stdin or file).
    if lsa == 3:
        in_fil = sys.stdin
    else:
        try:
            in_fil = open(sa[3], "r")
        except IOError as ioe:
            error_exit("{}: IOError: {}".format(sa[0], str(ioe)))

    end_line = start_line + num_lines - 1

    # Read input, skip unwanted lines, write others to output.
    for line_num, line in enumerate(in_fil, 1):
        if line_num < start_line:
            sys.stdout.write(line)
        elif line_num > end_line:
            sys.stdout.write(line)

    in_fil.close()

if __name__ == '__main__':
    main()

Here are a few test text files I tried it on:
$ dir f?.txt/b
f0.txt
f5.txt
f20.txt
f0.txt has 0 bytes.
Contents of f5.txt:
$ type f5.txt
line 1
line 2
line 3
line 4
line 5
f20.txt is similar to f5.txt, but with 20 lines.

Here are a few runs of the program, with output:
$ python rmline.py
Usage: rmline.py start_line num_lines file
Usage: other_command | rmline.py start_line num_lines

$ dir | python rmline.py
Usage: rmline.py start_line num_lines file
Usage: other_command | rmline.py start_line num_lines
Both the above runs show that when called with an invalid set of
arguments (none, in this case), it prints a usage message and exits.
$ python rmline.py f0.txt
Usage: rmline.py start_line num_lines file
Usage: other_command | rmline.py start_line num_lines
Same result, except I gave an invalid first (and only) argument, a file name. See the usage() function in the code to know the right order and types of arguments.
$ python rmline.py -3 4 f0.txt
rmline.py: start_line (-3) must be > 0

$ python rmline.py 2 0 f0.txt
rmline.py: num_lines (0) must be > 0
The above two runs shows that it checks for invalid values for the
first two expected integer argyuments, start_line and num_line.
$ python rmline.py 1 2 f0.txt
For an empty input file, as expected, it both removes and prints nothing.
$ python rmline.py 1 2 f5.txt
line 3
line 4
line 5
The above run shows it removing lines 1 through 2 (start_line = 1, num_lines = 2) of the input from the output.
$ python rmline.py 7 4 f5.txt
line 1
line 2
line 3
line 4
line 5
The above run shows that if you give a starting line number larger than the last input line number, it removes no lines of the input.
$ python rmline.py 1 10 f20.txt
line 11
line 12
line 13
line 14
line 15
line 16
line 17
line 18
line 19
line 20
The above run shows it removing the first 10 lines of the input.
$ python rmline.py 6 10 f20.txt
line 1
line 2
line 3
line 4
line 5
line 16
line 17
line 18
line 19
line 20
The above run shows it removing the middle 10 lines of the input.
$ python rmline.py 11 10 f20.txt
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8
line 9
line 10
The above run shows it removing the last 10 lines of the input.

Read more:

Pipeline (computing)

Redirection (computing)

The image at the top of the post is of a Unix-style pipeline, with standard input (stdin), standard output (stdout) and standard error (stderr) streams of programs, all independently redirectable, and with the standard output of a preceding command piped to the standard input of the succeeding command in the pipeline. Pipelines and I/O redirection are one of the powerful features of the Unix operating system and shell.

Read a brief introduction to those concepts in an article I wrote for IBM developerWorks:

Developing a Linux command-line utility

The above link is to a post about that utility on my blog. For the
actual code for the utility (in C), and for the PDF of the article,
follow the relevant links in the post.

I had originally written the utility for production use for one of the
largest motorcycle manufacturers in the world.

Enjoy.


Saturday, December 29, 2018

The Zen of Python is well sed :)





- By Vasudev Ram - Online Python training / SQL training / Linux training

$ python -c "import this" | sed -n "4,4p;15,16p"
Explicit is better than implicit.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.


- Vasudev Ram - Online Python training and consulting

I conduct online courses on Python programming, Unix / Linux commands and shell scripting and SQL programming and database design, with course material and personal coaching sessions.

The course details and testimonials are here.

Contact me for details of course content, terms and schedule.

Or if you're a self-starter, check out my Python programming course by email.

Try FreshBooks: Create and send professional looking invoices in less than 30 seconds.

Learning Linux? Hit the ground running with my vi quickstart tutorial.

Sell your digital products via DPD: Digital Publishing for Ebooks and Downloads.

Posts about: Python * DLang * xtopdf

My ActiveState Code recipes

Follow me on:


Saturday, May 5, 2018

A Python version of the Linux watch command

By Vasudev Ram



Watcher image attribution: Yours truly

Hi readers,

[ Update: A note to those reading this post via Planet Python or other aggregators:

Before first pulbishing it, I reviewed the post in Blogger's preview mode, and it appeared okay, regarding the use of the less-than character, so I did not escape it. I did not know (or did not remember) that Planet Python's behavior may be different. As a result, the code had appeared without the less-than signs in the Planet, thereby garbling it. After noticing this, I fixed the issue in the post. Apologies to those seeing the post twice as a result. ]


I was browsing Linux command man pages (section 1) for some work, and saw the page for an interesting command called watch. I had not come across it before. So I read the watch man page, and after understanding how it works (it's pretty straightforward [1]), thought of creating a Python version of it. I have not tried to implement exactly the same functionality as watch, though, just something similar to it. I called the program watch.py.

[1] The one-line description of the watch command is:

watch - execute a program periodically, showing output fullscreen

How watch.py works:

It is a command-line Python program. It takes an interval argument (in seconds), followed by a command with optional arguments. It runs the command with those arguments, repeatedly, at that interval. (The Linux watch command has a few more options, but I chose not to implement those in this version. I may add some of them [2], and maybe some other features that I thought of, in a future version.)

[2] For example, the -t, -b and -e options should be easy to implement. The -p (--precise) option is interesting. The idea here is that there is always some time "drift" [3] when trying to run a command periodically at some interval, due to unpredictable and variable overhead of other running processes, OS scheduling overhead, and so on. I had experienced this issue earlier when I wrote a program that I called pinger.sh, at a large company where I worked earlier.

[3] You can observe the time drift in the output of the runs of the watch.py program, shown below its code below. Compare the interval with the time shown for successive runs of the same command.

I had written it at the request of some sysadmin friends there, who wanted a tool like that to monitor the uptime of multiple Unix servers on the company network. So I wrote the tool, using a combination of Unix shell, Perl and C. They later told me that it was useful, and they used it to monitor the uptime of multiple servers of the company in different cities. The C part was where the more interesting stuff was, since I used C to write a program (used in the overall shell script) that sort of tried to compensate for the time drift, by doing some calculations about remaining time left, and sleeping for those intervals. It worked somewhat okay, in that it reduced the drift a good amount. I don't remember the exact logic I used for it right now, but do remember finding out later, that the gettimeofday function might have been usable in place of the custom code I wrote to solve the issue. Good fun. I later published the utility and a description of it in the company's Knowledge Management System.

Anyway, back to watch.py: each time, it first prints a header line with the interval, the command string (truncated if needed), and the current date and time, followed by some initial lines of the output of that command (this is what "watching" the command means). It does this by creating a pipe with the command, using subprocess.Popen and then reading the standard output of the command, and printing the first num_lines lines, where num_lines is an argument to the watch() function in the program.

The screen is cleared with "clear" for Linux and "cls" for Windows. Using "echo ^L" instead of "clear" works on some Linux systems, so changing the clear screen command to that may make the program a little faster, on systems where echo is a shell built-in, since there will be no need to load the clear command into memory each time [4]. (As a small aside, on earlier Unix systems I've worked on, on which there was sometimes no clear command (or it was not installed), as a workaround, I used to write a small C program that printed 25 newlines to the screen, and compile and install that as a command called clear or cls :)

[4] Although, on recent Windows and Linux systems, after a program is run once, if you run it multiple times a short while later, I've noticed that the startup time is faster from the second time onwards. I guess this is because the OS loads the program code into a memory cache in some way, and runs it from there for the later times it is called. Not sure if this is the same as the OS buffer cache, which I think is only for data. I don't know if there is a standard name for this technique. I've noticed for sure, that when running Python programs, for example, the first time you run:

python some_script_name.py

it takes a bit of time - maybe a second or three, but after the first time, it starts up faster. Of course this speedup disappears when you run the same program after a bigger gap, say the next day, or after a reboot. Presumably this is because that program cache has been cleared.

Here is the code for watch.py.
"""
------------------------------------------------------------------
File: watch.py
Version: 0.1
Purpose: To work somewhat like the Linux watch command.
See: http://man7.org/linux/man-pages/man1/watch.1.html
Does not try to replicate its functionality exactly.

Author: Vasudev Ram
Copyright 2018 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
Twitter: https://mobile.twitter.com/vasudevram
------------------------------------------------------------------
"""

from __future__ import print_function

import sys
import os
from subprocess import Popen, PIPE
import time

from error_exit import error_exit

# Assuming 25-line terminal. Adjust if different.
# If on Unix / Linux, can get value of environment variable 
# COLUMNS (if defined) and use that instead of 80.
DEFAULT_NUM_LINES = 20

def usage(args):
    lines = [
        "Usage: python {} interval command [ argument ... ]".format(
            args[0]),
        "Run command with the given arguments every interval seconds,",
        "and show some initial lines from command's standard output.",
        "Clear screen before each run.",
    ]
    for line in lines:
        sys.stderr.write(line + '\n')

def watch(command, interval, num_lines):
    # Truncate command for display in the header of watch output.
    if len(command) > 50:
        command_str = command[:50] + "..."
    else:
        command_str = command
    hdr_part_1 = "Every {}s: {} ".format(interval, command_str)
    # Assuming 80 columns terminal width. Adjust if different.
    # If on Unix / Linux, can get value of environment variable 
    # COLUMNS (if defined) and use that instead of 80.
    columns = 80
    # Compute pad_len only once, before the loop, because 
    # neither len(hdr_part_1) nor len(hdr_part_2) change, 
    # even though hdr_part_2 is recomputed each time in the loop.
    hdr_part_2 = time.asctime()
    pad_len = columns - len(hdr_part_1) - len(hdr_part_2) - 1
    while True:
        # Clear screen based on OS platform.
        if "win" in sys.platform:
            os.system("cls")
        elif "linux" in sys.platform: 
            os.system("clear")
        hdr_str = hdr_part_1 + (" " * pad_len) + hdr_part_2
        print(hdr_str + "\n")
        # Run the command, read and print its output up to num_lines lines.
        # os.popen is the old deprecated way, Python docs recommend to use 
        # subprocess.Popen.
        #with os.popen(command) as pipe:
        with Popen(command, shell=True, stdout=PIPE).stdout as pipe:
            for line_num, line in enumerate(pipe):
                print(line, end='')
                if line_num >= num_lines:
                    break
        time.sleep(interval)
        hdr_part_2 = time.asctime()

def main():

    sa, lsa = sys.argv, len(sys.argv)

    # Check arguments and exit if invalid.
    if lsa < 3:
        usage(sa)
        error_exit(
        "At least two arguments are needed: interval and command;\n"
        "optional arguments can be given following command.\n")

    try:
        # Get the interval argument as an int.
        interval = int(sa[1])
        if interval < 1:
            error_exit("{}: Invalid interval value: {}".format(sa[0],
                interval))
        # Build the command to run from the remaining arguments.
        command = " ".join(sa[2:])
        # Run the command repeatedly at the given interval.
        watch(command, interval, DEFAULT_NUM_LINES)
    except ValueError as ve:
        error_exit("{}: Caught ValueError: {}".format(sa[0], str(ve)))
    except OSError as ose:
        error_exit("{}: Caught OSError: {}".format(sa[0], str(ose)))
    except Exception as e:
        error_exit("{}: Caught Exception: {}".format(sa[0], str(e)))

if __name__ == "__main__":
    main()
Here is the code for error_exit.py, which watch imports.
# error_exit.py

# Author: Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram

# Purpose: This module, error_exit.py, defines a function with 
# the same name, error_exit(), which takes a string message 
# as an argument. It prints the message to sys.stderr, or 
# to another file object open for writing (if given as the 
# second argument), and then exits the program.
# The function error_exit can be used when a fatal error condition occurs, 
# and you therefore want to print an error message and exit your program.

import sys

def error_exit(message, dest=sys.stderr):
    dest.write(message)
    sys.exit(1)

def main():
    error_exit("Testing error_exit with dest sys.stderr (default).\n")
    error_exit("Testing error_exit with dest sys.stdout.\n", 
        sys.stdout)
    with open("temp1.txt", "w") as fil:
        error_exit("Testing error_exit with dest temp1.txt.\n", fil)

if __name__ == "__main__":
    main()
Here are some runs of watch.py and their output:
(BTW, the dfs command shown, is from the Quick-and-dirty disk free space checker for Windows post that I had written recently.)

$ python watch.py 15 ping google.com

Every 15s: ping google.com                             Fri May 04 21:15:56 2018

Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data:
Reply from 2404:6800:4007:80d::200e: time=117ms
Reply from 2404:6800:4007:80d::200e: time=109ms
Reply from 2404:6800:4007:80d::200e: time=117ms
Reply from 2404:6800:4007:80d::200e: time=137ms

Ping statistics for 2404:6800:4007:80d::200e:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 109ms, Maximum = 137ms, Average = 120ms

Every 15s: ping google.com                             Fri May 04 21:16:14 2018

Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data:
Reply from 2404:6800:4007:80d::200e: time=501ms
Reply from 2404:6800:4007:80d::200e: time=56ms
Reply from 2404:6800:4007:80d::200e: time=105ms
Reply from 2404:6800:4007:80d::200e: time=125ms

Ping statistics for 2404:6800:4007:80d::200e:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 56ms, Maximum = 501ms, Average = 196ms

Every 15s: ping google.com                             Fri May 04 21:16:33 2018

Pinging google.com [2404:6800:4007:80d::200e] with 32 bytes of data:
Reply from 2404:6800:4007:80d::200e: time=189ms
Reply from 2404:6800:4007:80d::200e: time=141ms
Reply from 2404:6800:4007:80d::200e: time=245ms
Reply from 2404:6800:4007:80d::200e: time=268ms

Ping statistics for 2404:6800:4007:80d::200e:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 141ms, Maximum = 268ms, Average = 210ms

$ python watch.py 15 c:\ch\bin\date

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:00 2018

Tue May  1 00:33:00 India Standard Time 2018

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:15 2018

Tue May  1 00:33:16 India Standard Time 2018

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:31 2018

Tue May  1 00:33:31 India Standard Time 2018

Every 15s: c:\ch\bin\date                              Tue May 01 00:33:46 2018

Tue May  1 00:33:47 India Standard Time 2018

In one CMD window:

$ d:\temp\fill-and-free-disk-space

In another:

$ python watch.py 10 dfs d:\

Every 10s: dfs d:\                                     Tue May 01 00:43:25 2018
Disk free space on d:\
37666.6 MiB = 36.78 GiB

Every 10s: dfs d:\                                     Tue May 01 00:43:35 2018
Disk free space on d:\
37113.7 MiB = 36.24 GiB

$ python watch.py 20 dir /b "|" sort

Every 20s: dir /b | sort                               Fri May 04 21:29:41 2018

README.txt
runner.py
watch-outputs.txt
watch-outputs2.txt
watch.py
watchnew.py

$ python watch.py 10 ping com.nosuchsite

Every 10s: ping com.nosuchsite                         Fri May 04 21:30:49 2018

Ping request could not find host com.nosuchsite. Please check the name and try again.

$ python watch.py 20 dir z:\

Every 20s: dir z:\                                     Tue May 01 00:54:37 2018
The system cannot find the path specified.

$ python watch.py 2b echo testing
watch.py: Caught ValueError: invalid literal for int() with base 10: '2b'

$ python watch.py 20 foo

Every 20s: foo                                         Fri May 04 21:33:35 2018

'foo' is not recognized as an internal or external command,
operable program or batch file.

$ python watch.py -1 foo
watch.py: Invalid interval value: -1
- Enjoy.

Interested in a Python programming or Linux commands and shell scripting course? I have good experience built over many years of real-life experience, as well as teaching, in both those subject areas. Contact me for course details via my contact page here.

- Vasudev Ram - Online Python training and consulting

Fast web hosting with A2 Hosting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Email marketing for professional bloggers



Wednesday, May 10, 2017

Python utility like the Unix cut command - Part 1 - cut1.py

By Vasudev Ram

Regular readers of my blog might have noticed that I sometimes write (and write about) command line utilities in posts here.

Recently, I thought of implementing a utility like the Unix cut command in Python.

Here is my first cut at it (pun intended :), below.

However, instead of just posting the final code and some text describing it, this time, I had the idea of doing something different in a post (or a series of posts).

I thought it might be interesting to show some of the stages of development of the utility, such as incremental versions of it, with features or code improvements added in each successive version, and a bit of discussion on the design and implementation, and also on the thought processes occurring during the work.

(For beginners: incidentally, speaking of thought processes, during interviews for programming jobs at many companies, open-ended questions are often asked, where you have to talk about your thoughts as you work your way through to a solution to a problem posed. I know this because I've been on both sides of that many times. This interviewing technique helps the interviewers gauge your thought processes, and thereby, helps them decide whether they think you are good at design and problem-solving or not, which helps decide whether you get the job or not.)

One reason for doing this format of post, is just because it can be fun (for me to write about, and hopefully for others to read - I know I myself like to read such posts), and another is because one of my lines of business is training (on Python and other areas), and I've found that beginners sometimes have trouble going from a problem or exercise spec to a working implementation, even if they understand well the syntax of the language features needed to implement a solution. This can happen because 1) there can be many possible solutions to a given programming problem, and 2) the way to break down a problem into smaller pieces (stepwise refinement), that are more amenable to translating into programming statements and constructs, is not always self-evident to beginners. It is a skill one acquires over time, as one keeps on programming for months and years.

So I'm going to do it that way (with more explanation and multiple versions), over the course of a few posts.

For this first post, I'll just describe the rudimentary first version that I implemented, and show the code and the output of a few runs of the program.

The Wikipedia link about Unix cut near the top of this post describes its behavior and command line options.

In this first version, I only implement a subset of those:
- reading from a file (only a single file, and not reading from standard input (stdin))
- only support the -c (for cut by column) option (not the -b (by byte) or -f (by field) options)
- only support one column specification, i.e. -cm-n, not forms like -cm1-n1,m2-n2,...

In subsequent versions, I'll add support for some of the omitted features, and also fix any errors that I find in previous versions, by testing.

I'll call the versions cutN.py, where 1 <= N <= the highest version I implement. So this current post is about cut1.py.

Here is the code for cut1.py:
"""
File: cut1.py
Purpose: A Python tool somewhat similar to the Unix cut command.
Does not try to be exactly the same or implement all the features 
of Unix cut. Created for educational purposes.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
"""

from __future__ import print_function

import sys
from error_exit import error_exit

def usage(args):
    #"Cuts the specified columns from each line of the input.\n", 
    lines = [
    "Print only the specified columns from lines of the input.\n", 
    "Usage: {} -cm-n file\n".format(args[0]), 
    "or: cmd_generating_text | {} -cm-n\n".format(args[0]), 
    "where -c means cut by column and\n", 
    "m-n means select (character) columns m to n\n", 
    "For each line, the selected columns will be\n", 
    "written to standard output. Columns start at 1.\n", 
    ]
    for line in lines:
        sys.stderr.write(line)

def cut(in_fil, start_col, end_col):
    for lin in in_fil:
        print(lin[start_col:end_col])

def main():

    sa, lsa = sys.argv, len(sys.argv)
    # Support only one input file for now.
    # Later extend to support stdin or more than one file.
    if lsa != 3:
        usage(sa)
        sys.exit(1)

    prog_name = sa[0]

    # If first two chars of first arg after script name are not "-c",
    # exit with error.
    if sa[1][:2] != "-c":
        usage(sa)
        error_exit("{}: Expected -c option".format(prog_name))

    # Get the part of first arg after the "-c".
    c_opt_arg = sa[1][2:]
    # Split that on "-".
    c_opt_flds = c_opt_arg.split("-")
    if len(c_opt_flds) != 2:
        error_exit("{}: Expected two field numbers after -c option, like m-n".format(prog_name))

    try:
        start_col = int(c_opt_flds[0])
        end_col = int(c_opt_flds[1])
    except ValueError as ve:
        error_exit("Conversion of either start_col or end_col to int failed".format(
        prog_name))

    if start_col < 1:
        error_exit("Error: start_col ({}) < 1".format(start_col))
    if end_col < 1:
        error_exit("Error: end_col ({}) < 1".format(end_col))
    if end_col < start_col:
        error_exit("Error: end_col < start_col")
    
    try:
        in_fil = open(sa[2], "r")
        cut(in_fil, start_col - 1, end_col)
        in_fil.close()
    except IOError as ioe:
        error_exit("Caught IOError: {}".format(repr(ioe)))

if __name__ == '__main__':
    main()
Here are the outputs of a few runs of cut1.py. I used this text file for the tests.
The line of digits at the top acts like a ruler :) which helps you know what character is at what column:
$ type cut-test-file-01.txt
12345678901234567890123456789012345678901234567890
this is a line with many words in it. how is it.
here is another line which also has many words.
now there is a third line that has some words.
can you believe it, a fourth line exists here.
$ python cut1.py
Print only the specified columns from lines of the input.
Usage: cut1.py -cm-n file
or: cmd_generating_text | cut1.py -cm-n
where -c means cut by column and
m-n means select (character) columns m to n
For each line, the selected columns will be
written to standard output. Columns start at 1.

$ python cut1.py -c
Print only the specified columns from lines of the input.
Usage: cut1.py -cm-n file
or: cmd_generating_text | cut1.py -cm-n
where -c means cut by column and
m-n means select (character) columns m to n
For each line, the selected columns will be
written to standard output. Columns start at 1.

$ python cut1.py -c a
cut1.py: Expected two field numbers after -c option, like m-n

$ python cut1.py -c0-0
Print only the specified columns from lines of the input.
Usage: cut1.py -cm-n file
or: cmd_generating_text | cut1.py -cm-n
where -c means cut by column and
m-n means select (character) columns m to n
For each line, the selected columns will be
written to standard output. Columns start at 1.

$ python cut1.py -c0-0 a
Error: start_col (0) < 1

$ python cut1.py -c1-0 a
Error: end_col (0) < 1

$ python cut1.py -c1-1 a
Caught IOError: IOError(2, 'No such file or directory')

$ python cut1.py -c1-1 cut-test-file-01.txt
1
t
h
n
c

$ python cut1.py -c6-12 cut-test-file-01.txt
6789012
is a li
is anot
here is
ou beli

$ python cut1.py -20-12 cut-test-file-01.txt
Print only the specified columns from lines of the input.
Usage: cut1.py -cm-n file
or: cmd_generating_text | cut1.py -cm-n
where -c means cut by column and
m-n means select (character) columns m to n
For each line, the selected columns will be
written to standard output. Columns start at 1.
cut1.py: Expected -c option

$ python cut1.py -c20-12 cut-test-file-01.txt
Error: end_col < start_col
- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Email marketing for professional bloggers


Friday, April 14, 2017

Quick CMD one-liner like "which python" command - and more

By Vasudev Ram


Today, while browsing StackOverflow, I saw this question:

Is there an equivalent of 'which' on the Windows command line?

And I liked the second answer.

The first part of that answer had this solution:
c:\> for %i in (cmd.exe) do @echo.   %~$PATH:i
   C:\WINDOWS\system32\cmd.exe

c:\> for %i in (python.exe) do @echo.   %~$PATH:i
   C:\Python25\python.exe
Then I modified the one-liner above to add a few more commands to search for, and ran it:
$ for %i in (cmd.exe python.exe metapad.exe file_sizes.exe tp.exe alarm_clock.py
) do @echo.   %~$PATH:i
   C:\Windows\System32\cmd.exe
   D:\Anaconda2-4.2.0-32bit\python.exe
   C:\util\metapad.exe
   C:\util\file_sizes.exe
   C:\util\tp.exe
   C:\util\alarm_clock.py
Notice that I also included a .py file in the list of commands to search for, and it worked for that too. This must be because .py files are registered as executable files (i.e. executable by the Python interpreter) at the time of installing Python on Windows.

Of course, as a user comments in that SO post, this is not exactly the same as the which command, since you have to specify the .exe extension (for the commands you search for), and there are other extensions for executable files, such as .bat and others. But since the Python interpeter is normally named python.exe, this can be used as a quick-and-dirty way to find out which Python executable is going to be run when you type "python", if you have more than one of them installed on your system.

I had also written a simple Python tool roughly like the Unix which command, earlier:
A simple UNIX-like "which" command in Python

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Email marketing for professional bloggers



Friday, March 31, 2017

A Python class like the Unix tee command


By Vasudev Ram


Tee image attribution

Hi readers,

A few days ago, while doing some work with Python and Unix (which I do a lot of), I got the idea of trying to implement something like the Unix tee command, but within Python code - i.e., not as a Python program but as a small Python class that Python programmers could use to get tee-like functionality in their code.

Today I wrote the class and a test program and tried it out. Here is the code, in file tee.py:
# tee.py
# Purpose: A Python class with a write() method which, when 
# used instead of print() or sys.stdout.write(), for writing 
# output, will cause output to go to both sys.stdout and 
# the filename passed to the class's constructor. The output 
# file is called the teefile in the below comments and code.

# The idea is to do something roughly like the Unix tee command, 
# but from within Python code, using this class in your program.

# The teefile will be overwritten if it exists.

# The class also has a writeln() method which is a convenience 
# method that adds a newline at the end of each string it writes, 
# so that the user does not have to.

# Python's string formatting language is supported (without any 
# effort needed in this class), since Python's strings support it, 
# not the print method.

# Author: Vasudev Ram
# Web site: https://vasudevram.github.io
# Blog: https://jugad2.blogspot.com
# Product store: https://gumroad.com/vasudevram

from __future__ import print_function
import sys
from error_exit import error_exit

class Tee(object):
    def __init__(self, tee_filename):
        try:
            self.tee_fil = open(tee_filename, "w")
        except IOError as ioe:
            error_exit("Caught IOError: {}".format(repr(ioe)))
        except Exception as e:
            error_exit("Caught Exception: {}".format(repr(e)))

    def write(self, s):
        sys.stdout.write(s)
        self.tee_fil.write(s)

    def writeln(self, s):
        self.write(s + '\n')

    def close(self):
        try:
            self.tee_fil.close()
        except IOError as ioe:
            error_exit("Caught IOError: {}".format(repr(ioe)))
        except Exception as e:
            error_exit("Caught Exception: {}".format(repr(e)))

def main():
    if len(sys.argv) != 2:
        error_exit("Usage: python {} teefile".format(sys.argv[0]))
    tee = Tee(sys.argv[1])
    tee.write("This is a test of the Tee Python class.\n")
    tee.writeln("It is inspired by the Unix tee command,")
    tee.write("which can send output to both a file and stdout.\n")
    i = 1
    s = "apple"
    tee.writeln("This line has interpolated values like {} and '{}'.".format(i, s))
    tee.close()

if __name__ == '__main__':
    main()
And when I ran it, I got this output:
$ python tee.py test_tee.out
This is a test of the Tee Python class.
It is inspired by the Unix tee command,
which can send output to both a file and stdout.
This line has interpolated values like 1 and 'apple'.

$ type test_tee.out
This is a test of the Tee Python class.
It is inspired by the Unix tee command,
which can send output to both a file and stdout.
This line has interpolated values like 1 and 'apple'.

$ python tee.py test_tee.out > main.out

$ fc /l main.out test_tee.out
Comparing files main.out and TEST_TEE.OUT
FC: no differences encountered
As you can see, I compared the teefile with the redirected stdout output, and they are the same.

I have not implemented the exact same features as the Unix tee. E.g. I did not implement the -a option (to append to a teefile if it exists, instead of overwriting it), and did not implement the option of multiple teefiles. Both are straightforward.

Ideas for the use of this Tee class and programs using it:

- the obvious one - use it like the Unix tee, to both make a copy of some program's output in a file, and show the same output on the screen. We could even pipe the screen (i.e. stdout) output to a Python (or other) text file pager :-)

- to capture intermediate output of some of the commands in a pipeline, before the later commands change it. For another way of doing that, see:

Using PipeController to run a pipe incrementally

- use it to make multiple copies of a file, by implementing the Unix tee command's multiple output file option in the Tee class.

Then we can even use it like this, so we don't get any screen output, and also copy some data to multiple files in a single step:

program_using_tee_class.py >/dev/null # or >NUL if on Windows.

Assuming that multiple teefiles were specified when creating the Tee object that the program will use, this will cause multiple copies of the program's output to be made in different specified teefiles, while the screen output will be thrown away. IOW, it will act like a command to copy some data (the output of the Python program) to multiple locations at the same time, e.g. one could be on a directory on your hard disk, another could be on a USB thumb/pen drive, a third could be on a network share, etc. The advantage here is that by copying from the source only once, to multiple destinations, we avoid reading or generating data multiple times, one for the copy to each destination. This can be more efficient, particularly for large outputs / copies.

For more fun Unixy / Pythonic stdin / stdout / pipe stuff, check out:

[xtopdf] PDFWriter can create PDF from standard input

and a follow-up post, that shows how to use the StdinToPDF program in that post, along with my selpg Unix C utility, to print only selected pages of text to PDF:

Print selected text pages to PDF with Python, selpg and xtopdf on Linux

Enjoy your tea :)

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Are you a blogger with some traffic? Get Convertkit:

Email marketing for professional bloggers




Wednesday, March 1, 2017

Show error numbers and codes from the os.errno module

By Vasudev Ram

While browsing the Python standard library docs, in particular the module os.errno, I got the idea of writing this small utility to display os.errno error codes and error names, which are stored in the dict os.errno.errorcode:

Here is the program, os_errno_info.py:
from __future__ import print_function
'''
os_errno_info.py
To show the error codes and 
names from the os.errno module.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys
import os

def main():
    
    print("Showing error codes and names\nfrom the os.errno module:")
    print("Python sys.version:", sys.version[:6])
    print("Number of error codes:", len(os.errno.errorcode))
    print("{0:>4}{1:>8}   {2:<20}    {3:<}".format(\
        "Idx", "Code", "Name", "Message"))
    for idx, key in enumerate(sorted(os.errno.errorcode)):
        print("{0:>4}{1:>8}   {2:<20}    {3:<}".format(\
            idx, key, os.errno.errorcode[key], os.strerror(key)))

if __name__ == '__main__':
    main()
And here is the output on running it:
$ py -2 os_errno_info.py >out2 && gvim out2
Showing error codes and names
from the os.errno module:
Python sys.version: 2.7.12
Number of error codes: 86
 Idx    Code   Name                    Message
   0       1   EPERM                   Operation not permitted
   1       2   ENOENT                  No such file or directory
   2       3   ESRCH                   No such process
   3       4   EINTR                   Interrupted function call
   4       5   EIO                     Input/output error
   5       6   ENXIO                   No such device or address
   6       7   E2BIG                   Arg list too long
   7       8   ENOEXEC                 Exec format error
   8       9   EBADF                   Bad file descriptor
   9      10   ECHILD                  No child processes
  10      11   EAGAIN                  Resource temporarily unavailable
  11      12   ENOMEM                  Not enough space
  12      13   EACCES                  Permission denied
  13      14   EFAULT                  Bad address
  14      16   EBUSY                   Resource device
  15      17   EEXIST                  File exists
  16      18   EXDEV                   Improper link
  17      19   ENODEV                  No such device
  18      20   ENOTDIR                 Not a directory
  19      21   EISDIR                  Is a directory
  20      22   EINVAL                  Invalid argument
  21      23   ENFILE                  Too many open files in system
  22      24   EMFILE                  Too many open files
  23      25   ENOTTY                  Inappropriate I/O control operation
  24      27   EFBIG                   File too large
  25      28   ENOSPC                  No space left on device
  26      29   ESPIPE                  Invalid seek
  27      30   EROFS                   Read-only file system
  28      31   EMLINK                  Too many links
  29      32   EPIPE                   Broken pipe
  30      33   EDOM                    Domain error
  31      34   ERANGE                  Result too large
  32      36   EDEADLOCK               Resource deadlock avoided
  33      38   ENAMETOOLONG            Filename too long
  34      39   ENOLCK                  No locks available
  35      40   ENOSYS                  Function not implemented
  36      41   ENOTEMPTY               Directory not empty
  37      42   EILSEQ                  Illegal byte sequence
  38   10000   WSABASEERR              Unknown error
  39   10004   WSAEINTR                Unknown error
  40   10009   WSAEBADF                Unknown error
  41   10013   WSAEACCES               Unknown error
  42   10014   WSAEFAULT               Unknown error
  43   10022   WSAEINVAL               Unknown error
  44   10024   WSAEMFILE               Unknown error
  45   10035   WSAEWOULDBLOCK          Unknown error
  46   10036   WSAEINPROGRESS          Unknown error
  47   10037   WSAEALREADY             Unknown error
  48   10038   WSAENOTSOCK             Unknown error
  49   10039   WSAEDESTADDRREQ         Unknown error
  50   10040   WSAEMSGSIZE             Unknown error
  51   10041   WSAEPROTOTYPE           Unknown error
  52   10042   WSAENOPROTOOPT          Unknown error
  53   10043   WSAEPROTONOSUPPORT      Unknown error
  54   10044   WSAESOCKTNOSUPPORT      Unknown error
  55   10045   WSAEOPNOTSUPP           Unknown error
  56   10046   WSAEPFNOSUPPORT         Unknown error
  57   10047   WSAEAFNOSUPPORT         Unknown error
  58   10048   WSAEADDRINUSE           Unknown error
  59   10049   WSAEADDRNOTAVAIL        Unknown error
  60   10050   WSAENETDOWN             Unknown error
  61   10051   WSAENETUNREACH          Unknown error
  62   10052   WSAENETRESET            Unknown error
  63   10053   WSAECONNABORTED         Unknown error
  64   10054   WSAECONNRESET           Unknown error
  65   10055   WSAENOBUFS              Unknown error
  66   10056   WSAEISCONN              Unknown error
  67   10057   WSAENOTCONN             Unknown error
  68   10058   WSAESHUTDOWN            Unknown error
  69   10059   WSAETOOMANYREFS         Unknown error
  70   10060   WSAETIMEDOUT            Unknown error
  71   10061   WSAECONNREFUSED         Unknown error
  72   10062   WSAELOOP                Unknown error
  73   10063   WSAENAMETOOLONG         Unknown error
  74   10064   WSAEHOSTDOWN            Unknown error
  75   10065   WSAEHOSTUNREACH         Unknown error
  76   10066   WSAENOTEMPTY            Unknown error
  77   10067   WSAEPROCLIM             Unknown error
  78   10068   WSAEUSERS               Unknown error
  79   10069   WSAEDQUOT               Unknown error
  80   10070   WSAESTALE               Unknown error
  81   10071   WSAEREMOTE              Unknown error
  82   10091   WSASYSNOTREADY          Unknown error
  83   10092   WSAVERNOTSUPPORTED      Unknown error
  84   10093   WSANOTINITIALISED       Unknown error
  85   10101   WSAEDISCON              Unknown error

In the above Python command line, you can of course skip the "&& gvim out2" part. It is just there to automatically open the output file in gVim (text editor) after the utility runs.

The above output was from running it with Python 2.
The utility is written to also work with Python 3.
To change the command line to use Python 3, just change 2 to 3 everywhere in the above Python command :)
(You need to install or already have py, the Python Launcher for Windows, for the py command to work. If you don't have it, or are not on Windows, use python instead of py -2 or py -3 in the above python command line - after having set your OS PATH to point to Python 2 or Python 3 as wanted.)

The only differences in the output are the version message (2.x vs 3.x), and the number of error codes - 86 in Python 2 vs. 101 in Python 3.
Unix people will recognize many of the messages (EACCES, ENOENT, EBADF, etc.) as being familiar ones that you get while programming on Unix.
The error names starting with W are probably Windows-specific errors. Not sure how to get the messages for those, need to look it up. (It currently shows "Unknown error" for them.)

This above Python utility was inspired by an earlier auxiliary utility I wrote, called showsyserr.c, as part of my IBM developerWorks article, Developing a Linux command-line utility (not the main utility described in the article). Following (recursively) the link in the previous sentence will lead you to the code for both the auxiliary and the main utility, as well as the PDF version of the article.

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Sunday, January 8, 2017

An Unix seq-like utility in Python

By Vasudev Ram


Due to a chain (or sequence - pun intended :) of thoughts, I got the idea of writing a simple version of the Unix seq utility (command-line) in Python. (Some Unix versions have a similar command called jot.)

Note: I wrote this program just for fun. As the seq Wikipedia page says, modern versions of bash can do the work of seq. But this program may still be useful on Windows - not sure if the CMD shell has seq-like functionality or not. PowerShell probably has it, is my guess.)

The seq command lets you specify one or two or three numbers as command-line arguments (some of which are optional): the start, stop and step values, and it outputs all numbers in that range and with that step between them (default step is 1). I have not tried to exactly emulate seq, instead I've written my own version. One difference is that mine does not support the step argument (so it can only be 1), at least in this version. That can be added later. Another is that I print the numbers with spaces in between them, not newlines. Another is that I don't support floating-point numbers in this version (again, can be added).

The seq command has more uses than the above description might suggest (in fact, it is mainly used for other things than just printing a sequence of numbers - after all, who would have a need to do that much). Here is one example, on Unix (from the Wikipedia article about seq):
# Remove file1 through file17:
for n in `seq 17`
do
    rm file$n
done
Note that those are backquotes or grave accents around seq 17 in the above code snippet. It uses sh / bash syntax, so requires one of them, or a compatible shell.

Here is the code for seq1.py:
'''
seq1.py
Purpose: To act somewhat like the Unix seq command.
Author: Vasudev Ram
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: https://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys

def main():
    sa, lsa = sys.argv, len(sys.argv)
    if lsa < 2:
        sys.exit(1)
    try:
        start = 1
        if lsa == 2:
            end = int(sa[1])
        elif lsa == 3:
            start = int(sa[1])
            end = int(sa[2])
        else: # lsa > 3
            sys.exit(1)
    except ValueError as ve:
        sys.exit(1)

    for num in xrange(start, end + 1):
        print num, 
    sys.exit(0)
    
if __name__ == '__main__':
    main()
And here are a few runs of seq1.py, and the output of each run, below:
$ py -2 seq1.py

$ py -2 seq1.py 1
1

$ py -2 seq1.py 2
1 2

$ py -2 seq1.py 3
1 2 3

$ py -2 seq1.py 1 1
1

$ py -2 seq1.py 1 2
1 2

$ py -2 seq1.py 1 3
1 2 3

$ py -2 seq1.py 4
1 2 3 4

$ py -2 seq1.py 1 4
1 2 3 4

$ py -2 seq1.py 2 2
2

$ py -2 seq1.py 5 3

$ py -2 seq1.py -6 -2
-6 -5 -4 -3 -2

$ py -2 seq1.py -4 -0
-4 -3 -2 -1 0

$ py -2 seq1.py -5 5
-5 -4 -3 -2 -1 0 1 2 3 4 5

There are many other possible uses for seq, if one uses one's imagination, such as rapidly generating various filenames or directory names, with numbers in them (as a prefix, suffix or in the middle), for testing or other purposes, etc.

- Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Thursday, January 5, 2017

Give your Python function a {web,CLI} hug!

By Vasudev Ram



I came across this interesting Python framework called hug recently:

www.hug.rest

Hug is interesting because it allows you to create a function in Python and then expose it via both the web and the command-line. It also does some data type validation using Python 3's annotations (not shown in my example, but see the hug quickstart below). Hug is for Python 3 only, and builds upon on the Falcon web framework (which is "a low-level high performance framework" for, among other things, "building other frameworks" :).

Here is the hug quickstart.

The hug site says it is "compiled with Cython" for better performance. It makes some claims about being one of the faster Python frameworks. Haven't checked that out.

Here is an HN thread about hug from about a year ago, with some interesting comments, in which the author of hug also participated, replying to questions by readers, explaining some of his claims, etc. Some benchmark results for hug vs. other tools are also linked to in that thread:

I tried out some of the features of hug, using Python 3.5.2, with a small program I wrote.

Below is the test program I wrote, hug_pdp.py. The pdp in the filename stands for psutil disk partitions, because it uses the psutil disk_partitions() function that I blogged about here recently:

Using psutil to get disk partition information with Python

Here is hug_pdp.py:
"""
hug_pdp.py
Use hug with psutil to show disk partition info 
via Python, CLI or Web interfaces.
Copyright 2017 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: http://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
"""

import sys
import psutil
import hug

def get_disk_partition_data():
    dps = psutil.disk_partitions()
    fmt_str = "{:<8} {:<7} {:<7}"

    result = {}
    result['header'] = fmt_str.format("Drive", "Type", "Opts")
    result['detail'] = {}
    for i in (0, 2):
        dp = dps[i]
        result['detail'][str(i)] = fmt_str.format(dp.device, dp.fstype, dp.opts)
    return result

@hug.cli()
@hug.get(examples='drives=0,1')
@hug.local()
def pdp():
    """Get disk partition data"""
    result = get_disk_partition_data()
    return result

@hug.cli()
@hug.get(examples='')
@hug.local()
def pyver():
    """Get Python version"""
    pyver = sys.version[:6]
    return pyver

if __name__ == '__main__':
    pdp.interface.cli()
Note the use of hug decorators in the code to enable different kinds of user interfaces and HTTP methods.

Here are some different ways of running this hug-enabled program, with their outputs:

As a regular Python command-line program, using the python command:
$ python  hug_pdp.py
{'detail': {'0': 'C:\\      NTFS    rw,fixed', '2': 'E:\\      CDFS    ro,cdrom'
}, 'header': 'Drive    Type    Opts   '}

As a command-line program, using the hug command:
$ hug -f hug_pdp.py -c pdp
{'detail': {'2': 'E:\\      CDFS    ro,cdrom', '0': 'C:\\      NTFS    rw,fixed'}, 
'header': 'Drive    Type    Opts   '}
You can see that this command gives the same output as the previous one.
But you can also run the above command with the "-c pyver" argument instead of "-c pdp", giving:
$ hug -f hug_pdp.py -c pyver
3.5.2
(I added the pyver() function to the program later, after the initial runs with just the pdp() function, to figure out how using the hug command to run the program was different from using the python command to run it. The answer can be seen from the above output, though there is another difference too, shown below (the web interface). Next, I ran it this way:
$ hug -f hug_pdp.py
which started a web server (running on port 8000), giving this output on the console:
/#######################################################################\
          `.----``..-------..``.----.
         :/:::::--:---------:--::::://.
        .+::::----##/-/oo+:-##----:::://
        `//::-------/oosoo-------::://.       ##    ##  ##    ##    #####
          .-:------./++o/o-.------::-`   ```  ##    ##  ##    ##  ##
             `----.-./+o+:..----.     `.:///. ########  ##    ## ##
   ```        `----.-::::::------  `.-:::://. ##    ##  ##    ## ##   ####
  ://::--.``` -:``...-----...` `:--::::::-.`  ##    ##  ##   ##   ##    ##
  :/:::::::::-:-     `````      .:::::-.`     ##    ##    ####     ######
   ``.--:::::::.                .:::.`
         ``..::.                .::         EMBRACE THE APIs OF THE FUTURE
             ::-                .:-
             -::`               ::-                   VERSION 2.2.0
             `::-              -::`
              -::-`           -::-
\########################################################################/

 Copyright (C) 2016 Timothy Edmund Crosley
 Under the MIT License


Serving on port 8000...
Then I went to this URL in my browser:
http://localhost:8000/pdp
which gave me this browser output:
{"detail": {"0": "C:\\      NTFS    rw,fixed", "2": "E:\\      CDFS    ro,cdrom"}, 
"header": "Drive    Type    Opts   "}
which is basically the same as the earlier command-line interface output I got.
Next I went to this URL:
http://localhost:8000/pyver
which gave me this:
"3.5.2 "
which again is the same as the earlier corresponding command-line output of the hug command.

Of course, the output from both the web and CLI interfaces is either JSON or a dict, so in a real life app, we would have to get that output and use it in some way, such as (process it further before we) format it better for human consumption. If using a JavaScript front-end, it can easily be done; if using the code as is with the command-line mode, we need to figure out a way to do it. The hug module may have some support for that.

What is also interesting is that when I run it this way:
http://localhost:8000/
I get this browser output:
{
    "404": "The API call you tried to make was not defined. Here's a definition 
of the API to help you get going :)",
    "documentation": {
        "overview": "\nhug_pdp.py\nUse hug with psutil to show disk partition 
info \nvia Python, CLI or Web interfaces.\nCopyright 2017 Vasudev Ram\nWeb site: 
https://vasudevram.github.io\nBlog: http://jugad2.blogspot.com\nProduct store: 
https://gumroad.com/vasudevram\n",
        "handlers": {
            "/pdp": {
                "GET": {
                    "usage": "Get disk partition data",
                    "examples": [
                        "http://localhost:8000/pdp?drives=0,1"
                    ],
                    "outputs": {
                        "format": "JSON (Javascript Serialized Object Notation)",
                        "content_type": "application/json"
                    }
                }
            },
            "/pyver": {
                "GET": {
                    "usage": "Get Python version",
                    "examples": [
                        "http://localhost:8000/pyver"
                    ],
                    "outputs": {
                        "format": "JSON (Javascript Serialized Object Notation)",
                        "content_type": "application/json"
                    }
                }
            }
        }
    }
}
which shows that trying to access an unsupported route, gives as output, this:

an overview, supported URLs/routes, HTTP methods, and documentation about how to use it and the output formats - almost none of which code was written for, mind.

Go give your Python code a hug!

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Friday, December 30, 2016

Jal-Tarang, and a musical alarm clock in Python

By Vasudev Ram

Hi readers,

Season's greetings!

After you check out this video (Ranjana Pradhan playing the Jal Tarang in Sydney, 2006, read the rest of the post below ...



Here is a program that acts as a musical command-line alarm clock. It is an adaptation of the one I created here a while ago:

A simple alarm clock in Python (command-line)

This one plays a musical sound (using the playsound Python library) when the alarm time is reached, instead of beeping like the above one does. I had recently come across and used the playsound library in a Python course I conducted, so I thought of enhancing the earlier alarm clock app to use playsound. Playsound is a nice simple Python module with just one function, also called playsound, which can play either WAV or MP3 audio files on Windows.

The playsound library is described as "a pure Python, cross platform, single function module with no dependencies for playing sounds."

Excerpt from its PyPI page:

[ On Windows, uses windll.winmm. WAVE and MP3 have been tested and are known to work. Other file formats may work as well.

On OS X, uses AppKit.NSSound. WAVE and MP3 have been tested and are known to work. In general, anything QuickTime can play, playsound should be able to play, for OS X.

On Linux, uses ossaudiodev. I don’t have a machine with Linux, so this hasn’t been tested at all. Theoretically, it plays WAVE files. ]

Here is the code for the program, musical_alarm_clock.py:
from __future__ import print_function

'''
musical_alarm_clock.py

Author: Vasudev Ram
Copyright 2016 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: http://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram

Description: A simple program to make the computer act like 
a musical alarm clock. Start it running from the command line 
with a command line argument specifying the number of minutes 
after which to give the alarm. It will wait for that long, and 
then play a musical sound a few times.
'''

import sys
import string
from playsound import playsound, PlaysoundException
from time import sleep, asctime

sa = sys.argv
lsa = len(sys.argv)
if lsa != 2:
    print("Usage: python {} duration_in_minutes.".format(sys.argv[0]))
    print("Example: python {} 10".format(sys.argv[0]))
    print("Use a value of 0 minutes for testing the alarm immediately.")
    print("The program plays a musical sound a few times after the duration is over.")
    sys.exit(1)

try:
    minutes = int(sa[1])
except ValueError:
    print("Invalid value {} for minutes.".format(sa[1]))
    print("Should be an integer >= 0.")
    sys.exit(1)

if minutes < 0:
    print("Invalid value {} for minutes.".format(minutes))
    print("Should be an integer >= 0.")
    sys.exit(1)

seconds = minutes * 60

if minutes == 1:
    unit_word = "minute"
else:
    unit_word = "minutes"

try:
    print("Current time is {}.".format(asctime()))
    if minutes > 0:
        print("Alarm set for {} {} later.".format(str(minutes), unit_word))
        sleep(seconds)
    else:
        print("Running in immediate test mode, with no delay.")
    print("Alarm time reached at {}.".format(asctime()))
    print("Wake up.")
    for i in range(5):
        playsound(r'c:\windows\media\chimes.wav')        
        #sleep(1.00)
        sleep(0.50)
        #sleep(0.25)
        #sleep(0.10)
except PlaysoundException as pe:
    print("Error: PlaysoundException: message: {}".format(pe))
    sys.exit(1)
except KeyboardInterrupt:
    print("Interrupted by user.")
    sys.exit(1)



Jal Tarang image attribution

The picture above is of a Jal Tarang.

The Jal Tarang is an ancient Indian melodic percussion instrument. Brief description adapted from the Wikipedia article: It consists of a set of bowls filled with water. The bowls can be of different sizes and contain different amounts of water. The instrument is tuned by adjusting the amount of water in each bowl. The music is played by striking the bowls with two sticks.

I've watched live jal-tarang performances only a few times as a kid (it was probably somewhat uncommon even then, and there are very few people who play it nowadays), so it was interesting to see this video and read the article.

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates (via Gumroad) on my forthcoming apps and content.

Jump to posts: Python * DLang * xtopdf

Subscribe to my blog by email

My ActiveState Code recipes

Follow me on: LinkedIn * Twitter

Managed WordPress Hosting by FlyWheel



Saturday, December 3, 2016

Simple directory lister with multiple wildcard arguments

By Vasudev Ram

$ python file_glob.py f[!2-5]*xt

I was browsing the Python standard library (got to use those batteries!) and thought of writing this simple utility - a command-line directory lister that supports multiple wildcard filename arguments, like *, ?, etc. - as the OS shells bash on Unix and CMD on Windows do. It uses the glob function from the glob module. The Python documentation for glob says:

[ No tilde expansion is done, but *, ?, and character ranges expressed with [] will be correctly matched. This is done by using the os.listdir() and fnmatch.fnmatch() functions in concert. ]

Note: no environment variable expansion is done either, but see os.path.expanduser() and os.path.expandvars() in the stdlib.

I actually wrote this program just to try out glob (and fnmatch before it), not to make a production or even throwaway tool, but it turns out that, due to the functionality of glob(), even this simple program is somewhat useful, as the examples of its use below show, particularly with multiple arguments, etc.

Of course, this is not a full-fledged directory lister like DIR (Windows) or ls (Unix) but many of those features can be implemented in this or similar tools, by using the stat module (which I've used in my PySiteCreator and other programs); the Python stat must be a wrapper over the C library with the same name, at least on Unix (AFAIK the native Windows SDK's directory and file system manipulation functions are different from POSIX ones, though the C standard library on Windows has many C stdio functions for compatibility and convenience).

Here is the code for the program, file_glob.py:
from __future__ import print_function
'''
file_glob.py
Lists filenames matching one or more wildcard patterns.
Author: Vasudev Ram
Copyright 2016 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: http://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
'''

import sys
import glob

sa = sys.argv
lsa = len(sys.argv)

if lsa < 2:
    print("{}: Must give one or more filename wildcard arguments.".
        format(sa[0]))
    sys.exit(1)

for arg in sa[1:]:
    print("Files matching pattern {}:".format(arg))
    for filename in glob.glob(arg):
            print(filename)

I ran it multiple times with these files in my current directory. All of them are regular files except dir1 which is a directory.
$ dir /b
dir1
f1.txt
f2.txt
f3.txt
f4.txt
f5.txt
f6.txt
f7.txt
f8.txt
f9.txt
file_glob.py
o1.txt
out1
out3
test_fnmatch1.py
test_fnmatch2.py
test_glob.py~
Here are a few different runs of the program with 0, 1 or 2 arguments, giving different outputs based on the patterns used.
$ python file_glob.py
Must give one or more filename wildcard arguments.
$ python file_glob.py a
Files matching pattern a

$ python file_glob.py *1
Files matching pattern *1:
dir1
out1
$ python file_glob.py *txt
Files matching pattern *txt:
f1.txt
f2.txt
f3.txt
f4.txt
f5.txt
f6.txt
f7.txt
f8.txt
f9.txt
o1.txt
$ python file_glob.py *txt *1
Files matching pattern *txt:
f1.txt
f2.txt
f3.txt
f4.txt
f5.txt
f6.txt
f7.txt
f8.txt
f9.txt
o1.txt
Files matching pattern *1:
dir1
out1
$ python file_glob.py *txt *py
Files matching pattern *txt:
f1.txt
f2.txt
f3.txt
f4.txt
f5.txt
f6.txt
f7.txt
f8.txt
f9.txt
o1.txt
Files matching pattern *py:
file_glob.py
test_fnmatch1.py
test_fnmatch2.py
$ python file_glob.py f[2-5]*xt
Files matching pattern f[2-5]*xt:
f2.txt
f3.txt
f4.txt
f5.txt
$ python file_glob.py f[!2-5]*xt
Files matching pattern f[!2-5]*xt:
f1.txt
f6.txt
f7.txt
f8.txt
f9.txt
$ python file_glob.py *mat*
Files matching pattern *mat*:
test_fnmatch1.py
test_fnmatch2.py
$ python file_glob.py *e* *[5-8]*
Files matching pattern *e*:
file_glob.py
test_fnmatch1.py
test_fnmatch2.py
test_glob.py~
Files matching pattern *[5-8]*:
f5.txt
f6.txt
f7.txt
f8.txt
$ python file_glob.py *[1-4]*
Files matching pattern *[1-4]*:
dir1
f1.txt
f2.txt
f3.txt
f4.txt
o1.txt
out1
out3
test_fnmatch1.py
test_fnmatch2.py
$ python file_glob.py a *txt b
Files matching pattern a:
Files matching pattern *txt:
f1.txt
f2.txt
f3.txt
f4.txt
f5.txt
f6.txt
f7.txt
f8.txt
f9.txt
o1.txt
Files matching pattern b:
As you can see from the runs, it works, including for ranges of wildcard characters, and the negation of them too (using the ! character inside the square brackets before the range).

Enjoy.

- Vasudev Ram - Online Python training and consulting

Get updates on my software products / ebooks / courses.

Jump to posts: Python   DLang   xtopdf

Subscribe to my blog by email

My ActiveState recipes

Managed WordPress Hosting by FlyWheel


Friday, October 14, 2016

Command line D utility - find files matching a pattern under a directory

By Vasudev Ram

Hi readers,

I wrote this utility in D recently, to find files matching a pattern (a wildcard) under a specified directory. Here is the program, find_files.d:
/************************************************************************
find_files.d
Author: Vasudev Ram
Compile with Digital Mars D compiler using the command:
dmd find_files.d
Usage: find_files dir patt
Finds files under directory 'dir' that match file wildcard 
pattern 'patt'.
The usual file wildcard patterns that an OS like Windows or 
Linux supports, such as *.txt, budget*.xls, my_files*, etc.,
are supported.
Copyright 2016 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: http://jugad2.blogspot.com
Product updates: https://gumroad.com/vasudevram/follow
************************************************************************/

import std.stdio; 
import std.file;

string __version__ = "0.1";

void usage(string[] args) {
    debug {
        stderr.write("You gave the command: ");
        foreach(arg; args)
            stderr.write(arg, " ");
        stderr.writeln;
    }
    
    stderr.writeln(args[0], " (Find Files) v", __version__);
    stderr.writeln("Copyright 2016 Vasudev Ram" ~
    " - https://vasudevram.github.io");
    stderr.writeln("Usage: ", args[0], " dir pattern");
    stderr.writeln("Recursively find filenames, under directory 'dir', ");
    stderr.writeln("that match the literal or wildcard string 'pattern'.");
}

int main(string[] args) {

    // Check for and get correct command-line arguments.
    if (args.length != 3) {
        usage(args);
        return 1;
    }
    auto top_dir = args[1];
    if (!exists(top_dir) || !isDir(top_dir)) {
        stderr.writeln("The name ", top_dir, 
        " does not exist or is not a directory.");
        usage(args);
        return 1;
    }
    auto pattern = args[2];

    try {
        debug writeln(args[0], ": Finding files under directory: ", 
        top_dir, " that match pattern: ", pattern);
        foreach (de; dirEntries(top_dir, pattern, SpanMode.breadth)) {
            writeln(de.name);
        }
    }
    catch (Exception e) {
        stderr.writeln("Caught Exception: msg = ", e.msg);
        return 1;
    }
    return 0;
}
Compile command:
$ dmd -offf find_files.d
The -of flag is for "output file", and is used to specify the .EXE file to be created. The command creates ff.exe on Windows).

Compile command to create debug version (the statements prefixed 'debug' will also run):
$ dmd -debug -offf find_files.d

Here is the output from a few test runs of the find_files utility:

Run with no arguments - shows usage:
$ ff
ff (Find Files) v0.1
Copyright 2016 Vasudev Ram - https://vasudevram.github.io
Usage: ff dir pattern
Recursively find filenames, under directory 'dir',
that match the literal or wildcard string 'pattern'.
Run with one argument, a non-existent directory - shows usage, since number of arguments is invalid:
$ ff z
ff (Find Files) v0.1
Copyright 2016 Vasudev Ram - https://vasudevram.github.io
Usage: ff dir pattern
Recursively find filenames, under directory 'dir',
that match the literal or wildcard string 'pattern'.
Run with two argument, a non-existent directory and a file wildcard, * - gives message that the directory does not exist, and shows usage:
$ ff z *
The name z does not exist or is not a directory.
ff (Find Files) v0.1
Copyright 2016 Vasudev Ram - https://vasudevram.github.io
Usage: ff dir pattern
Recursively find filenames, under directory 'dir',
that match the literal or wildcard string 'pattern'.
Run with two argument, an existing directory and a file wildcard, * - gives the output of the files matching the wildcard under that directory (recursively):
$ ff . *
.\dir1
.\dir1\a.txt
.\dir1\dir11
.\dir1\dir11\b.txt
.\dir2
.\dir2\c.txt
.\dir2\dir21
.\dir2\dir21\d.txt
.\ff.exe
.\ff.obj
.\ffd.exe
.\find_files.d
.\find_files.d~
Note: It will find hidden files too, i.e. those with a H attribute, as shown by the DOS command DIR /A. If you find any issue with the utility, please describe it, with the error message, in the comments.

You can get the find_files utility (as source code) from here on my Gumroad product store:

https://gum.co/find_files

Compile the find_files.d program with the Digital Mars D compiler using the command:

dmd find_files.d

whhich will create the executable file find_files.exe.

Run:

find_files

to get help on the usage of the utility. The examples above in this post have more details.

- Enjoy.

Drawing of magnifying glass at top of post, by:

Yours truly.

- Vasudev Ram - Online Python training and consulting

Get updates on my software products / ebooks / courses.

Jump to posts: Python   DLang   xtopdf

Subscribe to my blog by email

My ActiveState recipes

Managed WordPress Hosting by FlyWheel



Saturday, October 1, 2016

min_fgrep: minimal fgrep command in D

By Vasudev Ram

min_fgrep pattern < file

The Unix fgrep command finds fixed-string patterns (i.e. not regular expression patterns) in its input and prints the lines containing them. It is part of the grep family of Unix commands, which also includes grep itself, and egrep.

All three of grep, fgrep and egrep originally had slightly different behaviors (with respect to number and types of patterns supported), although, according to the Open Group link in the previous paragraph, fgrep is marked as LEGACY. I'm guessing that means the functionality of fgrep is now included in grep, via some command-line option, and maybe the same for egrep too.

Here is a minimal fgrep-like program written in D. It does not support reading from a filename given as a command-line argument, in this initial version; it only supports reading from stdin (standard input), which means you have to either redirect its standard input to come from a file, or pipe the output of another command to it.
/*
File: min_fgrep.d
Purpose: A minimal fgrep-like command in D.
Author: Vasudev Ram
Copyright 2016 Vasudev Ram
Web site: https://vasudevram.github.io
Blog: http://jugad2.blogspot.com
Product store: https://gumroad.com/vasudevram
*/

import std.stdio;
import std.algorithm.searching;

void usage(string[] args) {
    stderr.writeln("Usage: ", args[0], " pattern");
    stderr.writeln("Prints lines from standard input that contain pattern.");
}

int main(string[] args) {
    if (args.length != 2) {
        stderr.writeln("No pattern given.");
        usage(args);
        return 1;
    }

    string pattern = args[1];
    try {
        foreach(line; stdin.byLine)
        {
            if (canFind(line, pattern)) {
                writeln(line);
            }
        }
    } catch (Exception e) {
        stderr.writeln("Caught an Exception. ", e.msg);
    }
    return 0;
}
Compile with:
$ dmd min_fgrep.d
I ran it with its standard input redirected to come from its own source file:
$ min_fgrep < min_fgrep.d string
and got the output:
void usage(string[] args) {
int main(string[] args) {
    string pattern = args[1];
(Bold style added by me in the post, not part of the min_fgrep output, unlike with some greps.)

Running it as part of a pipeline:

$ type min_fgrep.d | min_fgrep string > c
gave the same output. The interesting thing here is not the min_fgrep program itself - that is very simple, as can be seen from the code. What is interesting is that the canFind function is not specific to type string; instead it is from the std.algorithm.searching module of the D standard library (Phobos), and it is generalized - i.e. it works on D ranges, which are a rather cool and powerful feature of D. This is done using D's template metaprogramming / generic programming features. And since the version of the function specialized to the string type is generated at compile time, there is no run-time overhead due to genericity (I need to verify the exact details, but I think that is what happens).

- Vasudev Ram - Online Python training and consulting

Get updates on my software products / ebooks / courses.

Jump to posts: Python   DLang   xtopdf

Subscribe to my blog by email

My ActiveState recipes

FlyWheel - Managed WordPress Hosting