python-awk

Placeholder description

Project description

pawk is a python-based replacement for awk.

It uses python for line-by-line processing of files

Examples:

#pawk automatically reads lines as csv rows and stores the result as a list in "r"
#-g ("grep") keeps a subset of lines satisfying a given condition

#Selects lines from input.txt with at least 3 csv fields
> pawk -f input.txt -g 'len(r) > 2'

#Keep a subset of lines where the second csv field is non-empty
> pawk -f input.txt -g 'r[1]'

#The above may crash if some lines have only one csv field
#Use this instead:
> pawk -f input.txt -g 'len(r) > 1 and r[1]'

#The raw line is stored in the "l" variable
#Keep a subset of lines where l isn't empty and the first character is "a"
> pawk -f input.txt -g 'l != "" and l[0] == "a"'

#Run certain code for each input line using -p
#Using -p prevents the default printing of the line

#For each line of the input, print that line with whitespace stripped
> pawk -f input.txt -p 'print l.strip()'

#default value of -f is /dev/stdin
> less input.txt | pawk -p 'print len(r)'

#-d sets the input delimiter
#the output delimiter is ",", so this command converts a tsv to a csv
> pawk -f input.txt -d '\t'

#pawk store the line number (zero-indexed) in the "i" variable
#only keep lines starting with the 1133rd
> pawk -f input.txt -g 'i>=1132'

#replace a regular expression from each line (python re module imported by default)
> pawk -f input.txt -p 'print re.sub("U_C_Rate","firearm_rate",l)'

#-b runs code before any lines are processed
#-e runs code after all lines are processed
#To add up a list of floats
> pawk -f input.txt -b "cnt=0" -p "cnt += float(l)" -e "print cnt"

Writing multi-line python in pawk:
Heavily inspired by a source I can't find right now, pawk can process strings representing multi-line python.

examples:
#(semi-colon) or (colon+whitespace) causes a line break
'import random; print(random.random())'
-->
import random;
print(random.random())

#after lines with (colon+whitespace) successive lines are automatically indented:
'if i>3: print("hello world!"); a += 1; b = 0'
-->
if i>3:
print("hello world!");
a += 1;
b == 0

#use the 'end;' keyword to force indent level to decrease (compare this example with the above)
'if i>3: print("hello world!"); end; a += 1; b = 0'
-->
if i>3:
print("hello world!");
a += 1;
b = 0

#"elif:", "else:" and "except:" automatically cause indenting to decrease
'if i>3: print("a"); elif i>1: print("b"); else: print("c")'
-->
if i>3:
print("a");
elif i>1:
print("b");
else:
print("c")

#you can define functions!
'def test123(): print("hello world!"); end; test123(); test123(); test123();'
->
def test123():
print("hello world!");
test123();
test123();
test123();

Project details

Release history Release notifications | RSS feed

0.0.10

Aug 4, 2020

0.0.5

Apr 16, 2018

This version

0.0.4

Jun 25, 2017

0.0.3

Jun 25, 2017

0.0.2

Feb 13, 2016

0.0.1

Jan 2, 2016

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

python-awk-0.0.4.tar.gz (5.0 kB view details)

Uploaded Jun 25, 2017 Source

File details

Details for the file python-awk-0.0.4.tar.gz.

File metadata

Download URL: python-awk-0.0.4.tar.gz
Upload date: Jun 25, 2017
Size: 5.0 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for python-awk-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`75f98c6b5940cff2fb123729ab600bf193585e40fe7dff1bfc893cf3babd746c`
MD5	`c182e03b9e1ad81aa7905025d80627f6`
BLAKE2b-256	`d8f580848afe4e0dcc9d4f597711a7978bd5c018902f3949c97df1a896fce8b3`

See more details on using hashes here.

python-awk 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta