Wednesday, 23 February 2011

Finding the second occurrence of a regexp with sed

This is something that I needed, but couldn't find. So for the benefit of the next person, here's what I did.

I have files that each contain two heredocs, and I needed a simple script to pull out the second one. The files look something like this:


#!/bin/bash

# various commands chomped for simplicity

ftp -n ftp.example.net <<EOT
user uid pwd
type image
get importantfile.tgz
quit
EOT

# lots more code

cat >scriptfile.sh <<EOD
#!/bin/sh
ftp -n ftp.example.net <<EOT
user uid pwd
get file1.sh
get file2.sh
EOT

# more code

EOD

# more code



After some searching and poking around with sed, I came up with this:


sed '1,/<<EOT/d;1,/<<EOT/d;/^EOT$/,$d' $1|ftp -n ftp.example.net


It locates one <<EOT, then locates another, deleting all the while. Then it stops deleting until it finds ^EOT$ and wipes out the rest. There's probably a simpler way to do some of this, but this is what I managed to hack together.

No comments: