Monday, November 16, 2015

Purely shell way to extract a numbered line from a file

I feel almost shameful to write it down, but as it took me a long time to realize this, I'll write it down anyway. Here's the simplest portable shell one-liner I've found to extract only the, say, 5th line from file:

~ cat file | tail -n +5 | head -n1

Hope it helps...

Update: following the comments to this post, here are a couple of other solutions. Thanks to all who contributed !

~ cat file | sed -ne 5p
~ cat file | sed -ne '5{p;q}' 

The second solution has the advantage of closing the input after line 5, so if you have an expensive command, you'll kill it with a SIGPIPE soon after it produces line 5. Other ones:

~ cat file | awk 'NR==5 {print; exit}'
~ cat file | head -n5 | tail -n1 

The last one, while simpler, is slightly more expensive because all the lines before the one you're interested in are copied twice (first from cat to head and then from head to tail). This happens less with the first solution because, even if tail passes on all the lines after the one you're interested in, it is killed by a SIGPIPE when head closes.

10 comments:

olasd said...

"sed -ne 5p" works pretty well too

Vincent Fourmond said...

Mmmm. I knew there was an easy sed way too. Just a question, though: does sed close the input after line 5 ? If you are generating a huge amount of data, but only need the beginning, the tail | head trick would kill the original process with SIGPIPE relatively early.

olasd said...

Ah, that's a fair point, sed reads the whole input file unconditionally.

camh said...

sed -ne '5{p;q}'

That will have sed exit after printing the 5th line.

awk 'NR==5 {print; exit}'

does the same with awk

Anonymous said...

I prefer this version

$ cat file | head -n5 | tail -n1

Bruno BEAUFILS said...

If the line to be extract is at the end tail will read a lot of lines. IMHO using cat is also not a good idea either since it spawns a useless process.

Is something like this not more efficient ?

head -5 < file | tail -1

Anyhow the sed way is the cleaner to my point of view.

Vincent Fourmond said...

Now that's an interesting series of command-lines, thanks ! @Bruno, the cat is just here as a demonstration, mostly I'm using this to get the nth line of the ouput of some command, and I find it more readable (ie left to right) than the redirection...

josch said...

Even cat just being there for demonstration purposes, the more people see it being used this way the more they will use it for their real scripts. You can search for "useless uses of cat" to get some rationales.

If you want to read from left to right, then consider these two variations:

$ head -n5 file | tail -n1

or

$ < file head -n5 | tail -n1

Both preserve the reading from left to right making use of head taking (one or more) file(s) as arguments and the redirection operator also being allowed in front of the command.

Erik J said...

Using cat is a preference, there is nothing inherently wrong with either version.. most of the time cat need to be replaced with some other command like gzip/curl/pv or something that doesn't take input from standard in.


There are absolutely no good arguments against using cat.

Anonymous said...

Putting a redirection on the far left of a command (<foo bar |...) breaks in some shell contexts (such as inside a while... Etc)