I had a bunch of CSV files I wanted to join together and they all had the same 'header' line at the start. I didn't want the header line repeating in my concatenation of the files so I wanted to get all except the first line for my concatenation process (using cp
).
Turns out there are several ways to do this. Let's say you have a file a.txt
containing such:
1
2
3
4
5
And your goal is to get:
2
3
4
5
You can't use tail
in the standard way as you might not know how long the file is. But you can use it with a special +
number to specify that tail
is to begin on a certain line, like so:
tail -n +2 a.txt
You can also use awk
to get the job done by specifying that it can return any lines where the line number count is larger than 1:
awk 'NR>1' a.txt
Or you could use sed
to delete the first line before displaying the file:
sed '1d' a.txt
Each approach has various pros and cons mostly around which one you can remember in the moment or if you want to use awk
or sed
in other ways to make other adjustments, but it's handy to have options here.
To conclude, I took my CSV files and joined them like so:
cp 1.csv all.csv
tail -n +2 2.csv >> all.csv
tail -n +2 3.csv >> all.csv
tail -n +2 4.csv >> all.csv
There are smarter ways to do this all in one go, but I only had a few files to do anyway! :-)
Top comments (0)