Tuesday, June 4, 2013

Compare two files and show the additional lines in one that aren't in the other.

Quote often I have two sorted files and I want to "subtract" one from the other.  I've used a few different tools like perl, grep, and awk with success but they can be slow.

In typical linux fashion there is a tool that does exactly what I need very quickly called join.

Here is a quick example: join -1 3 -2 1 -v 1 < file1  < file2  > output

That is saying use column 3 on file1 ( -1 3 ), and column 1 on file2 ( -2 1 ).  It will then show only the lines that are in file1 but not file2 ( -v 1 ).

The files must already be sorted on those columns before you run join.

No comments:

Post a Comment