Remove duplicates from each cell
Since the input looks like it is fixed-width, you can use unpack to split it into columns. Then split each cell on comma and use uniq to remove the duplicates while preserving order. Then, output it with pack
.
use warnings;
use strict;
use List::Util qw(uniq);
my $tmpl = 'A6A6A7A5A6A10A8A15A*';
while (<DATA>) {
my @cols = unpack $tmpl, $_;
for my $c (@cols) {
$c =~ s/^\s+//;
my @items = split /,/, $c;
$c = join ',', uniq(@items);
}
print pack($tmpl, @cols), "\n";
}
__DATA__
Sl.no Name1 Name2 Dis From Type item Animal Code
2 qw wsa 12 23 car,car Case CAT1,CAT1,Dog p.12>a,p.12>a
23 as swe 34 2,2 Bus,Bus Case1,, Dog1,Dog1,, N.12>a,N.12>a
23 ks awe 35 . Bike,Bike Case1,, rat4,rat4,, 5.16>b,5.16>b
Output:
Sl.no Name1 Name2 Dis From Type item Animal Code
2 qw wsa 12 23 car Case CAT1,Dog p.12>a
23 as swe 34 2 Bus Case1 Dog1 N.12>a
23 ks awe 35 . Bike Case1 rat4 5.16>b
with sed
$ sed -E 's/\t(.*),\1/\t\1/g;s/,+\t/\t/g' file | column -ts$'\t'
Sl.no Name1 Name2 Dis From Type item Animal Code
2 qw wsa 12 23 car Case CAT1,Dog p.12>a
23 as swe 34 2 Bus Case1 Dog1 N.12>a
23 ks awe 35 . Bike Case1 rat4 5.16>b