Remove diacritics (accents) only in the first parameter of awk command

I have a main shell that reformat text from a source to a target file :

Source :

Libelléacte;CHAR(20);Libellé de l'acte;

Target :

 * Libellé de l'acte.
   
    05 Libelléacte PIC X(20).

What i want i to remove diacritics only for the first parameter. I tried to convert my file to ascii//TRANSLIT//IGNORE with iconv command but it removes all diacritics and this is not what i want.

This is my reformat code :

for f in $TEMP_DIRECTORY 

do 
    b=$(basename "$f")
    echo "Generating $f file in copy.."; 
    awk -F ';' '
toupper($1)=="TABLE" {printf "01 %s.\n\n", $2; next} 
toupper($1)=="EXTRACTION" {printf "01 %s.\n\n", $2; next} 
{
  result = $2
  if ($2 ~ /^Numérique [0-9]+(\.[0-9]+)?$/) {
    nr=split($2,a,"[ .]")
    result = "PIC 9(" a[2] ")"
    if (nr == 3) {
      result = result ".v9(" a[3] ")"
    }    
  }
  sub(/CHAR/,"PIC X", result);
  sub(/Char/,"PIC X", result);
  sub(/char/,"PIC X", result);
  sub(/Entier/,"PIC 9(9)", result);
  sub(/entier/,"PIC 9(9)", result);
  gsub(/user/,"user-field");
  gsub(/User/,"user-field");
  gsub("/","_");
  printf "   * %s.\n\n     05 %s %s.\n\n", $3, $1, result;
}' "$f" > "$TARGET_DIRECTORY/${b%%.*}.cpy"
done

I need to change only the first parameter so i can have this output :

 * Libellé de l'acte.

    05 Libelleacte PIC X(20).

Solution 1:

First I would use cut to get the first parameter before the semicolon, then iconv to transliterate into ASCII, and finally tr to remove the non-alphanumerics by deleting the [:punct:] POSIX character class.

cat test | cut -d \; -f 1 | iconv -f UTF-8  -t ASCII//TRANSLIT | tr -d "[:punct:]"

Solution 2:

My original answer below is calling iconv once per line of input, this would be much more efficient:

$ iconv -f utf8 -t ascii//ignore file |
    awk 'BEGIN{FS=OFS=";"} NR==FNR{a[NR]=$1; next} {$1=a[FNR]; print}' - file
Libellacte;CHAR(20);Libellé de l'acte;

or if you prefer:

$ paste -d';' <(cut -d';' -f1 file | iconv -f utf8 -t ascii//ignore) <(cut -d';' -f2- file)
Libellacte;CHAR(20);Libellé de l'acte;

or if you always know the number of input fields:

$ iconv -f utf8 -t ascii//ignore file | paste -d';' - file | cut -d';' -f1,6-
Libellacte;CHAR(20);Libellé de l'acte;

Lots of options.

Change the iconv command to be whatever you already know it should be (given you say in your question I tried to convert my file to ascii//TRANSLIT//IGNORE with iconv command but it removes all diacritics) if the above isn't the right call to it.


Original answer:

#!/usr/bin/env bash
while IFS=';' read -r f1 rest; do
    printf '%s;%s\n' "$(iconv -f utf8 -t ascii//ignore <<<"$f1")" "$rest"
done < file
Libellacte;CHAR(20);Libellé de l'acte;