Search for text in a file, then rename the file with that text

We have a number of job reports from a system which save on our Linux server with the prefix O*****.TXT. Within the file is the report's process ID.

What script can I run to search all 'O****.TXT files in a directory, and rename each file with the process ID from each file? eg:

Search OAAJWNZN.TXT for ProcID:0000019324, rename OAAJWNZN.TXT to 0000019324.TXT.

Then once this file is renamed, the script moves onto the next file in the directory and does the same.


An awk command that reads the file and renames it:

awk -F '[:)]' '/ProcID/{printf "echo mv %s %s.txt\n", FILENAME, $2 | "/bin/sh"; nextfile}' O*.TXT

Since the ProcID line is actually:

JOB 04508907 (ProcID:0000019324) START AT 22.12.2016 / 09:10:45

To extract the I,P we need to split on both : and ), which will give us 0000019324 as the second field.

Remove echo to actually execute the move. Use this only if the filenames and ProcID don't contain spaces or special characters.

This command gets the ProcID by splitting on : to get the second field, then uses it and the special variable FILENAME to construct the command. GNU awk's documentation suggests the printf ... | "/bin/sh" method. Then we skip to the next file.


You can use for, rename and sed in the directory where the files are:

for i in O*.TXT; do rename -n "s/.*\.TXT/$(sed -nr 's/.*( |^)ProcID:([0-9]+)( |$).*/\2/p' "$i").TXT/" "$i"; done

Remove -n after rename after testing to actually rename the files

Explanation

  • for i in O*TXT; do do something with all the matching files
  • rename -n just report what the new names will be, don't actually rename them yet (remove -n after testing)
  • "s/old/new/" replace old with new (use double instead of single quotes so we can pass variables with $ expansion)
  • $(command) command substitution - pass output of command to something else
  • .*\.TXT match any characters followed by literal . and then TXT
  • sed invoke our friend sed to read the file and extract things from it
  • -n don't print anything until we ask for it
  • -r use ERE so we don't have to escape () or +
  • ( |^) space or start of line
  • ProcID:([0-9]+) at least one digit after ProcID... save the number for later
  • ( |$) space or end of line
  • \2 back reference to pattern saved earlier with ()
  • p print the result after editing

  1. Loop through the files.
  2. You can use grep for finding the ProcID
  3. Move the file to the new filename

Do a cp instead of mv if you want to keep the old files or you are not sure.

for file in 0*.TXT; do
  procid=$(grep -Po "(?<=ProcID:)[0-9]*" "$file");
  new_filename="/outputpath/${procid}.TXT";
  if [ ! -f "$newfilename" ]; then
    mv "$file" "$new_filename";
  fi;
done