How to replace text randomly from file?
How can I randomly replace specific strings in one text file with strings from another file? For example:
file1.txt(file has more than 200 lines):
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
file2.txt(file has 10-20 lines):
@adress1.com
@adress2.com
@adress3.com
@adress4.com
@adress5.com
output.txt:
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
You could implement this algorithm:
- Load the content of
file2.txt
to an array - For each line in
file1.txt
:- Extract the name part
- Get a random address
- Print the output correctly formatted
Like this:
mapfile -t addresses < file2.txt
while IFS='' read -r orig || [[ -n "$orig" ]]; do
((index = RANDOM % ${#addresses[@]}))
name=${orig%%@*}
echo "$name${addresses[index]}"
done < file1.txt
(Special thanks to @GlennJackman and @dessert for the improvements.)
If you really want a random selection, then here's one way using awk
:
awk '
BEGIN{FS="@"; OFS=""}
NR==FNR{a[NR]=$0; n++; next}
{$2=a[int(1 + n * rand())]; print}
' file2.txt file1.txt
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
OTOH if you want a random permutation of the addresses, I'd suggest something like
paste -d '' <(cut -d'@' -f1 file1.txt) <(sort -R file2.txt)
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
You could use shuf
(you might need to sudo apt install shuf
) to shuffle the lines of the second file and then use them to replace:
$ awk -F'@' 'NR==FNR{a[NR]=$1;next}{print a[FNR]"@"$2} ' file1 <(shuf file2)
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
shuf
simply randomizes the order of its input lines. The awk
command there will first read all of file1 (NR==FNR
will only be true while the first file is being read), and saves the second field (fields are defined by @
, so this is the domain) in the associative array a
whose values are the domains and whose keys are the line numbers. Then, when we get to the next file, it will simply print whatever was stored in a
for this line number, along with what's in file 2 for the same line number.
Note that this assumes both files have exactly the same number of lines and isn't actually being "random", since it will not allow anything to be repeated. But that looks like what you wanted to ask for.