How can I compare two columns in Excel to highlight words that don't match?
(I'm using Microsoft excel 2010)
Lts say I have a list of phrases in both column A and column B (see screen shot below)
What I would like to happen whether it be with a macro, VBA or formula is:
If there is a word in any cell in column A that isn't any of the words in any cell in column B to highlight that word in red.
For example: in cell A9 the word "buy" is there, but the word buy isn't mentioned anywhere in column B so i would like the word buy to highlight in red.
How can I accomplish this?
(I think a macro/vba would be the best option but I have no idea how to create it, or even if its possible.)
Solution 1:
Insert the following code into a VBA module.
Sub highlightWords()
Application.ScreenUpdating = False
Dim rng2HL As Range, rngCheck As Range, dictWords As Object
Dim a() As Variant, b() As Variant, wordlist As Variant, wordStart As Long
Set r = Selection
'Change the addresses below to match your data.
Set rng2HL = Range("A1:A9")
Set rngCheck = Range("B1:B9")
a = rng2HL.Value
b = rngCheck.Value
Set dictWords = CreateObject("Scripting.Dictionary")
'Load unique words from second column into a dictionary for easy checking
For i = LBound(b, 1) To UBound(b, 1)
wordlist = Split(b(i, 1), " ")
For j = LBound(wordlist) To UBound(wordlist)
If Not dictWords.Exists(wordlist(j)) Then
dictWords.Add wordlist(j), wordlist(j)
End If
Next j
Next i
'Reset range to highlight to all black font.
rng2HL.Font.ColorIndex = 1
'Check words one by one against dictionary.
For i = LBound(a, 1) To UBound(a, 1)
wordlist = Split(a(i, 1), " ")
For j = LBound(wordlist) To UBound(wordlist)
If Not dictWords.Exists(wordlist(j)) Then
wordStart = InStr(a(i, 1), wordlist(j))
'Change font color of word to red.
rng2HL.Cells(i).Characters(wordStart, Len(wordlist(j))).Font.ColorIndex = 3
End If
Next j
Next i
Application.ScreenUpdating = True
End Sub
Just be sure to change the addresses in the lines below to match your worksheet.
Set rng2HL = Range("A1:A9")
Set rngCheck = Range("B1:B9")
Results:
EDIT:
Since you added requirements in the comments below, I modified the code to also print out the list of red-highlighted phrases in column C. If you want this list elsewhere, you'll have to adjust the address in the last section of the code. I also improved the highlighting code -- I noticed that it would do weird things like only highlight the first instance of a non-matching word.
Sub highlightWords()
Application.ScreenUpdating = False
Dim rng2HL As Range, rngCheck As Range, dictWords As Object, dictRed As Object
Dim a() As Variant, b() As Variant, wordlist As Variant, wordStart As Long, phraseLen As Integer
Dim re As Object, consec As Integer, tmpPhrase As String
'Change the addresses below to match your data.
Set rng2HL = Range("A1:A9")
Set rngCheck = Range("B1:B9")
a = rng2HL.Value
b = rngCheck.Value
Set dictWords = CreateObject("Scripting.Dictionary")
'Load unique words from second column into a dictionary for easy checking
For i = LBound(b, 1) To UBound(b, 1)
wordlist = Split(b(i, 1), " ")
For j = LBound(wordlist) To UBound(wordlist)
If Not dictWords.Exists(wordlist(j)) Then
dictWords.Add wordlist(j), wordlist(j)
End If
Next j
Next i
Erase b
'Reset range to highlight to all black font.
rng2HL.Font.ColorIndex = 1
Set dictRed = CreateObject("Scripting.Dictionary")
Set re = CreateObject("vbscript.regexp")
'Check words one by one against dictionary.
For i = LBound(a, 1) To UBound(a, 1)
wordlist = Split(a(i, 1), " ")
consec = 0
tmpPhrase = ""
For j = LBound(wordlist) To UBound(wordlist)
If Not dictWords.Exists(wordlist(j)) Then
consec = consec + 1
If consec > 1 Then tmpPhrase = tmpPhrase & " "
tmpPhrase = tmpPhrase & wordlist(j)
Else
If consec > 0 Then
If Not dictRed.Exists(tmpPhrase) Then dictRed.Add tmpPhrase, tmpPhrase
re.Pattern = "(^| )" & tmpPhrase & "( |$)"
Set matches = re.Execute(a(i, 1))
For Each m In matches
wordStart = m.FirstIndex
phraseLen = m.Length
'Change font color of word to red.
rng2HL.Cells(i).Characters(wordStart + 1, phraseLen).Font.ColorIndex = 3
Next m
consec = 0
tmpPhrase = ""
End If
End If
Next j
'Highlight any matches that appear at the end of the line
If consec > 0 Then
If Not dictRed.Exists(tmpPhrase) Then dictRed.Add tmpPhrase, tmpPhrase
re.Pattern = "(^" & tmpPhrase & "| " & tmpPhrase & ")( |$)"
Set matches = re.Execute(a(i, 1))
For Each m In matches
wordStart = m.FirstIndex
phraseLen = m.Length
'Change font color of word to red.
rng2HL.Cells(i).Characters(wordStart + 1, phraseLen).Font.ColorIndex = 3
Next m
End If
Next i
Erase a
'Output list of unique red phrases to column C.
redkeys = dictRed.Keys
For k = LBound(redkeys) To UBound(redkeys)
Range("C1").Offset(k, 0).Value = redkeys(k)
Next k
Erase redkeys
Application.ScreenUpdating = True
End Sub