How can I compare two columns in Excel to highlight words that don't match?

(I'm using Microsoft excel 2010)

Lts say I have a list of phrases in both column A and column B (see screen shot below)

enter image description here

What I would like to happen whether it be with a macro, VBA or formula is:

If there is a word in any cell in column A that isn't any of the words in any cell in column B to highlight that word in red.

For example: in cell A9 the word "buy" is there, but the word buy isn't mentioned anywhere in column B so i would like the word buy to highlight in red.

How can I accomplish this?

(I think a macro/vba would be the best option but I have no idea how to create it, or even if its possible.)


Solution 1:

Insert the following code into a VBA module.

Sub highlightWords()
Application.ScreenUpdating = False
Dim rng2HL As Range, rngCheck As Range, dictWords As Object
Dim a() As Variant, b() As Variant, wordlist As Variant, wordStart As Long
Set r = Selection
'Change the addresses below to match your data.
Set rng2HL = Range("A1:A9")
Set rngCheck = Range("B1:B9")
a = rng2HL.Value
b = rngCheck.Value
Set dictWords = CreateObject("Scripting.Dictionary")
'Load unique words from second column into a dictionary for easy checking
For i = LBound(b, 1) To UBound(b, 1)
    wordlist = Split(b(i, 1), " ")
    For j = LBound(wordlist) To UBound(wordlist)
        If Not dictWords.Exists(wordlist(j)) Then
            dictWords.Add wordlist(j), wordlist(j)
        End If
    Next j
Next i
'Reset range to highlight to all black font.
rng2HL.Font.ColorIndex = 1
'Check words one by one against dictionary.
For i = LBound(a, 1) To UBound(a, 1)
    wordlist = Split(a(i, 1), " ")
    For j = LBound(wordlist) To UBound(wordlist)
        If Not dictWords.Exists(wordlist(j)) Then
            wordStart = InStr(a(i, 1), wordlist(j))
            'Change font color of word to red.
            rng2HL.Cells(i).Characters(wordStart, Len(wordlist(j))).Font.ColorIndex = 3
        End If
    Next j
Next i
Application.ScreenUpdating = True
End Sub

Just be sure to change the addresses in the lines below to match your worksheet.

Set rng2HL = Range("A1:A9")
Set rngCheck = Range("B1:B9")

Results:

enter image description here

EDIT:

Since you added requirements in the comments below, I modified the code to also print out the list of red-highlighted phrases in column C. If you want this list elsewhere, you'll have to adjust the address in the last section of the code. I also improved the highlighting code -- I noticed that it would do weird things like only highlight the first instance of a non-matching word.

Sub highlightWords()
Application.ScreenUpdating = False
Dim rng2HL As Range, rngCheck As Range, dictWords As Object, dictRed As Object
Dim a() As Variant, b() As Variant, wordlist As Variant, wordStart As Long, phraseLen As Integer
Dim re As Object, consec As Integer, tmpPhrase As String
'Change the addresses below to match your data.
Set rng2HL = Range("A1:A9")
Set rngCheck = Range("B1:B9")
a = rng2HL.Value
b = rngCheck.Value
Set dictWords = CreateObject("Scripting.Dictionary")
'Load unique words from second column into a dictionary for easy checking
For i = LBound(b, 1) To UBound(b, 1)
    wordlist = Split(b(i, 1), " ")
    For j = LBound(wordlist) To UBound(wordlist)
        If Not dictWords.Exists(wordlist(j)) Then
            dictWords.Add wordlist(j), wordlist(j)
        End If
    Next j
Next i
Erase b
'Reset range to highlight to all black font.
rng2HL.Font.ColorIndex = 1
Set dictRed = CreateObject("Scripting.Dictionary")
Set re = CreateObject("vbscript.regexp")
'Check words one by one against dictionary.
For i = LBound(a, 1) To UBound(a, 1)
    wordlist = Split(a(i, 1), " ")
    consec = 0
    tmpPhrase = ""
    For j = LBound(wordlist) To UBound(wordlist)
        If Not dictWords.Exists(wordlist(j)) Then
            consec = consec + 1
            If consec > 1 Then tmpPhrase = tmpPhrase & " "
            tmpPhrase = tmpPhrase & wordlist(j)
        Else
            If consec > 0 Then
                If Not dictRed.Exists(tmpPhrase) Then dictRed.Add tmpPhrase, tmpPhrase
                re.Pattern = "(^| )" & tmpPhrase & "( |$)"
                Set matches = re.Execute(a(i, 1))
                For Each m In matches
                    wordStart = m.FirstIndex
                    phraseLen = m.Length
                    'Change font color of word to red.
                    rng2HL.Cells(i).Characters(wordStart + 1, phraseLen).Font.ColorIndex = 3
                Next m
                consec = 0
                tmpPhrase = ""
            End If
        End If
    Next j
    'Highlight any matches that appear at the end of the line
    If consec > 0 Then
        If Not dictRed.Exists(tmpPhrase) Then dictRed.Add tmpPhrase, tmpPhrase
        re.Pattern = "(^" & tmpPhrase & "| " & tmpPhrase & ")( |$)"
        Set matches = re.Execute(a(i, 1))
        For Each m In matches
            wordStart = m.FirstIndex
            phraseLen = m.Length
            'Change font color of word to red.
            rng2HL.Cells(i).Characters(wordStart + 1, phraseLen).Font.ColorIndex = 3
        Next m
    End If
Next i
Erase a
'Output list of unique red phrases to column C.
redkeys = dictRed.Keys
For k = LBound(redkeys) To UBound(redkeys)
    Range("C1").Offset(k, 0).Value = redkeys(k)
Next k
Erase redkeys
Application.ScreenUpdating = True
End Sub

new example