Convert HTML hyperlink code to Markdown (MD) in pure AppleScript?

Solution 1:

Here is one example:

set htmlString to "This is a link: <a href=\"https://duck.com\">link</a>"

set mdString to do shell script "/usr/bin/sed -E -e 's|<a href=\"|[link](|g' -e 's|\">link</a>|)|g' <<< " & htmlString's quoted form

Result:

"This is a link: [link](https://duck.com)"

This can also be down without the use of the do shell script command, as in this example:

set htmlString to "This is a link: <a href=\"https://duck.com\">link</a>"

set htmlString to findAndReplaceInText(htmlString, "<a href=\"", "[link](")
set htmlString to findAndReplaceInText(htmlString, "\">link</a>", ")")

on findAndReplaceInText(theText, theSearchString, theReplacementString)
    set AppleScript's text item delimiters to theSearchString
    set theTextItems to every text item of theText
    set AppleScript's text item delimiters to theReplacementString
    set theText to theTextItems as string
    set AppleScript's text item delimiters to ""
    return theText
end findAndReplaceInText

Result:

"This is a link: [link](https://duck.com)"

If This is a link: <a href="https://duck.com">link</a> is in a file or on the clipboard, the escaping is done automatically when assigning it to a variable. You then only need to escape the " in the sed command as shown in the example above.


Other examples:

If This is a link: <a href="https://duck.com">link</a> in in a file:

set htmlFile to "/path/to/filename/ext"
set htmlString to read htmlFile
set mdString to do shell script "/usr/bin/sed -E -e 's|<a href=\"|[link](|g' -e 's|\">link</a>|)|g' <<< " & htmlString's quoted form

Or, processing the file directly:

set htmlFile to "/path/to/filename.ext"
set mdString to do shell script "/usr/bin/sed -E -e 's|<a href=\"|[link](|g' -e 's|\">link</a>|)|g'" & space & htmlFile's quoted form

If This is a link: <a href="https://duck.com">link</a> is on the clipboard:

set htmlString to (the clipboard as text)
set mdString to do shell script "/usr/bin/sed -E -e 's|<a href=\"|[link](|g' -e 's|\">link</a>|)|g' <<< " & htmlString's quoted form

Note: The use of the findAndReplaceInText() handler can also be used in place of the do shell script command in these other examples.

  • See also, Manipulating Text in the Mac Automation Scripting Guide.