Get list of all fonts containing a specific character
Solution 1:
You can use the command line tool albatross (https://gitlab.com/islandoftex/albatross/ ; also included in tex distributions like texlive or miktex).
If you run albatross ਠ
, you'll get a list of all your fonts which contain the character:
.---.-.| | |--.---.-.| |_.----.-----.-----.-----.
| _ || | _ | _ || _| _| _ |__ --|__ --|
|___._||__|_____|___._||____|__| |_____|_____|_____|
Unicode code point [A20] mapping to ਠ
┌─────────────────────────────────────────────────────────────────────────────┐
│ Font name │
├─────────────────────────────────────────────────────────────────────────────┤
│ .LastResort │
├─────────────────────────────────────────────────────────────────────────────┤
│ Arial Unicode MS │
├─────────────────────────────────────────────────────────────────────────────┤
│ Gurmukhi MN │
├─────────────────────────────────────────────────────────────────────────────┤
│ Gurmukhi MT │
├─────────────────────────────────────────────────────────────────────────────┤
│ Gurmukhi Sangam MN │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee Bold │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee ExtraBold │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee ExtraLight │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee Light │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee Medium │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee Regular │
├─────────────────────────────────────────────────────────────────────────────┤
│ Mukta Mahee,MuktaMahee SemiBold │
└─────────────────────────────────────────────────────────────────────────────┘
Solution 2:
It's still not clear to me how this is done by macOS itself, but in the meantime here's what I ended up doing.
The solutions I found all were of the following form:
- Get a list of all fonts available.
- Loop over the list to find fonts that contain the selected character.
Listing all fonts
As at this question, there are two approaches (plus a third one I found here):
system_profiler SPFontsDataType
to which you can add-xml
to get output in XML,fc-list
which can take a pattern (:
is the empty pattern that matches all fonts) and a format specifier.Instally
python-fontconfig
, then runimport fontconfig; fontconfig.query()
to get a list of font paths.
Comparing the two approaches (I wrote this before I had noticed the third one) is interesting:
Speed: On my computer and for my set of fonts,
fc-list
takes about 24 seconds the first time and 0.04 seconds each time after that, whilesystem_profiler
consistently takes about 3 seconds each time.Comprehensiveness: On my current system,
system_profiler
lists 702 fonts whilefc-list
lists 770: all those 702 plus 68 more. On the one hand,system_profiler
seems to be the "official" way, and matches the fonts visible in Font Book, the ones that show up in "Font Variation" in the character/symbol viewer (as in the question), the menu in TextEdit etc. On the other hand, at least some of the fonts that it misses are genuinely usable fonts. This includes not just the 5 fonts/Library/Fonts/{Athelas.ttc,Charter.ttc,Marion.ttc,Seravek.ttc,SuperClarendon.ttc}
about which you can find some confusing pages online (e.g. this and this), but also/Library/Fonts/{DIN Alternate Bold.ttf,DIN Condensed Bold.ttf,Iowan Old Style.ttc}
and 57 of the 177 Noto Sans fonts I have installed on my system. For example, I have Noto Sans Brahmi installed but this font doesn't show up in Font Book or in "Font Variation" when I search for a Brahmi letter (say 𑀅), but it does get used in TextEdit (and displayed in my browser). Whatever the reason for this weirdness, I'm happy that I can get the full list withfc-list
.Ease of use: with either method a little bit of parsing the output is required. With
fc-list
I can specify the format (e.g.fc-list --format="%{family}\n%{file}\n%{lang}\n\n"
but I couldn't find a reference for the names of the fields!); withsystem_profiler
I can either just grep forLocation:
or output to XML and parse the XML (examples with xml.etree.ElementTree, with plistlib).
Does this font cover this character?
However we get the list of fonts, next we have to check whether a character is covered in a specific font (given by name or path). Again, the ways I found:
Use one of the FreeType bindings. For Python, there is freetype-py but I couldn't figure out in a few minutes how to use it.
Dump the font's cmap table with ttx/fonttools, then loop over the table. This is certainly doable and I've used such dumping many times (one can just
ttx foo.ttf
to get thefoo.ttx
xml file which is even human-readable), but for this use-case (searching over all fonts), it's not the best as it takes seconds per font.-
Look up the cmap table from a library written for that:
use Font::TTF::Font
in Perl,from fontTools.ttLib import TTFont
in Python -- this would be something like:def has_char(font_path, c): """Does font at `font_path` contain the character `c`?""" from fontTools.ttLib import TTFont from fontTools.unicode import Unicode try: font = TTFont(font_path) for table in font['cmap'].tables: for char_code, glyph_name in table.cmap.items(): if char_code == ord(c): font.close() return True except Exception as e: print('Error while looking at font %s: %s' % (font_path, e)) pass return False
Unfortunately it fails on far too many fonts to be useful.
If you use the python-fontconfig solution, there's a
has_char
, used like:font = fontconfig.FcFont(path); return font.has_char(c)
Summary
I ended up using the solution from here, which I've lightly rewritten to keep it minimal:
#!/usr/bin/env python
def find_fonts(c):
"""Finds fonts containing the (Unicode) character c."""
import fontconfig
fonts = fontconfig.query()
for path in sorted(fonts):
font = fontconfig.FcFont(path)
if font.has_char(c):
yield path
if __name__ == '__main__':
import sys
search = sys.argv[1]
char = search.decode('utf-8') if isinstance(search, bytes) else search
for path in find_fonts(char):
print(path)
Example usage:
% python3 find_fonts.py 'ಠ'
/Library/Fonts/Arial Unicode.ttf
/Library/Fonts/Kannada MN.ttc
/Library/Fonts/Kannada MN.ttc
/Library/Fonts/Kannada Sangam MN.ttc
/Library/Fonts/Kannada Sangam MN.ttc
/System/Library/Fonts/LastResort.ttf
/Users/shreevatsa/Library/Fonts/Kedage-b.TTF
/Users/shreevatsa/Library/Fonts/Kedage-i.TTF
/Users/shreevatsa/Library/Fonts/Kedage-n.TTF
/Users/shreevatsa/Library/Fonts/Kedage-t.TTF
/Users/shreevatsa/Library/Fonts/NotoSansKannada-Bold.ttf
/Users/shreevatsa/Library/Fonts/NotoSansKannada-Regular.ttf
/Users/shreevatsa/Library/Fonts/NotoSansKannadaUI-Bold.ttf
/Users/shreevatsa/Library/Fonts/NotoSansKannadaUI-Regular.ttf
/Users/shreevatsa/Library/Fonts/NotoSerifKannada-Bold.ttf
/Users/shreevatsa/Library/Fonts/NotoSerifKannada-Regular.ttf
/Users/shreevatsa/Library/Fonts/akshar.ttf
(Works with both python3
and python2
, whichever python
you have. Takes about 29 seconds on my computer, for the set of fonts I have installed.)