How to capitalize the first letter in a String in Ruby
The upcase
method capitalizes the entire string, but I need to capitalize only the first letter.
Also, I need to support several popular languages, like German and Russian.
How do I do it?
It depends on which Ruby version you use:
Ruby 2.4 and higher:
It just works, as since Ruby v2.4.0 supports Unicode case mapping:
"мария".capitalize #=> Мария
Ruby 2.3 and lower:
"maria".capitalize #=> "Maria"
"мария".capitalize #=> мария
The problem is, it just doesn't do what you want it to, it outputs мария
instead of Мария
.
If you're using Rails there's an easy workaround:
"мария".mb_chars.capitalize.to_s # requires ActiveSupport::Multibyte
Otherwise, you'll have to install the unicode gem and use it like this:
require 'unicode'
Unicode::capitalize("мария") #=> Мария
Ruby 1.8:
Be sure to use the coding magic comment:
#!/usr/bin/env ruby
puts "мария".capitalize
gives invalid multibyte char (US-ASCII)
, while:
#!/usr/bin/env ruby
#coding: utf-8
puts "мария".capitalize
works without errors, but also see the "Ruby 2.3 and lower" section for real capitalization.
capitalize first letter of first word of string
"kirk douglas".capitalize
#=> "Kirk douglas"
capitalize first letter of each word
In rails:
"kirk douglas".titleize
=> "Kirk Douglas"
OR
"kirk_douglas".titleize
=> "Kirk Douglas"
In ruby:
"kirk douglas".split(/ |\_|\-/).map(&:capitalize).join(" ")
#=> "Kirk Douglas"
OR
require 'active_support/core_ext'
"kirk douglas".titleize
Unfortunately, it is impossible for a machine to upcase/downcase/capitalize properly. It needs way too much contextual information for a computer to understand.
That's why Ruby's String
class only supports capitalization for ASCII characters, because there it's at least somewhat well-defined.
What do I mean by "contextual information"?
For example, to capitalize i
properly, you need to know which language the text is in. English, for example, has only two i
s: capital I
without a dot and small i
with a dot. But Turkish has four i
s: capital I
without a dot, capital İ
with a dot, small ı
without a dot, small i
with a dot. So, in English 'i'.upcase # => 'I'
and in Turkish 'i'.upcase # => 'İ'
. In other words: since 'i'.upcase
can return two different results, depending on the language, it is obviously impossible to correctly capitalize a word without knowing its language.
But Ruby doesn't know the language, it only knows the encoding. Therefore it is impossible to properly capitalize a string with Ruby's built-in functionality.
It gets worse: even with knowing the language, it is sometimes impossible to do capitalization properly. For example, in German, 'Maße'.upcase # => 'MASSE'
(Maße is the plural of Maß meaning measurement). However, 'Masse'.upcase # => 'MASSE'
(meaning mass). So, what is 'MASSE'.capitalize
? In other words: correctly capitalizing requires a full-blown Artificial Intelligence.
So, instead of sometimes giving the wrong answer, Ruby chooses to sometimes give no answer at all, which is why non-ASCII characters simply get ignored in downcase/upcase/capitalize operations. (Which of course also reads to wrong results, but at least it's easy to check.)
Well, just so we know how to capitalize only the first letter and leave the rest of them alone, because sometimes that is what is desired:
['NASA', 'MHz', 'sputnik'].collect do |word|
letters = word.split('')
letters.first.upcase!
letters.join
end
=> ["NASA", "MHz", "Sputnik"]
Calling capitalize
would result in ["Nasa", "Mhz", "Sputnik"]
.