YAML: Do I need quotes for strings in YAML?
After a brief review of the YAML cookbook cited in the question and some testing, here's my interpretation:
- In general, you don't need quotes.
- Use quotes to force a string, e.g. if your key or value is
10
but you want it to return a String and not a Fixnum, write'10'
or"10"
. - Use quotes if your value includes special characters, (e.g.
:
,{
,}
,[
,]
,,
,&
,*
,#
,?
,|
,-
,<
,>
,=
,!
,%
,@
,\
). - Single quotes let you put almost any character in your string, and won't try to parse escape codes.
'\n'
would be returned as the string\n
. - Double quotes parse escape codes.
"\n"
would be returned as a line feed character. - The exclamation mark introduces a method, e.g.
!ruby/sym
to return a Ruby symbol.
Seems to me that the best approach would be to not use quotes unless you have to, and then to use single quotes unless you specifically want to process escape codes.
Update
"Yes" and "No" should be enclosed in quotes (single or double) or else they will be interpreted as TrueClass and FalseClass values:
en:
yesno:
'yes': 'Yes'
'no': 'No'
While Mark's answer nicely summarizes when the quotes are needed according to the YAML language rules, I think what many of the developers/administrators are asking themselves, when working with strings in YAML, is "what should be my rule of thumb for handling the stings?"
It may sound subjective, but the number of rules you have to remember, if you want to use the quotes only when they are really needed as per the language spec, is somewhat excessive for such a simple thing as specifying one of the most common datatypes. Don't get me wrong, you will eventually remember them when working with YAML regularly, but what if you use it occasionally, and you didn't develop automatism for writing YAML? Do you really want to spend time remembering all the rules just to specify the string correctly?
The whole point of the "rule of thumb" is to save the cognitive resource and to handle a common task without thinking about it. Our "CPU" time can arguably be used for something more useful then handling the strings correctly.
From this - pure practical - perspective, I think the best rule of thumb is to single quote the strings. The rationale behind it:
- Single quoted strings work for all scenarios, except when you need to use escape sequences.
- The only special character you have to handle within single-quoted string is the single quote itself.
These are just 2 rules to remember for some occasional YAML user, minimizing the cognitive effort.
There have been some great answers to this question. However, I would like to extend them and provide some context from the new official YAML v1.2.2 specification (released October 1st 2021) which is the "true source" to all things considering YAML.
There are three different styles that can be used to represent strings, each of them with their own (dis-)advantages:
YAML provides three flow scalar styles: double-quoted, single-quoted and plain (unquoted). Each provides a different trade-off between readability and expressive power.
Double-quoted style:
- The double-quoted style is specified by surrounding
"
indicators. This is the only style capable of expressing arbitrary strings, by using\
escape sequences. This comes at the cost of having to escape the\
and"
characters.
Single-quoted style:
- The single-quoted style is specified by surrounding
'
indicators. Therefore, within a single-quoted scalar, such characters need to be repeated. This is the only form of escaping performed in single-quoted scalars. In particular, the\
and"
characters may be freely used. This restricts single-quoted scalars to printable characters. In addition, it is only possible to break a long single-quoted line where a space character is surrounded by non-spaces.
Plain (unquoted) style:
- The plain (unquoted) style has no identifying indicators and provides no form of escaping. It is therefore the most readable, most limited and most context sensitive style. In addition to a restricted character set, a plain scalar must not be empty or contain leading or trailing white space characters. It is only possible to break a long plain line where a space character is surrounded by non-spaces. Plain scalars must not begin with most indicators, as this would cause ambiguity with other YAML constructs. However, the
:
,?
and-
indicators may be used as the first character if followed by a non-space “safe” character, as this causes no ambiguity.
TL;DR
With that being said, according to the official YAML specification one should:
- Whenever applicable use the unquoted style since it is the most readable.
- Use the single-quoted style (
'
) if characters such as"
and\
are being used inside the string to avoid escpaing them and therefore improve readability. - Use the double-quoted style (
"
) when the first two options aren't sufficient, i.e. in scenarios where more complex line breaks are required or non-printable characters are needed.