Regex to split camel case

Solution 1:

My guess is replacing /([A-Z])/ with /([a-z])([A-Z])/ and ' $1' with '$1 $2'

"MyCamelCaseString"
    .replace(/([a-z])([A-Z])/g, '$1 $2');

/([a-z0-9])([A-Z])/ for numbers counting as lowercase characters

console.log("MyCamelCaseStringID".replace(/([a-z0-9])([A-Z])/g, '$1 $2'))

Solution 2:

"MyCamelCaseString".replace(/([a-z](?=[A-Z]))/g, '$1 ')

outputs:

"My Camel Case String"

Solution 3:

If you want an array of lower case words:

"myCamelCaseString".split(/(?=[A-Z])/).map(s => s.toLowerCase());

If you want a string of lower case words:

"myCamelCaseString".split(/(?=[A-Z])/).map(s => s.toLowerCase()).join(' ');

If you want to separate the words but keep the casing:

"myCamelCaseString".replace(/([a-z])([A-Z])/g, '$1 $2')

Solution 4:

Sometime camelCase strings include abbreviations, for example:

PDFSplitAndMergeSamples
PDFExtractorSDKSamples
PDFRendererSDKSamples
BarcodeReaderSDKSamples

And in this case the following function will work, it splits the string leaving abbreviations as separate strings:

function SplitCamelCaseWithAbbreviations(s){
   return s.split(/([A-Z][a-z]+)/).filter(function(e){return e});
}

Example:

function SplitCamelCaseWithAbbreviations(s){
   return s.split(/([A-Z][a-z]+)/).filter(function(e){return e});
}

console.log(SplitCamelCaseWithAbbreviations('PDFSplitAndMergeSamples'));
console.log(SplitCamelCaseWithAbbreviations('PDFExtractorSDKSamples'));
console.log(SplitCamelCaseWithAbbreviations('PDFRendererSDKSamples'));
console.log(SplitCamelCaseWithAbbreviations('BarcodeReaderSDKSamples'));

Solution 5:

I found that none of the answers for this question really worked in all cases and also not at all for unicode strings, so here's one that does everything, including dash and underscore notation splitting.

let samples = [
  "ThereIsWay_too  MuchCGIInFilms These-days",
  "UnicodeCanBeCAPITALISEDTooYouKnow",
  "CAPITALLetters at the StartOfAString_work_too",
  "As_they_DoAtTheEND",
  "BitteWerfenSie-dieFußballeInDenMüll",
  "IchHabeUberGesagtNichtÜber",
  "2BeOrNot2Be",
  "ICannotBelieveThe100GotRenewed. It-isSOOOOOOBad"
];

samples.forEach(sample => console.log(sample.replace(/([^[\p{L}\d]+|(?<=[\p{Ll}\d])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}[\p{Ll}\d])|(?<=[\p{L}\d])(?=\p{Lu}[\p{Ll}\d]))/gu, '-').toUpperCase()));

If you don't want numbers treated as lower case letters, then:

let samples = [
  "2beOrNot2Be",
  "ICannotBelieveThe100GotRenewed. It-isSOOOOOOBad"
];

samples.forEach(sample => console.log(sample.replace(/([^\p{L}\d]+|(?<=\p{L})(?=\d)|(?<=\d)(?=\p{L})|(?<=[\p{Ll}\d])(?=\p{Lu})|(?<=\p{Lu})(?=\p{Lu}\p{Ll})|(?<=[\p{L}\d])(?=\p{Lu}\p{Ll}))/gu, '-').toUpperCase()));