Find character with most occurrences in string?
Solution 1:
input.GroupBy(x => x).OrderByDescending(x => x.Count()).First().Key
Notes:
- if you need this to work on ancient (2.0) versions of .Net consider LinqBridge. If you can't use C# 3.0 (targeting .Net 2.0) you probably better off with other solutions due to missing lambda support. Another .Net 2.0+ option is covered in xanatos answer.
- for the case of
"aaaabbbb"
only one of those will be returned (thanks xanatos for comment). If you need all of the elements with maximum count, use Albin's solution instead. - due to sorting this if O(n log n) solution. If you need better than that - find Max value by linear search instead of sorting first which will give O(n). See LINQ: How to perform .Max() on a property of all objects in a collection and return the object with maximum value
Solution 2:
This because someone asked for a 2.0 version, so no LINQ.
Dictionary<char, int> dict = new Dictionary<char, int>();
int max = 0;
foreach (char c in "abbbbccccd")
{
int i;
dict.TryGetValue(c, out i);
i++;
if (i > max)
{
max = i;
}
dict[c] = i;
}
foreach (KeyValuePair<char, int> chars in dict)
{
if (chars.Value == max)
{
Console.WriteLine("{0}: {1}", chars.Key, chars.Value);
}
}
Instead this for the LINQ version. It will extract paired "bests" (aaaabbbb == a, b). It WON'T work if str == String.Empty.
var str = "abbbbccccd";
var res = str.GroupBy(p => p).Select(p => new { Count = p.Count(), Char = p.Key }).GroupBy(p => p.Count, p => p.Char).OrderByDescending(p => p.Key).First();
foreach (var r in res) {
Console.WriteLine("{0}: {1}", res.Key, r);
}
Solution 3:
string testString = "abbbbccd";
var charGroups = (from c in testString
group c by c into g
select new
{
c = g.Key,
count = g.Count(),
}).OrderByDescending(c => c.count);
foreach (var group in charGroups)
{
Console.WriteLine(group.c + ": " + group.count);
}
Solution 4:
Inspired from Stephen's answer, almost the same:
public static IEnumerable<T> Mode<T>(this IEnumerable<T> input)
{
var dict = input.ToLookup(x => x);
if (dict.Count == 0)
return Enumerable.Empty<T>();
var maxCount = dict.Max(x => x.Count());
return dict.Where(x => x.Count() == maxCount).Select(x => x.Key);
}
var modes = "".Mode().ToArray(); //returns { }
var modes = "abc".Mode().ToArray(); //returns { a, b, c }
var modes = "aabc".Mode().ToArray(); //returns { a }
var modes = "aabbc".Mode().ToArray(); //returns { a, b }
Update: Did a quick benchmarking of this answer vs Jodrell's answer (release build, debugger detached, oh yes)
source = "";
iterations = 1000000
result:
this - 280 ms
Jodrell's - 900 ms
source = "aabc";
iterations = 1000000
result:
this - 1800 ms
Jodrell's - 3200 ms
source = fairly large string - 3500+ char
iterations = 10000
result:
this - 3200 ms
Jodrell's - 3000 ms
Solution 5:
EDIT 3
Here is my last answer which I think (just) shades Nawfal's for performance on longer sequences.
However, given the reduced complexity of Nawfal's answer, and its more universal performance, especially in relation to the question, I'd choose that.
public static IEnumerable<T> Mode<T>(
this IEnumerable<T> source,
IEqualityComparer<T> comparer = null)
{
var counts = source.GroupBy(t => t, comparer)
.Select(g => new { g.Key, Count = g.Count() })
.ToList();
if (counts.Count == 0)
{
return Enumerable.Empty<T>();
}
var maxes = new List<int>(5);
int maxCount = 1;
for (var i = 0; i < counts.Count; i++)
{
if (counts[i].Count < maxCount)
{
continue;
}
if (counts[i].Count > maxCount)
{
maxes.Clear();
maxCount = counts[i].Count;
}
maxes.Add(i);
}
return maxes.Select(i => counts[i].Key);
}
EDIT 2
EDIT
If you want an efficient generic solution, that accounts for the fact that multiple items could have the same frequency, start with this extension,
IOrderedEnumerable<KeyValuePair<int, IEnumerable<T>>>Frequency<T>(
this IEnumerable<T> source,
IComparer<T> comparer = null)
{
return source.GroupBy(t => t, comparer)
.GroupBy(
g => g.Count(),
(k, s) => new KeyValuePair<int, IEnumerable<T>>(
k,
s.Select(g => g.First())))
.OrderByDescending(f => f.Key);
}
This extension works in all of the following scenarios
var mostFrequent = string.Empty.Frequency().FirstOrDefault();
var mostFrequent = "abbbbccd".Frequency().First();
or,
var mostFrequent = "aaacbbbcdddceee".Frequency().First();
Note that mostFrequent
is a KeyValuePair<int, IEnumerable<char>>
.
If so minded you could simplify this to another extension,
public static IEnumerable<T> Mode<T>(
this IEnumerable<T> source,
IEqualityComparer<T> comparer = null)
{
var mode = source.GroupBy(
t => t,
(t, s) => new { Value = t, Count = s.Count() }, comparer)
.GroupBy(f => f.Count)
.OrderbyDescending(g => g.Key).FirstOrDefault();
return mode == null ? Enumerable.Empty<T>() : mode.Select(g => g.Value);
}
which obviously could be used thus,
var mostFrequent = string.Empty.Mode();
var mostFrequent = "abbbbccd".Mode();
var mostFrequent = "aaacbbbcdddceee".Mode();
here, mostFrequent
is an IEnumerable<char>
.