Find character with most occurrences in string?

Solution 1:

input.GroupBy(x => x).OrderByDescending(x => x.Count()).First().Key

Notes:

  • if you need this to work on ancient (2.0) versions of .Net consider LinqBridge. If you can't use C# 3.0 (targeting .Net 2.0) you probably better off with other solutions due to missing lambda support. Another .Net 2.0+ option is covered in xanatos answer.
  • for the case of "aaaabbbb" only one of those will be returned (thanks xanatos for comment). If you need all of the elements with maximum count, use Albin's solution instead.
  • due to sorting this if O(n log n) solution. If you need better than that - find Max value by linear search instead of sorting first which will give O(n). See LINQ: How to perform .Max() on a property of all objects in a collection and return the object with maximum value

Solution 2:

This because someone asked for a 2.0 version, so no LINQ.

Dictionary<char, int> dict = new Dictionary<char, int>();

int max = 0;

foreach (char c in "abbbbccccd")
{
    int i;
    dict.TryGetValue(c, out i);
    i++;
    if (i > max)
    {
        max = i;
    }
    dict[c] = i;
}

foreach (KeyValuePair<char, int> chars in dict)
{
    if (chars.Value == max)
    {
        Console.WriteLine("{0}: {1}", chars.Key, chars.Value);
    }
}

Instead this for the LINQ version. It will extract paired "bests" (aaaabbbb == a, b). It WON'T work if str == String.Empty.

var str = "abbbbccccd";

var res = str.GroupBy(p => p).Select(p => new { Count = p.Count(), Char = p.Key }).GroupBy(p => p.Count, p => p.Char).OrderByDescending(p => p.Key).First();

foreach (var r in res) {
    Console.WriteLine("{0}: {1}", res.Key, r);
}

Solution 3:

string testString = "abbbbccd";
var charGroups = (from c in testString
                    group c by c into g
                    select new
                    {
                        c = g.Key,
                        count = g.Count(),
                    }).OrderByDescending(c => c.count);
foreach (var group in charGroups)
{
    Console.WriteLine(group.c + ": " + group.count);
}

Solution 4:

Inspired from Stephen's answer, almost the same:

public static IEnumerable<T> Mode<T>(this IEnumerable<T> input)
{
    var dict = input.ToLookup(x => x);
    if (dict.Count == 0)
        return Enumerable.Empty<T>();
    var maxCount = dict.Max(x => x.Count());
    return dict.Where(x => x.Count() == maxCount).Select(x => x.Key);
}

var modes = "".Mode().ToArray(); //returns { }
var modes = "abc".Mode().ToArray(); //returns { a, b, c }
var modes = "aabc".Mode().ToArray(); //returns { a }
var modes = "aabbc".Mode().ToArray(); //returns { a, b }

Update: Did a quick benchmarking of this answer vs Jodrell's answer (release build, debugger detached, oh yes)

source = "";

iterations = 1000000

result:

this - 280 ms
Jodrell's - 900 ms

source = "aabc";

iterations = 1000000

result:

this - 1800 ms
Jodrell's - 3200 ms

source = fairly large string - 3500+ char

iterations = 10000

result:

this - 3200 ms
Jodrell's - 3000 ms

Solution 5:

EDIT 3

Here is my last answer which I think (just) shades Nawfal's for performance on longer sequences.

However, given the reduced complexity of Nawfal's answer, and its more universal performance, especially in relation to the question, I'd choose that.

public static IEnumerable<T> Mode<T>(
    this IEnumerable<T> source,
    IEqualityComparer<T> comparer = null)
{
    var counts = source.GroupBy(t => t, comparer)
        .Select(g => new { g.Key, Count = g.Count() })
        .ToList();

    if (counts.Count == 0)
    {
        return Enumerable.Empty<T>();
    }

    var maxes = new List<int>(5);
    int maxCount = 1;

    for (var i = 0; i < counts.Count; i++)
    {
        if (counts[i].Count < maxCount)
        {
            continue;
        }

        if (counts[i].Count > maxCount)
        {
            maxes.Clear();
            maxCount = counts[i].Count;
        }

        maxes.Add(i);
    }

    return maxes.Select(i => counts[i].Key);
}

EDIT 2


EDIT



If you want an efficient generic solution, that accounts for the fact that multiple items could have the same frequency, start with this extension,

IOrderedEnumerable<KeyValuePair<int, IEnumerable<T>>>Frequency<T>(
    this IEnumerable<T> source,
    IComparer<T> comparer = null)
{
    return source.GroupBy(t => t, comparer)
        .GroupBy(
            g => g.Count(),
            (k, s) => new KeyValuePair<int, IEnumerable<T>>(
                k,
                s.Select(g => g.First())))
        .OrderByDescending(f => f.Key);
}

This extension works in all of the following scenarios

var mostFrequent = string.Empty.Frequency().FirstOrDefault();

var mostFrequent = "abbbbccd".Frequency().First();

or,

var mostFrequent = "aaacbbbcdddceee".Frequency().First();

Note that mostFrequent is a KeyValuePair<int, IEnumerable<char>>.


If so minded you could simplify this to another extension,

public static IEnumerable<T> Mode<T>(
    this IEnumerable<T> source,
    IEqualityComparer<T> comparer = null)
{
    var mode = source.GroupBy(
            t => t,
            (t, s) => new { Value = t, Count = s.Count() }, comparer)
        .GroupBy(f => f.Count)
        .OrderbyDescending(g => g.Key).FirstOrDefault();

    return mode == null ? Enumerable.Empty<T>() : mode.Select(g => g.Value);
}

which obviously could be used thus,

var mostFrequent = string.Empty.Mode();

var mostFrequent = "abbbbccd".Mode();

var mostFrequent = "aaacbbbcdddceee".Mode();

here, mostFrequent is an IEnumerable<char>.