"Series objects are mutable and cannot be hashed" error
Shortly: gene_name[x]
is a mutable object so it cannot be hashed. To use an object as a key in a dictionary, python needs to use its hash value, and that's why you get an error.
Further explanation:
Mutable objects are objects which value can be changed.
For example, list
is a mutable object, since you can append to it. int
is an immutable object, because you can't change it. When you do:
a = 5;
a = 3;
You don't change the value of a
, you create a new object and make a
point to its value.
Mutable objects cannot be hashed. See this answer.
To solve your problem, you should use immutable objects as keys in your dictionary. For example: tuple
, string
, int
.
gene_name = no_headers.iloc[1:,[1]]
This creates a DataFrame because you passed a list of columns (single, but still a list). When you later do this:
gene_name[x]
you now have a Series object with a single value. You can't hash the Series.
The solution is to create Series from the start.
gene_type = no_headers.iloc[1:,0]
gene_name = no_headers.iloc[1:,1]
disease_name = no_headers.iloc[1:,2]
Also, where you have orph_dict[gene_name[x]] =+ 1
, I'm guessing that's a typo and you really mean orph_dict[gene_name[x]] += 1
to increment the counter.