How to access/change properties of individual points on matplotlib scatter plot
Is there a way I could modify properties of individual points on matplotlib scatter plot for example make certain points invisible or change theirsize/shape ?
Let's consider example data set using pandas.DataFrame():
import pandas as pd
import matplotlib.pyplot as plt
import random
df = pd.DataFrame()
df['name'] = ['cat', 'dog', 'bird', 'fish', 'frog']
df['id'] = [1, 1, 1, 2, 2]
df['x'] = [random.randint(-10, 10) for n in range(5)]
df['y'] = [random.randint(-10, 10) for n in range(5)]
Let's plot it on scatter plot:
sc = plt.scatter(df['x'].tolist(), df['y'].tolist())
plt.show()
#easy-peasy
Plot was generated.
Let's say I want all datapoints that have id=1 in df removed from the existing plot (for example with button click). By removed I don't necessary mean deleted. Set-invisible or something will be ok. In general I'm interested in a way to iterate over each point existing on the plot and do something with it.
EDIT #1
using inspect
module I noticed that sc
plot object holds property named sc._offsets
.
Those seems to be 2D numpy arrays holding coordinates of datapoints on the scatter plot (for 2D plot).
This _offsets
property consists of 2 components? .. should I say?: "data" (2D array of coordinates) and "mask" (2D aray of bool values: in this case = False) and "fill value" which seems to be of no concern to me.
I've managed to remove points of choice from the scatter plot by deleting _offsets elements at certain indexes like this:
sc._offsets = numpy.delete(sc._offsets, [0, 1, 3], axis=0)
and then re-drawing the plot:
sc.figure.canvas.draw()
Since values in 'id' column of the dataframe and coordinates in sc._offsets are aligned, I can remove coordinates by index where 'id' value was (for example) = 1. This does what I wanted cause original dataframe with dataset remains intact so I can re-create points on scatter plot on demand.
I think I could use the "mask" to somehow hide/show points of choice on scatter plot but I don't yet know how. I'm investigating it.
SOLVED
Answer is setting mask of numpy.core.ma.MaskedArray
that lies under sc._offsets.mask
property of matplotlib scatter plot.
This can be done in the following way both during plot generation and after plot has been generated, in interactive mode:
#before change:
#sc._offsets.mask = [[False, False], [False, False], [False, False], [False, False], [False, False]]
sc._offsets.mask = [[1, 1], [1, 1], [1, 1], [0, 0], [0, 0]]
#after change:
#sc._offsets.mask = [[True, True], [True, True], [True, True], [False, False], [False, False]]
#then re-draw plot
sc.figure.canvas.draw() #docs say that it's better to use draw_idle() but I don't see difference
Setting to True value coressponding with index of point you would like to exclude from plot, removes that particular point from the plot. It does not "deletes" it. Points can be restored by setting bool values back to "False". Note that it is 2D array so passing simple: [1, 1, 1, 0, 0]
will not do and you need to take into account both x and y coordinates of the plot.
Consult numpy docs for details: https://numpy.org/doc/stable/reference/maskedarray.generic.html#accessing-the-mask
I'll edit if something comes up. Thank you all for help.