Two or more quantities on same scatter plot seaborn
I am trying to do a scatter plot for the following data with all columns in one plot.
Actually I imported this data from csv file and saved in a dataframe df_inv
and then I saved it in variable tips
tips = df_inv
sns.scatterplot(data=tips, x=df_inv.index, y = "a")
plt.show()
I want to add columns b, c, and d on the same plot but I am unable to find the right code. I have tried y = ["a", "b", "c", "d", "e"]
but it didn't worked. I want my result in the following format ideally not all circles but some x, *, and other shapes.
please help me achieving the target.
Another solution, if you don't want to reshape the dataframe, will be calling sns.scatterplot several times, each time with a different column you'd like to plot in the y parameter. Then, each column will be plotted on the same axes the first call will generate. You can create lists of colors and markers to be used at each call, and also manually create a legend. Here is the example code, using all columns in the df_inv dataframe (with some randomly generated data).
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import random
import matplotlib.lines as mlines
df_inv = pd.DataFrame({'a': random.sample(range(300, 400), 5),
'b': random.sample(range(100, 200), 5),
'c': random.sample(range(40, 90), 5)},
index=range(1,6))
markers = ['o', 'x', 'd']
colors = ['purple', 'cyan', 'green']
legend_handles = []
for i, col_name in enumerate(df_inv.columns):
sns.scatterplot(data=df_inv, x=df_inv.index, y=col_name,
marker=markers[i], color=colors[i], s=100) # s = marker size
legend_handles.append(mlines.Line2D([], [], color=colors[i], marker=markers[i],
linestyle='None', markersize=8, label=col_name))
plt.ylabel('Value')
plt.xlabel('Index')
plt.grid()
plt.legend(handles=legend_handles, bbox_to_anchor=(1.02, 1), title='Column')
plt.tight_layout()
plt.show()
Code result:
You could re-shape your data in a different dataframe with pandas.melt
:
df_inv = df_inv.reset_index()
columns = ['index', 'a', 'b', 'c', 'd']
df_to_plot = df_inv[columns]
df_to_plot = pd.melt(frame = df_to_plot,
id_vars = 'index',
var_name = 'column_name',
value_name = 'value')
In this way, you will get something like:
index column_name value
0 0 a 315
1 1 a 175
2 2 a 65
3 3 a 370
4 4 a 419
5 0 b 173
6 1 b 206
7 2 b 271
8 3 b 463
9 4 b 419
10 0 c 58
...
Now you can finally plot with a single line of code:
sns.scatterplot(ax = ax, data = df_to_plot, x = 'index', y = 'value', style = 'column_name', hue = 'column_name')
Complete code
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
N = 5
df_inv = pd.DataFrame()
df_inv['a'] = np.random.randint(low = 50, high = 500, size = N)
df_inv['b'] = np.random.randint(low = 50, high = 500, size = N)
df_inv['c'] = np.random.randint(low = 50, high = 500, size = N)
df_inv['d'] = np.random.randint(low = 50, high = 500, size = N)
df_inv = df_inv.reset_index()
columns = ['index', 'a', 'b', 'c', 'd']
df_to_plot = df_inv[columns]
df_to_plot = pd.melt(frame = df_to_plot,
id_vars = 'index',
var_name = 'column_name',
value_name = 'value')
fig, ax = plt.subplots()
sns.scatterplot(ax = ax, data = df_to_plot, x = 'index', y = 'value', style = 'column_name', hue = 'column_name')
plt.show()