Have you ever considered scatter plots to be a tad boring?

Do you love the patterning of cathedral stained glass?

Well, besides aesthetics, there are cases in which a scatter plot might not be the best option to represent data.

What if you are more interested in the spatial distribution of some variable? Some points might be close together, while some could be further apart. This adds confusion to the variable distribution. Our brains are very good at seeing patterns, but it might be easy for them to ignore lonely points if they are further away.

A Voronoi plot is a neat way for distributing borders based on points. The basic premise is to draw borders that are equidistant to the points.

See it for yourself! which plot makes it easy to see the distribution of color? The Voronoi plot utilizes all available canvas while the normal scatter plot is left mostly blank.

Here is the lovely lovely code:

import numpy as np import matplotlib.pyplot as plt from scipy.spatial import Voronoi, voronoi_plot_2d import seaborn.apionly as sns #only used for the iris dataset def scatter_vor(data,x,y,c,cmap='jet',ax=None): """creates a Voronoi plot using a dataframe and an indication of the x,y, and color columns. Args: data (pandas Dataframe): The first parameter. x: The column name for the x factor. y: The column name for the y factor. c: The column name for the color factor. cmap (str): matplotlib color map name, default is jet. ax (axes): axes to draw on. if non is given, it will create one. Returns: ax: axis object with Voronoi plot """ cmap = plt.get_cmap(cmap) points= data[[x,y]].values color = data[c].values color = color - np.nanmin(color) color = color / np.nanmax(color) # compute Voronoi tesselation xLIM = (data[x].min(),data[x].max()) xDist= xLIM[1]-xLIM[0] yLIM = (data[y].min(),data[y].max()) yDist= yLIM[1]-xLIM[0] xDist,yDist=[i*10 for i in [xDist,yDist]] #this is a small hack to force the Voronoi diagram to be bounded added_points=[] steps = 20 added_points += [(xLIM[0]-xDist,i)for i in np.linspace(yLIM[0]-yDist,yLIM[1]+yDist,num = steps)] added_points += [(xLIM[1]+xDist,i)for i in np.linspace(yLIM[0]-yDist,yLIM[1]+yDist,num = steps)] added_points += [(i,yLIM[0]-yDist)for i in np.linspace(xLIM[0]-xDist,xLIM[1]+xDist,num = steps)] added_points += [(i,yLIM[1]+yDist)for i in np.linspace(xLIM[0]-xDist,xLIM[1]+xDist,num = steps)] added_colors = np.array([sum(color)/len(color)]*len(added_points)) added_points = np.array(added_points) points = np.concatenate([points,added_points],axis=0) color = np.concatenate([color ,added_colors],axis=0) vor = Voronoi(points, qhull_options='Qbb Qc Qx') # plot if ax==None: fig,ax=plt.subplots(1) ax.set_ylim(yLIM) ax.set_xlim(xLIM) ax.set_ylabel(y) ax.set_xlabel(x) for n,region in enumerate(vor.regions): if not -1 in region: c=color[np.where(vor.point_region == n)[0][0]] c=cmap(c) polygon = [vor.vertices[i] for i in region] ax.fill(*zip(*polygon),color=c) return ax

Now that we have the function, lets hit it with some data:

fig,(ax1,ax2)= plt.subplots(1,2,figsize=(16,5)) iris = sns.load_dataset('iris') scatter_vor(iris,x='sepal_length',y='sepal_width',c='petal_length',ax=ax1) ax.axis('off') iris.plot.scatter(x='sepal_length',y='sepal_width',c='petal_length',cmap='jet',ax=ax2) plt.show()