Voronoi time

Have you ever considered scatter plots to be a tad boring?
Do you love the patterning of cathedral stained glass?
sourced from pixabay.com

Well, besides aesthetics, there are cases in which a scatter plot might not be the best option to represent data.
What if you are more interested in the spatial distribution of some variable? Some points might be close together, while some could be further apart. This adds confusion to the variable distribution. Our brains are very good at seeing patterns, but it might be easy for them to ignore lonely points if they are further away.
A Voronoi plot is a neat way for distributing borders based on points. The basic premise is to draw borders that are equidistant to the points.

See it for yourself! which plot makes it easy to see the distribution of color? The Voronoi plot utilizes all available canvas while the normal scatter plot is left mostly blank.

Here is the lovely lovely code:

import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi, voronoi_plot_2d
import seaborn.apionly as sns #only used for the iris dataset


def scatter_vor(data,x,y,c,cmap='jet',ax=None):
    """creates a Voronoi plot using a dataframe and an indication of the x,y, and color columns.

    Args:
        data (pandas Dataframe): The first parameter.
        x: The column name for the x factor.
        y: The column name for the y factor.
        c: The column name for the color factor.
        cmap (str): matplotlib  color map name, default is jet.
        ax (axes): axes to draw on. if non is given, it will create one.

    Returns:
        ax: axis object with Voronoi plot


    """
    cmap = plt.get_cmap(cmap)
    points= data[[x,y]].values
    
    color = data[c].values
    color = color - np.nanmin(color)
    color = color / np.nanmax(color)
    # compute Voronoi tesselation
    xLIM = (data[x].min(),data[x].max())
    xDist= xLIM[1]-xLIM[0]
    yLIM = (data[y].min(),data[y].max())
    yDist= yLIM[1]-xLIM[0]
    xDist,yDist=[i*10 for i in [xDist,yDist]]

    #this is a small hack to force the Voronoi diagram to be bounded
    added_points=[]
    steps = 20
    added_points += [(xLIM[0]-xDist,i)for i in np.linspace(yLIM[0]-yDist,yLIM[1]+yDist,num = steps)]
    added_points += [(xLIM[1]+xDist,i)for i in np.linspace(yLIM[0]-yDist,yLIM[1]+yDist,num = steps)]
    added_points += [(i,yLIM[0]-yDist)for i in np.linspace(xLIM[0]-xDist,xLIM[1]+xDist,num = steps)]
    added_points += [(i,yLIM[1]+yDist)for i in np.linspace(xLIM[0]-xDist,xLIM[1]+xDist,num = steps)]
    added_colors  = np.array([sum(color)/len(color)]*len(added_points))
    added_points  = np.array(added_points)

    points = np.concatenate([points,added_points],axis=0)
    color  = np.concatenate([color ,added_colors],axis=0)
    vor = Voronoi(points, qhull_options='Qbb Qc Qx')
    # plot
    if ax==None:
        fig,ax=plt.subplots(1)

    ax.set_ylim(yLIM)
    ax.set_xlim(xLIM)
    ax.set_ylabel(y)
    ax.set_xlabel(x)

    for n,region in enumerate(vor.regions):
        if not -1 in region:
            c=color[np.where(vor.point_region == n)[0][0]]
            c=cmap(c)
            polygon = [vor.vertices[i] for i in region]
            ax.fill(*zip(*polygon),color=c)

    return ax

Now that we have the function, lets hit it with some data:

fig,(ax1,ax2)= plt.subplots(1,2,figsize=(16,5))
iris = sns.load_dataset('iris')
scatter_vor(iris,x='sepal_length',y='sepal_width',c='petal_length',ax=ax1)
ax.axis('off')
iris.plot.scatter(x='sepal_length',y='sepal_width',c='petal_length',cmap='jet',ax=ax2)
plt.show()

Feel free to plot to your heart’s content.

Pin on Pinterest0Buffer this pageEmail this to someoneShare on Facebook0Share on Google+0Flattr the authorDigg thisPrint this pageTweet about this on TwitterShare on LinkedIn0Share on Reddit0Share on StumbleUpon0Share on Tumblr0

Leave a Reply