ŷhat

What's new in ggplot-0.4?

by Yhat

ggplot is a graphics package for Python that aims to approximate R's ggplot2 package in both usage and aesthetics.

This is a post summarising the latest fixes and enhancements in the ggplot-0.4 release.

Tidying up our mess

The positive reaction to ggplot from the onset was, candidly, a bit overwhelming! We'll be the first ones to admit that the first cut was a little...err...rough. But we've been hard at work incorporating new features and fixes and are extremely enthusiastic about the progress and interest in the project.

A big thank you to everyone submitting pull requests! We are deeply appreciative, and we're doing our best to keep up!

Facets

One of the most obvious deficiencies of the inital version of ggplot was the faceting implementation. In the original blog post, a few of the facet_wrap and facet_grid plots were flat out missing certains variables (sorry about it!).

ggplot-0.4 makes facets a little prettier, fixes issues with assigning the wrong colors, shapes, and other aesthetics, and provides the ability to do some basic scaling when using facet_grid.

from ggplot import *
p = ggplot(aes(x='price'), data=diamonds)
p + geom_histogram() + facet_wrap("cut")

p = ggplot(aes(x='wt', y='mpg'), data=mtcars)
p + geom_point() + facet_grid("cyl", "gear", scales="free_y")

ggplot(aes(x='carat', y='price', colour='cut'), data=diamonds) + \
    geom_point() + facet_wrap("clarity")

We're still working on legends for facets, but that's coming soon!

Themes

Another big improvement is the overall look and feel of the graphics. Originally we were using a .matplotlibrc file that didn't give the user the abiltity to customize the style of their plots.

With version 0.4, however, ggplot supports proper themes! Hats off to Jan Schulz for tackling this one and building out the key plumbing involved.

ggplot(aes(x='date', y='beef'), data=meat) + \
    geom_line() + \
    theme_bw()

And of course no theme implementation would be complete without an xkcd style!

ggplot(aes(x='date', y='beef'), data=meat) + \
    geom_line() + \
    theme_xkcd()

New geoms

We've been able to port over geoms pretty quickly.

Thank you Eric Chiang & Justin Haynes for adding:

  • geom_step
  • geom_text
  • geom_tile
random_walk = pd.DataFrame({
    "x": np.arange(100),
    "y": np.cumsum(np.random.choice([-1, 1], 100))
})
ggplot(aes(x='x', y='y'), data=random_walk) + \
    geom_step()

ggplot(aes(x='wt', y='mpg', label='name'), data=mtcars) + \
    geom_text()

df = pd.DataFrame({
    'x': ['a', 'b', 'c', 'a'],
    'y': [3, 2, 1, 2],
    'fill': np.random.random(4)
})
print ggplot(aes(x='x', y='y', fill='fill'), data=df) + \
    geom_tile() + \
    xlab('X Label') + \
    ylab('Y Label') + \
    ggtitle('This is geom_tile!\n')

Multiple ggplot objects in 1 plot

For more advanced plots, you inevitably need to be able to work with multiple data frames at the same time. Just add in a geom that contains a reference to your data and customize the aesthetics! We're still working out the legends so we'll thank you for your patience ahead of time.

random_walk1 = pd.DataFrame({
  "x": np.arange(100),
  "y": np.cumsum(np.random.choice([-1, 1], 100))
})
random_walk2 = pd.DataFrame({
  "x": np.arange(100),
  "y": np.cumsum(np.random.choice([-1, 1], 100))
})
ggplot(aes(x='x', y='y'), data=random_walk1) + \
    geom_step() + \
    geom_step(aes(x='x', y='y'), data=random_walk2)

Color Scales

One of my favorite parts of ggplot2 for R is the color scaling. We've added a basic implementation of scale_color_gradient and scale_color_manual.

ggplot(aes(x='wt', y='mpg', color='mpg'), data=mtcars) + \
    geom_point() + \
    scale_colour_gradient2(low="coral", high="steelblue")

ggplot(aes(x='drat', y='mpg', color='wt'), data=mtcars) + \
    geom_point() + \
    scale_colour_gradient(low="white", mid="blue", high="black")

We're working on scale_color_brewer. Look for it in the next release.

color_list = [
    '#FFAAAA', 
    '#ff5b00',
    '#c760ff', 
    '#f43605', 
    '#00FF00',
    '#0000FF', 
    '#4c9085'
]
lng = pd.melt(meat, ['date'])
ggplot(lng, aes('date', 'value', color='variable')) + \
    geom_point(size=3) + \
    scale_colour_manual(values=color_list)

ggsave

Another really helpful utility we've added is the ggsave command. It let's you save a plot to a .png file. We're going to be adding PDF support soon as well!

p = ggplot(aes(x='price'), data=diamonds) + geom_histogram() + ggtitle('My Diamond Histogram')
ggsave(p, "my_diamond_histogram.png")

And now we can open the .png file we just created in our current working directory

In [5]: ! open ./my_diamond_histogram.png

What's in the pipeline

  • scale_color_brewer
  • additional methods for stat_smooth
  • facet legends
  • full IPython Notebook support (curently legends get chopped off :( )
  • saving multiple plots to a PDF
  • geom_errorbar

Interested in ŷhat? Learn More