Creating Animated NBA Shot Charts in Matplotlib

In [1]:
import goldsberry
import pandas as pd

import requests
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

from urllib import urlretrieve
from matplotlib import animation
from matplotlib.offsetbox import  OffsetImage
from matplotlib.patches import Circle, Rectangle, Arc

Here is a funtion I'm borrowing from a blog post by Savvas Tjortjoglou. It's possible using Matplotlib's Circle, Rectangle, and Arc patches to draw the entire basketball court and that is exactly what this function does. Savvas's function maps the real life dimesions for all of these shapes to the proper coordinates and sizes required, and has saved me a ton of time!

In [2]:
def draw_court(ax=None, color='black', lw=2, outer_lines=False):
    # If an axes object isn't provided to plot onto, just get current one
    if ax is None:
        ax = plt.gca()

    # Create the various parts of an NBA basketball court

    # Create the basketball hoop
    # Diameter of a hoop is 18" so it has a radius of 9", which is a value
    # 7.5 in our coordinate system
    hoop = Circle((0, 0), radius=7.5, linewidth=lw, color=color, fill=False)

    # Create backboard
    backboard = Rectangle((-30, -7.5), 60, -1, linewidth=lw, color=color)

    # The paint
    # Create the outer box 0f the paint, width=16ft, height=19ft
    outer_box = Rectangle((-80, -47.5), 160, 190, linewidth=lw, color=color,
                          fill=False)
    # Create the inner box of the paint, widt=12ft, height=19ft
    inner_box = Rectangle((-60, -47.5), 120, 190, linewidth=lw, color=color,
                          fill=False)

    # Create free throw top arc
    top_free_throw = Arc((0, 142.5), 120, 120, theta1=0, theta2=180,
                         linewidth=lw, color=color, fill=False)
    # Create free throw bottom arc
    bottom_free_throw = Arc((0, 142.5), 120, 120, theta1=180, theta2=0,
                            linewidth=lw, color=color, linestyle='dashed')
    # Restricted Zone, it is an arc with 4ft radius from center of the hoop
    restricted = Arc((0, 0), 80, 80, theta1=0, theta2=180, linewidth=lw,
                     color=color)

    # Three point line
    # Create the side 3pt lines, they are 14ft long before they begin to arc
    corner_three_a = Rectangle((-220, -47.5), 0, 140, linewidth=lw,
                               color=color)
    corner_three_b = Rectangle((220, -47.5), 0, 140, linewidth=lw, color=color)
    # 3pt arc - center of arc will be the hoop, arc is 23'9" away from hoop
    # I just played around with the theta values until they lined up with the 
    # threes
    three_arc = Arc((0, 0), 475, 475, theta1=22, theta2=158, linewidth=lw,
                    color=color)

    # Center Court
    center_outer_arc = Arc((0, 422.5), 120, 120, theta1=180, theta2=0,
                           linewidth=lw, color=color)
    center_inner_arc = Arc((0, 422.5), 40, 40, theta1=180, theta2=0,
                           linewidth=lw, color=color)

    # List of the court elements to be plotted onto the axes
    court_elements = [hoop, backboard, outer_box, inner_box, top_free_throw,
                      bottom_free_throw, restricted, corner_three_a,
                      corner_three_b, three_arc, center_outer_arc,
                      center_inner_arc]

    if outer_lines:
        # Draw the half court line, baseline and side out bound lines
        outer_lines = Rectangle((-250, -47.5), 500, 470, linewidth=lw,
                                color=color, fill=False)
        court_elements.append(outer_lines)

    # Add the court elements onto the axes
    for element in court_elements:
        ax.add_patch(element)

    return ax

In my quest for NBA knowledge I stumbled across the py-goldsberry module, which you can find at https://github.com/bradleyfay/py-Goldsberry. This module is useful because it contains an easy way to look up a players PERSON_ID, the unique identifier for every player listed on nba.com/stats.

In [3]:
playersCurrent = pd.DataFrame(goldsberry.PlayerList(2015))
playersCurrent.head()
Out[3]:
DISPLAY_LAST_COMMA_FIRST FROM_YEAR GAMES_PLAYED_FLAG PERSON_ID PLAYERCODE ROSTERSTATUS TEAM_ABBREVIATION TEAM_CITY TEAM_CODE TEAM_ID TEAM_NAME TO_YEAR
0 Acy, Quincy 2012 Y 203112 quincy_acy 1 SAC Sacramento kings 1610612758 Kings 2015
1 Adams, Jordan 2014 Y 203919 jordan_adams 1 MEM Memphis grizzlies 1610612763 Grizzlies 2015
2 Adams, Steven 2013 Y 203500 steven_adams 1 OKC Oklahoma City thunder 1610612760 Thunder 2015
3 Afflalo, Arron 2007 Y 201167 arron_afflalo 1 NYK New York knicks 1610612752 Knicks 2015
4 Ajinca, Alexis 2008 Y 201582 alexis_ajinca 1 NOP New Orleans pelicans 1610612740 Pelicans 2015

Indexing our dataframe on our target player's name returns his ID, and below you'll see our player of interest for this post is TJ Warren, a player for the Pheonix Suns. I've decided to use TJ to generate our animation because his scoring output has a relatively high variance along with his usage from game to game. His limited usage is partly due to the fact that he is a never player in the league, and is one of the reasons he makes for a good player to analyze. In our animation I plan to display an entire game's shot chart in each frame, and this type of visualization will be really good at comparing his shooting from one day to the next. Natually his usuage flucuation should be very clear in the product of our animation, but we will see for sure shortly.

In [4]:
tjid = playersCurrent[playersCurrent.DISPLAY_LAST_COMMA_FIRST=='Warren, TJ']['PERSON_ID']
print tjid
431    203933
Name: PERSON_ID, dtype: int64
In [5]:
season = '2015-16'

Below is the string that we need to pass to requests to get back our desired data. I've taken the liberty of using the .format string method to insert our variables, this will help us out later if we plan to look up different players or seasons and don't feel like scanning the string for the place to do so.

In [6]:
shot_chart_url = 'http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPAR'\
                'AMS={1}&ContextFilter=&ContextMeasure=FGA&DateFrom=&D'\
                'ateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Loca'\
                'tion=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&'\
                'PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID={0}&Plu'\
                'sMinus=N&Position=&Rank=N&RookieYear=&Season={1}&Seas'\
                'onSegment=&SeasonType=Regular+Season&TeamID=0&VsConferenc'\
                'e=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&sh'\
                'owZones=0'.format(str(tjid.values[0]), season)
In [7]:
print shot_chart_url
http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2015-16&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=203933&PlusMinus=N&Position=&Rank=N&RookieYear=&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&showZones=0
In [8]:
response = requests.get(shot_chart_url)
headers = response.json()['resultSets'][0]['headers']
shots = response.json()['resultSets'][0]['rowSet']
In [9]:
shot_df = pd.DataFrame(shots, columns=headers)
shot_df.head()
Out[9]:
GRID_TYPE GAME_ID GAME_EVENT_ID PLAYER_ID PLAYER_NAME TEAM_ID TEAM_NAME PERIOD MINUTES_REMAINING SECONDS_REMAINING ... ACTION_TYPE SHOT_TYPE SHOT_ZONE_BASIC SHOT_ZONE_AREA SHOT_ZONE_RANGE SHOT_DISTANCE LOC_X LOC_Y SHOT_ATTEMPTED_FLAG SHOT_MADE_FLAG
0 Shot Chart Detail 0021500014 74 203933 TJ Warren 1610612756 Phoenix Suns 1 5 45 ... Jump Shot 2PT Field Goal Mid-Range Left Side Center(LC) 16-24 ft. 17 -138 101 1 0
1 Shot Chart Detail 0021500014 103 203933 TJ Warren 1610612756 Phoenix Suns 1 4 0 ... Jump Shot 3PT Field Goal Above the Break 3 Left Side Center(LC) 24+ ft. 25 -237 92 1 1
2 Shot Chart Detail 0021500014 126 203933 TJ Warren 1610612756 Phoenix Suns 1 2 31 ... Layup Shot 2PT Field Goal Restricted Area Center(C) Less Than 8 ft. 1 -19 -1 1 0
3 Shot Chart Detail 0021500014 140 203933 TJ Warren 1610612756 Phoenix Suns 1 1 21 ... Jump Shot 2PT Field Goal Mid-Range Left Side(L) 16-24 ft. 18 -165 85 1 1
4 Shot Chart Detail 0021500014 150 203933 TJ Warren 1610612756 Phoenix Suns 1 0 28 ... Jump Shot 2PT Field Goal Mid-Range Right Side(R) 16-24 ft. 19 168 90 1 0

5 rows × 21 columns

In [10]:
games = pd.Series(shot_df['GAME_ID'].unique())

Using urllib.urlretrieve we can read in a picture of TJ Warren from the NBA website and overlay it onto our chart using the Matplotlib image ofset feature.

In [11]:
picture = urlretrieve("http://stats.nba.com/media/players/230x185/203933.png",
                                "201935.png")

tj_pic = plt.imread(picture[0])

plt.imshow(tj_pic)
plt.grid(False)
plt.tick_params(labelbottom=False, labelleft=False)
plt.show()

Below is what a shot chart looks like with all of TJ's shots for the entire season. While cool, it isn't what we want, as we can't tell the difference between a miss or a make and don't see the difference between games in his shooting efficacy.

In [12]:
sns.set_style('white')

plt.figure(figsize=(12,11))
plt.scatter(shot_df.LOC_X, shot_df.LOC_Y, c='r')
plt.title('TJ Warren 2015-2016 All Shots')
plt.grid(False)
draw_court()

plt.xlim(-250,250)
plt.ylim(422.5, -47.5)

plt.tick_params(labelbottom=False, labelleft=False)
plt.show()

Here is where is gets interesting. The best resource for I've found for animating plots has been the following blog post at https://jakevdp.github.io/blog/2012/08/18/matplotlib-animation-tutorial/. Since we want makes and misses to be plotted differently we need to create two seperate empty axes for them to be plotted on. If all goes according to plan the new axes will be redrawn each frame with the makes and the misses respectively.

The function below is what draws the starting frame, and we want to pass it empty data values. If you want to add any other elements like the game_text element I have added below, it needs to be instantiated in the init function as well as returned for it to be plotted properly.

def init():
    makes.set_data([], [])
    misses.set_data([], [])
    game_text.set_text('')
    return makes, misses, game_text

After the init function we need to define our animate function. Our animate function is what will generate each frame, and the argument i that it takes is the frame number of the animation. Since we want each game to be plotted on each subsequent frame I've created a pandas series containing each game id and am changing the index on each frame, i. We also are updating our make and misses axes with the corresponding shot coordinates for each game. I'm using the SHOT_MADE_FLAG column provided in our shot dataframe to differeniate between makes and misses for the purpose of our plot. We can also be fancy and update the game text each frame with the corresponding GameID, this will allow us to display which game is being plotted onscreen during its frame. Unfortunately GameIDs aren't mapped for this current season in the py-goldsberry model, which would've allowed us to display the date and other game information on the screen. Until I figure out a way to do that GameID will have to do. At the end of the function we return everything that we have changed.

def animate(i):
    make = shot_df[(shot_df.GAME_ID==games[i]) & (shot_df.SHOT_MADE_FLAG==1)][['LOC_X', 'LOC_Y']]
    miss = shot_df[(shot_df.GAME_ID==games[i]) & (shot_df.SHOT_MADE_FLAG==0)][['LOC_X', 'LOC_Y']]
    #for makes
    makes.set_data(make.LOC_X.values, make.LOC_Y.values)#, color='g', marker='o', s=150, alpha=.75)
    makes.set_markerfacecolor('white')
    makes.set_markeredgecolor('green')
    makes.set_markeredgewidth(3)
    #for misses
    misses.set_data(miss.LOC_X.values, miss.LOC_Y.values)#, color='g', marker='o', s=150, alpha=.75)
    #misses.set_xdata(miss.LOC_X.values)
    #misses.set_ydata(miss.LOC_Y.values)
    misses.set_markeredgecolor('r')
    misses.set_markeredgewidth(4)
    game_text.set_text('Player: ' + shot_df.loc[1]['PLAYER_NAME'] + ', GameID: ' + games[i])

    return makes, misses, game_text


All that is left now is to pass both our init and animated functions to animation.FuncAnimationm specify the number of frames and the frame internal that we need, call plt.show(), and viola! We have our animation! In my case I am going to call anim.save, and save the video in an HTML5 compatible format for embedding below.

anim = animation.FuncAnimation(fig, animate, init_func=init,
                               frames=len(games), interval=1000, blit=True)
In [21]:
fig = plt.figure(figsize=(12,11))
ax = fig.add_subplot(111, xlim=(-250,250), ylim=(422.5, -47.5))

plt.tick_params(labelbottom=False, labelleft=False)

#after creating ax we can draw the court
draw_court()
makes, = ax.plot([], [], 'o', markersize=20)
misses, = ax.plot([], [], 'x', markersize=15)
game_text = ax.text(80, 420, '')

##Adding TJ's Face to the bottom left corner
img = OffsetImage(tj_pic, zoom=0.6)
img.set_offset((130,98))
ax.add_artist(img)


def init():
    makes.set_data([], [])
    misses.set_data([], [])
    game_text.set_text('')
    return makes, misses, game_text

def animate(i):
    make = shot_df[(shot_df.GAME_ID==games[i]) & (shot_df.SHOT_MADE_FLAG==1)][['LOC_X', 'LOC_Y']]
    miss = shot_df[(shot_df.GAME_ID==games[i]) & (shot_df.SHOT_MADE_FLAG==0)][['LOC_X', 'LOC_Y']]
    #update markers for makes
    makes.set_data(make.LOC_X.values, make.LOC_Y.values)
    makes.set_markerfacecolor('green')
    makes.set_markeredgewidth(1)
    makes.set_markeredgecolor('green')
    makes.set_alpha(0.6)
    #update markers for misses
    misses.set_data(miss.LOC_X.values, miss.LOC_Y.values)
    misses.set_markeredgecolor('r')
    misses.set_markeredgewidth(3)
    misses.set_alpha(0.8)
    game_text.set_text('Player: ' + shot_df.loc[1]['PLAYER_NAME'] + ', GameID: ' + games[i])
    
    return makes, misses, game_text

anim = animation.FuncAnimation(fig, animate, init_func=init,
                               frames=len(games), interval=500, blit=True)

There are more elegant ways to display a video inline in a notebook but this was the easiest hack-ish way I found to get working. Once you have FFmpeg or MEncoder installed you can pass either writer as an argument to the anim.save function.

In [22]:
anim.save('tjchart.mp4', fps=1, writer='ffmpeg', extra_args=['-vcodec','libx264','-pix_fmt', 'yuv420p'])

from IPython.display import HTML
from base64 import b64encode
video = open("tjchart.mp4", "rb").read()
video_encoded = b64encode(video).decode('ascii')
video_tag = '<video controls alt="test" src="data:video/x-m4v;base64,{0}">'.format(video_encoded)
HTML(data=video_tag)
Animation.save using <class 'matplotlib.animation.FFMpegWriter'>
MovieWriter.run: running command: ffmpeg -f rawvideo -vcodec rawvideo -s 864x792 -pix_fmt rgba -r 1 -loglevel quiet -i pipe: -vcodec mpeg4 -vcodec libx264 -pix_fmt yuv420p -y tjchart.mp4
Out[22]:

It works beautifully! Each frame lasts for one second, which I thought was a good compromise for animation speed and readability. You can use this type of animation technique to animate player movements on the court entirely in matplotlib, much like the NBA already does on their stats page. Maybe I'll explore this type of animation in my next post. I'm also thinking about switching to the liquid tags plugin for pelican to display my ipython notebooks, this would allow me to render the video outside of the notebook and not have to trick ipython into displaying it, and would also allow me to edit the text of my blog easilly through markdown files. As always thanks for reading!


Comments

comments powered by Disqus