# Sunday, December 03, 2006

In a previous post, I attempted to create a very simple artificial neural network (ANN) to predict the outcome of NFL games.  I found that the approach I took was really no better than random guesses, as I had an average accuracy of 50%.

Consequently, I decided upon a different approach.  I used the same network architecture, but I added data which allowed the network to make more accurate predictions.  This time, I used the following data:

Week Number (1-17)
Away team Yards per Game differential
Away team Yards per Play differential
Away team First Downs per Game differential
Away team Time of Possession differential
Home team Yards per Game differential
Home team Yards per Play differential
Home team First Downs per Game differential
Home team Time of Possession differential
Bias (binary value of 1)

Note: since originally writing this, I expanded the differentials.  Essentially, I'm grabbing everyting from the NFL stats page (requires less work on my end) and creating differentials between the defense and offense.

A typically row pattern looks like this ...

01031033044044027029025049B

... where the first two characters represent the week number, the next three the away team yards per game, and so on.  This value is then converted into a binary array (of 105 binary values), and added to a collection of binary arrays.  This collection represents the training data.  Each "pattern" is pushed throw the network and found to be either true (meaning the away team won) or false (meaning the away team lost).  The error gradient is calculated, and then backpropogation (a subject for another post) refines the weights, and continues to try and get the data to converge.  I decided it was practical to go to an RMSE value of .15 on the training data, before making the prediction.

With this approach, I made predictions for week 11 (excluding week 11 data, of course) ten times, and averaged the values.  I then compared it to the actual results of the game.  I found the accuracy to be 81.25%, which means it made an accurate prediction 13 of 16 times!

So, I decided to include week 12 again, and make predictions for week 13.  Here are my predicted winners:

St. Louis
* Atlanta
* Dallas
* New England
Indianapolis
* Jacksonville
* Cleveland
Minnesota
* N.Y. Jets
* San Diego
* New Orleans
* Pittsburgh
* Houston
* Seattle
Carolina

* Accurately predicted!

So, for week 13, I have accurately predicted 11 of the 15 games (73% accuracy)!  This approach has MUCH more promise than the previous approach, yet isn't as good as I'd like it to be.  I have some ideas one how to make this even MORE accurate!  Stay tuned!

posted on Sunday, December 03, 2006 6:29:18 PM (Central Standard Time, UTC-06:00)  #    Comments [0] Trackback

Note: an updated, more accurate method can be found here.

Over the past few months, I have been playing around with artificial neural networks (ANNs) in C#.  For those of you unfamiliar with ANNs, it is an attempt to programmatically reproduce the way the brain processes data. 

Below is a simple ANN architecture:

On the left we have inputs (i.e. data), and on the right we have an output (i.e. the outcome).  The middle is a layer of neurons that essentially map the inputs to the output via weights (i.e. synapses).

The idea is that, given enough data to "train" the neural network", you can create a neural network that is capable of predicting an outcome.

There are many uses for ANNs; I am attempting to see if I can build a network to accurately predict football games.

I've created a VERY simple ANN, and come up with the following predictions for week 13 of the NFL:

Arizona at St. Louis : Arizona
Atlanta at Washington : Atlanta
Dallas at N.Y. Giants : Dallas
Detroit at New England : New England
Indianapolis at Tennessee : Indianapolis
Jacksonville at Miami : Miami
Kansas City at Cleveland : Kansas City
Minnesota at Chicago : Chicago
N.Y. Jets at Green Bay : Green Bay
San Diego at Buffalo : Buffalo
San Francisco at New Orleans : New Orleans
Tampa Bay at Pittsburgh : Pittsburgh
Houston at Oakland : Oakland
Seattle at Denver : Seattle
Carolina at Philadelphia (Monday night) : Carolina

Now, I don't have a lot of faith in these predictions, because my inputs only consist of the following for the first 12 weeks:

Away Team, Home Team, Did Away Team Win?

For example, the first line of my training data looks like this:

17,25,0

Where 17 equals Miami, 25 equals Pittsburgh, and 0 means that the away team lost.

The idea is that the network is built based on 12 weeks of data, and given the inputs it can predict whether or not the away team wins or loses.

Once all the results are in, I'll post and share how well my predictions did!  I'd be surprised if it was more than 50% accurate.  In the future, I plan to try to find additional pieces of data to include in the training, so that the predictions become more accurate.

Note: I just tested this method against week 12 results (meaning I used weeks 1 through 11, excluding week 12), and found that I was only able to get it 50% accurate ... no better than randomly choosing.

posted on Sunday, December 03, 2006 2:09:16 PM (Central Standard Time, UTC-06:00)  #    Comments [0] Trackback