NBA Archives - Michael's Programming Bytes

And Now For Michael’s Programming Bytes 2025-26 NBA Season Predictions

Hello everybody,

Michael here, and in today’s post we will discuss what I think is the fun part about this NBA post prediction series-the predictions themselves.

That’s right, now that we have our model, let’s make some predictions for the season!

Now where did we leave off?

Before we get into the juicy NBA season predictions, let’s first revisit where we left off on the previous post Another Crack At Linear Regression NBA Machine Learning Predictions (2025-26 edition):

Towards the end of the previous post, we generated this equation to assist us in generating our linear regression NBA season predictions for this year. To recap what the equation means:

64.8 * (field goal %)
PLUS 113 * (3-point %)
PLUS 15.4 * (2-point %)
MINUS 1.94 * (seeding at end of season)
PLUS 0.011 * (total rebounds)
MINUS 0.00346 * (total assists)
PLUS 0.0215 * (total steals)
PLUS 0.00663 * (total blocks)
MINUS 0.0097 * (total turnovers)
MINUS 60.38 (the intercept)

That’s quite a mouthful, but I’ll show you the Python calculations we’ll be doing in order to generate those juicy predictions!

I’ll admit that even I’m not perfect with my blogs here, as I made a small mistake on the previous post that showed part of the equations as 215 * (total steals) rather than 0.0215 * (total steals). As it turns out, even experienced coders like me make oversights, so apologies for that!

A little disclaimer here

Before we dive in to our predictions, I want to clarify that these are simply win total/conference seeding predictions based off of a simple linear regression model configured by me. I personally wouldn’t use these predictions for any bets or parlays because first and foremost, I am your friendly neighborhood coding blogger, not your friendly neighborhood sportsbook. You can count on me for juicy, way-too-early predictions, but certainly not for any juicy over/unders.

If you do bet on NBA games this season, please do so responsibly! Thank you!

The way of the weighted averages

You may recall that for my post on last NBA season’s predictions, we used weighted averages to help generate the predictions. Since I personally liked that method, I’ll do so again.

Here’s the file with the weighted averages, which we’ll be using to calculate the predictions:

We’ll use the same methodology as we did last year for calculating the weighted averages, which went like this:

2022-23 to 2024-25 (last 3 seasons)-0.2 weight (higher weight for the three most recent seasons)
2019-20 to 2021-22 (three seasons prior to that)-0.1 (less weight for seasons further in the past, plus this timespan does include the two COVID shortened seasons)
2015-16 to 2018-19 (four seasons further back)-0.025 (even less weight for these seasons further in the past)

Now here’s the weighted averages file for all 30 teams:

weighted averages 2025-26 Download

So, without further ado, let’s predict some win totals!

import pandas as pd

NBAAVG = pd.read_csv(r'C:\Users\mof39\OneDrive\Documents\weighted averages 2025-26.csv')

for n in NBAAVG['Team']:
    print(64.8*NBAAVG['FG%'] + 113*NBAAVG['3P%'] + 15.4*NBAAVG['2P%'] - 1.94*NBAAVG['Finish'] + 0.011*NBAAVG['TRB'] - 0.00346*NBAAVG['AST'] + 0.0215*NBAAVG['STL'] + 0.00663*NBAAVG['BLK'] - 0.0097*NBAAVG['TOV'] - 60.38)
    break

0     40.21900 (Atlanta Hawks)
1     52.98400 (Boston Celtics)
2     37.83554 (Brooklyn Nets)
3     34.54740 (Charlotte Hornets)
4     40.36300 (Chicago Bulls)
5     50.23470 (Cleveland Cavaliers)
6     43.38590 (Dallas Mavericks)
7     50.17750 (Denver Nuggets)
8     33.35420 (Detroit Pistons)
9     45.65995 (Golden State Warriors)
10    41.07520 (Houston Rockets)
11    45.93936 (Indiana Pacers)
12    51.35416 (LA Clippers)
13    45.88495 (LA Lakers)
14    47.79176 (Memphis Grizzlies)
15    41.26986 (Miami Heat)
16    48.55712 (Milwaukee Bucks)
17    47.12266 (Minnesota Timberwolves)
18    40.65833 (New Orleans Pelicans)
19    47.90818 (NY Knicks)
20    58.57943 (Oklahoma City Thunder)
21    41.25042 (Orlando Magic)
22    43.73122 (Philadelphia 76ers)
23    45.39194 (Phoenix Suns)
24    39.99757 (Portland Trail Blazers)
25    43.89877 (Sacramento Kings)
26    39.38897 (San Antonio Spurs)
27    39.13594 (Toronto Raptors)
28    40.82412 (Utah Jazz)
29    36.85122 (Washington Wizards)

Once I read the weighted averages CSV and ran the equation for all 30 teams, I get the predicted win totals for all 30 teams, which I will use for my way-too-early East/West seeding chart. Note that since the team names aren’t shown in the output, I took the liberty of manually adding each team name by each predicted win total so you know your favorite team’s projected win total (according to my model, of course).

One interesting difference between this year’s projected win totals and last year’s is the narrower range of possible win totals in this year’s model. See, the range of possible win totals in last year’s model was 24-54 wins, while the range of possible win totals in this year’s model is just 33-59 wins. Could the narrower possible win total range be due to the different features I used in this year’s model? It’ll be interesting to see how the season plays out.

Another interesting thing to note is that even though there is a narrower range of potential wins in this year’s model, the majority of teams’ win counts last season fell into this range-20 teams won between 33 and 59 games last season (Knicks, Pacers, Bucks, Pistons, Magic, Hawks, Bulls, Heat, Rockets, Lakers, Nuggets, Clippers, Timberwolves, Warriors, Grizzlies, Kings, Mavericks, Suns, TrailBlazers and Spurs).

How will the win counts look this time around? We’ll see as the season unfolds!

Michael’s Way-Too-Early Conference Seeding:

And now, for the stuff I really wanted to share with you all in this post: Michael’s Way-Too-Early Conference Seeding. Now that we’ve got our projected win totals for each team, it’s time to seed them in their projected spots! But that’s not all I’m going to do!

In addition to the model’s projected seedings, I’ll also give you my own personal seedings for all 30 teams. That’s right-this year, I want to see which set of predictions comes out more accurate-my predications or my model’s predictions. This will be fun to revisit next July once the season wraps up!

Eastern Conference predictions

To begin, let’s start with the model’s Eastern Conference predictions:

Play-Offs	Play-Ins	Maybe Next Year
1. Boston Celtics	7. Miami Heat	11. Toronto Raptors
2. Cleveland Cavaliers	8. Orlando Magic	12. Brooklyn Nets
3. Milwaukee Bucks	9. Chicago Bulls	13. Washington Wizards
4. New York Knicks	10. Atlanta Hawks	14. Charlotte Hornets
5. Indiana Pacers		15. Detroit Pistons
6. Philadelphia 76ers

And now, let’s see my personal Eastern Conference predictions:

Play-Offs	Play-Ins	Maybe Next Year
1. New York Knicks	7. Orlando Magic	11. Toronto Raptors
2. Cleveland Cavaliers	8. Milwaukee Bucks	12. Philadelphia 76ers
3. Boston Celtics	9. Atlanta Hawks	13. Brooklyn Nets
4. Detroit Pistons	10. Chicago Bulls	14. Charlotte Hornets
5. Miami Heat		15. Washington Wizards
6. Indiana Pacers

Here are some interesting observations about both the model’s predictions and my own personal predictions:

The Eastern Conference teams that made last season’s play-in (Heat, Hawks, Bulls, Magic) are the same ones projected to make another go at play-ins this year. In other words, could we see the same teams stuck in another year of play-ins?
Personally, I think the Hawks, Bulls and Magic will make another trip to the play-in. On the other hand, I think the Heat will eke out a 5 (maybe 6) seed in the East because of some great new acquisitions like small forward Simone Fontecchio and shooting guard Norman Powell.
I honestly don’t know why the model hates the Detroit Pistons, as it placed them at the bottom of the East once more. I ranked them as a possible 4-seed because after their improvement last year (44-38 from a dismal 14-68 in 2023-24), I feel they could be quite the playoff contender-and it was certainly nice to see 2021 1st Overall Pick Cade Cunningham finally develop into a star-quality player. The acquisition of the former Heat small forward Duncan Robinson should be exciting to see.
This might sound like a hot take here, but I don’t think the Sixers will even qualify for play-in, let alone playoffs given the plethora of issues they had last season. Least of all, Paul George and Joel Embiid-two of the biggest Sixers names-weren’t at the top of their game last season when they were healthy (and both of them missed significant time due to injuries).
Unlike my model, I think the Knicks could really take the top spot in the East this season. Despite falling just short of the 2025 NBA Finals, the Knicks showed they can certainly make a deep playoff run with talent such as Jalen Brunson (winner of the Clutch Player of the Year award), OG Anunoby and their acquisition of Karl-Anthony Towns from the Timberwolves during the 2024 offseason.
With two of the biggest names in the East-Jayson Tatum and Tyrese Haliburton-out for most if not all of this season due to Achilles injuries they got during last season’s playoffs, I think the East is wide open. Granted, I still think the Pacers and Celtics have a good chance at making the playoffs this year, but I don’t think either of them is a shoo-in for the top spot in the East, which in my opinion leaves the East playoff race wide open for another team to take the top spot (which as I said earlier, I think it could be the Knicks’ year to do just that). Also, I still think the Celtics could realistically clinch the 3-seed in the East despite the offseason departures of Jrue Holiday, Kristaps Porzingis, Al Horford and Luke Kornet, who were all key players in the Celtics 2024 Championship run.

Western Conference predictions:

First, let’s start with how the model think the Western Conference standings will play out this season:

Play-Offs	Play-Ins	Maybe Next Year
1. Oklahoma City Thunder	7. Golden State Warriors	11. Houston Rockets
2. LA Clippers	8. Phoenix Suns	12. Utah Jazz
3. Denver Nuggets	9. Sacramento Kings	13. New Orleans Pelicans
4. Memphis Grizzlies	10. Dallas Mavericks	14. Portland Trail Blazers
5. Minnesota Timberwolves		15. San Antonio Spurs
6. LA Lakers

Just as with the model’s Eastern Conference predictions, I certainly have disagreements with the Western Conference predictions. Here’s how I think the Western Conference standings will play out this season:

Play-Offs	Play-Ins	Maybe Next Year
1. Oklahoma City Thunder	7. Golden State Warriors	11. Dallas Mavericks
2. Houston Rockets	8. LA Clippers	12. Memphis Grizzlies
3. Minnesota Timberwolves	9. Sacramento Kings	13. Utah Jazz
4. Denver Nuggets	10. San Antonio Spurs	14. Portland Trail Blazers
5. Houston Rockets		15. New Orleans Pelicans
6. LA Lakers

As I did with my Eastern Conference predictions, here are some interesting observations between the model’s projected conference standings and my personal projected conference standings:

I’m sure the question on every NBA fan’s mind-including mine-is “Can the Oklahoma City Thunder pull off another championship?”. My guess-I think of all the champions we’ve seen in the 2020s alone, I think they’ve got the best shot at a repeat title. Why might that be? One big reason that could happen-the Thunder kept their core Big 3 (SGA, Chet Holmgren, and Jaylin Williams) around along with several other key players from the championship run such as Isaiah Hartenstein, Lu Dort, among others. Personally, I think that NBA teams would be wise not to go full rebuild-mode after winning their first championship, and it seems the Thunder have done just that (they only traded second-year small forward Dillon Jones, who played limited minutes in OKC’s championship run). Even if the Thunder don’t end up repeating as champions, I think, at the very least, the 1-seed in the West could be theirs for the taking once more.
Another interesting Western Conference storyline to watch would be whether Cooper Flagg (the 2025 #1 overall pick) becomes the next Luka Doncic for the Mavericks. After Doncic got traded for Anthony Davis during last year’s midseason trades, it’s safe to say the Mavericks’ season went south. A controversial trade and injuries to many key players-Anthony Davis (after the trade) and Kyrie Irving being the two most notable examples-didn’t help matters. Then again, having such an injury-struck roster to the point where the Mavericks nearly (but thankfully didn’t) have to forfeit games only added to their problems last season after the infamous Doncic-Davis trade. The drafting of 6’9″, 18-year-old forward Cooper Flagg could bring a spark to the struggling Mavericks (and from watching some of his highlights, I think Flagg has potential), but I think Flagg will need at least a year to gel with the Mavericks before they once again become Western Conference contenders.
Just as I was surprised that my model placed the Detroit Pistons at the bottom of the Eastern Conference given their improvements last season, I can say I’m just as surprised that the San Antonio Spurs were placed at the bottom of the Western Conference. Granted, they haven’t made the playoffs since 2019 and just went through a coaching change (Popovich stepped down and Mitch Johnson was named as head coach after serving as interim last season), but they did also improve their record from 22-60 in ’23-’24 to 34-48 last season. The Spurs also have their own solid Big 3 in De’Aaron Fox, Stephon Castle, and of course 2023 #1 overall pick Victor Wembanyama. Even though Wemby’s season was cut short last year due to deep vein thrombosis (a type of blood clot), his improved shooting and double-doubles could certainly help the Spurs once he’s fully recovered.
How might the Golden State Warriors do with their 35-and-over Big 3 (Jimmy Butler is 36, Draymond Green is 35, and Steph Curry is 37)? Given that they earned their playoff spot last season through play-ins, I’ve got a hunch that the Warriors might be seeing the play-ins once more-but will likely get a playoff spot in this manner. Yes, they had quite the herky-jerky trajectory last season, but the midseason acquisition of Jimmy Butler certainly gave them an extra spark down the regular season stretch-Butler’s basketball skills certainly paired well with guys like Steph and Draymond. Upsetting the 2-seeded Houston Rockets in the Western Conference quarterfinals last season certainly helps the Warriors’ momentum heading into this season, but I do wonder how the loss of their championship-winning forward Kevon Looney would affect the Warriors dynamic.
I know I said that I think the Thunder have a great chance to repeat as champions, but I also wonder if the Timberwolves would be a team to look out for in the 2026 postseason. After all, despite losing franchise mainstay Karl-Anthony Towns to the Knicks in the 2024 offseason, the Timberwolves adapted quite well as stars like Anthony Edwards and Naz Reid rose to the challenge by helping the team get to the Western Conference finals for the second year in a row (even though they got knocked out at the Western Conference finals for the second year in a row too). All in all, in terms of every NBA trade ever made, I think the Karl-Anthony towns trade-along with the players the Timberwolves got in exchange (Julius Randle and Donte DiVincenzo)-was one of the most even trades for both teams involved, as both the Knicks and Timberwolves made it to their respective conference finals.
Just as with my play-in predictions for the Eastern conference, at least three of the four projected play-in teams (according to the model) for the Western Conference made the play-ins last season-the Mavericks, Warriors, and Kings. I think the Warriors have the best shot at cracking the actual playoffs while the Mavericks could use another year for Cooper Flagg to develop (plus buy some time to get stars like Kyrie Irving back). It will be interesting to see how the Sacramento Kings fare because even though Domantis Sabonis, Zack LaVine and DeMar DeRozan fared well despite the disappointing finish, the talent around them could use some improvement. Perhaps the addition of Russell Westbrook (who’s in his 18th year in the NBA) could spice up the Kings’ offense, as he certainly showed he still had the athleticism and speed needed for basketball last season with the Denver Nuggets.

And now for something a little scandalous…

Boy oh boy this is certainly going to be the most interesting (or at least the most interestingly-timed) post I’ve written during this blog’s run. Why might that be?

Well, last Thursday (October 23, 2025) news broke that the FBI (US Federal Bureau of Investigation) had arrested 34 people for a pair of scandals that certainly rocked pro basketball-one involving colluding with Italian Mafia families (specifically the Gambino, Bonnano and Genovese crime families) to conduct a series of rigged poker games and another involved colluding to rig sports betting.

Here’s the wildest part though-among the 34 arrested were the current head coach of the Portland Trail Blazers (Chauncey Billups), a current Miami Heat star (Terry Rozier), and a former Cavaliers player (Damon Jones). Billups and Rozier were placed on leave by their respective teams.

Want to know some other juicy, scandalous details? Here are a few takeaways from the indictments:

Chauncey Billups was allegedly used by these Mafia families to lure in victims to the rigged poker games in order to make the poker games appear legitimate.
How the poker games were rigged is possibly the wildest part, with everything that was alleged to have happened sounding like it could’ve come from a James Bond movie. Among the methods used to rig these poker games were X-Ray tables that allowed these Mafia families to see opponents’ hands and rigged shuffling machines that could be used to predict what opponents’ hands would look like.
As for Rozier, the game that led to him being investigated was a March 23, 2023 game while Rozier was still with the Charlotte Hornets. In this game, Rozier left the game early due to a “foot injury”-which wasn’t true as Rozier conspired with a longtime friend of his that he planned to fake the “foot injury” in order to net this friend over $200,000 on his “under” statistics (that Rozier would underperform in the game in other words).
As for Damon Jones, he sold insider information to his co-conspirators during the 2022-23 season while working for the Lakers. The information concerned insider tips on lineup decisions and injury reports on star Lakers players; the co-conspirators were able to place significant wagers on their bets with this information. It was later revealed that one of the players whose injury report was leaked was LeBron James, who hasn’t been implicated in any wrongdoing.

All in all, it will be interesting to see how this scandal plays out-especially to see if anyone else get busted as part of this massive gambling ring. Here’s an October 23, 2025 release from the US DOJ (Department of Justice) describing the basics of the gambling ring (keep in mind that anyone involved is presumed innocent until proven guilty)-https://www.justice.gov/usao-edny/pr/current-and-former-national-basketball-association-players-and-four-other-individuals.

Here’s a snippet of a conference from FBI Director Kash Patel on October 23, 2025 regarding the charges-https://www.youtube.com/shorts/4F4_JMGVJXw.

All I will say is that it will be very very interesting to see not only how the rest of the NBA season plays out but also to see how commissioner Adam Silver will change league gambling policy-especially when it comes to players and coaching staff. Assuming other players and/or coaching staff get busted in the gambling ring (which could happen) the trials will be interesting-mostly because we’ll get to see who will snitch on who to get a sweet plea deal. Maybe there will be some RICO charges in the mix-which given what occurred, isn’t a stretch to think.

Anyway, thanks for reading as always, and enjoy the juicy action of the 2025-26 NBA season! The season is still young, so it’s anyone’s game!

Michael

Another Crack At Linear Regression NBA Machine Learning Predictions (2025-26 edition)

Hi everybody,

Michael here, and in today’s post, I thought I’d try something a little familiar. You may recall that last October, I released a pair of posts (Python, Linear Regression & An NBA Season Opening Day Special Post and Python, Linear Regression & the 2024-25 NBA season) attempting to predict each NBA team’s win total and conference seeding based off of their performance from the previous 10 seasons.

All in all, after seeing how the season played out-I managed to get only 3/30 teams in the correct seeding. So what would I do here?

I’ll give my ML NBA machine learning predictions another go, also using data from the previous 10 seasons (2015-16 to 2024-25). You may be wondering why I’m trying to predict the outcomes of the upcoming NBA season once more given how off last year’s predictions were-the reason I’m giving the whole “Michael’s NBA crystal ball” thing another go is because I’m not only interested in how my predictions change from one season to the next but also because I plan to use a slightly different model than I did last year (it’ll still be good old linear regression, however) so I can analyze how different factors might play a role in a team’s record and ultimately their conference seeding.

So, without further ado, let’s jump right in to Michael’s Linear Regression NBA Season Predictions II!

Reading the data

Before we dive in to our juicy predictions, the first thing we need to do is read in the data to the IDE. Here’s the file:

NBA analysis 2025-26 Download

Now let’s import the necessary packages and read in the data!

import pandas as pd
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression

from google.colab import files
uploaded = files.upload()

import io

NBA = pd.read_excel(io.BytesIO(uploaded['NBA analysis 2025-26.xlsx']))

NBA.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300 entries, 0 to 299
Data columns (total 31 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Season  300 non-null    object 
 1   Team    300 non-null    object 
 2   W       300 non-null    int64  
 3   L       300 non-null    int64  
 4   Finish  300 non-null    int64  
 5   Age     300 non-null    float64
 6   Ht.     300 non-null    object 
 7   Wt.     300 non-null    int64  
 8   G       300 non-null    int64  
 9   MP      300 non-null    int64  
 10  FG      300 non-null    int64  
 11  FGA     300 non-null    int64  
 12  FG%     300 non-null    float64
 13  3P      300 non-null    int64  
 14  3PA     300 non-null    int64  
 15  3P%     300 non-null    float64
 16  2P      300 non-null    int64  
 17  2PA     300 non-null    int64  
 18  2P%     300 non-null    float64
 19  FT      300 non-null    int64  
 20  FTA     300 non-null    int64  
 21  FT%     300 non-null    float64
 22  ORB     300 non-null    int64  
 23  DRB     300 non-null    int64  
 24  TRB     300 non-null    int64  
 25  AST     300 non-null    int64  
 26  STL     300 non-null    int64  
 27  BLK     300 non-null    int64  
 28  TOV     300 non-null    int64  
 29  PF      300 non-null    int64  
 30  PTS     300 non-null    int64  
dtypes: float64(5), int64(23), object(3)
memory usage: 72.8+ KB

As you can see, we’ve still got all 31 features that we had in last year’s dataset-the only difference between this dataset and last year’s is the timeframe covered (this dataset starts with the 2015-16 and ends with the 2024-25 season).

Just like last year, this year’s edition of the predictions comes from http://basketball-reference.com, where you can search up plenty of juicy statistics from both the NBA and WNBA. Also, just like last year, the only thing I changed in the data from Basketball Reference is the Finish variable, which represents a team’s conference finish (seeding-wise) as opposed to divisional finish (since divisional finishes are largely irrelevant for a team’s playoff standings).
If you want a better explanation of these terms, please feel free to refer to last year’s edition of my predictions post-Python, Linear Regression & An NBA Season Opening Day Special Post.

Now that we’ve read our file into the IDE, let’s create our model!

Creating the model

You may recall that last year, before we created the model, we used the Select-K-Best algorithm to help us pick the optimal model features. For a refresher, here’s what Select-K-Best chose for us:

['L', 'Finish', 'Age', 'FG%', '3P%']

After seeking the five best features for our model from the Select-K-Best algorithm, this is what we got. However, we’re not going to use the Select-K-Best suggestions this year as there are other factors I’d like to analyze when it comes to making upcoming season NBA predictions.

Granted, I’ll keep the Finish, FG%, and 3P% as I feel they provide some value to the model’s predictions, but I’ll also add a few more features of my own choosing:

X = NBA[['FG%', '3P%', '2P%', 'Finish', 'TRB', 'AST', 'STL', 'BLK', 'TOV']]
y = NBA['W']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

Along with the features I chose from last year’s model, I’ll also add the following other scoring categories:

2P%-The percentage of a team’s successful 2-pointers in a given season
TRB-A team’s total rebounds in a season
AST-A team’s total assists in a season
STL-A team’s total steals in a season
BLK-A team’s total blocks in a season
TOV-A team’s total turnovers in a season

The y-variable will still be W, as we’re still trying to predict an NBA team’s win total for the upcoming season based off of all our x-variables.

Now, let’s create a linear regression model object and run our predictions through that model object:

NBAMODEL = LinearRegression()
NBAMODEL.fit(X_train, y_train)

yPredictions = NBAMODEL.predict(X_test)

yPredictions

array([43.2515066 , 36.4291265 , 55.14626364, 46.01164579, 24.18679591,
       35.59131124, 35.59836527, 49.98114132, 48.57869061, 50.65733101,
       21.296126  , 49.94020238, 31.98306604, 41.89217714, 45.65373458,
       50.57831266, 32.76923727, 45.6898562 , 20.4393901 , 55.28944034,
       52.79027154, 21.81113366, 50.79142468, 50.95798684, 53.23802534,
       50.00199063, 48.4639119 , 49.1671417 , 51.12760913, 31.20606334,
       45.3090483 , 25.02488097, 43.67955061, 48.47484838, 33.74041157,
       41.7463038 , 36.10796911, 40.5399278 , 35.30656175, 16.92677689,
       49.77947698, 39.2160337 , 22.08871355, 31.83549487, 15.2675987 ,
       18.24486804, 21.71657476, 42.21505537, 22.84745758, 25.56862333,
       43.6212702 , 20.28339646, 44.60289296, 49.20316062, 53.69182149,
       29.48304908, 44.60789347, 42.44466633, 55.93637972, 54.89728291])

Just as with last year’s model, the predictions are run on the test dataset, which consists of the last 60 of the dataset’s 300 total records.

And now for the equation…

Now that we’ve generated predictions for our test dataset, let’s find out all of the coefficients and the intercept for the equation I will use to make this year’s NBA predictions:

NBAMODEL.coef_

array([ 6.48260593e+01,  1.13945178e+02,  1.54195451e+01, -1.94822281e+00,
        1.10428617e-02, -3.46015457e-03,  2.15326621e-02,  6.63810730e-03,
       -9.70593407e-03])

NBAMODEL.intercept_

np.float64(-60.37720744829896)

Now that we know what our coefficients are, let’s see what this year’s equation looks like:

Although it’s much more of a mouthful than last year’s equation, it follows the same logic in that it uses the features of this year’s model in the order that I listed them:

['FG%', '3P%', '2P%', 'Finish', 'TRB', 'AST', 'STL', 'BLK', 'TOV']

A is FG%, B is 3P%, and so on until you get to I (which represents TOV).

Since all the coefficients are listed in scientific notation, I rounded them to two decimal places before converting them for this equation. Same thing for the intercept.
In case you’re wondering, no you can’t add all the coefficients together for this equation as each coefficient plays a part in the overall equation. Just like last year, we’re going to do the weighted-averages thing to generate projected win totals. Keep your eyes peeled for the next post, which covers the juicy predictions.

…and the accuracy test!

So now that we’ve got our 2025-26 NBA predictions model, let’s see how accurate it is:

from sklearn.metrics import mean_absolute_percentage_error

mean_absolute_percentage_error(y_test,yPredictions)

0.09573425883736708

Using the MASE (mean absolute percentage error) from sklearn like we did in last year’s analysis, we see that the model’s margin of error is roughly 9.57%. I’ll round that up to 10%, which means that despite not choosing the model’s features from a prebuilt algorithm, the overall accuracy of the model is still 90%.

Now, whether the model’s accuracy and my predictions hold up is something I’ll certainly revisit in 8 months time for another end-of-season reflection. After all, last season I only got 3 of the 30 teams in the correct seeding, though I did do better with predicting which teams didn’t make playoffs though.

Recall that to find the accuracy of the model using the MASE, subtract 100 from the (MASE * 100). Since the MASE rounds out to 10 as the nearest whole number (rounded to 2 decimal places), 100-10 gives us an accuracy of 90%

Last but not least, it’s prediction visualization time!

Before we go, the last thing I want to cover is how to visualize this year’s model’s predictions. Just like last year, we’re going to use the PYPLOT module from MATPLOTLIB:

import matplotlib.pyplot as plt

plt.scatter(y_test, yPredictions, color="red")
plt.xlabel('Actual values', size=15)
plt.ylabel('Predicted values', size=15)
plt.title('Actual vs Predicted values', size=15)
plt.show()

As you can see, the plot forms a sort of diagonal-line shape, which reinforces the model’s 90% prediction accuracy rate.

Also, just for comparison’s sake, here’s what my predictions looked like on last year’s model (the one where I used Select-K-Best to choose the model features):

This also looks like a diagonal-line shape, and last year’s model had a 91% accuracy rate.

Here’s the link to the Colab notebook in my GitHub-https://github.com/mfletcher2021/blogcode/blob/main/NBA_25_26_predictions.ipynb

Thanks for reading, and keep an eye out for my 2025-26 season predictions,

Michael

2024-25 NBA Season Predictions Revisited & OKC’s First Championship

Hello readers,

As some of you may recall, I did a pair of NBA 2024-25 series prediction posts around the start of the 2024-25 season back in October (Python, Linear Regression & the 2024-25 NBA season and Python, Linear Regression & An NBA Season Opening Day Special Post).

Well, now that the season has just concluded, I thought it would be a good time to revisit those predictions and reflect on the season that just passed.

Yes, I know this is primarily a coding blog, but all the storylines of this past NBA season (the Finals, the midseason trades, etc.) are certainly worth revisiting. After all, this post will tie into my NBA season opener two-parter (juicy Python AI predictions and all).

But first, a few words on the NBA finals…

Wow wow wow what a Finals that was! I mean, first of all, there were two solid teams in the Indiana Pacers (50-32, 4th in East) and Oklahoma City Thunder (68-14, 1st in West). Second of all, it went the distance as it was the first NBA Finals to go a full 7 games since the 2016 Finals (that of the Warriors’ blown 3-1 lead and the Lebron-led Cavs’ comeback-like-no-other).

The momentum was back-and-forth, and it felt like the series could’ve gone in favor of either the Pacers or Thunder. Tyrese Haliburton’s Game 1 2-point clutch shot to give the Pacers the Game 1 win. OKC rallying back with their defense in Game 2 to keep the Pacers at bay. The Pacers’ blowout Game 6 win at home to even force a Game 7. OKC rallying back in the second half of Game 7, holding the Pacers to just 42 second-half points to ultimately win their first championship (and congrats to Shai Gilgeous-Alexander on winning 2025 NBA Finals & 2025 regular season MVP). I’ve got to hand it to the Pacers’ resilience in Game 7, because even though they didn’t win, they-especially their bench players like TJ McConnell-gave it their all after Tyrese Haliburton went out in the first quarter with a torn Achilles (well wishes to him on recovery).

The mid-season trades (especially THAT trade)

It wouldn’t be an NBA season-in-review/predictions-revisited post without reviewing all of the crazy stuff that went down in the midseason trades.

Personally, as wild as off-season free agency can be (anyone remember LeBron’s “I’m taking my talents to South Beach”?) I think the mid-season trades can be even wilder, especially this season. Let’s go over the most notable examples, shall we?

Jimmy Butler (Miami Heat –> Golden State Warriors)
De’Aaron Fox (Sacramento Kings –> San Antonio Spurs)
Khris Middleton (Milwaukee Bucks –> Washington Wizards)
Andrew Wiggins (Golden State Warriors –> Miami Heat-part of the Jimmy Butler trade package)

And for the most notable 2025 midseason trade (yes THAT trade):

Anthony Davis from the LA Lakers to the Dallas Mavericks/Luka Doncic from the Dallas Mavericks to the LA Lakers

Personally, I thought this takes the cake for most shocking midseason NBA trade in quite some time! I mean, no one expected (least of all Doncic himself) that Dallas would trade away a franchise icon in his prime-he’s 26-for a 32-year-old Anthony Davis. It’s especially wild considering A) it was largely done in secrecy and B) Luka took the Mavs to the Finals just the previous year. Time will tell how this plays out, but personally, I think the Lakers got the better deal (getting Markeiff Morris and Maxi Klieber in that package didn’t hurt either). After all, the Lakers got the 3-seed in the West this year, while the Mavericks fell to the 10-seed and didn’t make it past play-in (though the Mavs’ especially bad luck with injuries didn’t help). Luka could certainly bring the Lakers much success, especially in the post-LeBron years to come.

Side note: it will be interesting to see how Dallas’s #1 overall 2025 draft pick of Cooper Flagg will pan out. Perhaps he’ll be the new Doncic in Dallas and fit right in with Klay, Kyrie and the rest of the Mavs?

And now, we reflect…

So, how accurate were my opening day predications for the season? Let’s review those predictions shall we?

Eastern Conference Edition

So, here were my opening day predictions for how things in the Eastern Conference would pan out:

INTO THE PLAYOFFS	INTO THE PLAY-IN	OUT OF PLAYOFF RUNNING
1. Milwaukee Bucks	7. Cleveland Cavaliers	11. Chicago Bulls
2. Boston Celtics	8. Indiana Pacers	12. Washington Wizards
3. Philadelphia 76ers	9. Atlanta Hawks	13. Charlotte Hornets
4. Miami Heat	10. New York Knicks	14. Orlando Magic
5. Toronto Raptors		15. Detroit Pistons
6. Brooklyn Nets

Here’s how things actually played out:

INTO THE PLAYOFFS	INTO THE PLAY-IN	OUT OF PLAYOFF RUNNING
1. Cleveland Cavaliers	7. Orlando Magic (made playoffs)	11. Toronto Raptors
2. Boston Celtics	8. Atlanta Hawks	12. Brooklyn Nets
3. New York Knicks	9. Chicago Bulls	13. Philadelphia 76ers
4. Indiana Pacers	10. Miami Heat (made playoffs)	14. Charlotte Hornets
5. Milwaukee Bucks		15. Washington Wizards
6. Detroit Pistons

In total, here’s a rundown of how right/off I was:

Only 2/6 teams I picked to get a top-6 spot in the East (and get a direct playoff spot) were correct (Bucks and Celtics)
3/6 teams I picked to get a top-6 spot in the East didn’t even make play-in (Nets, Sixers, and Raptors)
Only 1 correct play-in pick for the East (Hawks)-the other three play-in teams actually got top-6 spots
3/5 teams that I thought weren’t going to make playoffs at least made play-in (Magic, Pistons, and Bulls). None of these teams got further than the Eastern Conference quarterfinals, however.
Of the 15 teams in the East, 7 teams finished better than I expected, 7 teams finished worse than I expected, and 1 team finished in the exact spot I thought they would.
Teams that finished better than I thought they would: Cavaliers, Knicks, Pacers, Pistons, Hawks, Bulls, and Magic
Teams that finished worse than I thought they would: Bucks, Heat, Nets, 76ers, Raptors, Wizards, Hornets
Team that finished exactly as I thought they would: Celtics (2-seed)

Takeaways from the East…

The Miami Heat, despite finishing 10th in the East, earning a play-in spot for the third straight season, and all the Jimmy Butler drama leading up to the midseason trade, still managed to surprise me in that they became the first 10-seed playoff team to make the playoffs. Then they promptly got swept by the Cleveland Cavaliers in the first round, including two straight 30+ point losses at home. Maybe the lottery pick might’ve been better for them?
I know I’ve talked about earlier in this post, but I’ve got to give the Indiana Pacers credit where credit is due. Despite a slow 10-15 start to the season, they eventually found their rhythm, both offensively and defensively. The return of key players like Aaron Neismith and Tyrese Haliburton’s 2nd-half-of-the-season resurgence certainly helped propel the Pacers to the NBA Finals (and help the series go the distance against the OKC Thunder)
On the other side of the coin, teams like the Philadelphia 76ers fell far below expectations this season, finishing with a 13-seed in the East, no playoffs or play-in, and a 24-58 record. The highly-anticipated acquisition of Paul George from the LA Clippers and the formation of a new Sixers’ Big-3 with George, Joel Embiid and Tyrese Maxsey didn’t quite pan out, especially due to a knee injury for George, a torn meniscus for Embiid, and a sprained finger for Maxsey, which ended up becoming season-ending injuries for all three players.
The Milwaukee Bucks, despite finishing with a 5-seed in the East and a solid regular season, certainly didn’t look like the championship Bucks squad of 2020-21. Injuries to Giannis Antetokounmpo and Damian Lilliard didn’t help matters in the playoffs, nor did their blown late-game playoff leads-especially their infamous 118-111 lead in OT of game 5 against the Pacers…that the Pacers still won by 1, thus eliminating the Bucks.

Western Conference Edition

Now let’s see how things panned out in the Western Conference? Here’s how I thought things would go:

INTO THE PLAYOFFS	INTO THE PLAY-IN	OUT OF PLAYOFF RUNNING
1. Denver Nuggets	7. LA Lakers	11. Sacramento Kings
2. LA Clippers	8. Memphis Grizzlies	12. New Orleans Pelicans
3. Golden State Warriors	9. Oklahoma City Thunder	13. San Antonio Spurs
4. Phoenix Suns	10. Minnesota Timberwolves	14. Portland Trailblazers
5. Dallas Mavericks		15. Houston Rockets
6. Utah Jazz

And here’s how things actually went for the teams in the West:

INTO THE PLAYOFFS	INTO THE PLAY-IN	OUT OF PLAYOFF RUNNING
1. Oklahoma City Thunder	7. Golden State Warriors (made playoffs)	11. Phoenix Suns
2. Houston Rockets	8. Memphis Grizzlies (made playoffs)	12. Portland Trail Blazers
3. LA Lakers	9. Sacramento Kings	13. San Antonio Spurs
4. Denver Nuggets	10. Dallas Mavericks	14. New Orleans Pelicans
5. LA Clippers		15. Utah Jazz
6. Minnesota Timberwolves

In total, here’s a rundown of how right/off I was (Western Conference edition).

It’s worth noting that my Western Conference predictions have several similarities with my Eastern Conference predictions:
- 2/6 teams that I predicted would get a top-6 spot did just that (Clippers and Nuggets)
- 3/6 teams that I predicted would get a top-6 spot didn’t even make the play-in (Suns, Mavericks and Jazz)
- Only 1 correct play-in pick for the west (Grizzlies)-the other three teams were straight shots into the playoffs (with “9-seed” OKC finishing as the champions)
- For both the East and West, the teams I had finishing as the “bottom seed” both made it directly to playoffs (with Pistons as 6-seed and Rockets as 2-seed)
- There were some differences between my East predictions and West predictions:
- Only 2/6 teams that I thought weren’t going to playoffs at least made play-ins (Kings and Rockets). Just like the East though, neither of these teams made it past the Western Conference quarterfinals.
- In total, 6 teams in the West finished better than I expected, 7 teams finished worse than I expected, and 2 teams finished in the exact spot I expected.
- Teams that finished better than I thought they would: Lakers, Thunder, Timberwolves, Kings, Trailblazers, Rockets
- Teams that finished worse than I thought they would: Nuggets, Clippers, Warriors, Mavericks, Jazz, Suns, Pelicans
- Teams that finished in the exact spot I thought they would: Spurs (13-seed) and Grizzlies (8-seed)

Takeaways from the west…

One team I certainly underestimated was the Houston Rockets. To think my Michael-made ML model put them as the 15-seed in the West-boy did they outperform my model’s expectations (finishing as the 2-seed in the West). Sure, they did get upset in the Western Conference first round by the play-in Warriors, but the combination of good veterans with Fred VanVleet and Jeff Green, solid offensive contributors like Alpern Sengun, solid defensive contributors with guys like Amen Thompson, and coach Ime Udoka, I think the Rockets will go far in the years to come. Now to see how adding KD in free agency will elevate the Rockets.
And now for the Warriors…I mean, at least they’ve got a solid Big-3 in Draymond Green, Jimmy Butler and of course, Stephen Curry. However, they did have an up-and-down season, finishing with a 48-34 record and a 7-seed play-in spot. The mid-season acquisition of Jimmy Butler from the Heat certainly gave the Warriors more defensive spark (and in my opinion, I think Miami got a solid asset in Andrew Wiggins). I think it would do the Warriors a lot of good to think about their future-after all, they can only roll with guys like Jimmy Butler and Draymond Green for so long-both of whom will turn 36 next season. Plus I think it would be satisfying to make one more championship run with Steph Curry and Draymond Green, who have been with the Warriors for 15+ years and four championships.
On the other end of the playoff coin, what happened to the bottom-of-the-West (and bottom-of-the-NBA) Utah Jazz? Granted, I know they haven’t been great since the departures of Rudy Gobert and Donovan Mitchell, but losing one of their most promising assets in Taylor Hendricks to a fractured fibula and dislocated ankle just three games into the season certainly didn’t help matters for the Jazz. Hopefully the Jazz’ 2025 draft class-led by small forward Ace Bailey-will reverse their fortunes and bring them back into playoff contention. Maybe adding Kevin Love in free agency will give them a boost of veteran player-coach leadership?
Last but not least, let’s talk about the San Antonio Spurs-one of only three teams whose seeding I got exactly right (13-seed). Granted, going from 14-seed last season to 13-seed this season might not sound like a huge improvement, but I think Victor Wembanyama definitely has potential to be a franchise icon, as he showed his defensive potential this season despite his season-ending blood clot after All-Star Break. The mid-season acquisition of De’Aaron Fox and the coaching transition from Coach “Pop” Popovich to Mitch Johnson could help rejuvenate the Spurs and, who knows, maybe even bring them back to the playoffs for the first time since 2019.

Thanks for reading. What a great NBA season it was! I’ll certainly admit I had fun writing this 3-post series to not only try and predict the upcoming NBA season, but also to reflect on how my predictions fared after the season has concluded. After all, an important skill of data analysis is learning how to gather insights from the data!

Michael

Also, congrats OKC on winning the 2025 NBA championship!

Python, Linear Regression & the 2024-25 NBA season

Hello everybody!

Michael here, and in today’s post, we’ll continue where we left off from the previous post Python, Linear Regression & An NBA Season Opening Day Special Post. As I mentioned in that post, we’ll use the linear regression equation we obtained to see if we can obtain predictions for the current 2024-25 NBA season?

Disclaimer

Yes, I know I’m trying to predict the various juicy outcomes of the 2024-25 NBA season, but these predictions are purely meant for educational purposes to display the methodology of the predictions, not for game-day parlays and/or your fantasy NBA team. After all, I am your friendly neighborhood coding blogger, but I am not your friendly neighborhood sportsbook. If you do decide to bet on anything during the NBA season, please bet responsibly :-).

Previously on Michael’s Programming Bytes…

In the previous post, we used data from the last 10 NBA seasons for each of the 30 teams to predict season record results, which in turn gave us this linear regression equation that I will use to predict team-by-team results and standings for the 2024-25 NBA season:

Just to recap, here’s what in this equation:

-0.47x (represents team’s losses in a given season)
-1.31x (represents team’s conference finish from 1-15 in a given season)
0.4x (represents average age of team’s roster)
34.13x (represents % of field goals made)
-22.12x (represents % of 3-pointers made)
50.95 (linear regression model intercept)

Our predictions generated in the previous post came back with 91% accuracy/9% mean absolute percentage error, so I can tell we’re gonna get some good predictions here.

And now, for the predictions…

Yes, here comes the fun part, the predictions. For the predictions, I gathered the weighted averages of the five features we used in our model (losses, conference finish, average roster age, % of field goals made and % of 3-pointers made) and placed them into this spreadsheet:

NBA weighted averages Download

Now, how did I calculate the weighted averages of these five features for each team? Well, I simply assigned different weights for different seasons like so:

2021-22 to 2023-24 seasons-0.2 weight (higher weight for the three most recent seasons)
2018-19 to 2020-21 seasons-0.1 weight (they’re a little further back, plus I factored in COVID impacts to the 2019-20 and 2020-21 seasons)
2014-15 to 2017-18 season-0.025 weight (smaller weight since these are the furthest in the past, plus many players in the league during this time have since retired)

After assigning these weights, I calculated averages using the standard procedure for average calculation.

Here’s the basic Python code I used to calculate projected wins for all 30 NBA teams:



import pandas as pd

NBAAVG = pd.read_csv(r'C:\Users\mof39\OneDrive\Documents\NBA weighted averages.csv')

for n in NBAAVG['Team']:
    print(str(-0.47*NBAAVG['L']-1.31*NBAAVG['Finish']
                                      +0.4*NBAAVG['Age']+34.13*NBAAVG['FG%']
                                      -22.15*NBAAVG['3P%']+50.95))
    break

Here’s the GitHub link to that script-https://github.com/mfletcher2021/blogcode/blob/main/NBA%20averages.py

And here are the projected win totals for each team using this equation:

0     37.911934
1     52.863761
2     40.819851
3     31.742252
4     37.524958
5     40.441851
6     42.540851
7     51.103223
8     24.263654
9     45.852691
10    33.197160
11    38.736829
12    47.364055
13    41.338946
14    41.297291
15    45.202762
16    53.722600
17    39.185063
18    37.443009
19    38.462010
20    39.284500
21    32.296571
22    47.795819
23    45.063567
24    33.312626
25    37.493793
26    32.519145
27    42.072515
28    41.920285
29    31.773436

Granted, you don’t actually see the team names in this output, but since the team names are organized alphabetically in the dataset you can tell which team corresponds to which projected win total. However, just for clarity, I’ll elaborate on those totals below:

Atlanta Hawks: 37.911934 wins (38-44)
Boston Celtics: 52.863761 wins (53-29)
Brooklyn Nets: 40.819851 wins (41-41)
Charlotte Hornets: 31.742252 wins (32-50)
Chicago Bulls: 37.524958 wins (38-44)
Cleveland Cavaliers: 40.441851 wins (40-42)
Dallas Mavericks: 42.540851 wins (43-39)
Denver Nuggets: 51.103223 wins (51-31)
Detroit Pistons: 24.263654 wins (24-58)
Golden State Warriors: 45.852691 wins (46-36)
Houston Rockets: 33.197160 wins (33-49)
Indiana Pacers: 38.736829 wins (39-43)
LA Clippers: 47.364055 wins (47-35)
LA Lakers: 41.338946 wins (41-41)
Memphis Grizzlies: 41.297291 wins (41-41)
Miami Heat: 45.202762 wins (45-37)
Milwaukee Bucks: 53.722600 wins (54-28)
Minnesota Timberwolves: 39.185063 wins (39-43)
New Orleans Pelicans: 37.443009 wins (37-45)
New York Knicks: 38.462010 wins (38-44)
Oklahoma City Thunder: 39.284500 wins (39-43)
Orlando Magic: 32.296571 wins (32-50)
Philadelphia 76ers: 47.795819 wins (48-34)
Phoenix Suns: 45.063567 wins (45-37)
Portland Trailblazers: 33.312626 wins (33-49)
Sacramento Kings: 37.493793 wins (37-45)
San Antonio Spurs: 32.519145 wins (33-49)
Toronto Raptors: 42.072515 wins (42-40)
Utah Jazz: 41.920285 wins (42-40)
Washington Wizards: 31.773436 wins (32-50)

As you can see above, I have managed to predict the records for each team for the 2024-25 NBA season. A few things to note about my predictions:

Since NBA records are only counted in whole numbers, I rounded each team’s projected win total up or down to the nearest whole number. For instance, for the Milwaukee Bucks, since their projected win total was 53.722600, I rounded that up to 54 wins (and a 54-28 record).
According to my model, all team’s projected win totals fall between 24 and 54 wins. This make sense since in a given NBA season, a majority of teams’ win totals fall in the 24-54 win range. In the last NBA season (2023-24), 21 teams fell within the 24-54 win range.
Four teams obtained over 54 wins (Celtics with 64, Thunder and Nuggets with 57, and Timberwolves with 56) while five teams obtained less than 24 wins (Spurs with 22, Hornets and Trailblazers with 21, Wizards with 15 and Pistons with 14).
One thing to note about my predictions is that while I rounded up or down to the nearest whole number to get a projected record total, I’ll still factor in the entire decimal (e.g. 45.202762 for the Heat) when deciding how to seed teams, as teams with a higher decimal will be seeded higher in their respective conference.

Michael’s Magnificently Way-To-Early Playoff Picture

Yes, now that we have projected record totals for each of the 30 teams, the next thing we’ll do is predict each team’s seeding.

How will we seed the teams? Well, for one, I’ll rank the teams with the higher projected records higher in their respective conference. For instance, since the Bucks have a higher projected record than the Celtics, I’ll rank the Bucks higher than the Celtics.

However, what if two teams have a really, really close margin between them? For instance, the Minnesota Timberwolves and Oklahoma City Thunder’s projected records of 39.185063 wins and 39.284500 wins respectively are very close to each other. However, since OKC has a slightly higher projected win total, I’ll rank them higher than the Timberwolves.

So without further ado, here’s Michael’s Magnificently Way-Too-Early Playoff Picture!

Eastern Conference

INTO THE PLAYOFFS	INTO THE PLAY-IN	OUT OF PLAYOFF RUNNING
1. Milwaukee Bucks	7. Cleveland Cavaliers	11. Chicago Bulls
2. Boston Celtics	8. Indiana Pacers	12. Washington Wizards
3. Philadelphia 76ers	9. Atlanta Hawks	13. Charlotte Hornets
4. Miami Heat	10. New York Knicks	14. Orlando Magic
5. Toronto Raptors		15. Detroit Pistons
6. Brooklyn Nets

Western Conference

INTO THE PLAYOFFS	INTO THE PLAY-IN	OUT OF PLAYOFF RUNNING
1. Denver Nuggets	7. LA Lakers	11. Sacramento Kings
2. LA Clippers	8. Memphis Grizzlies	12. New Orleans Pelicans
3. Golden State Warriors	9. Oklahoma City Thunder	13. San Antonio Spurs
4. Phoenix Suns	10. Minnesota Timberwolves	14. Portland Trailblazers
5. Dallas Mavericks		15. Houston Rockets
6. Utah Jazz

And now, for some insights

Now that we have our predictions for both team’s projected win totals and projected conference seeding, let’s see if we can gather some insights into what the 2024-25 NBA season might bring for all 30 teams. Without further ado, here are insights across the NBA that I think will be interesting to see play out over the course of the season:

Will the Celtics repeat as champs?

For those who don’t know, the Boston Celtics came out on top as the champions of the 2023-24 NBA season, beating the Dallas Mavericks in 5 games in the 2024 NBA Finals.

Question is, can they do it again? There’s a good chance that can happen, even with the projected 2-seed in the Eastern Conference. After all, the Celtics have kept many of their key playmakers from their championship squad such as Al Horford, Derrick White, Jaylen Brown and of course, Jayson Tatum.

Interestingly, we’ve had SIX different teams win the NBA championship in the last six seasons, such as:

2019-Raptors
2020-Lakers
2021-Bucks
2022-Warriors
2023-Nuggets
2024-Celtics

Could we have a repeat champ for the first time since those seemingly endless Warriors-Cavs finals (remember those)? I’ll reiterate that it’s certainly possible, especially with Tatum in his prime.

Warriors for a deep playoff run?

Yes, I know they’ve had their ups and downs over the last 10 years, but after all, the Golden State Warriors have won 4 championships over the last 10 years, so I have reason to believe they’ll go on another deep playoff run.

Will the loss of Klay Thompson hurt? Yes. Stephen Curry is also on the back-nine of his career (he turns 37 in March), but he did put up the most points per game of anyone on the Warriors’ roster last season (26.4). Curry also had the highest 3-pointer percentage of anyone on the Warriors’ roster last season (40.8%)-recall that successful 3-pointer percentage was one of the five features I used in the linear regression model. Plus, Draymond Green will be returning to the Warriors this eason; he proved to be one of the Warriors’ strongest 3-point shooters and rebounders last season (though he is also in his later career as he will be 35 in March).

Interestingly, this model has the Warriors going 46-36 as the 3-seed in the Western Conference. Funny enough, the Warriors finished 46-36 last season but ended up as the 10-seed in the Western Conference and failed to make it past play-in.

This brings me to my next point…

Will the West be close again?

Last season, the Western Conference was incredibly close when it came to win totals and playoff seeding. After all, the 6-seed in the West last year (Phoenix Suns) still finished with a 49-33 record…and were promptly swept in the Western Conference first round (though that’s neither here nor there).

Another thing to put the closeness of last year’s Western Conference playoff race into perspective-the Warriors finished 46-36 yet only notched a 10-seed and the Houston Rockets finished with an even 41-41 record but missed the postseason entirely (they got the 11-seed).

Which brings be to my next point…

Will the East be far apart?

While last year’s Western Conference was quite competitive, the Eastern Conference was, well, another story:

Image from Wikipedia: https://en.wikipedia.org/wiki/2023%E2%80%9324_NBA_season.

Yes, the Celtics not only got the 1-seed in the East but also finished FOURTEEN games ahead of the 2-seed New York Knicks (yes, the Knicks finished 50-32 and still got the 2-seed). Two teams that had very up-and-down seasons-the Bulls and Hawks-both finished with under 40 wins yet still qualified for the play-in as the 9- and 10-seeds in the East, respectively.

Miami Heat to the play…offs?

Throughout the last 10 years, the Miami Heat have had a great deal of success, making it to the Finals twice in that span (’20 and ’23) and making it to the playoffs 7 of the last 10 seasons (exceptions being ’15, ’17 and ’19).

However, while they did make the playoffs the last two seasons, they had to do so through first making it through the play-ins-both times they made it as the 8-seed in the play-in (meaning they had to play two play-in games to even get a playoff slot).

In this model however, the Miami Heat will earn the 4-seed and make the actual playoffs, not the play-in. What could possibly work to their advantage? Here are a few factors:

While their successful field goal percentage was in the bottom half of the league last season, they came in 12th amongst all teams in successful 3-pointer percentage, which should help their case.
After losing Jimmy Butler and Terry Rozier before play-offs last season, both are now (as of this writing) healthy and ready to play.
Those 42.3 rebounds (both offensive and defensive) last year look pretty good.
Tyler Herro, Bam Adebayo and Jimmy Butler were the top-3 scorers on the Heat in both points per game and field goals last year…those stats certainly matter for big games. Plus Herro is 24 and Adebayo is 27, so both are still in the primes of their careers (though Jimmy Butler at 35 still plays like he’s in his prime in my opinion)

Will the Heat win an NBA championship or make another Finals appearance? TBD. However, it looks like (according to this model I made) that they will at least make it to the play-offs without needing to go through play-ins first (though their 8-seed to-the-Finals run in 2023 was certainly memorable).

And now, for the bottom of the conference

Most of my insights discussed more successful teams and (potential) deep playoff runs. However, I wanted to offer one more insight concerning the two teams at the (projected) bottom of their conferences-the Pistons in the East and Rockets in the West.

First off-the Detroit Pistons, who, according to my model, are projected to be the 15-seed again (they were the 15-seed last season); will they manage to improve this season? My guess is yes-at least in terms of having more wins this season (only 14 wins last year)-but I don’t think they’ll make a strong playoff run, and I know a 28 game losing streak last season to drop the Pistons to 2-30 at one point didn’t help make a case for their postseason hopes. However, give the Pistons credit for changing their coach (now JB Bickerstaff) and GM (now Trajan Langdon) and adding some solid free agents like Tobias Harris (48.7% of successful field goals last season-not too shabby). Again, I doubt they’ll make a strong playoff run, but they could very well finish higher than the 15-seed.

As for the Houston Rockets (projected 15-seed in the West), they finished in the 11-seed last year in a competitive Western Conference with an even 41-41 record. Judging from last years stats-coming in 9th on defense but 20th on offense-they do have some work to do to make a deep playoff run. However, with a good mix of young players like Tari Eason and veterans like Fred VanVleet (who was on the championship 2019 Toronto Raptors), the Rockets could make it past play-in.

Just for fun…Michael’s Play-In Predictions

Now for an added bonus for my loyal readers, here are my educated guess, just-for-fun play-in predictions for both the Eastern and Western conferences. Granted, while the model I made did help predict regular-season seeding in each conference, it didn’t predict who would make it past play-in to grab the 7- and 8-seeds in the conference. So without further ado, here are my play-in predictions based on what I saw in the teams last season:

Eastern Conference

Predictions: Pacers 7-seed, Cavaliers 8-seed

Western Conference

Predictions: Lakers 7-seed, Timberwolves 8-seed

Thanks for reading and I hope you learned something new from this post! Enjoy the NBA season and I will follow up with a Part 3 post on this topic sometime in April, or at least some time after the conclusion of the regular season. It will be interested to see how accurate or off my predictions were.

Michael

Python, Linear Regression & An NBA Season Opening Day Special Post

Hello readers,

Michael here, and in today’s lesson, we’re gonna try something special! For one, we’re going back to this blog’s statistical roots with a linear regression post; I covered linear regression with R in the way, way back of 2018 (R Lesson 6: Linear Regression) on this blog, so I thought I’d show you how to work the linear regression process in Python. Two, I’m going to try something I don’t normally do, which is predict the future. In this case, the future being the results of the just-beginning 2024-25 NBA season. Why try to predict NBA results you might ask? Well, for one, I wanted to try something new on this blog (hey, gotta keep things fresh six years in), and for two, I enjoy following along with the NBA season. Plus, I enjoyed writing my post on the 2020 NBA playoffs-R Analysis 10: Linear Regression, K-Means Clustering, & the 2020 NBA Playoffs.

Let’s load our data and import our packages!

Before we get started on the analysis, let’s first load our data into our IDE and import all necessary packages:

import pandas as pd
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression

You’re likely quite familiar with pandas but for those of you that don’t know, sklearn is an open-source Python library commonly used for machine learning projects (like the linear regression we’re about to do)!

A note about uploading files via Google Colab

Once we import our necessary packages, the next thing we should do is upload the data-frame we’ll be using for this analysis.

This is the file we’ll be using; it contains team statistics such as turnovers (team total) and wins for all 30 NBA teams for the last 10 seasons (2014-15 to 2023-24). The data was retrieved from basketball-reference.com, which is a great place to go if you’re looking for juicy basketball data to analyze. This site comes from https://www.sports-reference.com/, which contains statistics on various sports from NBA to NFL to the other football (soccer for Americans), among other sports.

NBA analysis Download

Now, since I used Google Colab for this analysis, I’ll show you how to upload Excel files into Colab (a different process from uploading Excel files into other IDEs):

To import local files into Google Colab, you’ll need to include the lines from google.colab import files and uploaded = files.upload() in the notebook since, for some odd reason, Google Colab won’t let you upload local files directly into your notebook. Once you run these two lines of code, you’ll need to select a file from the browser tool that you want to upload to Colab.

Next (and ideally in a separate cell), you’ll need to add the lines import io and dataframe = pd.read_csv(io.BytesIO(uploaded['dataframe name'])) to the notebook and run the code. This will officially upload your data-frame to your Colab notebook.

Yes, I know it’s annoying, but that’s just how Colab works. If you’re not using Colab to follow along with me, feel free to skip this section as a simple pd.read_csv() will do the trick to upload your data-frame onto the IDE.

Let’s learn about our data-frame!

Now that we’ve uploaded our data-frame into the IDE, let’s learn more about it!

NBA.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300 entries, 0 to 299
Data columns (total 31 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Season  300 non-null    object 
 1   Team    300 non-null    object 
 2   W       300 non-null    int64  
 3   L       300 non-null    int64  
 4   Finish  300 non-null    int64  
 5   Age     300 non-null    float64
 6   Ht.     300 non-null    object 
 7   Wt.     300 non-null    int64  
 8   G       300 non-null    int64  
 9   MP      300 non-null    int64  
 10  FG      300 non-null    int64  
 11  FGA     300 non-null    int64  
 12  FG%     300 non-null    float64
 13  3P      300 non-null    int64  
 14  3PA     300 non-null    int64  
 15  3P%     300 non-null    float64
 16  2P      300 non-null    int64  
 17  2PA     300 non-null    int64  
 18  2P%     300 non-null    float64
 19  FT      300 non-null    int64  
 20  FTA     300 non-null    int64  
 21  FT%     300 non-null    float64
 22  ORB     300 non-null    int64  
 23  DRB     300 non-null    int64  
 24  TRB     300 non-null    int64  
 25  AST     300 non-null    int64  
 26  STL     300 non-null    int64  
 27  BLK     300 non-null    int64  
 28  TOV     300 non-null    int64  
 29  PF      300 non-null    int64  
 30  PTS     300 non-null    int64  
dtypes: float64(5), int64(23), object(3)
memory usage: 72.8+ KB

Running the NBA.info() command will allow us to see basic information about all 31 columns in our data-frame (such as column names, amount of records in dataset, and object type).

In case you’re wondering about all the abbreviations, here’s an explanation for each abbreviation:

Season-The specific season represented by the data (e.g. 2014-15)
Team-The team name
W-A team’s wins in a given season
L-A team’s losses in a given season
Finish-The seed a team finished in during a given season in their conference (e.g. Detroit Pistons finishing 15th seed in the East last season)
Age-The average age of a team’s roster as of February 1 of a given season (e.g. February 1, 2024 for the 2023-24 season)
Ht.-The average height of the team’s roster in a given season (e.g. 6’6)
Wt.-The average weight (in lbs.) of the team’s roster in a given season
G-Total amount of games played by the team in a given season
MP-Total minutes played as a team in a given season
FG-Field goals scored by the team in a given season
FGA-Field goal attempts made by the team in a given season
FG%-Percent of successful field goals made by team in a given season
3P-3-point field goals scored by the team in a given season
3PA-3-point field goal attempts made by the team in a given season
3P%-Percent of successful 3-point field goals made by the team in a given season
2P-2-point field goals scored by the team in a given season
2PA-2-point field goal attempts made by the team in a given season
2P%-Percent of successful 2-point field goals made by the team in a given season
FT-Free throws scored by the team in a given season
FTA-Free throw attempts made by the team in a given season
FT%-Percent of successful free throw attempts made by the team in a given season
ORB-Team’s total offensive rebounds in a given season
DRB-Team’s total defensive rebounds in a given season
TRB-Team’s total rebounds (both offensive and defensive) in a given season
AST-Team’s total assists in a given season
STL-Team’s total steals in a given season
BLK-Team’s total blocks in a given season
TOV-Team’s total turnovers in a given season
PF-Team’s total personal fouls in a given season
PTS-Team’s total points scored in a given season

Wow, that’s a lot of variables! Now that understand know the data we’re working with better, let’s see how we can make a simple linear regression model!

If you’re not familiar with basketball jargon, the NBA has a great glossary of basic terms on their website: https://www.nba.com/stats/help/glossary

The K-Best Way To Set Up Your Model

Before we start the juicy analysis, let’s first pick the features we will use for the model. In this post, we’ll explore the Select K-Best algorithm, which is an algorithm commonly used in linear regression to help select the best features for a particular model:

X = NBA.drop(['Season', 'Team', 'W', 'Ht.'], axis=1)
y = NBA['W']

from sklearn.feature_selection import SelectKBest, f_regression
features = SelectKBest(score_func=f_regression, k=5)
features.fit(X, y)

selectedFeatures = X.columns[features.get_support()]
print(selectedFeatures)

Index(['L', 'Finish', 'Age', 'FG%', '3P%'], dtype='object')

According to the Select K-Best algorithm, the five best features to use in the linear regression are L, Finish, Age, FG% and 3P%. In other words, a team’s end-of-season seeding, total losses, average roster age, and percentage of successful field goals and 3-pointers are the five most important features to predict a team’s win total.

How did the model arrive to these conclusions? First of all, I set the X and y variables-this is important as the Select K-Best algorithm needs to know what is the dependent variable and what are possible independent variable selections that can be used in the model. In this example, the dependent (or y) variable is W (for team wins) while the X variable includes all other dataset columns except for W, Team, Season, and Ht. because W is the y variable and the other three variables are categorial (or non-numerical) variables, so they really won’t work in our analysis.

Next we import the SelectKBest and f_regression packages from the sklearn.feature_selection module. Why do we need these two packages? Well, SelectKBest will allow us to use the Select K-Best algorithm while f_regression is like a back-end feature selection method that allows the Select K-Best algorithm to select the best x-amount of features for the model (I used five features for this model).

After setting up the Select K-Best algorithm, we then fit both the X and y variables to the algorithm and then print out our top five selectedFeatures.

Train, test…split!

Once we have our top five features for model, it’s time for the train, test, splitting of the model! What is train, test, split you ask? Well, our linear regression model will be split into two types of data-training data (the data we use for training the model) and testing data (the data we use to test our model). Here’s how we can utilize the train, test, split for this model:

X = NBA[['L', 'Finish', 'Age', 'FG%', '3P%']]
y = NBA['W']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

How does the train, test, split work? Using sklearn’s train_test_split method, we pass in four parameters-our independent variables (X), our dependent variable (y), the size of the test data (a decimal between 0 and 1), and the random state (this can be kept at 0, but it doesn’t matter what number you use-42 is another common number). In this model, I will utilize an 80/20 train, test, split, which indicates that 80% of the data will be for training while the other 20% will be used for testing.

Other common train, test, splits are 70/30, 85/15, and 67/33, but I opted for 80/20 because our dataset is only 300 rows long. I would utilize these other train, test, splits for larger datasets.

Something worth noting: What we’re doing here is called multiple linear regression since we’re using five X variables to predict a Y variable. Simple linear regression would only use one X variable to predict a Y variable. Just thought I’d throw in this quick factoid!

And now, for the model-making

Now that we’ve done all the steps to set up our model, the next thing we’ll need to do is actually create the model!

Here’s how we can get started:

NBAMODEL = LinearRegression()
NBAMODEL.fit(X_train, y_train)

LinearRegression()

In this example, we create a LinearRegression() object (NBAMODEL) and fit it to both the X_train and y_train data.

Predictions, predictions

Once we’ve created our model, next comes the fun part-generating the predictions!

yPredictions = NBAMODEL.predict(X_test)

yPredictions

array([53.20097648, 28.89541793, 52.26551381, 53.22220829, 35.90676716,
       32.15874993, 47.72090936, 48.32896277, 39.4193884 , 40.1548429 ,
       19.62678175, 48.3263792 , 32.13473281, 43.50887634, 43.85260484,
       52.79795145, 27.35822648, 40.23392095, 18.85423981, 61.69624816,
       51.59650403, 23.86311747, 56.18087097, 54.15867678, 49.75211403,
       46.90177259, 31.80109001, 46.82531833, 37.50563942, 32.19863141,
       52.41205133, 25.09011881, 48.94542256, 38.80244997, 24.80146638,
       42.50107728, 43.27320835, 37.45199938, 46.7795962 , 28.11289951,
       57.64388881, 29.35812466, 18.3222965 , 36.26677012, 20.56912227,
       22.15266241, 19.9955299 , 44.84930613, 45.14740453, 23.19471644,
       53.940611  , 26.0780373 , 27.88093669, 61.23347337, 52.99948229,
       34.66653881, 30.04421016, 27.21669768, 48.55215233, 47.11060905])

The yPredictions are obtained through using the predict method on the model’s X_train data, which in this case consists of 60 of the 300 records..

Evaluating the model’s accuracy

Once we’ve created the model and made our predictions on the training data, it’s time to evaluate the model’s accuracy. Here’s how to do so:

from sklearn.metrics import mean_absolute_percentage_error

mean_absolute_percentage_error(y_test,yPredictions)

0.09147159762376074

There are several ways you can evaluate the accuracy of a linear regression model. One good method as shown here is the mean_absolute_percentage_error (imported from the sklearn.metrics package). The mean absolute percentage error evaluates the model’s accuracy by indicating how off the model’s predictions are. In this model, the mean absolute percentage error is 0.09147159762376074, indicating that the model’s predictions are off by roughly 9%-which also indicates that overall, the model’s predictions are roughly 91% accurate. Not too shabby for this model!

Interestingly, the two COVID impacted NBA seasons in the dataset (2019-20 and 2020-21) didn’t throw off the model’s accuracy much.

Don’t forget about the equation!

Evaluating the model’s accuracy isn’t the only thing you should do when analyzing the model. You should also grab the model’s coefficients and intercept-they will be important in the next post!

NBAMODEL.coef_

array([ -0.4663858 ,  -1.30716212,   0.39700734,  34.1325687 ,
       -22.12258585])

NBAMODEL.intercept_

50.945769772855854

All linear regression models will have a coefficient and an intercept, which form the linear regression equation. Since our model had five X variables, there are five coefficients.

Now, what would our equation look like?

Here is the equation in all it’s messy glory. We’re going to be using this equation in the next post.

Linear regression plotting

For the visual learners among my readers, I thought it would be nice to include a simple scatterplot to visualize the accuracy of our linear regression model. Here’s how to create that plot:

import matplotlib.pyplot as plt
plt.scatter(y_test, yPredictions, color="red")
plt.xlabel('Actual values', size=15)
plt.ylabel('Predicted values', size=15)
plt.title('Actual vs Predicted values', size=15)
plt.show()

First, I imported the matplotlib.pyplot module. Then, I ran the plt.scatter() method to create a scatterplot. I used three parameters for this method: the y_test values, the yPredictions values, and the color="red" parameter (this just indicated that I wanted red scatterplot dots). I then used the plt.xlabel(), plt.ylabel(), and plt.title() methods to give the scatterplot an x-label title, y-label title, and title, respectively. Lastly, I used the plt.show() method to display the scatterplot in all of its red-dotted glory.

As you can see from this plot, the predicted values match the actual values fairly closely, hence the 91% accuracy/9% error.

Thanks for reading, enjoy the upcoming NBA season action, and stay tuned for my next post where I reveal my predicted records and standings for each team, East and West! It will be interesting to see how my predictions pan out over the course of the season-after all, it’s certainly something different I’m trying on this blog!

And yes, perfect timing for this blog to come out on NBA season opening day! Serendipity am I right?

Also, here’s a link to the notebook in GitHub-https://github.com/mfletcher2021/DevopsBasics/blob/master/NBA_24_25_predictions.ipynb.