Python, Linear Regression & the 2024-25 NBA season

Hello everybody!

Michael here, and in today’s post, we’ll continue where we left off from the previous post Python, Linear Regression & An NBA Season Opening Day Special Post. As I mentioned in that post, we’ll use the linear regression equation we obtained to see if we can obtain predictions for the current 2024-25 NBA season?

Disclaimer

Yes, I know I’m trying to predict the various juicy outcomes of the 2024-25 NBA season, but these predictions are purely meant for educational purposes to display the methodology of the predictions, not for game-day parlays and/or your fantasy NBA team. After all, I am your friendly neighborhood coding blogger, but I am not your friendly neighborhood sportsbook. If you do decide to bet on anything during the NBA season, please bet responsibly :-).

Previously on Michael’s Programming Bytes…

In the previous post, we used data from the last 10 NBA seasons for each of the 30 teams to predict season record results, which in turn gave us this linear regression equation that I will use to predict team-by-team results and standings for the 2024-25 NBA season:

Just to recap, here’s what in this equation:

  • -0.47x (represents team’s losses in a given season)
  • -1.31x (represents team’s conference finish from 1-15 in a given season)
  • 0.4x (represents average age of team’s roster)
  • 34.13x (represents % of field goals made)
  • -22.12x (represents % of 3-pointers made)
  • 50.95 (linear regression model intercept)

Our predictions generated in the previous post came back with 91% accuracy/9% mean absolute percentage error, so I can tell we’re gonna get some good predictions here.

And now, for the predictions…

Yes, here comes the fun part, the predictions. For the predictions, I gathered the weighted averages of the five features we used in our model (losses, conference finish, average roster age, % of field goals made and % of 3-pointers made) and placed them into this spreadsheet:

Now, how did I calculate the weighted averages of these five features for each team? Well, I simply assigned different weights for different seasons like so:

  • 2021-22 to 2023-24 seasons-0.2 weight (higher weight for the three most recent seasons)
  • 2018-19 to 2020-21 seasons-0.1 weight (they’re a little further back, plus I factored in COVID impacts to the 2019-20 and 2020-21 seasons)
  • 2014-15 to 2017-18 season-0.025 weight (smaller weight since these are the furthest in the past, plus many players in the league during this time have since retired)

After assigning these weights, I calculated averages using the standard procedure for average calculation.

Here’s the basic Python code I used to calculate projected wins for all 30 NBA teams:



import pandas as pd

NBAAVG = pd.read_csv(r'C:\Users\mof39\OneDrive\Documents\NBA weighted averages.csv')

for n in NBAAVG['Team']:
    print(str(-0.47*NBAAVG['L']-1.31*NBAAVG['Finish']
                                      +0.4*NBAAVG['Age']+34.13*NBAAVG['FG%']
                                      -22.15*NBAAVG['3P%']+50.95))
    break

And here are the projected win totals for each team using this equation:

0     37.911934
1     52.863761
2     40.819851
3     31.742252
4     37.524958
5     40.441851
6     42.540851
7     51.103223
8     24.263654
9     45.852691
10    33.197160
11    38.736829
12    47.364055
13    41.338946
14    41.297291
15    45.202762
16    53.722600
17    39.185063
18    37.443009
19    38.462010
20    39.284500
21    32.296571
22    47.795819
23    45.063567
24    33.312626
25    37.493793
26    32.519145
27    42.072515
28    41.920285
29    31.773436

Granted, you don’t actually see the team names in this output, but since the team names are organized alphabetically in the dataset you can tell which team corresponds to which projected win total. However, just for clarity, I’ll elaborate on those totals below:

Atlanta Hawks: 37.911934 wins (38-44)
Boston Celtics: 52.863761 wins (53-29)
Brooklyn Nets: 40.819851 wins (41-41)
Charlotte Hornets: 31.742252 wins (32-50)
Chicago Bulls: 37.524958 wins (38-44)
Cleveland Cavaliers: 40.441851 wins (40-42)
Dallas Mavericks: 42.540851 wins (43-39)
Denver Nuggets: 51.103223 wins (51-31)
Detroit Pistons: 24.263654 wins (24-58)
Golden State Warriors: 45.852691 wins (46-36)
Houston Rockets: 33.197160 wins (33-49)
Indiana Pacers: 38.736829 wins (39-43)
LA Clippers: 47.364055 wins (47-35)
LA Lakers: 41.338946 wins (41-41)
Memphis Grizzlies: 41.297291 wins (41-41)
Miami Heat: 45.202762 wins (45-37)
Milwaukee Bucks: 53.722600 wins (54-28)
Minnesota Timberwolves: 39.185063 wins (39-43)
New Orleans Pelicans: 37.443009 wins (37-45)
New York Knicks: 38.462010 wins (38-44)
Oklahoma City Thunder: 39.284500 wins (39-43)
Orlando Magic: 32.296571 wins (32-50)
Philadelphia 76ers: 47.795819 wins (48-34)
Phoenix Suns: 45.063567 wins (45-37)
Portland Trailblazers: 33.312626 wins (33-49)
Sacramento Kings: 37.493793 wins (37-45)
San Antonio Spurs: 32.519145 wins (33-49)
Toronto Raptors: 42.072515 wins (42-40)
Utah Jazz: 41.920285 wins (42-40)
Washington Wizards: 31.773436 wins (32-50)

As you can see above, I have managed to predict the records for each team for the 2024-25 NBA season. A few things to note about my predictions:

  • Since NBA records are only counted in whole numbers, I rounded each team’s projected win total up or down to the nearest whole number. For instance, for the Milwaukee Bucks, since their projected win total was 53.722600, I rounded that up to 54 wins (and a 54-28 record).
  • According to my model, all team’s projected win totals fall between 24 and 54 wins. This make sense since in a given NBA season, a majority of teams’ win totals fall in the 24-54 win range. In the last NBA season (2023-24), 21 teams fell within the 24-54 win range.
  • Four teams obtained over 54 wins (Celtics with 64, Thunder and Nuggets with 57, and Timberwolves with 56) while five teams obtained less than 24 wins (Spurs with 22, Hornets and Trailblazers with 21, Wizards with 15 and Pistons with 14).
  • One thing to note about my predictions is that while I rounded up or down to the nearest whole number to get a projected record total, I’ll still factor in the entire decimal (e.g. 45.202762 for the Heat) when deciding how to seed teams, as teams with a higher decimal will be seeded higher in their respective conference.

Michael’s Magnificently Way-To-Early Playoff Picture

Yes, now that we have projected record totals for each of the 30 teams, the next thing we’ll do is predict each team’s seeding.

How will we seed the teams? Well, for one, I’ll rank the teams with the higher projected records higher in their respective conference. For instance, since the Bucks have a higher projected record than the Celtics, I’ll rank the Bucks higher than the Celtics.

However, what if two teams have a really, really close margin between them? For instance, the Minnesota Timberwolves and Oklahoma City Thunder’s projected records of 39.185063 wins and 39.284500 wins respectively are very close to each other. However, since OKC has a slightly higher projected win total, I’ll rank them higher than the Timberwolves.

So without further ado, here’s Michael’s Magnificently Way-Too-Early Playoff Picture!

Eastern Conference

INTO THE PLAYOFFSINTO THE PLAY-INOUT OF PLAYOFF RUNNING
1. Milwaukee Bucks7. Cleveland Cavaliers11. Chicago Bulls
2. Boston Celtics8. Indiana Pacers12. Washington Wizards
3. Philadelphia 76ers9. Atlanta Hawks13. Charlotte Hornets
4. Miami Heat10. New York Knicks14. Orlando Magic
5. Toronto Raptors15. Detroit Pistons
6. Brooklyn Nets

Western Conference

INTO THE PLAYOFFSINTO THE PLAY-INOUT OF PLAYOFF RUNNING
1. Denver Nuggets7. LA Lakers11. Sacramento Kings
2. LA Clippers8. Memphis Grizzlies12. New Orleans Pelicans
3. Golden State Warriors9. Oklahoma City Thunder13. San Antonio Spurs
4. Phoenix Suns10. Minnesota Timberwolves14. Portland Trailblazers
5. Dallas Mavericks15. Houston Rockets
6. Utah Jazz

And now, for some insights

Now that we have our predictions for both team’s projected win totals and projected conference seeding, let’s see if we can gather some insights into what the 2024-25 NBA season might bring for all 30 teams. Without further ado, here are insights across the NBA that I think will be interesting to see play out over the course of the season:

Will the Celtics repeat as champs?

For those who don’t know, the Boston Celtics came out on top as the champions of the 2023-24 NBA season, beating the Dallas Mavericks in 5 games in the 2024 NBA Finals.

Question is, can they do it again? There’s a good chance that can happen, even with the projected 2-seed in the Eastern Conference. After all, the Celtics have kept many of their key playmakers from their championship squad such as Al Horford, Derrick White, Jaylen Brown and of course, Jayson Tatum.

Interestingly, we’ve had SIX different teams win the NBA championship in the last six seasons, such as:

  • 2019-Raptors
  • 2020-Lakers
  • 2021-Bucks
  • 2022-Warriors
  • 2023-Nuggets
  • 2024-Celtics

Could we have a repeat champ for the first time since those seemingly endless Warriors-Cavs finals (remember those)? I’ll reiterate that it’s certainly possible, especially with Tatum in his prime.

Warriors for a deep playoff run?

Yes, I know they’ve had their ups and downs over the last 10 years, but after all, the Golden State Warriors have won 4 championships over the last 10 years, so I have reason to believe they’ll go on another deep playoff run.

Will the loss of Klay Thompson hurt? Yes. Stephen Curry is also on the back-nine of his career (he turns 37 in March), but he did put up the most points per game of anyone on the Warriors’ roster last season (26.4). Curry also had the highest 3-pointer percentage of anyone on the Warriors’ roster last season (40.8%)-recall that successful 3-pointer percentage was one of the five features I used in the linear regression model. Plus, Draymond Green will be returning to the Warriors this eason; he proved to be one of the Warriors’ strongest 3-point shooters and rebounders last season (though he is also in his later career as he will be 35 in March).

Interestingly, this model has the Warriors going 46-36 as the 3-seed in the Western Conference. Funny enough, the Warriors finished 46-36 last season but ended up as the 10-seed in the Western Conference and failed to make it past play-in.

This brings me to my next point…

Will the West be close again?

Last season, the Western Conference was incredibly close when it came to win totals and playoff seeding. After all, the 6-seed in the West last year (Phoenix Suns) still finished with a 49-33 record…and were promptly swept in the Western Conference first round (though that’s neither here nor there).

Another thing to put the closeness of last year’s Western Conference playoff race into perspective-the Warriors finished 46-36 yet only notched a 10-seed and the Houston Rockets finished with an even 41-41 record but missed the postseason entirely (they got the 11-seed).

Which brings be to my next point…

Will the East be far apart?

While last year’s Western Conference was quite competitive, the Eastern Conference was, well, another story:

Image from Wikipedia: https://en.wikipedia.org/wiki/2023%E2%80%9324_NBA_season.

Yes, the Celtics not only got the 1-seed in the East but also finished FOURTEEN games ahead of the 2-seed New York Knicks (yes, the Knicks finished 50-32 and still got the 2-seed). Two teams that had very up-and-down seasons-the Bulls and Hawks-both finished with under 40 wins yet still qualified for the play-in as the 9- and 10-seeds in the East, respectively.

Miami Heat to the play…offs?

Throughout the last 10 years, the Miami Heat have had a great deal of success, making it to the Finals twice in that span (’20 and ’23) and making it to the playoffs 7 of the last 10 seasons (exceptions being ’15, ’17 and ’19).

However, while they did make the playoffs the last two seasons, they had to do so through first making it through the play-ins-both times they made it as the 8-seed in the play-in (meaning they had to play two play-in games to even get a playoff slot).

In this model however, the Miami Heat will earn the 4-seed and make the actual playoffs, not the play-in. What could possibly work to their advantage? Here are a few factors:

  • While their successful field goal percentage was in the bottom half of the league last season, they came in 12th amongst all teams in successful 3-pointer percentage, which should help their case.
  • After losing Jimmy Butler and Terry Rozier before play-offs last season, both are now (as of this writing) healthy and ready to play.
  • Those 42.3 rebounds (both offensive and defensive) last year look pretty good.
  • Tyler Herro, Bam Adebayo and Jimmy Butler were the top-3 scorers on the Heat in both points per game and field goals last year…those stats certainly matter for big games. Plus Herro is 24 and Adebayo is 27, so both are still in the primes of their careers (though Jimmy Butler at 35 still plays like he’s in his prime in my opinion)

Will the Heat win an NBA championship or make another Finals appearance? TBD. However, it looks like (according to this model I made) that they will at least make it to the play-offs without needing to go through play-ins first (though their 8-seed to-the-Finals run in 2023 was certainly memorable).

And now, for the bottom of the conference

Most of my insights discussed more successful teams and (potential) deep playoff runs. However, I wanted to offer one more insight concerning the two teams at the (projected) bottom of their conferences-the Pistons in the East and Rockets in the West.

First off-the Detroit Pistons, who, according to my model, are projected to be the 15-seed again (they were the 15-seed last season); will they manage to improve this season? My guess is yes-at least in terms of having more wins this season (only 14 wins last year)-but I don’t think they’ll make a strong playoff run, and I know a 28 game losing streak last season to drop the Pistons to 2-30 at one point didn’t help make a case for their postseason hopes. However, give the Pistons credit for changing their coach (now JB Bickerstaff) and GM (now Trajan Langdon) and adding some solid free agents like Tobias Harris (48.7% of successful field goals last season-not too shabby). Again, I doubt they’ll make a strong playoff run, but they could very well finish higher than the 15-seed.

As for the Houston Rockets (projected 15-seed in the West), they finished in the 11-seed last year in a competitive Western Conference with an even 41-41 record. Judging from last years stats-coming in 9th on defense but 20th on offense-they do have some work to do to make a deep playoff run. However, with a good mix of young players like Tari Eason and veterans like Fred VanVleet (who was on the championship 2019 Toronto Raptors), the Rockets could make it past play-in.

Just for fun…Michael’s Play-In Predictions

Now for an added bonus for my loyal readers, here are my educated guess, just-for-fun play-in predictions for both the Eastern and Western conferences. Granted, while the model I made did help predict regular-season seeding in each conference, it didn’t predict who would make it past play-in to grab the 7- and 8-seeds in the conference. So without further ado, here are my play-in predictions based on what I saw in the teams last season:

Eastern Conference

Predictions: Pacers 7-seed, Cavaliers 8-seed

Western Conference

Predictions: Lakers 7-seed, Timberwolves 8-seed

Thanks for reading and I hope you learned something new from this post! Enjoy the NBA season and I will follow up with a Part 3 post on this topic sometime in April, or at least some time after the conclusion of the regular season. It will be interested to see how accurate or off my predictions were.

Michael

Python, Linear Regression & An NBA Season Opening Day Special Post

Hello readers,

Michael here, and in today’s lesson, we’re gonna try something special! For one, we’re going back to this blog’s statistical roots with a linear regression post; I covered linear regression with R in the way, way back of 2018 (R Lesson 6: Linear Regression) on this blog, so I thought I’d show you how to work the linear regression process in Python. Two, I’m going to try something I don’t normally do, which is predict the future. In this case, the future being the results of the just-beginning 2024-25 NBA season. Why try to predict NBA results you might ask? Well, for one, I wanted to try something new on this blog (hey, gotta keep things fresh six years in), and for two, I enjoy following along with the NBA season. Plus, I enjoyed writing my post on the 2020 NBA playoffs-R Analysis 10: Linear Regression, K-Means Clustering, & the 2020 NBA Playoffs.

Let’s load our data and import our packages!

Before we get started on the analysis, let’s first load our data into our IDE and import all necessary packages:

import pandas as pd
from sklearn.model_selection import train_test_split
from pandas.core.common import random_state
from sklearn.linear_model import LinearRegression

You’re likely quite familiar with pandas but for those of you that don’t know, sklearn is an open-source Python library commonly used for machine learning projects (like the linear regression we’re about to do)!

A note about uploading files via Google Colab

Once we import our necessary packages, the next thing we should do is upload the data-frame we’ll be using for this analysis.

This is the file we’ll be using; it contains team statistics such as turnovers (team total) and wins for all 30 NBA teams for the last 10 seasons (2014-15 to 2023-24). The data was retrieved from basketball-reference.com, which is a great place to go if you’re looking for juicy basketball data to analyze. This site comes from https://www.sports-reference.com/, which contains statistics on various sports from NBA to NFL to the other football (soccer for Americans), among other sports.

Now, since I used Google Colab for this analysis, I’ll show you how to upload Excel files into Colab (a different process from uploading Excel files into other IDEs):

To import local files into Google Colab, you’ll need to include the lines from google.colab import files and uploaded = files.upload() in the notebook since, for some odd reason, Google Colab won’t let you upload local files directly into your notebook. Once you run these two lines of code, you’ll need to select a file from the browser tool that you want to upload to Colab.

Next (and ideally in a separate cell), you’ll need to add the lines import io and dataframe = pd.read_csv(io.BytesIO(uploaded['dataframe name'])) to the notebook and run the code. This will officially upload your data-frame to your Colab notebook.

  • Yes, I know it’s annoying, but that’s just how Colab works. If you’re not using Colab to follow along with me, feel free to skip this section as a simple pd.read_csv() will do the trick to upload your data-frame onto the IDE.

Let’s learn about our data-frame!

Now that we’ve uploaded our data-frame into the IDE, let’s learn more about it!

NBA.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300 entries, 0 to 299
Data columns (total 31 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Season  300 non-null    object 
 1   Team    300 non-null    object 
 2   W       300 non-null    int64  
 3   L       300 non-null    int64  
 4   Finish  300 non-null    int64  
 5   Age     300 non-null    float64
 6   Ht.     300 non-null    object 
 7   Wt.     300 non-null    int64  
 8   G       300 non-null    int64  
 9   MP      300 non-null    int64  
 10  FG      300 non-null    int64  
 11  FGA     300 non-null    int64  
 12  FG%     300 non-null    float64
 13  3P      300 non-null    int64  
 14  3PA     300 non-null    int64  
 15  3P%     300 non-null    float64
 16  2P      300 non-null    int64  
 17  2PA     300 non-null    int64  
 18  2P%     300 non-null    float64
 19  FT      300 non-null    int64  
 20  FTA     300 non-null    int64  
 21  FT%     300 non-null    float64
 22  ORB     300 non-null    int64  
 23  DRB     300 non-null    int64  
 24  TRB     300 non-null    int64  
 25  AST     300 non-null    int64  
 26  STL     300 non-null    int64  
 27  BLK     300 non-null    int64  
 28  TOV     300 non-null    int64  
 29  PF      300 non-null    int64  
 30  PTS     300 non-null    int64  
dtypes: float64(5), int64(23), object(3)
memory usage: 72.8+ KB

Running the NBA.info() command will allow us to see basic information about all 31 columns in our data-frame (such as column names, amount of records in dataset, and object type).

In case you’re wondering about all the abbreviations, here’s an explanation for each abbreviation:

  • Season-The specific season represented by the data (e.g. 2014-15)
  • Team-The team name
  • W-A team’s wins in a given season
  • L-A team’s losses in a given season
  • Finish-The seed a team finished in during a given season in their conference (e.g. Detroit Pistons finishing 15th seed in the East last season)
  • Age-The average age of a team’s roster as of February 1 of a given season (e.g. February 1, 2024 for the 2023-24 season)
  • Ht.-The average height of the team’s roster in a given season (e.g. 6’6)
  • Wt.-The average weight (in lbs.) of the team’s roster in a given season
  • G-Total amount of games played by the team in a given season
  • MP-Total minutes played as a team in a given season
  • FG-Field goals scored by the team in a given season
  • FGA-Field goal attempts made by the team in a given season
  • FG%-Percent of successful field goals made by team in a given season
  • 3P-3-point field goals scored by the team in a given season
  • 3PA-3-point field goal attempts made by the team in a given season
  • 3P%-Percent of successful 3-point field goals made by the team in a given season
  • 2P-2-point field goals scored by the team in a given season
  • 2PA-2-point field goal attempts made by the team in a given season
  • 2P%-Percent of successful 2-point field goals made by the team in a given season
  • FT-Free throws scored by the team in a given season
  • FTA-Free throw attempts made by the team in a given season
  • FT%-Percent of successful free throw attempts made by the team in a given season
  • ORB-Team’s total offensive rebounds in a given season
  • DRB-Team’s total defensive rebounds in a given season
  • TRB-Team’s total rebounds (both offensive and defensive) in a given season
  • AST-Team’s total assists in a given season
  • STL-Team’s total steals in a given season
  • BLK-Team’s total blocks in a given season
  • TOV-Team’s total turnovers in a given season
  • PF-Team’s total personal fouls in a given season
  • PTS-Team’s total points scored in a given season

Wow, that’s a lot of variables! Now that understand know the data we’re working with better, let’s see how we can make a simple linear regression model!

The K-Best Way To Set Up Your Model

Before we start the juicy analysis, let’s first pick the features we will use for the model. In this post, we’ll explore the Select K-Best algorithm, which is an algorithm commonly used in linear regression to help select the best features for a particular model:

X = NBA.drop(['Season', 'Team', 'W', 'Ht.'], axis=1)
y = NBA['W']

from sklearn.feature_selection import SelectKBest, f_regression
features = SelectKBest(score_func=f_regression, k=5)
features.fit(X, y)

selectedFeatures = X.columns[features.get_support()]
print(selectedFeatures)

Index(['L', 'Finish', 'Age', 'FG%', '3P%'], dtype='object')

According to the Select K-Best algorithm, the five best features to use in the linear regression are L, Finish, Age, FG% and 3P%. In other words, a team’s end-of-season seeding, total losses, average roster age, and percentage of successful field goals and 3-pointers are the five most important features to predict a team’s win total.

How did the model arrive to these conclusions? First of all, I set the X and y variables-this is important as the Select K-Best algorithm needs to know what is the dependent variable and what are possible independent variable selections that can be used in the model. In this example, the dependent (or y) variable is W (for team wins) while the X variable includes all other dataset columns except for W, Team, Season, and Ht. because W is the y variable and the other three variables are categorial (or non-numerical) variables, so they really won’t work in our analysis.

Next we import the SelectKBest and f_regression packages from the sklearn.feature_selection module. Why do we need these two packages? Well, SelectKBest will allow us to use the Select K-Best algorithm while f_regression is like a back-end feature selection method that allows the Select K-Best algorithm to select the best x-amount of features for the model (I used five features for this model).

After setting up the Select K-Best algorithm, we then fit both the X and y variables to the algorithm and then print out our top five selectedFeatures.

Train, test…split!

Once we have our top five features for model, it’s time for the train, test, splitting of the model! What is train, test, split you ask? Well, our linear regression model will be split into two types of data-training data (the data we use for training the model) and testing data (the data we use to test our model). Here’s how we can utilize the train, test, split for this model:

X = NBA[['L', 'Finish', 'Age', 'FG%', '3P%']]
y = NBA['W']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

How does the train, test, split work? Using sklearn’s train_test_split method, we pass in four parameters-our independent variables (X), our dependent variable (y), the size of the test data (a decimal between 0 and 1), and the random state (this can be kept at 0, but it doesn’t matter what number you use-42 is another common number). In this model, I will utilize an 80/20 train, test, split, which indicates that 80% of the data will be for training while the other 20% will be used for testing.

Other common train, test, splits are 70/30, 85/15, and 67/33, but I opted for 80/20 because our dataset is only 300 rows long. I would utilize these other train, test, splits for larger datasets.

  • Something worth noting: What we’re doing here is called multiple linear regression since we’re using five X variables to predict a Y variable. Simple linear regression would only use one X variable to predict a Y variable. Just thought I’d throw in this quick factoid!

And now, for the model-making

Now that we’ve done all the steps to set up our model, the next thing we’ll need to do is actually create the model!

Here’s how we can get started:

NBAMODEL = LinearRegression()
NBAMODEL.fit(X_train, y_train)

LinearRegression()

In this example, we create a LinearRegression() object (NBAMODEL) and fit it to both the X_train and y_train data.

Predictions, predictions

Once we’ve created our model, next comes the fun part-generating the predictions!

yPredictions = NBAMODEL.predict(X_test)

yPredictions

array([53.20097648, 28.89541793, 52.26551381, 53.22220829, 35.90676716,
       32.15874993, 47.72090936, 48.32896277, 39.4193884 , 40.1548429 ,
       19.62678175, 48.3263792 , 32.13473281, 43.50887634, 43.85260484,
       52.79795145, 27.35822648, 40.23392095, 18.85423981, 61.69624816,
       51.59650403, 23.86311747, 56.18087097, 54.15867678, 49.75211403,
       46.90177259, 31.80109001, 46.82531833, 37.50563942, 32.19863141,
       52.41205133, 25.09011881, 48.94542256, 38.80244997, 24.80146638,
       42.50107728, 43.27320835, 37.45199938, 46.7795962 , 28.11289951,
       57.64388881, 29.35812466, 18.3222965 , 36.26677012, 20.56912227,
       22.15266241, 19.9955299 , 44.84930613, 45.14740453, 23.19471644,
       53.940611  , 26.0780373 , 27.88093669, 61.23347337, 52.99948229,
       34.66653881, 30.04421016, 27.21669768, 48.55215233, 47.11060905])

The yPredictions are obtained through using the predict method on the model’s X_train data, which in this case consists of 60 of the 300 records..

Evaluating the model’s accuracy

Once we’ve created the model and made our predictions on the training data, it’s time to evaluate the model’s accuracy. Here’s how to do so:

from sklearn.metrics import mean_absolute_percentage_error

mean_absolute_percentage_error(y_test,yPredictions)

0.09147159762376074

There are several ways you can evaluate the accuracy of a linear regression model. One good method as shown here is the mean_absolute_percentage_error (imported from the sklearn.metrics package). The mean absolute percentage error evaluates the model’s accuracy by indicating how off the model’s predictions are. In this model, the mean absolute percentage error is 0.09147159762376074, indicating that the model’s predictions are off by roughly 9%-which also indicates that overall, the model’s predictions are roughly 91% accurate. Not too shabby for this model!

  • Interestingly, the two COVID impacted NBA seasons in the dataset (2019-20 and 2020-21) didn’t throw off the model’s accuracy much.

Don’t forget about the equation!

Evaluating the model’s accuracy isn’t the only thing you should do when analyzing the model. You should also grab the model’s coefficients and intercept-they will be important in the next post!

NBAMODEL.coef_

array([ -0.4663858 ,  -1.30716212,   0.39700734,  34.1325687 ,
       -22.12258585])
NBAMODEL.intercept_

50.945769772855854

All linear regression models will have a coefficient and an intercept, which form the linear regression equation. Since our model had five X variables, there are five coefficients.

Now, what would our equation look like?

Here is the equation in all it’s messy glory. We’re going to be using this equation in the next post.

Linear regression plotting

For the visual learners among my readers, I thought it would be nice to include a simple scatterplot to visualize the accuracy of our linear regression model. Here’s how to create that plot:

import matplotlib.pyplot as plt
plt.scatter(y_test, yPredictions, color="red")
plt.xlabel('Actual values', size=15)
plt.ylabel('Predicted values', size=15)
plt.title('Actual vs Predicted values', size=15)
plt.show()

First, I imported the matplotlib.pyplot module. Then, I ran the plt.scatter() method to create a scatterplot. I used three parameters for this method: the y_test values, the yPredictions values, and the color="red" parameter (this just indicated that I wanted red scatterplot dots). I then used the plt.xlabel(), plt.ylabel(), and plt.title() methods to give the scatterplot an x-label title, y-label title, and title, respectively. Lastly, I used the plt.show() method to display the scatterplot in all of its red-dotted glory.

As you can see from this plot, the predicted values match the actual values fairly closely, hence the 91% accuracy/9% error.

Thanks for reading, enjoy the upcoming NBA season action, and stay tuned for my next post where I reveal my predicted records and standings for each team, East and West! It will be interesting to see how my predictions pan out over the course of the season-after all, it’s certainly something different I’m trying on this blog!

And yes, perfect timing for this blog to come out on NBA season opening day! Serendipity am I right?

Also, here’s a link to the notebook in GitHub-https://github.com/mfletcher2021/DevopsBasics/blob/master/NBA_24_25_predictions.ipynb.

A Quick Dive Into Google Colab

Hello everybody,

Michael here, and in today’s post, we’ll do something special-a quick dive (or you could call it a Programming Byte) into another IDE you can use for all sorts of fun Python coding adventures-Google Colab! Think of this post as a long-awaited follow-up to Python Program Demo 1: Using the Jupyter Notebook (written in December 2019).

What is Google Colab?

Google Colab (short for Colaboratory) is similar to Jupyter Notebook since both can be used as Python IDEs. However, what more should you know about Google Colab, and what are its differences/similarities to Jupyter Notebook:

Google ColabJupyter Notebook
Requires an internet connection and a Gmail account to useCan be utilized offline and doesn’t require a special account
It’s easy to save your code to GitHub with one click of a buttonIt’s a lot harder to save your code to GitHub; version control via Git is also more challenging since Jupyter notebooks are saved as JSON files
It’s free to use, but if you want better computing power from Colab, better to upgrade to a paid planIt’s always free to use, with no upgrade-to-paid-plan options
Certain commonly-used packages (e.g. pandas) come pre-installed with Google ColabJupyter will require you to install any package you wish to use (though if the package is already on your device, no need to install it again)
Only works with Python and HTML markdownWorks with Python, HTML markdown, R and Julia (which is a dynamic programming language)
There could be security risks as the code you work with in Colab is stored on Google Cloud serversMost code is stored on your local drive, not on cloud servers

Now that we’ve explored the gist of Google Colab, let’s see how we can use it!

Let’s start Colab-orating!

To start using Google Colab, click this link-https://colab.research.google.com/. It should take you to the Google Colab homepage, which looks something like this (as of September 2024):

  • If you’re not signed in to your Gmail account, you would need to do so before signing in to Colab.

As you can see, we have landed on the Google Colab homepage-and in case you’re wondering, we’re not going to explore the Gemini API today (though feel free to do so on your own).

Now, how do we start developing? Click on File–>New Notebook in Drive to create a new Colab notebook file; doing so will open up a new tab on your browser with a blank notebook file that looks like this:

Granted, the new file will come with a boring default title like UntitledNotebook4.ipynb or something like that, but you can change the notebook’s name by clicking on the textbook to the right of the multicolored triangle icon. Note that all Colab notebooks, like Jupyter notebooks, will use the IPYNB extension. Personally, I think Exploring Google Colab works for this lesson.

Colab Coding

Once we have our Colab notebook set up, it’s time to start coding!

Here’s how we’d write and execute some code in Colab!

In this example, we executed two simple lines of code in Colab by first writing said code into a blank Colab cell before clicking the Play icon in the cell to run the code. Since the code in this screenshot was already ran, we can see the output in the white cell below. Simple, yet effective!

  • If you’d like to edit the code in a certain cell, click on the code cell once to be able to edit that cell. Then click the Play button in that cell to run that code.

HTML Markdown

As I mentioned earlier in this post, Google Colab can also render HTML markdown text in addition to Python code. How can we generate some HTML text?

To generate HTML text, click on the Text button on the top of the screen to add a new text cell to your Colab notebook. Once the text cell is added, you can add all the HTML text you want-the best part about Colab is that, unlike in Jupyter Notebook, you can see how the text will appear as you are typing and formatting it (Jupyter Notebook only allows you to see the text once you’re done formatting/typing it).

Once your done using the editor, click on the pencil-like icon with a slash through it to close the editor. After you close the editor, you’ll see a textbox that looks like this:

As you can see, Google Colab managed to neatly format the text according to the HTML formatting we specified earlier.

  • Also, if you wish to further format the HTML text in a given text box, simply double click the text box to reopen the text editor.

A little CSS, perhaps?

Now, you might be wondering whether Colab can implement cool CSS styling (particularly inline CSS styling). The short answer is no, although I did try to do so here:

In this text cell, I tried coloring the text From the mind of Michael blue using inline CSS styling but lo and behold, that didn’t work. Apparently Google Colab doesn’t work with fancy CSS stylings:

As you can see, despite my best efforts to give this text some color, the output is a standard <h3> tagged-text.

Off to GitHub

And now, let’s see how we can send our Colab creation to GitHub!

But first, to save the notebook, click File–>Save (or use CTRL+S) to save the latest version of your notebook.

Now, how do we get our notebook to GitHub? First click File–>Save a copy in GitHub, which will take you to this screen:

Click Sign in to login to your GitHub account and continue along the sign-in process until you see this screen:

Click the Authorize googlecolab button to authorize Google Colab to connect to your GitHub account; this button will allow Google Colab to have read and write access to your GitHub public gists (which are Git repositories).

Once you click the green button, you should be taken back to your Google Colab notebook where a screen like this should appear:

In order to copy your Colab notebook to GitHub, select the Repository and its respective branch to indicate where to place the notebook copy in GitHub. Of course, feel free to change the commit message and include a link to the Colab notebook during the copying process.

Once you click OK, you should see the Colab notebook in the GitHub repository’s respective branch that you selected earlier:

As you can see, we have a copy of our Colab notebook in GitHub in the main branch of my blogcode repository (which is where I will store all the scripts from my posts). Here’s the link to the Colab notebook copy in GitHub-https://github.com/mfletcher2021/blogcode/blob/main/Exploring_Google_Colab.ipynb.

Interestingly, even though the line From the mind of Michael didn’t display blue in the Colab notebook, the blue came through here in the GitHub copy.

Thanks for reading,

Michael

Encryption, SHA Style (Python Lesson 54)

Hello everyone,

Michael here, and in today’s lesson, we’ll explore Python encryption, SHA style.

What is SHA encryption? SHA stands for Secure Hash Algorithm, and it is what’s known as a cryptographic hash function. The SHA algorithms were developed by the US National Security Agency (NSA) and published by the National Institute of Standards and Technology*. SHA is also not a symmetric-key or an asymmetric-key algorithm, as the main purpose of SHA is data security and integrity, not encrypting/decrypting communications.

Now, let’s dive into some SHA coding!

  • *The National Institute of Standards and Technology is an agency under the US Department of Commerce.

Types of SHA hashing functions

There are five different types of SHA hashing functions. Let’s explore each of them:

  • SHA-256-This hashing function will produce a 256-bit hash
  • SHA-384-This hashing function will produce a 384-bit (or 48-byte) hash
  • SHA-224-This hashing function will produce a 224-bit (or 28-byte) hash
  • SHA-512-This hashing function will produce a 512-bit (or 64-byte) hash; because of the long hash output, this is seen as one of the more secure hashing algorithms
  • SHA-1-This was the first SHA hashing function developed by the NSA and produced a 160-bit (or 20-byte) hash; however, this algorithm was found to have many security weaknesses and was thus phased out after 2010

Encrypting text, SHA style

Now that we know about the five different types of hashing functions, let’s see how they work on a text string in Python!

First, let’s import any necessary modules!

import hashlib

The hashlib module will be the only one we’ll need for this post. No need to pip install it since it comes built-in as long as your Python version is 2.5 or higher (and Python 2 has long since been sunset).

Next, let’s set the text we’ll use along with the SHA-256 hash of the text:

text = 'Michaels Programming Bytes: Byte sized programming classes for all coding learners'

SHA256 = hashlib.sha256(text.encode())
print(SHA256.hexdigest())

9854a9d55046c36afafbe47cbaa21ab85c8137d81955c3e344d76156a636a66b

As you see here, we used the name and tagline of this blog for our SHA demonstration. For our SHA encryption, keep these two methods in mind-.encode() and .hexdigest()-as we will use them throughout our demonstrations of the SHA hashing functions.

The .encode() method encodes the text string into a hexadecimal hash while the .hexdigest() method will display the hexadecimal hash.

As you can see here, we have generated a SHA-256 hash from our text with just two lines of code!

Now, if your curious what the decimal number of this hash equals, let’s check it out for fun:

68901140284153216712479841246928981776103555384517524587226404491787878442603

It’s a large, 77-digit number that’s roughly 68 quattuorvigintillion (this is a number followed by 75 zeroes). Quite a big number to come out of that hash!

Next, let’s try the SHA-384 encryption method:

SHA384 = hashlib.sha384(text.encode())
print(SHA384.hexdigest())

2777dfcb92166aa95f0796e3ed4edc12d7da9db3226c8831b40aca5576e8aafa106bb26b3eafe07fc9124238e5e7a6be

This time, we appear to have gotten a larger hash than we did from the SHA-256 function. What could be the decimal number of this hash?

6074720975275609937364706669124284930019665411073401970740713484318034757229843193804707918050212936621238163842750

Now, this number is even larger-it’s approximately 6 septentrigintillion (this is a number followed by 114 zeroes). I honestly never even seen such massive numbers, but as a numbers guy, massive numbers are fun to learn about!

Next, let’s try the SHA-224 method:

SHA224 = hashlib.sha224(text.encode())
print(SHA224.hexdigest())

8ed6b6c67f8f543d41472f2b0da146516cc6e28a20637d4fc3018518

Since SHA-224 uses less bits/bytes than the SHA-384 and SHA-256 functions, it makes sense that the generated hash would be shorter than the hashes generated from those functions.

I imagine that the decimal number would be shorter too:

15042673619469762912744046603652644639808686980332500102671983805720

Not surprisingly, the decimal number from this hash is also shorter, but still fairly massive-it’s approximately 15 unvigintillion (a number followed by 66 zeroes).

Next, let’s try the SHA-512 method:

SHA512 = hashlib.sha512(text.encode())
print(SHA512.hexdigest())

8d42047543b1bea09199838101c6ba0b5fb5d2631c1bb9b772333f2faa51b7cd0c53946e79758475929244d33e02b8e8cd994d79c63747ffdeeec3d6e9a66719

Since SHA-512 generates the largest hashes of all the SHA hashing functions, it’s not surprising that the generated hash is so large.

With a hash so large, I can only imagine what the decimal number would look like:

7398275510411850389223316484636795314726283670587045768118856918282820596862752633935810532260756989519458279298581715671072351034597240609479775136147225

The number above is approximately 7 quinquagintillion, which sounds like a word Dr. Seuss would make up, but it’s a number followed by 153 zeroes.

Last but not least, let’s try the retired, and original, SHA hashing function on our text-SHA-1:

SHA1 = hashlib.sha1(text.encode())
print(SHA1.hexdigest())

5ed4ecb61ff2323823781c514cbe2181e034a186

Since the hashes generated from the SHA-1 function use the least bytes (20) from the five functions we explored, the generated hash would also be the shortest of the five hashes we’ve generated.

With that said, it won’t be surprising that the decimal number of this hash is also fairly short:

541393510912863704690262334257797793251671187846

Even though this number is shorter than the other four numbers we got from the other four hashes, it’s still fairly massive-541 quattuordecillion (a number followed by 45 zeroes).

All in all, we got to see the power of SHA hashing functions through some simple text encryption. The massive decimal numbers we got from each of the generated hashes put the power of the SHA algorithm into perspective (plus, I had fun exploring massive numbers).

An interesting application of SHA hashing algorithms

Before we go, I wanted to explain an interesting application of the SHA hashing algorithm-Bitcoin mining. Yes, Bitcoin mining utilizes the SHA-256 algorithm during the mining process-this is because SHA-256 is both computationally efficient and offers a good deal of security throughout the process. Granted, SHA-512 could also offer a great deal of security, but the hashes also take up a lot of memory.

So, how would one mine Bitcoin? Check out this illustration below for an explanation:

For a very simplistic explanation of Bitcoin mining, a miner would need to utilize a very powerful computer (with processing power much greater than even your standard gaming laptop) to continually generate SHA-256 hashes until the correct hash is found. Once that happens, the miner is entitled to some Bitcoin as a reward. How much Bitcoin they get is determined by Bitcoin’s value at a given time-for instance, the reward as of August 2024 is 3.125 Bitcoin (or 3 1/8 Bitcoin).

Here’s the GitHub link to the script from this post (SHA is the script name)-https://github.com/mfletcher2021/blogcode/blob/main/SHA.py.

Thanks for reading!

Michael

Image Encryption (Python Lesson 53)

Hello everybody,

Michael here, and in today’s post (unofficially the summer of encryption series), we’ll learn how to encrypt images with Python!

During our encryption lessons, we’ve learned how to encrypt text and files with Python. We also learned some interesting mathematics behind RSA encryption too.

Anyway, let’s get started with our image encryption! Here’s the image we’ll be using:

Yes, it’s yours truly in front of the Deadpool & Wolverine movie poster, giving a thumbs up to indicate my approval (and I highly recommend seeing it). Anyway, onto the image encryption.

Encrypting the image

Before we encrypt the image, let’s first create a Fernet key. If you recall from my previous post, Fernet key encryption is a simple symmetric-key encryption method that utilizes the same key to encrypt and decrypt something (in this case, the image).

Encryption key generation!

Before we encrypt and decrypt the image, let’s generate a file for our image key! We’d generate the image encryption/decryption key in a similar manner to the way we generated the file encryption/decryption key (take a look at the code below to see how we generated the file for the image encryption/decryption key):

from cryptography.fernet import Fernet

imageKey = Fernet.generate_key()

with open(r'C:/Users/mof39/OneDrive/Documents/imageKey.key', 'wb') as keyFile:
    keyFile.write(imageKey)

After importing the Fernet module from cryptography.Fernet (install the cryptography package if you haven’t done so already), we create our imageKey then write it to the aptly-named imageKey file-with a .key extension-to a specific file path (you can use any location on your local drive that you can access).

What does the key look like? Let’s find out:

j8P7M-h1v6_KxFkNncCYIfRIQwWmxlLToatX3lg5Jcg=

This is the key we will need to both encrypt and decrypt the image.

It’s image encryption time!

Now, let’s actually encrypt the image. Here’s how to do so:

with open(r'C:/Users/mof39/OneDrive/Documents/imageKey.key', 'rb') as keyFile:
    encryptionKey = keyFile.read()
    
fernetEncryptionKey = Fernet(encryptionKey)

with open(r'C:/Users/mof39/Downloads/20240728_132146.jpg', 'rb') as image:
    originalImage = image.read()
    
encryptedImage = fernetEncryptionKey.encrypt(originalImage)

with open(r'C:/Users/mof39/OneDrive/Documents/encryptedImage.jpg', 'wb') as image:
    image.write(encryptedImage)

What exactly did we do here? Let me explain:

  • First we read the file containing the image encryption/decryption key into the IDE.
  • Next we turned the encryptionKey into a Fernet object.
  • We then read the image we’ll be using into the IDE and encrypted it using the Fernet object we created in the previous step.
  • Finally, we wrote the encrypted image to a specific location on the local drive.
  • Word of advice-I would use different names for the encrypted and decrypted images to avoid confusion.

Now, let’s see what our encrypted image looks like:

As you can see, we can’t view our encrypted image, which is a good thing since the whole point of encrypting the image is to keep it from being seen.

However, let’s see what happens when we open this image on Notepad:

As you can see, we get this beautiful-looking gibberish, which I would love for a would-be hacker to see if they tried to intercept the image transmission to my intended recipient.

Decrypting the image

Now that we have successfully encrypted the image, let’s now decrypt it. Here’s how we’ll accomplish that:

with open(r'C:/Users/mof39/OneDrive/Documents/imageKey.key', 'rb') as keyFile:
    imageKeyFile = keyFile.read()
    
fernetDecryptionKey = Fernet(imageKeyFile)

with open(r'C:/Users/mof39/OneDrive/Documents/encryptedImage.jpg', 'rb') as image:
    encryptedImage = image.read()
    
decryptedImage = fernetDecryptionKey.decrypt(encryptedImage)

with open(r'C:/Users/mof39/OneDrive/Documents/decryptedImage.jpg', 'wb') as image:
    image.write(decryptedImage)

What exactly did we do here to decrypt the image? Let’s explain:

  • First, we read the image key file (the one containing our encryption/decryption key) into the IDE.
  • Next, we created a fernetDecryptionKey object from the imageKeyFile-we’ll use this to decrypt the image.
  • We then read the encryptedImage onto the IDE and, using out fernetDecryptionKey, decrypted the image.
  • Last but not least, we wrote the decryptedImage onto the local drive with the name decryptedImage.
  • Yes I know fernetDecryptionKey and fernetEncryptionKey are the same thing. I just thought it would be easier to use different variable names since they are being used for different purposes (decryption and encryption respectively).

And now, the moment of truth…here’s our decrypted image:

Yup, there I am in all my Deadpool & Wolverine-loving decrypted glory!

Here’s the link to today’s script in GitHub-https://github.com/mfletcher2021/blogcode/blob/main/imageencryption.py

Thanks for reading,

Michael

File Encryption (Python Lesson 52)

Hello everybody,

Michael here, and in today’s post, we’re going to cover file encryption with Python. The previous two posts simply covered text encryption, but today, we’re going to explore something a little different-encrypting and decrypting files!

This will be the file we’ll work with for this tutorial-

Yes, this is an old file and if you want to read the post where I originally used this dataset, here’s the link-R Analysis 10: Linear Regression, K-Means Clustering, & the 2020 NBA Playoffs (written in November 2020).

And now, to start the encryption!

To start with our encryption, let’s import the Fernet class from the cryptography.fernet module like so:

from cryptography.fernet import fernet

Next, let’s create our Fernet key and a file that will store the key:

key = Fernet.generate_key()

with open('C:/Users/mof39/OneDrive/Documents/filekey.key', 'wb') as fileKey:
    fileKey.write(key)

Using the with open() method, we store our Fernet key into a file in (this case) the Documents directory. This method takes two parameters-the file path (where we will store the key in this case) and the mode you wish to open the file in. The mode is a two-character string value with the following options for modes:

First string (denotes method to open the file)
  • r-reads file into the IDE, errors out if the file doesn’t exist or path provided is incorrect
  • a-appends contents to an existing file or creates the file to append content to if the file provided doesn’t exist
  • w-writes content to an existing file or create the file to write content to if the file provided doesn’t exist
  • x-creates the file in the specified file path, errors out if file already exists
SECOND STRING (denotes method to handle the file)
  • t-handles file in text mode
  • b-handles file in binary mode (this is good for handling images)

Now, what does the key look like?

In this example, we wrote our encryption key into a file called filekey.key and stored in the Documents folder.

  • Something to note: The encryption keys should be saved as a .key file, but if you want to view the key file, opening it with a text editor like Notepad will work.

The actual file encryption

Now that we have the encryption key file, let’s encrypt the file! Here’s how to do so:

with open('C:/Users/mof39/OneDrive/Documents/filekey.key', 'rb') as fileKey:
    key = fileKey.read()
    
fernetKey = Fernet(key)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs.xlsx', 'rb') as testFile:
    originalFile = testFile.read()
    
encryption = fernetKey.encrypt(originalFile)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-encrypted.xlsx', 'wb') as encryptedFile:
    encryptedFile.write(encryption)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-encrypted.xlsx', 'rb') as encryptedFile:
    encryptedFile.read()

So, what exactly am I doing here? Let me explain

  • I first read the file key that we generated in the previous section into the IDE.
  • I then created a Fernet key object from the file key we generated.
  • I then read the dataset we’re using into the IDE-note the originalFile variable.
  • I encrypted the originalFile using the Fernet key we created earlier-note the encryption variable.
  • Finally, I encrypted the file using the encryption variable and saved it to my Documents folder.

Now what does the encrypted file look like:

In this example, our test Excel file looks like a bunch of gibberish after being encrypted-and that’s the point of the encryption as its supposed to make the file unreadable during transmission from point A to point B.

  • Excel files such as this one might not open after they are encrypted as the encryption process could also possibly corrupt the file. In this case, if you want to see the contents of the Excel file, opening it with Notepad (as I did here) should do the trick.

It’s decryption time!

Now that we have successfully encrypted our file, assume we want to prepare it before it reaches its intended recipient. In this case, it’s time to decrypt the file! Here’s how to do so:

decryption = fernetKey.decrypt(encryption)

with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-decrypted.xlsx', 'wb') as decryptedFile:
    decryptedFile.write(decryption)
    
with open('C:/Users/mof39/OneDrive/Documents/2020-nba-playoffs-decrypted.xlsx', 'rb') as decryptedFile:
    decryptedFile.read()

How did I decrypt the file? Let me explain:

  • I used the Fernet key we generated earlier for file encryption to decrypt the file.
  • I then created a decrypted file (which is the same thing as our original file) and read that file into the IDE.

What does our decrypted file look like? Let’s take a look:

Ta-da! Our decrypted file is the same as our original file, just with the -decrypted at the end of the file name

  • My advice: Although you don’t absolutely need to use different file names for the encrypted and decrypted versions of the file, I like to do so to be able to tell the difference between the encrypted and decrypted files.

Notice a familiar concept?

If you read my 6th anniversary post, you may recall that I discussed the concepts of symmetric and asymmetric-key encryption.

What does this type of encryption/decryption look like to you? If you guessed symmetric-key encryption, you’d be correct! Fernet key encryption-the method we used to encrypt/decrypt this file-is symmetric key encryption because it uses the same key to encrypt and decrypt the file. Granted, I also mentioned that symmetric-key encryption is less secure than asymmetric-key encryption; there are likely many ways to encrypt/decrypt the file using asymmetric-key encryption, but I thought Fernet key encryption would be an easy enough method to utilize to demonstrate basic file encryption/decryption with Python.

Just one more thing…

Six years into this blogging journey, I still strive to find ways to improve how I get my content to you-the readers. With that said, I will now upload scripts I use in my posts to my GitHub so that you can download and play along with the scripts too!

Here’s the link to the repo with the scripts-mfletcher2021/blogcode: Various scripts from Michael’s Programming Bytes (github.com). The script for this lesson is fileencryption.py.

Thanks for reading,

Michael

6

Hello everyone,

Michael here, and you may be wondering what’s up with this blog title? What could I possibly be covering? The joys of the number six perhaps?

First of all, this post marks the 6th anniversary of Michael’s Programming Bytes (known as Michael’s Analytics Blog until June 2022). Yes, dear readers, I have officially been blogging for six years now, with 166 posts to this blog’s name covering things from data analytics and Python coding to web development and GitHub.

Now, how might I style my anniversary post for year #6? Will I use it as an excuse to show you all something cool. Yes!

In fact, today I’ll be showing you how to work with basic text encryption in Python. Let’s begin!

What is encryption?

Now, before we dive into some Python encryption, let’s explain the concept of encryption as it relates to data.

Let’s use the example of sending an Instagram DM (direct message for those unfamiliar) to one of your friends. Instagram has the option to enable end-to-end encryption for any DMs you send, which means when you send a DM to your friend encryption would encode the text into something called ciphertext while the message is being sent to your friend’s device. Once the message reaches your friend’s device, it will be decrypted (or decoded) back to plain text.

Why is encryption important? When there’s a transfer of data from one point to another (like one person’s Instagram account to another’s), encryption essentially keeps any hackers from intercepting the contents of that data before it reaches its destination. After all, what good is ciphertext to a hacker?

There are two types of encryption I’ll show you-asymmetric-key encryption and symmetric-key encryption.

Symmetric-key encryption

The first type of encryption we’ll explore is symmetric-key encryption. In symmetric-key encryption, the text is encoded and decoded (or encrypted and decrypted) with the same key.

Although this is an easier method of encryption than asymmetric-key encryption (which we’ll discuss later in this post), it is also a less secure method because the same key is being used to encrypt and decrypt the message. As long as anyone has the encryption/decryption key, they can read the message.

Here’s an illustration of the idea of symmetric-key encryption.

In this picture, “the key” represents the key needed to both encrypt and decrypt the message.

Now, let’s explore symmetric-key encryption in a Pythonic sense. First, please pip install cryptography before beginning.

Next, let’s see some symmetric-key encryption in action:

from cryptography.fernet import Fernet

message = "Thank you for six wonderful years!"

theKey = Fernet.generate_key()

theFernetKey = Fernet(theKey)

encryptedMessage = theFernetKey.encrypt(message.encode())

print("The original message is: " , message)
print("The encrypted message is: " , encryptedMessage)

decryptedMessage = theFernetKey.decrypt(encryptedMessage).decode()

print("The decrypted message is: " , decryptedMessage)

print("The encryption key is: " , theKey)

And here’s our output:

The original message is:  Thank you for six wonderful years!
The encrypted message is:  b'gAAAAABmaQb-Ft-ws6utm0lw8S7Vl-ZHeW0MKyYLdYbGrrV-t04xzjg4ftpQ_0oOegR2MzQ8KWeOsfV2-UMzZdR10CM_UeWmpnlkSOW6kBnYn4KYE9bV-f0mBrco9zWS-OvePTvkGU4F'
The decrypted message is:  Thank you for six wonderful years!
The encryption key is:  b'TA5gu709UF8GEw6zIVvq77sWaOzQHPShZaMlUmA17ls='

So, how did I accomplish all this. Let’s walk through the code, shall we?

  • I imported the Fernet class from the cryptography package. Fernet encryption is a type of symmetric-key encryption that ensures a message can’t be read or manipulated without the encryption/decryption key.
  • I also had a message that I wanted to encrypt and decrypt using Fernet encryption.
  • Before encrypting my message, I utilized the Fernet.generate_key() method to generate an encryption/decryption key (storing it in the theKey variable) and then ensured the key utilized Fernet encryption by instantiating it as an object of the Fernet class (using the line Fernet(theKey) and storing it in the theFernetKey variable).
  • I then encrypted my message by using the Fernet key’s encrypt() method and passing in message.encode() as the method’s parameter. This parameter will encode the original message using Fernet encryption.
  • After printing out my original message and encrypted message, I then decrypted the message using the Fernet key’s decrypt() method while passing in the encryptedMessage as the parameter. I then followed up the call to the decrypt() method with a call to the decode() method.
  • Finally, I printed out my decrypted message and (non-Fernet) encryption key. Granted, I didn’t build this script with high security in mind, but assuming someone had the encryption key, they could read and mess around with my message.

Now, the fun thing about encryption in Python is that, if we try to encrypt the same message using symmetric-key encryption, we’ll get a different key created each time. Here’s the key that’s generated when I run this script again:

b'OkncdE57Dvq42ODTxSMdLbpEIJeZWr5b2_Gbej1LevU='

And now for some asymmetric-key encryption

Now that we’ve explored symmetric-key encryption, let’s explore asymmetric-key encryption! Unlike symmetric-key encryption, asymmetric-key encryption uses different keys to encrypt and decrypt data. A public key encrypts the data while a private key decrypts the data-the great thing about this setup is that since no one has the private key, no one can access the data you are trying to transmit. If it helps, think of asymmetric-key encryption like two-factor authentication (you know when you have to use both your password and a second code to login to something), where the different keys add two layers of protection for your data.

Just as I did with symmetric-key encryption, here’s an illustration of asymmetric-key encryption:

And now, let’s code! For this example, please pip install rsa to be able to follow along with this lesson.

Here’s our code for the asymmetric-key encryption:

import rsa

publicKey, privateKey = rsa.newkeys(512)

message = 'Thank you loyal readers for six amazing years'

encryptedMessage = rsa.encrypt(message.encode(), publicKey)

print('Original message: ', message)
print('Encrypted message: ', encryptedMessage)

decryptedMessage = rsa.decrypt(encryptedMessage, privateKey).decode()

print('Decrypted message: ', decryptedMessage)

print('Public key: ', publicKey)
print('Private key: ', privateKey)

And here are the outputs:

Original message:  Thank you loyal readers for six amazing years
Encrypted message:  b'X\xaa\xc5\x98\xe2\xc8\xd1"\xd5\x94\xd0\xc2l\x92\xe3\xc4^\xe9\xef\x83\x18\xab\xdc\xfb\xea\xbb\x1a9\x06\x8e"\xa1\x08\xcc:\xa6n\xc3\xa4\xc2\x14F\xe5i\x96\xd4\x0e\xb6B\x9c-\x85"\xd9\xde\x15\xd8S\xba\xb8\xc8s\x88m'
Decrypted message:  Thank you loyal readers for six amazing years
Public key:  PublicKey(8756745001992373161285778726645083782004419876731866636961799474661459252554364385770004594397922925180145618274212925790191421654715585611349812414582633, 65537)
Private key:  PrivateKey(8756745001992373161285778726645083782004419876731866636961799474661459252554364385770004594397922925180145618274212925790191421654715585611349812414582633, 65537, 2577171637371696805390544273914435753655206228274169456852147462765616764969150310852001220742439294959750117227094790559647443735723327989042277493248225, 7128026561941571600154499219580762398618969604207116659373371614354183291477318639, 1228495001512334734540841979883773428960324113167485626876315020518609447)

So, how did I accomplish all of this? Let’s walk through the code, shall we?

  • The RSA module that we used here utilizes the RSA encryption algorithm, which is a type of encryption algorithm that uses two different but linked keys (one public and one private). The public key encrypts the message while the private key decrypts the message.
  • Quick historical fact: the name RSA comes from the surnames of the creators of this algorithm-computer scientists Ron Rivest, Adi Shamir and Leonard Adleman, who first developed this algorithm in 1977 (all of them are still alive as of June 2024)
  • I used the rsa.newkeys() method to create the publicKey and privateKey and passed in 512 as this method’s parameter-the 512 represents the number of bits each key should have. Trying to figure out a good number of bits to utilize is a little trial-and-error process.
  • I then used the rsa.encrypt() method to encrypt my message and passed in both message.encode() and my publicKey as parameters.
  • After printing out the original and encrypted message, I then used the rsa.decrypt() method to decrypt my message and passed in the encryptedMessage and privateKey as this method’s parameters.
  • I finally printed out the decryptedMessage, publicKey and privateKey.

One interesting thing to note is the similarities between the publicKey and privateKey. Remember how I mentioned that these two keys are opposite, albeit linked? Notice how both keys start with 8756745001992373161285778726645083782004419876731866636961799474661459252554364385770004594397922925180145618274212925790191421654715585611349812414582633, 65537. However, the privateKey is considerably longer than the publicKey, likely to make it harder to access.

Also, similar to symmetric-key encryption, this script will generate different keys each time its run. Here’s the keys we get after another run of the script:

PublicKey(7855075279758572094336135232248306022642803736898164846214110092559099915389472776042423258530713061811745645535500011244055637229684973871741305772152203, 65537)

PrivateKey(7855075279758572094336135232248306022642803736898164846214110092559099915389472776042423258530713061811745645535500011244055637229684973871741305772152203, 65537, 999608279798991276176257195736009768967773672364171305025034380150798682275873600573017565727711649304440962386673322170031172276957264982184527359310433, 6478929829895505402335617053533719083904398041145802512468037497716457622395698959, 1212403203305762871531606015835489318412721597481677658992522607891186117)

You’ll also notice that whenever I run this script, the RSA keys I obtain always have the number 65537 in them. Why might that be? It’s what’s known as a public exponent in the RSA algorithm, which is a crucial part of the public key that is utilized for verifying both encryption of the data and access signatures for anyone trying to access the data.

Dear coder, thank you

However you decide to encode this message, I just want to make one thing clear-thank you, thank you, thank you for following along this journey with me for six wonderful years. Thank you for reading everything I’ve done over the past six years (and perhaps learning a trick of the trade along the way)? I hope to keep coding along for as long as possible but I’ll admit, I’ve certainly come a long way since my early posts (remember R Lesson 1: Basic R commands as my first-ever tutorial and second overall post?). I’ve also certainly learned quite a bit about running this publication over the last six years, and to be honest, I feel like my programming has come very very far since that first post in summer 2018.

In short, keep calm and code along fellow devs! I’ll be back with another great year of programming content (and perhaps another cool coding demo for the 7th anniversary).

Also, I would be remiss not to acknowledge the two furry friends that have been around since the early days of this blog:

Orange Boy/Simba and Pretty Girl/Marbles (seen here eagerly awaiting their Christmas presents in 2017):

Michael

Let’s Make Our Own Python Game Part Two: Creating The Bingo Card (Python Lesson 50)

Hello everyone,

Michael here, and in today’s post, I’ll pick up where we left off last time. This time, we’ll create the BINGO board!

In case you forgot where we left off last time, here’s the code from the previous post:

Now let’s continue making our BINGO game. In today’s post, we’ll focus on generating the BINGO card.

Outlining the BINGO board

As you may recall from the previous post, all we really did was create one giant lime green square:

It’s a good start, but quite frankly, it won’t work for our BINGO board. How can we improve the look of our BINGO card? Add some borders!

Here’s the code to add the borders:

for xcoord in x:

for ycoord in y:
screen.blit(square1.surf, (xcoord, ycoord))
pygame.draw.rect(screen, pygame.Color("Red"), (xcoord, ycoord, 75, 75), width=3)

Pay attention to the last line in these nested for loops-the one that contains the pygame.draw.rect() method. What this method does is draw a square border around each square in the BINGO card. This method takes four parameters-the game screen (screen in this case), the color you want for your border (this could be a color name or hex code), a four-integer tuple that contains the x and y-coordinates for the square along with the square’s size, and the thickness of the border in pixels. Let’s see what we get!

The BINGO card already looks much better-now we need to fill it up!

  • Just a tip-for the border generation process to work, make the dimensions of the border the same as the dimensions of the square generated. In this case, we used 75×75 borders since the squares are 75×75 [pixels].

B-I-N-G-O

Now, what does every good BINGO card need. The B-I-N-G-O on the top row, of course!

Here’s how we can implement that:

font = pygame.font.Font('ComicSansMS3.ttf', 32) 

if ycoord == 75:
if xcoord == 40:
text = font.render('B', True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))
elif xcoord == 115:
text = font.render('I', True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))
elif xcoord == 180:
text = font.render('N', True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))
elif xcoord == 255:
text = font.render('G', True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))
elif xcoord == 330:
text = font.render('O', True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

And how does all of this code work? Let me explain.

The B-I-N-G-O letters will usually go on the top row of the card. The line if ycoord == 75 will ensure that the letters are only drawn on the top row of the card since point 75 on the y-axis (on our gamescreen) corresponds to the top row of the card.

Since there are five letters in B-I-N-G-O, there are also five conditional statements that account for the five letters we’ll be drawing onto the top five squares. There are also five x-coordinates to account for (40, 115, 180, 255, 330).

But before we actually start drawing the text, I want you to take note of this line-font = pygame.font.Font('ComicSansMS3.ttf', 32). This line allows you to set the font you wish to use for text along with its pixel size-in this case, I wanted to use Comic Sans with a 32 pixel size (hey, this isn’t a business project, so I can have some fun with fonts). Unforntuantely, if you want custom fonts for your game, you’ll need to download a TFF (TrueType font) file and save it to your local drive.

  • Another tip-if the TFF file is saved in the same directory as your game file, you just need the TFF file name as the first parameter of the pygame.font.Font() method. Otherwise, you’ll need the whole filepath as the first parameter.

As for the five conditional statements, you’ll notice that they each have the same five lines. Let’s explain them one-by-one.

First off we have our text variable, which contains the text we want to write to the square. The value of this variable is stored as the results of the font.render() method, which takes four parameters-the text you wish to display, whether you want to antialias the text (True will antialias the text, which simply results in a smoother text appearance), a 3-integer tuple representing the color of the text in RGB form, and another 3-integer tuple representing the color of the background where you wish to apply the text-also in RGB form.

  • For a good look, be sure to make the backgound color the same as the square’s color.

Next we have our textRect variable, which represents the invisible rectangle (or square) that contains the text we will render.

Upon initial testing of the text rendering, I noticed that my text wasn’t being centered in the appropriate square. The testX and testY variables are here to fix it by using the simple formula (square size-rectangle width/height) // 2 (use width for the x-center point and height for the y-center point). What this does is help gather the textRect x-center and y-center to in turn help center the text within the square. However, these two variables alone won’t center the text correctly, and I’ll explain why shortly.

  • In Python, the // symbol indicates that the result of the division will be rounded down to the nearest whole number, which helps when dealing with coordinates and text centering.

Last but not least, we have our wonderful screen.blit() method. In this context, the screen.blit() method takes two parameters-the text you want to display (text) and a 2-integer tuple denoting the coordinates where you wish to place the text.

Simple enough, right? However, take note of the coordinates I’m using here-(xcoord + testX, ycoord + testY). What addind the testX and testY coordinates will do is help center the text within the square.

After all our text rendering, how does the game look now?

Wow, our BINGO game is starting to come together. And now, let’s generate some BINGO numbers!

B….1 to 15, I….16 to 30 and so on

Now that the B-I-N-G-O letters are visible on the top of our card, the next thing we should do is fill our card up with the appropriate numbers.

For anyone who’s played BINGO, you’ll likely be familiar with which numbers end up on which spots on the card. In any case, here’s a refresher on that:

  • B (1 to 15)
  • I (16 to 30)
  • N (31 to 45)
  • G (46 to 60)
  • O (61 to 75)

With these numbering rules in mind, let’s see how we can implement them into our code and show them on the card! First, inside the game while loop, let’s create five different arrays to hold our BINGO numbers, appropriately titled B, I, N, G, and O.

from random import seed, randint 

[meanwhile, inside the while game loop...]

    B = []
seed(10)
for n in range(5):
value = randint(1, 15)
B.append(value)

I = []
seed(10)
for n in range(5):
value = randint(16, 30)
I.append(value)

N = []
seed(10)
for n in range(5):
value = randint(31, 45)
N.append(value)

G = []
seed(10)
for n in range(5):
value = randint(46, 60)
G.append(value)

O = []
seed(10)
for n in range(5):
value = randint(61, 75)
O.append(value)

Also, don’t forget to include the line from random import seed, randint (I’ll explain why this is important) on the top of the script and out of the while game loop.

As for the array creation, please keep that inside the while game loop! How does the BINGO array creation work? First of all, I first created five empty arrays-B, I, N, G, and O-which will soon be filled with five random numbers according to the BINGO numbering system (B can be 1 to 15, I can be 16 to 30, and so on).

Now, you’ll notice that there are five calls to the [random].seed() method, and all of the calls take 10 as the seed. What does the seed do? Well, in Python (and many other programming languages), random number generation isn’t truly “random”. The seed value can be any positive integer under the sun, but your choice of seed value determines the sequence of random numbers that will be generated-hence why random number generation (at least in programming) is referred to as a deterministic algorithm since the seed value you choose determines the sequence of numbers generated.

  • If you don’t have a specific range of random number you want to generate, you’ll get a sequence of random numbers Python’s random number generator chooses to generate.
  • You simply need to write seed() to initialize the random seed generator-the random. part is implied.

After the random seed generators are set up, there are five different loops that append five random numbers to each array-the line for n in range(5) ensures that each array has a length of 5. Inside each loop I have utilized the [random].randint() function and passed in two integers as parameters to ensure that I only recieve random numbers in a specific range (such as 1 to 15 for array B).

Now, let’s display our numbers on the BINGO card! Here’s the code to run (and yes, keep it in the while game loop):

 if ycoord != 75:

if xcoord == 40:
for num in B:
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 115:
for num in I:
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 180:
for num in N:
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 255:
for num in G:
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 330:
for num in O:
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

Confused at what this code means? The last five lines in each if statement essentially do the same thing we were doing when we were rendering the B-I-N-G-O on the top row of the card (rendering and centering the text on each square). However, since we don’t want the numbers on the top row of the card, we include the main if statement if ycoord != 75 because this represents all squares that aren’t on the top row of the card.

Oh, one thing to note about rendering the numbers on the card-simply cast the number variable (num in this case) as a string/str because pygame won’t render anything other than text of type str.

With all that said, let’s see what our BINGO card looks like:

Well, we did get correct number ranges, but this isn’t the output we want. Time for some debugging!

D-E-B-U-G

Now, how do we fix this board to get distinct numbers on the card? Here’s the code for that!

First, let’s fix the BINGO number array creation process:

    B = []


value = sample(range(1, 15), 5)
for v in value:
B.append(v)

I = []

value = sample(range(16, 30), 5)
for v in value:
I.append(v)

N = []

value = sample(range(31, 45), 5)
for v in value:
N.append(v)

G = []

value = sample(range(46, 60), 5)
for v in value:
G.append(v)

O = []

value = sample(range(61, 75), 5)
for v in value:
O.append(v)

In this code, I first added the sample method to the from random import ... line as we’ll need this method to ensure we get an array of distinct random numbers.

   else:

y2 = [150, 225, 300, 375, 450]
if xcoord == 40:
for num, ycoord in zip(B, y2):
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 115:
for num, ycoord in zip(I, y2):
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 180:
for num, ycoord in zip(N, y2):
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 255:
for num, ycoord in zip(G, y2):
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

if xcoord == 330:
for num, ycoord in zip(O, y2):
text = font.render(str(num), True, (0,0,0), (50,205,50))
textRect = text.get_rect()
textX = (75 - textRect.width) // 2
textY = (75 - textRect.height) // 2
screen.blit(text, (xcoord + textX, ycoord + textY))

Remember the else block where we were rendering the text? Well, I made a few changes to the code. I first added another array of y-coordinates (y2) that is essentially the same as the y array without the number 75. Why did I remove the 75? I simply wanted to ensure that no numbers are drawn on the top row of the card, and 75 represents the y-coordinate that displays the top row of the card.

While iterating through our five BINGO number arrays aptly titled B, I, N, G and O, we’re also iterating through the y2 array to ensure that the correct numbers are rendered in the correct squares.

  • In case you’re wondering about the zip() line in our for loop, the zip() function allows us to iterate through multiple arrays at once in a for loop. However, the zip() function only works if the arrays you’re looping through are the same length.
  • If you want to iterate through multiple arrays of unequal lengths, include this import at the beginning of your script-from itertools import zip_longest. The zip_longest() function will allow you to iterate through multiple arrays of unequal length. Also remember to pip install the itertools package if you don’t have it on your laptop already.

Using our revised code, let’s see what our BINGO card looks like now!

Wow, the BINGO card already looks much better! However, if you’re familiar with BINGO, you know the center square in the N column is considered a “free space”. Let’s reflect this with the addition of one simple line of code:

N[2] = 'Free'

This line will replace the middle element in the N array with the word Free, which in turn will display the word Free on the center of the BINGO card:

Nice work!

Testing the card

Now that we’ve generated quite the good-looking BINGO card, the last thing we’ll need to do is test it to ensure we get a new card each time we open the game!

Here’s what our card currently looks like:

And let’s see what happens we we close and restart the game:

It looks like we got the same card. Now, playing with the same card every time would be a quite boring, right? How do we fix this bug? Here’s some code to do so (and keep in mind this is just one way to solve the problem):

possibleSeeds = []

value = sample(range(1, 10000), 5)
for v in value:
possibleSeeds.append(v)

meanwhile, inside the game loop

...

seed(possibleSeeds[0])

To solve the BINGO card generation bug, I used the same array generating trick I used for generating the BINGO arrays which is gather a specific number of integers from a specific integer range and use a loop to create a 5-element integer array. This time, I used integers ranging from 1 to 10000.

Inside the game loop, I set the random seed to the first element of the possibleSeeds array. Why did I do this? When I set my seed() to 10, I managed to see the same BINGO card each time I started the game because since the seed() value was fixed, the same sequence of random numbers are generated each time because using the same seed() each time you run a random number generator will give you the same sequence of random numbers each time it’s run. However, using the first element of the possibleSeeds array won’t give you the same random number sequence (and in turn, the same card) each time because the possibleSeeds array generates a sequence of five different integers with each iteration. Since you get a different random number sequence each time, the random number generation seed will be different each time, which in turn results in a different BINGO card generated each time the game is run.

  • Keep the seed() method inside the game loop, but keep the possibleSeeds array outside of the game loop because inserting the array into the game loop will generate random 5-integer sequences non-stop, which isn’t a desirable outcome.

Now, let’s see if our little trick worked. Let’s try running the game:

Now let’s close this window and try running the game again!

Awesome-we got a different BINGO card! How about another test run-third time’s the charm after all!

Nice work. Stay tuned for the next part of this game development series where we will create the mechanism to call out different BINGO “balls”.

Also, here’s a Word Doc file with our code so far (WordPress won’t let me upload PY files)-

Thanks for reading,

Michael

Let’s Make Our Own Python Game Part One: Getting Started (Python Lesson 49)

Hello loyal readers,

I hope you all had a wonderful and relaxing holiday with those you love-I know I sure did. I did promise you all new and exciting programming content in 2024, so let’s get started!

My first post of 2024 (and first several posts of 2024 for that matter) will do something I haven’t done on this blog in its nearly 6-year existence-game development! Yup, that’s right, we’ll learn how to make a simple game using Python’s pygame package. And yes, this game will include graphics (so we’re making something way cooler than a simple text-based blackjack game or something like that).

Let’s begin!

Setting ourselves up

Before we even start to design our game, let’s install the pygame package using the following line either on our IDE or command prompt-pip install pygame.

Next, let’s open our IDE. You could technically use Jupyter notebook to start creating the game, but for something like game creation that utilizes graphics (and likely lots of code) I’d suggest an IDE like Spyder.

Now, where do we begin?

To start, here are the first three lines of code we should include in our script:

import pygame
from pygame.locals import *
import sys

What game will I teach you how to program? Well, in this series of posts, we’ll learn to make our own BINGO clone.

Why BINGO? Well, compared to many other games I could possibly teach you to program, BINGO seemed like a relatively easy first game to learn to develop as it doesn’t involve multiple levels, much scoring, health tracking, or final bosses (though we could certainly explore games that involve these concepts later on).

Let’s start coding!

First off, since we are programming a BINGO game, we’ll need to draw squares. 30 of them, to be precise, as simple BINGO games utilize a 5×5 card along with five squares at the top that contain the letters B, I, N, G, and O.

Seems simple enough to understand right? Let’s see how we code it!

class Square(pygame.sprite.Sprite):

def __init__(self):

super(Square, self).__init__()


self.surf = pygame.Surface((75, 75))


self.surf.fill((50, 205, 50))

self.rect = self.surf.get_rect()



pygame.init()



screen = pygame.display.set_mode((800, 600))



square1 = Square()

First of all, to draw the BINGO squares, we’ll first need to create a Square class and pass in the pygame.sprite.Sprite parameter into it like so-class Square(pygame.sprite.Sprite).

What is the Sprite class in pygame? For those who are familiar with fanatsy works (e.g. Shrek, Lord of the Rings), a sprite is a legendary mythical creature such as a pixie, fairy, or elf (among others). In pygame a sprite simply represents a 2D image or animation that is displayed on the game screen-like the squares we’ll need to draw for our BINGO board.

The next line-the one that begins with super-allows the Square class to inherit all of the methods and capabilites of the Sprite class, which is necessary if you want the squares drawn on the game screen.

The following three lines set the drawing surface (and in turn, the size) of the square, set the color of each square on the gameboard using RGB color coding (yes, you can make the squares different colors, but I’m keeping it simple and coloring all the squares lime green), and get the rectangular area of each square, respectively.

The next two lines initiate the instace of the game-using the line pygame.init()-and set the size of the screen (in pixels). In this case, we’ll use an 800×600 pixel screen.

The last line initiates a square object for us to draw. The interesting thing to note here is that even though we’ll ultimately need to draw 30 squares for our BINGO board, we only need one square object since we can draw that same square object 30 different times in 30 different places.

Even with all this code, we’ll still need to actually draw the squares onto our game screen-this code just ensures that we have the ability to do just that (it doesn’t actually take care of the graphics drawing).

Let’s run the game!

Now that we have created the squares for our BINGO game and imported the necessary packages, let’s figure out how to get our game running! Check out this chunk of code that helps us do just that!

gameOn = True


while gameOn:


for event in pygame.event.get():

if event.type == KEYDOWN:


if event.key == K_BACKSPACE:

gameOn = False




elif event.type == QUIT:

gameOn = False

First, we have our boolean variable gameOn, which indicates whether or not our game is currently running (True if it is, False if it isn’t).

The while loop that follows is a great example of event handling (I think this is the first time I mention it on this blog), which is the process of what your program should do in various scenarios, or events. This while loop will keep running as long as gameOn is true (in other words, as long as the game is running).

You’ll notice two event types that will shut the game down, KEYDOWN and QUIT. In the case of KEYDOWN, the game will shut down only if the backspace key is pressed. In the case of the QUIT event, the game will quit if the user presses the X close button on the window. However, something to note about the QUIT event is that pressing X alone doesn’t quit the game-I know because I tried using the X button to quit the game and ran into an unresponsive window that I ended up force-quitting. Don’t worry, I’ll explain how to quit the game properly later in this post.

Drawing the squares

Now that we have a means to keep our game running (or close it if we so choose), let’s now draw the squares onto the gameboard. Here’s the code to do so:

screen.blit(square1.surf, (40, 75))

screen.blit(square1.surf, (115, 75))
screen.blit(square1.surf, (180, 75))
screen.blit(square1.surf, (255, 75))
screen.blit(square1.surf, (330, 75))
screen.blit(square1.surf, (40, 150))
screen.blit(square1.surf, (115, 150))
screen.blit(square1.surf, (180, 150))
screen.blit(square1.surf, (255, 150))
screen.blit(square1.surf, (330, 150))
screen.blit(square1.surf, (40, 225))
screen.blit(square1.surf, (115, 225))
screen.blit(square1.surf, (180, 225))
screen.blit(square1.surf, (255, 225))
screen.blit(square1.surf, (330, 225))
screen.blit(square1.surf, (40, 300))
screen.blit(square1.surf, (115, 300))
screen.blit(square1.surf, (180, 300))
screen.blit(square1.surf, (255, 300))
screen.blit(square1.surf, (330, 300))
screen.blit(square1.surf, (40, 375))
screen.blit(square1.surf, (115, 375))
screen.blit(square1.surf, (180, 375))
screen.blit(square1.surf, (255, 375))
screen.blit(square1.surf, (330, 375))
screen.blit(square1.surf, (40, 450))
screen.blit(square1.surf, (115, 450))
screen.blit(square1.surf, (180, 450))
screen.blit(square1.surf, (255, 450))
screen.blit(square1.surf, (330, 450))

pygame.display.flip()

Even though you’ll only need to create one square object, you’ll need to draw that object 30 different times since the BINGO board will consist of 30 squares drawn in a 6×5 matrix. To draw the squares, you’ll need to use the following method-screen.blit(square1.surf, (x-coordinate, y-coordinate). The screen.blit(...) method drawes the squares onto the screen and it takes two parameters-the square1.surf, which is the surface of the square and a two-integer tuple stating the coordinates where you want the square placed (x-coordinate first, then y-coordinate).

After the 30 instances of the screen.blit() method, the pygame.display.flip() method is called, which simply updates the game screen to display the 30 squares. You might’ve thought the screen.blit() method already accomplishes this, but this method simply draws the squares while the pygame.display.flip() method updates the game screen to ensure the squares are present.

Quitting the game

As I mentioned earlier in this post, I’ll show you how to properly quit the game. Here are the two lines of code needed to do so:

pygame.quit()

sys.exit()

To properly end the pygame session, you’ll need to include these two commands in your code. Why do you need them both? Wouldn’t one command or the other work?

You need both commands because the pygame.quit() method simply shuts down the active pygame module while the sys.exit() method proprely shuts down the entire window.

And now, let’s see our work!

Now that we’ve got the basic game outline set up, let’s see our work by running our script!

As you see here, we simply have one giant lime-green square. However, that lime green square consists of the 30 squares we drew earlier-the squares are simply drawn on top of each other, hence why the output looks like one big square. Don’t worry, in the next post we’ll make this square look more like a BINGO board!

A small coding improvement

As you noticed earlier in this post, I was calling the screen.blit() method 30 times while drawing the squares. However, there is a much better way to accomplish this:

x = [40, 115, 180, 255, 330]

y = [75, 150, 225, 300, 375, 450]

for xcoord in x:
for ycoord in y:
screen.blit(square1.surf, (xcoord, ycoord))

In this example, I placed all possible x and y coordinates into arrays and drew each square by looping through the values in both arrays. Here’s the output of this improved approach:

As you see, not only did we improve the process for drawing the squares onto the game screen, but we also got the same result we did when we were calling the screen.blit() method 30 times.

For your reference, the code

Just in case you want it, here’s the code we used for our game development in this post (and we will certainly change it throughout this series of posts). I’m copying the code here since WordPress won’t let me upload .PY files:

import pygame

from pygame.locals import *
import sys

class Square(pygame.sprite.Sprite):
def __init__(self):
super(Square, self).__init__()

self.surf = pygame.Surface((75, 75))

self.surf.fill((50, 205, 50))
self.rect = self.surf.get_rect()

pygame.init()

screen = pygame.display.set_mode((800, 600))

square1 = Square()

gameOn = True

while gameOn:
for event in pygame.event.get():

if event.type == KEYDOWN:

if event.key == K_BACKSPACE:
gameOn = False

# Check for QUIT event
elif event.type == QUIT:
gameOn = False

x = [40, 115, 180, 255, 330]
y = [75, 150, 225, 300, 375, 450]

for xcoord in x:
for ycoord in y:
screen.blit(square1.surf, (xcoord, ycoord))

# Update the display using flip
pygame.display.flip()

pygame.quit()
sys.exit()

Thanks for reading, and I look forward to having you code along with me in 2024!

Michael

Python Lesson 47: Image Rotation (AI pt. 13)

Hello loyal readers,

Michael here, and in this post, we’ll cover another fun OpenCV topic-image rotation!

Let’s rotate an image!

First off, let’s figure out how to rotate images with OpenCV. Here’s the image we’ll be working with in this example:

This is an image of the Jumbotron at First Horizon Park in Nashville, TN, home ballpark of the Nashville Sounds (Minor League Baseball affilate of the Milwaukee Brewers)

Now, how do we rotate this image? First, let’s read in our image in RGB colorscale:

import cv2
import matplotlib.pyplot as plt

ballpark=cv2.imread(r'C:\Users\mof39\Downloads\20230924_140902.jpg', cv2.IMREAD_COLOR)
ballpark=cv2.cvtColor(ballpark, cv2.COLOR_BGR2RGB)
plt.figure(figsize=(10, 10))
plt.imshow(ballpark)

Now, how do we rotate this image? Let’s start by analyzing a 90-degree clockwise rotation:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_90_CLOCKWISE)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

All it takes to rotate an image in OpenCV is the cv2.rotate() method and two parameters-the image you wish to rotate and one of the following OpenCV rotation codes (more on these soon):

  • cv2.ROTATE_90_CLOCKWISE (rotates image 90 degrees clockwise)
  • cv2.ROTATE_180 (rotates image 180 degrees clockwise)
  • cv2.ROTATE_90_COUNTERCLOCKWISE (rotates image 270 degrees clockwise-or 90 degrees counterclockwise)

Let’s analyze the image rotation with the other two OpenCV rotation codes-first off, the ballpark image rotated 180 degrees clockwise:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_180)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

Alright, pretty impressive. It’s an upside down Jumbotron!

Now to rotate the image 270 degrees clockwise:

clockwiseBallpark = cv2.rotate(ballpark, cv2.ROTATE_90_COUNTERCLOCKWISE)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

Well well, it’s the amazing rotating Jumbotron!

And yes, in case you’re wondering, the rotation code cv2.ROTATE_90_COUNTERCLOCKWISE is the correct rotation code for a 270 degree clockwise rotation because a 90 degree counterclockwise rotation is the same thing as a 270 degree clockwise rotation.

Now, I know I just discussed three possible ways to rotate an image. However, what if you wanted to rotate an image in a way that’s not 90, 180, or 270 degrees. Well, if you try to do so with the cv2.rotate() method, you’ll get an error:

clockwiseBallpark = cv2.rotate(ballpark, 111)
plt.figure(figsize=(10, 10))
plt.imshow(clockwiseBallpark)

TypeError: Image data of dtype object cannot be converted to float

When I tried to rotate this image 111 degrees clockwise, I got an error because the cv2.rotate() method will only accept one of the three rotation codes I mentioned above.

Let’s rotate an image (in any angle)!

However, if you want more freedom over how you rotate your images in OpenCV, use the cv2.getRotationMatrix2D() method. Here’s an example as to how to use it:

height, width = ballpark.shape[:2]
center = (width/2, height/2)
rotationMatrix = cv2.getRotationMatrix2D(center,55,1)
rotatedBallpark = cv2.warpAffine(ballpark, rotationMatrix,(height, width)) 
plt.figure(figsize=(10, 10))
plt.imshow(rotatedBallpark)

To rotate an image in OpenCV using an interval that’s not a multiple of 90 degrees (90, 180, 270), you’ll need to use both the cv2.getRotationMatrix2D() and the cv2.warpAffine() method. The former method sets the rotation matrix, which refers to the degree (either clockwise or counterclockwise) that you wish to rotate this image. The latter method actually rotates the image.

Since both of these are new methods for us, let’s dive into them a little further! First off, let’s explore the parameters of the cv2.getRotationMatrix2D() method:

  • center-this parameter indicates the center of the image, which is necessary for rotations not at multiples-of-90-degrees. To get the center, first retrieve the image’s shape and from there, retrieve the height and width. Once you have the image’s height and width, create a center 2-element tuple where you divide the image’s width and height by 2. It would also be ideal to list the width before the height, but that’s just a programmer tip from me.
  • angle-the angle you wish to use for the image rotation. In this example, I used 55, indicating that I want to rotate the image 55 degrees clockwise. However, if I wanted to rotate the image 55 degrees counterclockwise, I would’ve used -55 as the value for this parameter.
  • scale-This is an integer that represents the factor you wish to use to zoom in the rotated image. In this example, I used 1 as the value for this parameter, indicating that I don’t want to zoom in the rotated image at all. If I’d used a value greater than 1, I’d be zooming in, and if I was using a value less than 1, I’d be zooming out.

Next, let’s explore the parameters of the cv2.warpAffine() method!

  • src-The image you wish to rotate (in this example, I used the base ballpark image)
  • M-The rotation matrix you just created for the image using the cv2.getRotationMatrix2D() method (ideally you would’ve stored the rotation matrix in a variable).
  • dsize-A 2-element tuple indicating the size of the rotated image; in this example, I used the base image’s height and width to keep the size of the rotated image the same.

Now for some extra notes:

  • Why is the rotation method called warpAffine()? This is because the rotation we’re performing on the image is also known as an affine transformation, which transforms the image (in this case rotating it) while keeping its same shape.
  • You’ll notice that after rotating the image using the cv2.warpAffine method, the entire image isn’t visible on the plot. I haven’t figured out how to make the image visible on the plot but when I do, I can certainly share my findings here. Though I guess a good workaround solution would be to play around with the size of the plot.

Thanks for reading, and for my readers in the US, have a wonderful Thanksgiving! For my readers elsewhere on the globe, have a wonderful holiday season (and no, this won’t be my last post for 2023)!