About this site
What is this?
This is a website that stores a ranking system for Maine high school basketball. There’s a lot of math. So much math. But, don’t worry. You can use your calculator.
In a state as large as Maine with such a range in school sizes (and driving distances!), you aren’t going to be able to play a balanced schedule within your division. Enter the Heal Points, which are a vast improvement over a simple Won-Loss record.
The thing is, they don’t go nearly far enough.
So I tore them down and rebuilt them on the same framework, only with newer features, like adjustments for where the game was played. For a primer on how Heals work, go here: http://www.bangornorth.com/how-to-calculating-those-mysterious-heal-points/
There’s a very important rule for anything related to the model: We won’t use anything that can’t be equally applied to every team in the state.
The old Preliminary Index system (PI) is the biggest flaw of the Heals. The points are solely based on the class of the team you beat, which is essentially the same thing as basing it on the price of gas in that town. There’s not a single basketball reason for your PI. Yes, you get points for a win, but you get the same number of points for beating the best team in your class as you do for beating the worst. That’s absurd.
Luckily, it doesn’t directly affect your final rating, so that allows some liberties.
The new, 5-class system has a weight of 2 points between classes, but we’re keeping the old 5 point difference, with a twist. Each team is placed on a 10-point range, based on their overall ranking (it’s a bit circular, I know, but since it doesn’t directly affect that rating, it more or less works and it’s the best system I’ve come up with). It looks a little like this:
- AA: 40-50
- A: 35-45
- B: 30-40
- C: 25-35
- D: 20-30
So the best team in Class A is worth the same as an average AA school. The worst team in B is worth the same as the best team in class D. And so on.
This is the part that slows down the Excel file the most.
To that we then apply a Pythagorean Expectation, which was invented by Bill James and adapted for basketball by Daryl Morey. It’s used by a lot of rating systems in a lot of sports.
Win% = (Points Scored)^x / ((Points Scored)^x + (Points Allowed)^x)
So if you play a team in Class B with a weight of 33 (to pick a random number) and you get a Win% of 82%, your PI will be credited 27.06. Beat a better team (say, 37) and you get 30.34. Then that is added all together and divided by the number of games on the schedule.
That’s your mPI.
You adjust for Home Court Advantage?
The model also adjusts each game for a home court advantage. There’s a lot of theories on how much weight HCA has, and it probably varies from location to location. Some gyms are just harder to play in than others and some fan bases travel much better than others. But, since the 2014-15 season, Home Court was equivalent to, on average, a 5% bump in your Win Expectation state-wide. At the outer ends of the ranges of Win%, the 5% bump starts to disappear. Simply put, you can’t get a Win% of 105% just because you blew someone out on the road. For games played at neutral locations, there is no home court advantage.
What about travel time?
We’ve started tracking travel time to see if it has an impact on winning, but haven’t come up with anything conclusive yet. We’ve also started tracking the relative rest of each team to see if that has an impact. As soon as we can work that in, we’ll do it.
Let’s look at a sample game.
In the opening game of the 2018-19 season, the Scarborough girls beat the Sanford girls 41-38 at home. First, we calculate the Win% for Scarborough, which on a neutral floor would have equated to 64.7%, meaning that based on only the score and location of this one game, we would assume that Scarborough would win a re-match 64.7% of the time. But they were at home, so this adjusts down to 59.7%. Naturally, Sanford gets the remaining 40.3%. So as the season progressed, each team continued to get credit for that percentage of the value of that opponent, plus a bonus for Sanford for playing on the road. This ended up looking like this
- Scarborough earns 59.7% of Sanford’s 14.723 (8.796)
- Sanford earns 40.3% of Scarborough’s 34.508 (13.891)
So despite losing, this game actually is worth more for Sanford, and it should be. Losing by 3 points on the road to the eventual state runner-up is nothing to sneeze at. The line for this game was Scarborough by 17.9 on game day and 16.3 at the end of the year.
So the model also creates a prediction of each game, translating it to a percentage. In the Scarborough/Sanford example, we expect Scarborough to win that game 87.6% of the time. The model then takes that percentage, adjusts for the number of points each team is expected to score, and creates a standard point spread. YOU SHOULD NOT BET ON THIS POINT SPREAD.
You also shouldn’t bet against it. It is very rarely wrong. In 2018-19 girls games where the model had a projection of 80% or higher, the favored team went 453-5. That means it was right 98.9% of the time. That’s been pretty consistent over the life of the model.
Is this really that accurate?
Yes. It’s actually too accurate, but I haven’t figured out how to fix it. Generally speaking, if a model says you should win 80% of the time, 20% of the time you should lose. But when the model says you should win 80% of the time, you win ~99% of the time. It’s too humble, which is certainly better than being wrong all the time. You know, like pundits on TV.
I saw something about Tourney Odds?
We then use those projections to predict the final standings. Using a number of Monte Carlo simulations, we simulate the rest of the season hundreds of times and create a range of possible outcomes. As the season progresses, this range tightens until you get locked into a certain final spot in the standings. Once again, this is very, very accurate. In the 2018-19 season, the model started projecting the tournament field with a month left in the season. It got 94.9% of them right.
That’s really accurate, right?
It’s the most accurate projection model I’ve ever found, in any sport. At any level. If you find a more accurate one, let me know.
During the COVID bubble, I made a version of this for the NBA and it out-performed Nate Silver's model by a pretty wide margin.
What about Player Stats?
Oh man, would I love to include that. Again, the rule here is that I have to be able to apply something to all teams equally. So I’d need full, reliable stats from every team in the state. Some schools are not super-honest in their stat tracking. Others can’t even be bothered to report scores in an accurate manner. So player stats would probably have to come from a neutral third party and I’m not sure I see that happening any time soon.
But! We do a little bit of player stats for the tournament and it’s fascinating to see how ill-informed the narratives of those games are, even though everyone in the state is sitting and watching the same games. Even something as simple as the MVP of a region is often hilariously wrong.
But what if you could?
Let’s say that happened. We could have advanced stats here for every player in the state, which would certainly help the players who have aspirations of playing at the next level. We could also factor in things like pace into the Power Rankings, which would make them even more accurate. We could actually factor in injuries!
Right now, we can’t include injuries because we don’t know exactly how important a player is. Sure, if the eventual Mr. Basketball misses a game, that’s pretty big, but how big? Is he worth 20 points a game? 22? More? How big is the drop-off from him to his backup? What about the 6th man on the other team who missed the game with the flu? How much is he worth?
It very quickly becomes a rabbit hole. But with a complete box score from every game, we could do it. It’d take some work, but it’d be doable.
This is really a whole thing.
Trust me. I’m aware.
I also go to a lot of games and would like to help provide information
Shoot me an email: email@example.com
What’s your math background?
Well, I took AP Calculus back in high school, so I’m not entirely without education, but I’m self-taught. I’ve never taken a Stats class. I taught myself Excel in Windows 3.1 in order to win a fantasy baseball league. I don’t play video games. I don’t work on cars. I think of problems like, “who’s better, a team from the county or one of the midcoast KVAC teams?” And then I try to figure it out.
Actually, this started back in the 4 class system with me listening to the debates over whether or not a then-undefeated Old Town team was better than Winslow, Medomak Valley, and Gardiner teams that had 1 or 2 losses, but had seemingly played tougher schedules. I figured it would take me an hour. It was a little more complicated than that. And just like those guys who constantly tweak their cars to get that extra little bit of performance out of it, I keep coming up with new ideas, which usually means teaching myself a new trick in Excel.
What’s your angle?
I like data. I want high school basketball to be better. I also think that a night of high school basketball is a better entertainment value than a night at the movies.
I have a firm belief that Maine High School sports are in dire need of information and transparency, things that are sorely lacking from schools and the Maine Principal’s Association. I think the powers-that-be in Maine have retarded the development and progress of high school sports, often with mistruths and lazy narratives. Sometimes with flat-out lies. We owe it to our student athletes to do so much better. That belief drives the vast majority of this site’s editorial content.
This has naturally created some enemies at the Maine Principal’s Association. The Executive Director even blocked me on Twitter, which might be a violation of the First Amendment.
Who are you?
I come from a basketball family. I won a state title in high school and did play-by-play for a college team. I covered a National Championship game. But I’m just a guy who has no problem leaving work early in order to make it in time for the freshman game. I watch a LOT of basketball. Maybe too much basketball. My wife is very understanding.
I’ve written for numerous national publications. My work has been seen by hundreds of millions of people and featured in places like The Washington Post, The Daily Beast, Buzzfeed News, the Portland Press Herald, The Huffington Post, the Bangor Daily News, and a bunch of other places I’ve since forgotten. I write this under a pseudonym.
I definitely root for my hometown team, but I keep that to myself, as it has zero impact on the model. I’ll often root for the model to be wrong, but it usually isn’t.
Can I advertise on your site?
Sure. If I wanted to reach the type of Mainers who love high school basketball for my business, I’d advertise here. The people who read this site are obsessed like you’d never believe. Email me: firstname.lastname@example.org
I work for a newspaper/TV Station/etc and would like to use the numbers I find here and/or on your social media pages without giving you any credit. Is that ok?
If you’d like to use data found here, please credit MaineBasketballRankings.com, unless it is for the Sun Journal, and then still no (they know why).
Didn’t you used to be on that forum?
Yeah. I totally did. And it created tens of thousands of page views for that site. Funny story…
Can I help?
Sure! Tell your friends. If you have something you want to talk about that’s math-related, let me know and we’ll do a guest post thing.
Will you add football/baseball/etc?
I’ve toyed with the idea of football. If there’s enough interest, I might do it. Baseball and softball are trickier. The pitcher is such an important factor in a team’s Win Expectation that we’d probably need that information in order to get reliable numbers, and I’m not sure that’s going to be forthcoming.