KansasSooner
Aces & eights
- 33,910
- 3,006
- 293
- Joined
- Apr 18, 2010
- Location
- Tumbleweed and Sagebrush country
- Hoopla Cash
- $ 1,000.00
The probability of winning based on the last results of winning against similar competition.Probability of what?
The probability of winning based on the last results of winning against similar competition.Probability of what?
Probability of rustling jimmies.Probability of what?
So the B1G winning against basically all of the quality opponents on their OOC schedule means the probability is low of them beating quality opponents?The probability of winning based on the last results of winning against similar competition.
You have to remember that some of that probability is based on last year results. It will change in a couple of weeks when last year is no longer a factor.So the B1G winning against basically all of the quality opponents on their OOC schedule means the probability is low of them beating quality opponents?
Which is why I say his rankings are idiotic. What's the point of putting out rankings 3 weeks into the season that aren't based on the 3 weeks of the season?You have to remember that some of that probability is based on last year results. It will change in a couple of weeks when last year is no longer a factor.
You are asking the wrong person. I was never a fan of Bayesian statistics as they rely too much on past occurrences, which may be totally irrelevant in actual predictive models. Yes there is a method behind them but I prefer more data based methods, the past is not a indicator of the future...or so says most prospectuses that sell you mutual funds...Which is why I say his rankings are idiotic. What's the point of putting out rankings 3 weeks into the season that aren't based on the 3 weeks of the season?
Good luck on the first two bolded, the official NCAA website doesn't list those although it does list about 36 different categories of data for teams, which I have used to model with multiple regression. The model, though unbiased, still doesn't account for home/away/neutral games (and someone did do a study of this but it for pro sports I believe and was shown to be slightly insignificant except in baseball), nor injuries (which given today's climate are misreported or not given at all). To say the least I have 3 different models and though all 3 are close there are differences beyond average teams that I have yet to account for. Above 20 or below 100 the teams gives vastly different results for the three models. Should be expected though as regression tries to fit the average the best.So basically these rating are complete trash based of projections from last years' results. The only thing these rankings are good for is the SEC nuthuggers/BIG haters.
Give me rankings based off SOS, winning % which also takes into account injuries, suspensions, (home/away/neutral games), margin of victory. If anyone can find something like this that's be fact checked then that would be more realistic than this rubbish.
Which is retarded.
Actually, Sagarin was the reason computers were ever included in the first place.Sagarin is the very reason the computers went away from the equation along time ago ...
This is correct. Sagarin (as are all of the computers) is still relying heavily on his initial bias. This is a mathematical necessity as there are still too few connections through the schedule to compare every team to every other team. So what you end up with are several islands or clusters of teams that are connected to one another, but not to the other islands.That's because of how much of his ratings are still based on last year's results. It will change a lot once only this year's results are used.
Only those that use strictly Bayesian statistics or otherwise import last years data into empirical models. My method is strictly dependent on current data and thus useless really until week 6 or 7, sometimes even 8, otherwise the data is either incomplete, or not normally distributed. Either of which make the underlying principle of regression useless. Skewed or missing data is not worth a damn in regression analysis though when it exists it can be found easily in the error analysis of the model.Sagarin (as are all of the computers)
Only those that use strictly Bayesian statistics or otherwise import last years data into empirical models. My method is strictly dependent on current data and thus useless really until week 6 or 7, sometimes even 8, otherwise the data is either incomplete, or not normally distributed. Either of which make the underlying principle of regression useless. Skewed or missing data is not worth a damn in regression analysis though when it exists it can be found easily in the error analysis of the model.
Adjust what though? Nothing from last year is relevant to this year as far as doing a true regression model. Weighted averages are better than using last years data in my opinion and even those are hard to use given the number of cupcakes scheduled the first few weeks. No, for true regression models I'll wait until the data is complete and normal before even attempting to make a rating model.You just stop using the previous years data after week 5 or 6 and end up with the same thing as waiting for week 6 or 7 to put out results.
I stopped using previous years data when all teams(or at least most all) had 5 games.
I had a choice, either don't do anything for multiple weeks, or use previous years data and get pretty good results that adjust as needed.
They're based on probability.
Probability of what?
Actually, Sagarin's ratings are not based on probability at all. His ratings are based on a least squares fit of the scores of games that have been played (plus his initial bias, which is weighted smaller and smaller until he finally takes it out.)The probability of winning based on the last results of winning against similar competition.
Why not create one?Give me rankings based off SOS, winning % which also takes into account injuries, suspensions, (home/away/neutral games), margin of victory. If anyone can find something like this that's be fact checked then that would be more realistic than this rubbish.
A least squares method? That's news to me, I read he used Bayesian methods, which is not least squares at all and if he is using least squares then he should know the limitations I mentioned above and should not even consider publishing results until all data for the current model is this year's data and normal to boot. Otherwise it's just a skewed biased result and is worth nothing.Actually, Sagarin's ratings are not based on probability at all. His ratings are based on a least squares fit of the scores of games that have been played (plus his initial bias, which is weighted smaller and smaller until he finally takes it out.)
In fairness, he may have changed how he does things. My comments were based on his descriptions of his ratings from the early 90s. So all I really know is what he did back then. I just assumed it was still the same.A least squares method? That's news to me, I read he used Bayesian methods, which is not least squares at all and if he is using least squares then he should know the limitations I mentioned above and should not even consider publishing results until all data for the current model is this year's data and normal to boot. Otherwise it's just a skewed biased result and is worth nothing.
I think I read he did use a least squares method for the older data but for current years (after he really started ranking current teams) he went to a Bayesian method. Not sure when he went to Bayesian models but I do think you're correct about his least squares for really older data.In fairness, he may have changed how he does things. My comments were based on his descriptions of his ratings from the early 90s. So all I really know is what he did back then. I just assumed it was still the same.
Far be it for me to speak for Jeff Sagarin, but I am willing to bet he understands the limitations pretty well.he should know the limitations ...