Correlating the Categories

The other day, Patrick suggested something (in response to the post comparing category distributions) I’ve been thinking about doing for a while too: Generating correlations between all the roto stat categories. This way, you could see which categories “go well” together, for the purposes of fine-tuning your team.

Without any further ado, here it is (based, of course, on the RotoPoll data):

FG

FT

Points

Reb

Ast

Threes

Steals

Blocks

Overall
FG


FT

0.18

Points

0.35

-0.26

Rebounds

0.86

0.15

0.35

Assists

0.34

0.53

-0.34

0.20

Threes

0.49

-0.30

0.25

0.37

-0.29

Steals

0.31

-0.45

0.35

0.33

-0.42

0.27

Blocks

0.50

-0.26

0.15

0.30

-0.13

0.52

0.20

Overall

0.18

0.41

-0.43

-0.01

0.67

-0.33

-0.43

-0.18


For those not familiar with the concept, a correlation is a statistical measure of how well two sets of data match up. In the case of basketball stats, for example, two categories will have a high correlation if players who do well in one of them also tend to do well in the other. A negative correlation suggests that players doing well in one category tend to do poorly in the other. A correlation around zero suggests there’s no relationship at all.

Ok, so what does this mean? To help demonstrate, I’ve highlighted the best (FG and rebounds) and worst (FT and steals) correlations. If your team needs to improve in both FG% and rebounds, the good news is you’ll be able to find a lot of players who can help you in both. If you need FT% and steals, however, you’re going to have a much harder time finding someone.

This data may also be useful if you’re considering tanking (giving up in) a category. For example, if you intend to tank FG%, it’ll be hard to do it without taking a hit in rebounds as well.

Another interesting thing to observe is the correlation between cateogories and overall value. Two cateogries, FT and assists, stand out as being the most correlated with overall value. I’m not sure, but this seems to suggest that those categories are more valuable in some way. For one thing, it means that it’s harder to find low-value players who can help you in those categories, as opposed to, say, steals or points, where the correlation with overall value is actually negative. So hold on to Steve Nash.

And of course, if anyone knows stats well enough to suggest or criticize something here, by all means, go ahead.

Advertisements

8 Responses to Correlating the Categories

  1. AJ says:

    Surprising that both Reb&3s and Reb&Stl correlate better than Reb$Blk …

  2. Nels says:

    So, next year, I’m drafting a team of the best FT shooters and players with the most assists and the rest will fall into place. Who knew it was so simple?

    Don’t these numbers also indicate that is should be possible to build the elusive “mid-ball” team by focusing on FG%, Pts, Reb, 3s, Blks? 5 out of 8 categories in an H2H league is pretty good…

  3. rotopoll says:

    Well, I’m willing to give that strategy a try. I’d like to break this down by position scarcity too, somehow, so you could see if a PG with FT/assists has a different value than, say, a PF who can give you those categories (which is rare, of course).

    I hadn’t thought of applying it to H2H, but what you say makes sense. Interesting idea.

    And some of the correlations are quite surprising, such as the strong one between threes and blocks. I guess guys like Bargnani, Okur, Dirk, Josh Smith, etc. aren’t as rare as we might have thought. It’s possible the analysis is wrong, but I don’t think so. If anyone wants to double-check my work, please do.

  4. AJ says:

    Positional stats are something Ive always wanted to look into as well. You would assume a PF would garnish more value in the assists category than a PG with the same numbers. It would be really interesting to see the effected values of a rebounding/shot blocking type guard vs a 3point shooing/FT% forward. However, I think that analysis should be broken down on an individual cat basis and not in terms of correlation.

  5. Patrick says:

    Wow. The correlation coefficient between blocks and FG% is huge. You could basically just look at a high rebounding guy and know is FG% was good. I figured they’d be correlated, but that’s large. The one question I’d have is what population are you looking at? Is this everybody in the league or some sub-set of players?

    And AJ makes an interesting point. One method of teasing out positional value would be to run a player rater separately for each individual position, which would probably take some time, but would be cool.

  6. rotopoll says:

    Good idea. Doing a rater for each position would be good. One thing you can do with the current rankings is sort by position, but that’s only slightly helpful.

    The correlations are done with the top 180 players (just like the rankings).

  7. aj says:

    What I always wanted YAHOO!, or any other fantasy provider, is to give a team log that would keep track of not only what stats were accumlated from each player but also from each starting position on the team. I think that would give a better glimpse into what to expect from each as well as a basis for a more accurate judge of value. While its fun to look at what actual players do you for, it just doesnt represent a real fantasy team which is always a hodgepodged effort of hot players and injury fillins.

  8. rotopoll says:

    Can I just say, the quality of the suggestions in the comments on this blog are outstanding. It’s somewhat mind-boggling that the big-name fantasy providers haven’t improved their products significantly in many years.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: