Okay folks, data crunching time.
First, I spent about 5 hours rearranging Jeff's data to a form Excel can do analysis on. Jeff, I beg of you, please use Google Survey for the follow up.
Second, the naysayers about having enough data points are correct - not everyone answered each question, and some points are not very statistically-significant because of the sample size. We should do this again, and get more respondents. Jeff, I beg of you, please use Google Survey for the follow up...
Now I need to explain some things about data analysis.
- I can't crunch non-numeric data in Excel (Excel just can't do it. MatLab or MiniTab might, but I have Excel). So, I had to create a key for each multiple-choice. In general, I put the tightest restriction in the choice selections as 1, no-change (say, to air-speed) as 3, and the most dev effort/biggest/most liberal change to the game as 5 or 6. Further, I had to carefully replace/place 0 where there was no response from a user, and 1 in some cases where no response was because of the way Jeff created the question (i.e. 'There should be a leaderboard' maps to 2, and 'There should not be a leaderboard' maps to 1).
This means I have to be VERY careful with the output, to make sure I sort the input data on non-0 values, then delete the 0-value rows from the data set.
- Excel 2007 could not do multiple regression on more than 16 columns of input to the output variable (limited to 16 x's to 1 y). I will use Excel 2013 if I ever put my Windows drive back into my other laptop (it's currently Linux).
- Because of the large number of questions, and the inversion of the dependent variables, and 1) and 2) above, it'll take a few days to answer all the questions (such as 'is the desire to hold-steady/slightly increase ground speed heavily dependent upon/correlated with MMR, and is that statistically significant of the larger population as a whole?' BTW, 'yes'.)
So... first some data key/tables:
Some summary stats on the users/ping/region
and a histogram of the MMR of the responses, representative of the forum respondents:
Next, I did the big-daddy regression (well, to the point I could, that 16-column limitation). I was looking for multicollinearity or an inverse cause-effect (High MMR is driven by the desire to have shorter TTK, things like that). I didn't find inverse cause/effect, but did find the expected multicollinearity. But...
That p-value for Ping says something to me, that it COULD be significant with a simpler model. Note the R-square is high, and p-value is vanishingly-small. This says that the data is significant, there is a significant correlation between MMR and ping.
Every MS increase in ping essentially results in a 22.8 increase in MMR. Not what you'd expect, huh?
Okay, the last bit (and what will take the longest to do for 23 independent variables/survey questions), is in the form of 'is the median ground speed answer from the community significant, and is the answer controlled by higher-MMR players?'. This is thorny - because Reloaded has to know/decide if they want to cater to us here reading this (we're pretty-high MMR, see that histogram above), or the general 'casual player'.
So, tonight's answer:
YES, this is significant, and yes, it's slightly related to MMR. Every 1000 points you go up in MMR, you're more likely to get an increase in the desire for increased ground speeds. Here's the distribution - what the regression is saying is that it's the higher-MMR players who want to #increasethespeeds:
Enough for tonight. I can do some of the other 23 in the coming days...