Thursday, December 31, 2009

Shin Hatsubai 2009 .org stats, now with pretty graphs

One of the neat things about having a stupid amount of videos on the .org, as a stats nerd, is that you get to have your own statistically significant data set to draw conclusions from without being an admin. Here are some numbers (actually, graphs, because this is MAI data and NO U CANT HAZ, make your own 83 local videos) from 2009 and some probably bogus conclusions about them.

Free Image Hosting at www.ImageShack.us

This is a graph showing SH downloads for the past year by volume; that is, how many actual times somebody pulled the video from the .org servers. This is a fairly normal heavy-tailed distribution, with a moderately broad shoulder. Does the .org look like this, as a whole? I'm not sure, because 83 is really not a big enough sample to talk concretely about a sample space three orders of magnitude larger, but I'm willing to bet it looks something like this.

Free Image Hosting at www.ImageShack.us

This here is a graph of SH downloads over the same period by weight (in MB); that is, the actual impact on the .org server of transferring the downloads graphed above. The shoulder is smaller -- if taller -- here and the tail falls away at a much faster pace. Does the .org look like this? Since I don't have the whole dataset, it's not a firm conclusion, but I'd hazard a guess that it probably does. The number of high-demand videos is very small, and the decay rate off that looked, the last time I looked over the bracketing of the star scale by week, seems to mostly match with this kind of slope.

Now here's where things get really interesting (well, at least as far as graphs based on statistics about AMVs can get, anyway).

Free Image Hosting at www.ImageShack.us

This shows volume as ordered by weight. The sinusoidal ebb and flow between short ("fat" videos that reach a higher (further leftwards on the graph) transfer on fewer downloads) and taller ("thin" videos that need more downloads to get where they are weight-wise) bars in the tail is interesting, and indicates that .org users, at least the ones who download SH videos, are not selecting on filesize.

Free Image Hosting at www.ImageShack.us

This one here is weight ordered by volume. This is the first time I've seen a sawtooth wave in an AMV-related graph, but all this is doing is pulling the information in the vol-by-weight graph out a little more dramatically. The sawteeth in the tail show transitions between roughly equivalent levels of transfer; the spike that starts is the fattest video with ~x downloads, and the last one before the next spike is the thinnest.

What does all this mean? Basically, that the motivation to compress your videos correctly is social rather than egotistical. A-m-v.org users are not selecting on either heavy or light videos, but, dun dun dun, presumably on some function of content and popularity. You won't get more people to download your video if it is the smallest filesize in the list of their options; the only motivation to compress properly is to minimize your bandwidth footprint. This is a social motivation, not one for the individual -- if you need a kick, though, donate, and the usage of the local server then does become more directly your own issue.