Our neighborhood Goon numbers guy, is at it again.
ANALYSIS OF SMALL TRACKER MOVESIn view of the Crytek lawsuit and the possible imminent demise of the funding tracker, I'm presenting an analysis that I put off for quite a while. This post is the third in the (very infrequently updated) Theoretical Cetology series. Previous entries are
here and
here. I know many people believe the tracker is faked, but irrespective of that issue, one can still try to figure out what the tracker is actually telling us.
My initial motivation in looking at the funding tracker was to try to tie together the three F's (Fleet, Fans, and Funds) to get a better sense of who was buying what. The
hourly scraped data maintained by Nehkara on Google Docs is not well suited to this, because so many transactions get lumped together per hour that it's difficult or impossible to tease out the individual contributions. Therefore I used scraped data with a 5 minute update rate, which strikes a balance between high update rate and not being rude. As it turned out, the three F's are updated on different schedules and possibly with differing time lags, so it remains impractical to do the dreamed-of joint analysis.
However, it turns out that the Funds data (i.e. the money counter) is updated in real time, or close enough to real time from the perspective of a 5-minute scrape. This tells us how much cash is going into the tracker at each 5 minute interval.
An excerpt from a typical day is shown below. The table shows the size of each tracker move in dollars, and the number of times a move of that size was observed.
A few interesting facts immediately jump out:
- Many moves are even multiples of $5.
- Some moves are not whole numbers. I am not sure how this happens; more on this later.
- There are a decent number of very small moves, like $5 or $10.
- $45 and $60 are by far the most common move size, probably due to their being starter packages. We also see spikes at combinations of $45 and $60, such as $90 and $105.
As the example of $105 illustrates, a tracker move may be composed of multiple smaller transactions. As long as the typical 5-minute interval does not contain "too many" transactions, we may be able to infer the individual transaction sizes, at least in a probabilistic sense.
The frequent $5 moves are particularly interesting because they are not likely to be composed of smaller transactions and because there isn't anything exciting on the store that costs $5. I believe that they are probably mostly CCU activity, possibly related to the grey market, but I welcome better explanations.
My assumption is that the tracker is honest in the sense that applied store credit is not shown as additional revenue. This would allow us to see transactions of all sizes (due to varying amounts of store credit being applied) even if the store has no item at a particular price.
SELECTING THE DATA SETTo keep from having too many transactions thus making the data too hard to analyze, I used a crude proxy for non-sale days by taking all days with daily funding total < $60K. This gives a total of 116,111 data points from "quiet" or "typical" days.
The method I will apply below relies heavily on the assumption that tracker moves are round numbers, i.e., multiples of $5. Thus, data points that do not fit this assumption must be excluded, leaving 108,621 data points remaining (which represents a loss of 6.5% of the data). Interestingly, non-round tracker moves tend to arrive bunched together. The below plot shows the percentage of non-round moves in a rolling temporal window, restricted to data points from quiet days.
Part of the cause may be temporally limited availability of items, such as the Squadron 42 Military Cap, that aren't multiples of $5. As for the fractional dollar amounts, the only hypothesis I can come up with is if an amount of store credit is somehow acquired untaxed but then has VAT taken out of it later. Partially defraying the cost of an item with the resulting store credit could give rise to strange transaction sizes.
ESTIMATING TRANSACTION RATESTo account for the effect of multiple transactions, I formulated a probability model for the data as a price-weighted sum of independent Poisson random variables. Going to the store and clicking on "Extras" shows CCUs valued at every multiple of $5 up to about $300, so I set the maximum allowable transaction size to $300. I then estimated the parameters of the model using maximum likelihood. The round transactions assumption is required to make the fitting process tractable; we can apply a crude correction for the exclusion of the non-round transactions afterward.
Below we show the raw histogram of tracker moves.
The result of the fitting process is an average transaction rate for each transaction size, i.e., the average number of transactions of each size, per hour.
The dominant effect is the spikes at $45 and $60, reaching 4.2 and 5.0 transactions/hour, respectively. If we assume that every one of these sales is a starter game package, and that every game package is sold to a new customer, this means 9.2 commandos are buying into Star Citizen per hour outside of special events, or about 80,000 commandos per year. By contrast, the "fans" number has increased by 237,325 so far in 2017, with no major recruiting events to speak of.
There is about 1 transaction per hour for every "small" transaction size below $45. Unless this is CCU activity, I'm not sure what it could represent. Are people buying $5 skins and UEC chits?
The estimated average daily revenue from starter packages is $11,700, versus an average daily revenue (for the quiet days used in this analysis) of roughly $39,500. Similarly, the average daily revenue from small transactions of $40 or less is $4000. The majority of funding, about $20K, comes from transactions that are $65 or larger.
The average daily total implied by our model is about $37K. Using a crude 7% correction for the excluded data points gives an average daily total of $39,800 which is fairly close to the true average of $39,500.
How literally should we interpret the fitted parameters? I think the inferred rate of starter packages, as well as the small transactions, is roughly accurate. As price increases from there, we should expect a general decline in the frequency of transactions, but not as steep a decline as the model implies. You can see an artifact of this where the model has boosted the rates of transactions near $300 to try to match the heavier tails of the actual data.
IS THE TRACKER TOO STABLE?There are reasons to expect this model to underestimate the frequency of very large transactions, meaning that it will underestimate daily variability. That being said, the daily standard deviation of funding implied by the fitted model is $2,040, which implies that we would expect to see successive daily totals to be within about $5800 of each other 95% of the time.
A quick eyeballing suggests that the real data is indeed more variable than the model estimate. This is unsurprising both because there are dynamic influences on the store activity that we do not model, and because we would expect overdispersion even in the absence of such effects.
A better answer would involve looking at autocorrelation, but :effort: