The global investment community is utterly enamored with alternative data these days, and it seems like every day there are new data vendors armed with some exotic data arsenal joining the gold rush. As the world becomes inundated with all different types data, despite the confidence data vendors project in marketing their jewels, on the back of their minds, the number one question is: does this new data really work? Does it provide accurate predictions? Can it generate alpha? This takes me down memory lane, when ChinaScope first used alternative data to make predictions, and it worked, but the data just didn't sell.

In 2012, ChinaScope approached a company in China, Jiangsu Gimis, which installed GPS systems onto over 22,000 heavy construction machines across China to track their operations. We hypothesized that if we tracked the operating hours of these heavy construction machinery, then we should be able to reasonably estimate the monthly Fixed Asset Investment (FAI) figure released by the government. Indeed, after testing the data we saw a strong correlation between construction activity and FAI. With this data, ChinaScope produced our proprietary Construction Activity Index (CAI), which would be released on the 3rd of every month, 6 days before the National Bureau of Statistics (NBS) leased their official FAI figures. The predicative power of CAI was high, with R-squared between CAI and FAI having been consistently above 85%.

No alt text provided for this image
No alt text provided for this image

We offered this data to the investment community for over 8 months, only to have realized no sales. Why? As it turned out, macro analysts on The Street made predictions that were also consistently within the ballpark of the NBS official data. The information premium provided by this alternative data over traditional data was simply limited.

This was our first foray into alternative data, and the initial excitement about this data set ultimately ended with an anti-climatic whimper. But, this taught us an important lesson. Predictability is only part of the equation, but the magnitude of data's predictability above existing data's is where the cheddar is. It seems like a pretty intuitive notion in hindsight, but it buttressed the foundation for how we thought about alternative data value extraction going forward.

By Tom Liu, CEO of ChinaScope