Roberto Rigobon of MIT calls for hybrid approach to data mining and analysis
Renowned economist Roberto Rigobon of the MIT has called for a hybrid approach to data mining and data analysis saying neither big data nor small data is the way forward to arrive at actual insights.
"Since not a single type of data is perfect, the methodology for future data collection entails a hybrid approach. We need to take designed data and combine it with organic data so the big data can be improved by the small data," Rigobon said delivering the second Suresh Tendulkar memorial lecture at the RBI headquarters this evening.
Stated differently, said the Society of Sloan Fellows Professor of Applied Economics at the Sloan School of Management at MIT, Massachusetts, "use the small data, to correct for the biases generated by the big data; because it is much better to imperfectly measure something relevant, than to continue to perfectly measure the irrelevant.
" He said to accomplish this new measurement principles, we need new data sources and new procedures.
According to him, there are two general data sources-- designed data or those coming from surveys and administrative records, and organic data or what is called big data, while the organic data is generated by individuals without them noticing they are being surveyed.
It is the data in the GPS of one's phone, one's web searches, the friends in one's network, the things one purchase etc, he added. "The biggest advantages of organic data are that they are non-intrusive and the individual tends to be truthful in the data generation. We do not lie to our GPS, or to Google, or try to manipulate Netflix," he said.
He further said data characteristics like volume, velocity, and variety are irrelevant by themselves as they are only meaningful if they can be used to answer pertinent questions because the size of data reduces the estimation error only, and not the bias error, leaving the researcher estimating the wrong thing with higher precision.
"That means working with big data or organic data only solves the problem of precision and biases arising from misspecification, model uncertainty, and model instability are exacerbated and thus a hybrid data is the way forward," Rigobon said.
Tendulkar is regarded as among the most eminent economists and his seminal work on the measurement and analysis of living standards remains his enduring legacy to public policy formulation.
Rigobon is one of the two founding members of the Billion Prices Project (BPP) and a co-founder of PriceStats. The BPP-- an initiative of the MIT Sloan and Harvard Business School-- is considered the most ambitious data collection project at MIT.
It uses prices collected from hundreds of online retailers around the world on a daily basis to conduct research in macro and international economics and to compute real-time inflation metrics.
(This story has not been edited by Devdiscourse staff and is auto-generated from a syndicated feed.)