Massive data analysis fraught with challenges, says National Research Council
Federal agencies with missions related to science and technology are funding research that aims to build capabilities for the analysis of massive data, says a new book published by the National Research Council. While authors did not recommend where agencies should increase grant money, they did outline emerging challenges and opportunities agencies should be aware of as massive data analysis becomes a more popular federal-funding area.
New tools, skills and approaches are necessary for massive data because small-data strategies do not translate to larger datasets, according to Frontiers in Massive Data Analysis.
"A major part of the challenge of massive data analysis is that of developing statistically well-founded procedures that provide control over errors in the setting of massive data, recognizing that these procedures are themselves computational procedures that consume resources," write authors.
The challenge of gleaning meaningful information from massive data is not isolated to one field, either. Authors note that massive data analysis is an interdisciplinary enterprise.
Just as the challenge faces many fields, the solutions to big data problems may come from a variety of fields. Privacy issues, for example, are common with big data, and establishing norms and best practices for data privacy will likely require input from legal scholars, economists and other social scientists, says the book.
A shortage of big data experts also faces the public and private sector--due in part to the fact that training requires practice with massive data analysis and computational infrastructure that reveals the real problems associated with massive data.
"The availability of benchmarks, repositories (of data and software), and computational infrastructure will be a necessity in training the next generation of data scientists. The same point, of course, can be made for academic research: significant new ideas will only emerge if academics are exposed to real-world massive data problems," say authors.
- download the book from National Academies Press, Frontiers in Massive Data Analysis