Big Data has implications for national security, says DARPA
The era of big data has frightening implications for national security as well as personal privacy, the Defense Advanced Research Projects Agency acknowledges in an oblique way through a small business solicitation for tools for the rapid anonymization and de-anonymization of data and a framework to measure the national security impact of vast troves of publicly available data.
Large scale data aggregators, the solicitation notes, sell information gleaned from commerce and social media often voluntarily disclosed by individuals, and some of that information is also openly available online. When combined with advanced marketing techniques and low-cost computational capabilities, it's possible that a modestly funded group could obtain analytic powers previously the purview of nation-states only, the solicitation says.
"To what extent could a non-state actor collect, process, and analyze a portfolio of purchased and open source data to reconstruct an organizational profile, fiscal vulnerabilities, location of physical assets, work force pattern-of-life, and other information, in order to construct a deliberate attack on a specific capability?" it asks.
As an example of unintended consequences created by voluntary data disclosure, DARPA cites a 2009 lawsuit against Netflix filed by four individuals, including an anonymous closeted lesbian woman, after University of Texas-Austin researchers (.pdf) were able to de-anonymize a Netflix subscriber movie rating dataset. The company released the movie ratings of 500,000 subscribers with identifying information removed as part of a contest to improve its recommendation algorithm; the company settled the lawsuit and canceled a follow-up contest.
Defense Department entities "including Army, Navy, and Air Force are interested in operational security and not having their plans and operations compromised through vulnerabilities in public data," the solicitation states.
It calls for a proof-of-concept system that can automatically sample and characterize data from many sources, "and provide automatic feedback on the measurable risk inherent with various collections of data."
It also says the Pentagon wants "defensive countermeasures" against vulnerabilities created by public data accumulation, but it doesn't specify what the countermeasures could consist of besides that it would depend on near real-time monitoring of open source data.
- go to the DARPA Small Business Innovation Research Program solicitation