Where I mix career information and career decision making in a test tube and see what happens

Wednesday, May 25, 2011

Big Data: The Next Career Field?

Yesterday I completed the manuscript for the next edition of Best Jobs for the 21st Century. The book actually focuses on the next ten years, but I often wonder about longer-term prospects for career growth in the United States. Where will tomorrow’s jobs come from? Here’s one possibility.

Early in the 20th century, geologists discovered a huge pool of oil beneath the ground near Beaumont, Texas. Other petroleum deposits soon were identified in California and Oklahoma, and these natural resources led to thousands of jobs and billions of dollars of revenue. Our economy has exploited many other natural resources, such as timber, fish, and fresh water, sometimes creating shortages when demand exceeds supply. But perhaps the next huge resource that will be exploited is not a natural resource, not even something tangible, but rather massive quantities of data. This is what a recent report (PDF) from McKinsey Global Institute argues, and the report makes a good case.

It’s estimated that the volume of business data doubles roughly every 1.2 years. Every time you order something online, you’re generating data about your purchasing and payment behavior. The logistics process of getting the product to you generates additional data. Postings on Facebook, geotagged photos on Flickr, items on eBay or Craig’s List, media that stream on YouTube or Internet radio, all these and countless other quantities of data are generated every minute of the day, and the way (including the location from which) people respond to each of these items creates more data.

This pool of data, like a pool of oil, can be exploited for economic value, but it has the additional benefit of never running dry. Not only is new data constantly being generated, but consumption of data doesn’t use it up. It can be analyzed and reanalyzed. Any number of users can exploit it simultaneously.

The most successful current users of big data are Google, Bing, Yahoo, and other search providers. They use it two ways: (1) They create an index of Web content that is ranked according to how many links exist to the content, and (2) when you use this index, they sell information about your clicking behavior to advertisers. But these search providers only scratch the surface of all the data that’s out there.

The authors of the report (try as you may, you just can’t avoid calling a bulletin from McKinsey a “McKinsey report”) emphasize that what makes big data a new resource, different from past uses of business data, is its size: We already have business tools that exploit various databases, but the potential for innovative work lies in finding ways to analyze data at larger scales than have ever been attempted before. In fact, the authors make a point of defining “big data” as a kind of moving target rather than as a fixed number of terabytes, because as the volume of data doubles and redoubles, the scale of the analytical task will continuously create new challenges. Success in this field is not a matter of being able to analyze data, but rather being able to analyze bigger data sets than ever before.

Another dimension of the analytical challenge that defines this new resource (and the occupations that it will spawn) is the speed with which the big-data analysis can be done. We’re already gotten used to being able to track the delivery route of a package within a few hours of real time or the actual arrival time of an airline flight within a few minutes. Wall Street has developed ways to react within milliseconds to fluctuations in the value of securities. Some of the innovations that will be developed for use of big data will be methods of accomplishing instant analysis and response--and doing so in ways that have controls to prevent snowballing events such as the “flash crash” of 2010, in which the Dow Jones plunged about 9 percent within minutes, only to rebound just as quickly.

It’s obvious that marketers will have uses for the outputs of this work. So will governments, which can improve services by better segmenting the population. Law enforcement and defense are already reaping the benefits of using large-scale, real-time monitoring of events to trigger and coordinate a rapid response.

This kind of work will be done by teams that are highly creative and have outstanding technical and communications skills. In other words, it is the kind of work that America has always been good at. The major hurdle that needs to be overcome is the projected shortage of skilled workers. As the report notes, we face “a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of big data.” So here is another reason why we need to improve STEM education and career development, make higher education more meritocratic, and (if Americans fail to step up to the plate) facilitate immigration of skilled foreigners.


  1. One thing I didn't touch on in this blog is the privacy issues raised by big-data analysis. These technologies are capable of Big Brother uses, and repressive regimes such as China's are already pioneering techniques for applying big-data analysis to these purposes. Nor is government the only potential abuser of these technologies. For an interesting commentary on these issues, I recommend a blog (http://www.technologyreview.com/business/37548/?p1=BI) on MIT's Technology Review site. The blogger mentions "calls for corporations to create positions such as chief privacy officer, chief safety officer, and chief data officer." In other words, the privacy issues raised by these technologies can also create jobs.

  2. Big Data Development is based on these number of these principles are security, performance and data quality management.