마이크로소프트 – Book of Ryu's past and future in another universe, Chapter 4.

[Economist] 순간의 반짝임(Flash in the pan)

순간의 반짝임(Flash in the pan)

As Apple flexes its mobile muscles, it is changing the appearance of video on the web
애플의 모바일 시장에서의 기지개로 웹에서의 비디오 형태가 달라지고 있다.

Apr 16th 2010 |
From The Economist online

스티브 잡스에게 마땅한 보상을 주자. 애플의 카리스마 넘치는 보스는 의심의 여지 없이 산업계의 가장 전략적인 사상가이다. 그는 과도기의 컴퓨터 산업에서 어떤 사람보다 더 인기가 있다. 대부분의 정적인 활동들이 점점 더 (배타적으로?) 모바일로 진화해 감에 따라, 이 산업을 선도하는 참여자들의 역할도 빠르게 바뀌고 있다.

마이크로소프트가 개인용 컴퓨터 시장을 호령할 때, 애플은 틈새시장에서 활약하던 기업 정도의 수준이었다. 하지만 휴대전화 시장에서 마이크로소프트는, 시장 진입을 노리는 뒤쳐진 경쟁자 중의 한 명이다. 그리고 비록 구글이 데스크탑 검색 시장의 65%를 가지고 있지만, 애플이 판매한 8천 5백만 개의 무선기기(iPhone, iPod, 그리고 새로운 iPad)는 미국의 모바일 브라우징의 64%를 점유한다고 Jobs는 이번 달 밝혔다.

애플의 모바일 기기에서의 성공은 새로운 모바일 광고 시장에서의 커다란 한 부분을 점유할 수 있는 기회를 얻을 수 있게 했다. 사실, 이것이 애플이 최근 모바일 광고 에이전시인 Quattro Wireless를 인수한 이유이다. 광고분야의 유력 기업이 되는 것은 확실히 매력적이다. 하지만 잡스는 이 보다 훨씬 큰 목표들을 가지고 있다. 그 중 가장 큰 것은 애플을 마이크로소프트 같은 유동성을 가지도록 변화시키는 것이다. 하지만 무엇보다 가능한 한 많은 개발자들을 애플의 생태계 안으로 끌어들이는 일이 있어야 한다. 만약 어플리케이션이 풍부하게 있다면, 이용자들은 떼지어 몰려들 것이라는 주장을 펼치고 있다.

이러한 일은 예전에도 있었다. 마이크로소프트가 그들의 왕국을 구축할 수 있었던 원인 중 일부분은 그들이 인텔의 프로세서와 함께 다른 제조사의 컴포넌트를 끼워 넣을 수 있는 슬롯에 기반한 개방적인 플랫폼을 채택하였기 때문이다. 하지만 이 것보다 더 중요한 사실은 독립적인 개발자들의 엄청난 수의 어플리케이션이 마이크로소프트의 운영체제에서만 동작할 수 있도록 작성되었다는 사실이다.

잡스는 애플의 하드웨어를 개방하여 다른 이들 과의 관계를 맺을 의도가 없다. 하지만 최소한의 기준을 만족시키는 소프트웨어의 경우 다르다. 가장 최근의 집계에 따르면 앱스토어(iPhone 소프트웨어를 위한 애플의 온라인 장터)는 이용자가 선택할 수 있는 185,000개의 어플리케이션을 보유하고 있다. 지금까지 40억 개 정도의 소프트웨어 유틸리티 게임, 지도, 음악 등이 iPhone, iPod, 그리고 최근의 iPad의 소유자들에 의해 다운로드 되었다. 이들 모두는 같은 운영체제를 사용하므로 다수의 동일한 어플리케이션을 이용할 수 있다. 앱스토어는 Jobs에게 마이크로소프트의 윈도우와 동등한 세계적 프랜차이즈를 구축할 기회를 가져다 주었다. 그러므로 애플의 관점에서, 그들의 특별한 고객 만족의 원천이 어떤 형태로든 위협 받는 것은 절대 두고 볼 수 없는 일이다.

따라서 잡스가 iPhone을 위한 앱을 작성하는 데 있어서 어도비의 플래시나 마이크로소프트의 닷넷처럼 한번 작성하면 다른 기기나 운영체제에서 구동하는 것을 쉽게 만들어주는 교차 플랫폼 도구를 이용하는 프로그래머를 배척하는 것은 놀라운 일이 아니다. 웹 브라우저의 플러그인 형태로 구동되는 플래시는 매킨토시 컴퓨터에는 있지만 애플의 모바일 기기에서는 찾아볼 수 없다.

플래시가 iPhone OS의 뒷문 어딘가를 이용하는 길을 찾는다면 애플이 그들의 소비자에게 걸었던 족쇄가 약화될 수 있다. 만약 대부분의 App들이 안드로이드나 블랙베리 단말에서도 iPhone과 마찬가지로 구동된다면, 애플은 그들이 제공할 수 있었던 폭 넓은 App 선택의 장점을 잃게 될 것이다. 요즘은 모든 스마트폰들이 동일한 재미있는 기능들로 무장하고 있어서, 첫 째로 iPhone을 사야 한다는 충동은 별로 없어질 수 있다.

하지만 Flash를 배척하는 데는 커다란 문제가 있다: 이것 없이는, 사람들은 이 분야의 가장 인기 있는 툴인 플래시로 만들어진 웹사이트 내 대부분의 비디오나 애니메이션, 게임 등을 플레이 할 수 없다. 어도비의 플래시 소프트웨어는 YouTube의 비디오부터 극히 간단한 움직이는 차트나 광고까지 웹에서 보여지는 대다수의 멀티미디어 클립을 구동 가능하게 한다. 애플의 기기들은 필요할 경우 YouTube의 비디오를 플레이 할 수 있는 소프트웨어를 포함하고 있다. 하지만, 이를 제외하고는 플래시로 만들어진 컨텐츠와는 호환이 되지 않는다. (Farmville의 팬들에게는 안타까운 일이다.)

여전히, 잡스는 단호하다. 그의 관점에서, 플래시는 프로세서를 독차지하고, 베터리 수명을 단축시키며, 필요 없는 충돌을 야기시키는 버그 투성이 소프트웨어의 온상이다. 이것이 그가 그의 모바일 플래시에 대한 배척을 우회적으로 돌파하고자 하는 어도비의 계획을 계속 차단시키는 이유이다. 이후로, iPhone등 애플의 기기를 위한 소프트웨어를 만드는 개발자는 반드시 애플이 인증한 셋 이외의 프로그래밍 툴을 사용하는 것을 금지하는 새롭게 변경된 계약에 사인을 해야 한다.

이 조치는 어도비의 최근의 프로그래밍 도구 Flash Pro CS 5의 출시에 의해 촉발됐다. 이를 통해 웹에서 보이는 종류의 플래시 어플리케이션이 독자 실행이 가능한 iPhone App으로 바뀌고 이는 App Store에 발각되지 않고 업로드 될 수 있어 보였다. 어도비는 꽤나 성급하게 이를 자랑했다. 밝혀진 바에 따르면 100개가 넘는 프로그램이 이러한 방식을 취했다고 한다.

애플의 최근의 플래시를 향한 족쇄는 iPhone과 iPod 그리고 iPad를 구입한 사람들이 불구가 된 웹을 이용할 것을 강요 당하고 있다는 것을 의미하는가? 당분간은 그렇지만, 부분적인 제2의 해결책들이 이를 도울 수 있을지도 모른다. 결국, 지난 6년 동안 개발되어온 HTML5로 알려진 기술이 많은 부분 플래시가 필요 없게 만들고 있다. 다른 대안들 중에서도 HTML5가 매력적인 것은 오디오와 비디오를 어도비의 플래시(혹은 다른 마이크로소프트의 실버라이트나 오라클의 JavaFX) 같은 브라우저의 플러그 인이 필요 없이 내부적으로 처리하도록 디자인 되었다는 것이다.

불행히도, HTML5는 아직도 진행 중이다. 현재 플래시가 웹 서버와 뷰어 사이의 비디오 데이터 스트림의 압축과 해제를 위한 다양한 “코덱”을 완벽하게 다룰 수 있는 가운데, HTML5의 경우에는 비디오 재생을 위한 두 개의 전혀 다른 코덱이 실험 중이다. 첫 번째는 H.264라 불리는 애플의 사파리와 마이크로소프트의 출시 예정에 있는 IE9에서 사용되는 것이고, 다른 것은 Ogg Theora라 알려진 Firefox와 Opera 브라우져에서 사용되고 있는 것이다. 구글의 Chrome은 양쪽 모두를 지원한다.

전문가들은 H.264 알고리즘이 더 좋은 영상을 보여주는 것에 동의하지만, 이것은 비록 당분간은 무료 라이센스를 발급하지만 소유자가 있는 기술이다. 인터넷 순수주의자들에게 Ogg Theora의 매력은 그것이 오픈 소스라는 점이다. 어느 코덱이 표준으로 채택될 것인지를 놓고 두 진영 사이에서 종교 전쟁이 발발했다.

좋은 소식은 하나의 해결책이 나타날지도 모른다는 것이다. 모든 사람들이, Google이 높이 평가되고 있는 VP8 비디오 코덱을 오픈 소스화 시킬 것이라고 예상하고 있다. 이 거대 검색 기업은 올해 초 코덱 제조사인 On2 Technologies를 인수한 이래 계속 이러한 징조를 보여주고 있다. 내부자들은 VP8가 H.264에 비해 절반의 대역폭을 사용하면서도 더 좋은 영상을 보여줄 것이라 예상한다. Firefox를 개발하고 있는 Mozilla는 VP8를 기꺼이 사용할 것이다.

하지만 H.264를 그렇게 열렬하게 지지한 애플도 그럴 것인가? 만약 이를 통해 플래시를 빠르게, 또 확실하게 없앨 수 있다면, 잡스는 의심할 여지 없이 동참할 것이다. 불행한 iPhone 이용자들에게, 절름발이 웹은 이윽고 과거의 일이 될 것이다.

[#M_ more.. | less.. |

GIVE Steve Jobs his due. Apple’s charismatic boss is, without question, the most strategic thinker in the business. He appreciates better than anyone that computing is in transition. As it evolves from being predominantly a stationary activity to becoming increasingly (exclusively?) a mobile one, the roles of the industry’s leading participants are changing fast.

When Microsoft ruled the realm of personal computers, Apple was little more than a niche player. But in mobile phones, Microsoft is the one left scrambling for a piece of the action. And although Google may own 65% of the search business on the desktop, the 85m wireless devices Apple has sold (iPhones, iPods and now iPads) account for 64% of America’s mobile browsing, Mr Jobs said this month.

The success of Apple’s mobile devices gives the firm an opportunity to capture a goodly chunk of the emerging mobile-advertising market. Indeed, that is the reason why Apple recently acquired Quattro Wireless, a mobile advertising agency. Becoming an advertising powerhouse is certainly attractive. But Mr Jobs has far bigger fish to fry. The biggest of them all is turning Apple into the Microsoft of mobility. But first there is a little matter of locking as many software developers as possible into the Apple ecosystem. If the applications are there, so the argument goes, users will follow in droves.

It has been done before. What gave Microsoft the keys to the kingdom was partly the way it embraced an open platform based on the Intel processor plus slots for other manufacturers’ components to plug into. Even more important, though, was the vast number of applications written by independent programmers that worked exclusively with Microsoft’s operating systems.

Mr Jobs has no intention of ever opening Apple’s hardware for others to mess with. But software that meets a minimum standard is a different matter. At the last count, the App Store (Apple’s online outlet for iPhone software) listed 185,000 applications for users to choose from. So far, some 4 billion software utilities, games, maps and music tracks have been downloaded by owners of iPhones, iPods and lately iPads—all of which share the same operating system and can therefore use many of the same applications. The App Store offers Mr Jobs his best chance yet of creating a global franchise on a par with Microsoft’s Windows. From Apple’s perspective, the last thing it should therefore do is allow that unique source of customer satisfaction to be threatened in any way.

No surprise, then, that Mr Jobs has banned programmers from writing iPhone apps using cross-platform programming tools like Adobe’s Flash and Microsoft’s .NET, which make it easy to write an app for many different devices and operating systems at once. Flash plug-ins, running inside web browsers, can be found in Macintosh computers, but in none of Apple’s mobile toys.

Were Flash ever to find its way in through the back door to the iPhone operating system, Apple’s armlock on its customers would be severely weakened. If most apps are built to run on Android and BlackBerry phones, as well as iPhones, then Apple would lose the advantage of being able to offer the widest choice of apps. With all smart phones able to do similar tricks these days, there would be less compulsion to buy an iPhone in the first place.

But there is a big problem with banning Flash: without it, people cannot play most of the videos, animation and games encoded on websites using the industry’s most popular tool. Adobe’s Flash software powers the vast majority of multimedia clips seen on the web—from YouTube videos to the simplest animated chart or advertisement. Apple’s devices include software that can play YouTube videos when needed. But apart from that they are incompatible with content built in Flash. (Bad luck, Farmville fans.)

Still, Mr Jobs remains adamant. In his view, Flash is a rat’s nest of buggy software that hogs processor cycles, drains battery life and causes needless crashes. That is why he has just blocked an end-run Adobe was planning around his ban on mobile Flash. Henceforth, developers creating applications for the iPhone and its ilk will have to sign a revised agreement that forbids them from using any programming tools other than Apple’s approved set.

The move was prompted by the arrival of Adobe’s latest programming aid, Flash Pro CS 5. This threatened to turn Flash applications of the kind seen on the web into stand-alone iPhone apps capable of slipping onto the App Store undetected. Adobe even boasted—rather rashly, as it turned out—that over 100 such programs had already done just that.

Does Apple’s latest clamp down on Flash mean that people who have bought iPhones, iPods and iPads are now stuck with a crippled version of the web? For the time being, yes—though there are partial workarounds that might yet help. Eventually, though, a technology known as HTML5, which has been in the works for the past six years, promises to render Flash largely irrelevant. Among other things, the attraction of HTML5 is that it is designed to handle audio and video internally, without the need for browser plug-ins such as Adobe’s Flash (or others like Microsoft’s Silverlight and Oracle’s JavaFX).

Unfortunately, HTML5 remains a work in progress. Where, today, Flash can seamlessly handle a variety of “codecs” for compressing and decompressing the video’s data stream between the web server and the viewer, HTML5 is experimenting with two distinctly different codecs for video playback: one, called H.264, is used in Apple’s Safari and Microsoft’s forthcoming IE9 browsers, while the other, known as Ogg Theora, has been adopted by the Firefox and Opera browsers; Google’s Chrome has embraced both.

Experts agree that the H.264 algorithm produces a superior picture, but it is a proprietary technology—though free to license, at least for the time being. For internet purists, Ogg Theora’s attraction is that it is open source. A religious war has broken out between the two camps over which codec to standardise on.

The good news is that a solution may yet be in sight. By all accounts, Google is poised to open-source its highly regarded VP8 video codec. The search giant has hinted as much ever since acquiring the codec’s maker, On2 Technologies, earlier this year. Insiders reckon VP8 uses only half the bandwidth of H.264 while delivering an even better picture. Mozilla, the open-source organisation behind Firefox, would welcome VP8 into the fold.

But would Apple, after having backed H.264 so enthusiastically? If it promised a quick and certain death for Flash, Mr Jobs would doubtless be delighted to go along. For deprived iPhone users, the crippled web might then be a thing of the past.

_M#]

[Economist] 데이터, 사방의 데이터 (Data, Data everywhere)

데이터, 사방의 데이터

Feb 25th 2010 |
From The Economist print edition

정보가 희박한 상태에서 넘쳐나도록 바뀌었다. 이것이 Kenneth Cukier의 말처럼 많은 이점을 가져다 주었지만, 또한 골칫거리 이기도 하다.

슬로운 디지털 스카이 서베이가 2000년에 시작되었을 때, 뉴멕시코에 있는 망원경은 첫 몇 주 동안 천문학 역사 전체에 걸쳐 수집된 것 보다 많은 데이터를 모았다. 10년이 지난 현재, 그들의 저장소에는 140테라 바이트라는 엄청난 양의 정보가 저장되어있다. 2016년에 칠레에 설치될 이의 후속 Large Synoptic Survey 망원경은 이 정도 양의 데이터를 5일 이면 모을 것이다.

이러한 천문학적인 양의 정보는 지구에 더 가까운 곳에서도 찾을 수 있다. 거대 소매상 월 마트는 1백만 건의 소비자 구매 건을 처리하고 2.5 페타 바이트 이상으로 측정되는 데이터베이스에 이들을 보관한다. 이는 미의회도서관에 있는 책의 167배에 달하는 양이다. 소셜 네트워크 웹 사이트 페이스북은 400억 개의 사진을 보유하고 있다. 그리고 인간 유전자에 포함된 30억 기본 쌍을 분석하는데 최초로 이것이 이루어졌던 2003년에는 10년이 걸렸지만 지금은 1주면 충분하다.

이러한 모든 예들은 같은 것을 시사한다. 세계에는 상상도 할 수 없을 만큼 거대한 양의 디지털 정보들이 점점 더 거대해지고, 더 빠르게 늘어나고 있다. 이는 예전에는 이룰 수 없었던, 비지니스 트랜드를 파악하고, 질병을 예방하고, 범죄와 싸우는 등을 가능하게 해준다. 잘 관리 된다면, 이러한 데이터는 과학에 있어서 신선한 통찰을 제시하고, 정부에 책임을 묻는 등, 새로운 경제적 가치의 원천으로 활용될 수 있다.

하지만 이들은 또한 다수의 문제점을 만들어낸다. 센서, 컴퓨터, 모바일 폰 등의 이러한 모든 정보를 얻고, 처리하고, 공유하는 도구들은 넘쳐나지만, 이들을 저장할 가용 공간이 크게 부족하다. 게다가, 전세계적으로 어느 때보다 활발하게 정보가 복제되고 공유되는 때에 데이터의 보안과 사생활 보호 등은 점점 더 어려워지고 있다.

존스 홉킨스 대학의 천체 물리학자 Alex Szalay는 데이터의 풍족함이 오히려 이들을 더 다루기 어렵게 만든다고 지적한다. 그는 “어떻게 이 모든 데이터를 이해해야 하나요? 사람들은 다음 세대를 어떻게 교육시킬지에 대해서 고민해야 합니다. 과학자 뿐 아니라, 정부나 산업에서 일하는 사람 모두요.” 라고 말한다.

“우리는 너무도 많은 정보들이 만드는 이전과는 다른 세상을 살고 있습니다.” 사회에서의 정보의 역사에 관한 다수의 책을 집필한 IBM의 James Cortada는 말한다. UC 버클리의 컴퓨터 과학자 Joe Hellerstein은 이것을 “데이터의 산업 혁명”이라고 부른다. 이의 영향력은 산업분야에서 과학분야까지, 정부에서 예술 분야까지 모든 곳에서 느낄 수 있다. 과학자들과 컴퓨터 기술자들은 이러한 현상을 “big data”라고 불러왔다.

인식론적으로 말해서, 정보는 데이터의 집합으로 구성되고, 지식은 각각의 서로 다른 정보들의 묶음으로 구성된다. 하지만 이 스페셜 리포트에서는 “데이터”와 “정보”를 서로 혼용해서 사용한다. 왜냐하면, 나중에 다루어지겠지만, 이 둘은 서로 구분하기 점점 어려워지고 있다. 주어진 충분한 양의 순수한 데이터에서, 현재의 알고리즘과 강력한 컴퓨터들을 가지고 과거에는 숨겨져 있었던 새로운 통찰을 발견해 낼 수 있다.

정보 관리 산업 – 어떤 조직이 그들의 풍부한 데이터를 이해하도록 돕는 – 은 대폭 성장하고 있다. 최근 몇 년간 오라클, IBM, 마이크로소프트 그리고 이들 사이의 SAP은 150억불 이상을 데이터를 관리하고 분석하는데 특화된 소프트웨어 기업을 사들이는데 썼다. 이 산업은 최소 1000억불 이상의 값어치로 평가되며 대충 소프트웨어 산업 전체가 성장하는 속도의 2배인 연간 10% 정도 성장하고 있다.

최고 정보 관리 책임자는 경영진 사이에서 그 중요성이 커져왔고, 또한 소프트웨어 프로그래머, 통계학자 그리고 스토리 텔러/예술가의 기술을 모두 가지고 산처럼 많은 데이터 속에 숨겨진 금 덩어리를 발견하는 데이터 과학자라는 새로운 형태의 전문가가 등장했다. 구글의 최고 경제 전문가 Hal Varian은 이런 통계 전문가의 직업이 가장 매력적이 될 것이라고 예측했다. 그가 설명하기를, 데이터는 어디에나 존재하지만 그 속에서 지혜를 찾아내는 능력은 부족하다는 것이다.

모든 것 그 이상

이러한 정보의 폭발에는 여러 이유가 있다. 가장 확실한 것 하나는 기술이다. 디지털 기기의 능력이 발전하고 가격은 곤두박질 치면서 센서나, 미니기기들은 예전에는 불가능했던 엄청나게 많은 정보를 디지털화 시키고 있다. 그리고 더욱 더 많은 사람들이 더 강력한 도구들을 사용하고 있다. 예를 들어서 세계적으로는 46억 명의 모바일 폰 가입자가 존재한다. (많은 사람들이 하나 이상을 사용하기 때문에, 이러한 수치가 제시하는 것 만큼 세계의 68억 인구 모두에게 보급되어 있지는 않다.) 그리고 10억에서 20억의 사람들이 인터넷을 사용한다.

게다가, 정보를 통해 의사 소통하는 사람들이 더욱 증가했다. 1990년과 2005년 사이에 세계적으로 10억의 사람이 중산층 계급에 진입했다. 그들이 더 부유해지고, 문맹에서 벗어남에 따라, 정보양의 증가에 기여했다고 Cortada씨는 말한다. 그 결과는 정치, 경제 뿐 아니라 법 등의 각 분야에서 나타났다. “과학에서의 혁명 전에 측량에서의 혁명이 있었습니다.” 뉴욕 대학의 경영학 교수 Sinal Aral는 말한다. 현미경이 세균을 발견하여 생물학을 변화시키고, 전자 현미경이 물리학을 변화시킨 것처럼, 모든 이러한 데이터는 사회 과학을 거꾸로 뒤집고 있다고 그는 설명했다. 연구자들은 인간의 행동을 인구 차원의 레벨이 아닌 개인 레벨에서 이해하는 것이 가능해졌다.

디지털 정보의 양은 매 5년마다 10배씩 증가하고 있다. 컴퓨터 산업에서 이제는 당연시되는 무어의 법칙은 컴퓨터 칩의 프로세스 파워와 저장 용량이 두 배가 되고 가격은 절반이 되는데 18개월이 걸린다 한다. 소프트웨어 프로그램도 나날이 향상되고 있다. 프린스턴 대학의 컴퓨터 과학자 Edward Felten은 컴퓨터 어플리케이션들을 구동시키는 알고리즘의 발전이 지난 몇 십 년 동안 무어의 법칙의 중요한 부분으로서 역할을 해왔다고 측정한다.

이러한 정보의 많은 양이 공유되고 있다. 통신장비 메이커인 Cisco에 따르면 2013년까지 인터넷을 통해 흘러 다니는 트래픽의 양은 연간 667 엑사바이트에 달할 것이라고 한다. 그리고 데이터의 양은 지속적으로 네트워크가 운송할 수 있는 양보다 더 빠른 속도로 증가할 것이라 한다. 사람들은 그들이 정보의 늪에서 허우적댄다고 불평해왔다. 1917년으로 돌아가 한 코네티컷의 제조 회사의 매니져는 전화의 영향에 대해서 불평을 했다. : “시간은 낭비되고 혼란스러운 결과만 낳고, 돈은 돈대로 낭비된다.” 지금 일어나고 있는 일들은 점진적 증가 이상이 될 것이다. 양적인 팽창은 질적인 다름을 만들어내기 시작했다.

정보의 부족에서 과다에의 이동은 폭 넓은 변화를 가져온다. “우리가 관심 있는 것은 데이터를 통해 경제적인 성과를 낼 수 있는 능력이다. 그리고 이것은 나에게는 사회적 차원에서, 혹은 거시경제학 차원에서의 큰 변화이다.” 라고 마이크로소프트의 연구와 전략 책임자 Graig Mundie는 말한다. 데이터는 자본이나 노동력과 거의 동등한 레벨의 경제적 인풋으로 경영의 새로운 원자재가 되고 있다. “매일 나는 잠에서 일어나 묻습니다, 어떻게 데이터를 잘 흘러가게 하고, 데이터를 잘 관리하고, 데이터를 잘 분석할 수 있을까?” 월마트의 CIO Rollin Ford는 말한다.

복잡한 정량적 분석은 과거처럼 미사일 궤도 분석이나 재정의 연계 전략 뿐 아니라 삶의 많은 분야에서 적용되어 왔다. 예를 들어, 마이크로소프트의 검색엔진 Bing의 일부분인 Farecast는 고객에게 항공 티켓을 지금 구입할지, 아니면 가격이 내려가기를 기다릴지를 2250억 개의 비행과 가격 기록을 살펴본 후 조언해 줄 수 있다. 같은 아이디어가 호텔 룸이나, 자동차, 비슷한 아이템까지 확장 될 수 있다. 개인 금융 웹 사이트와 은행들은 그들의 고객 데이터를 종합하여 거시 경제학 트랜드를 밝혀내고 이는 그들 자신의 노력을 통해 보조적인 비지니스로 발전될 수 있다. 숫자에 빠삭한 이들은 일본의 스모에서 승부 조작이 있었다는 사실까지 밝혀냈다.

쓰레기를 금으로

“데이터 배기가스” – 인터넷 사용자들이 뒤에 남기는 클릭의 자취에서 가치가 발견될 수 있다 – 가 인터넷 경제의 중심이 되고 있다. 예의 하나로서 검색 질의 문과의 연관성을 얼마나 많은 클릭이 그 대상에 있었느냐에 따라 측정하는 구글의 검색 엔진을 들 수 있다. 만약 어떤 검색어의 8번째 결과를 사람들이 가장 많이 방문한다면 이 알고리즘은 이것을 가장 위에 위치 시킨다.

세계가 점점 디지털로 변화하면서 데이터를 모으고 분석하는 것이 다른 분야에서도 막대한 양의 이익을 가져다 줄 수 있다. 예를 들어, 마이크로소프트의 Mundie와 구글의 CEO, Eric Schmidt는 미국 건강 보험의 개혁을 위한 대통령 테스크 포스에 임명되었다. “일의 초기 단계에서는 Eric과 저 모두는 말했죠. ‘보세요, 당신이 만약 정말로 건강 보험을 개혁하기를 원한다면, 당신은 기본적으로 사람들과 관련된 데이터에서 일종의 건강 보험 경제 구조를 만들어야 합니다.’ “ Mundie는 설명했다. “건강 보험을 제공하는 것의 산출물로서 데이터를 생각하면 안됩니다, 그 대신 데이터는 건강 보험의 모든 면을 어떻게 증진 시킬지 구체화 시키려는 과정에서의 중심적인 자산이 되어야 합니다. 이는 약간의 도치입니다.”

틀림없이, 디지털 기록은 의사들을 편하게 만든다. 공급자와 환자를 위한 가격을 낮추고 치료의 질을 높인다. 하지만 종합적으로 데이터는 원치 않은 마약 거래나, 가장 효과적인 치료법을 찾아내거나, 증상이 나타나기 전에 질병의 시작을 예측하기 위해서도 사용되고 있다. 컴퓨터는 벌써 이러한 일을 시도하고 있지만, 명시적으로 이러한 목적으로 프로그램 될 필요가 있다. 거대한 데이터의 세계에서는 사물의 연관 관계들이 이들에 의해 수면 위로 떠오른다.

때로는 이러한 데이터들이 의도한 것 이상을 밝혀내기도 한다. 예를 들어 캘리포니아에 있는 Oakland 시에서는 언제 어디서 검거가 이루어졌는지 정보를 Oakland Crimespotting이라는 사설 웹 사이트에 공개했다. 어느 순간, 몇몇의 클릭이 매춘을 위해 분주한 거리 전체를 경찰은 월요일 저녁을 제외하고는 매일 순찰한다는, 그들이 숨기고 싶었던 전략을 밝혀냈다.

하지만 많은 양의 데이터는 이러한 결과들보다 훨씬 더 심각한 결과를 낸다. 최근의 금융 위기 기간에 은행과 신용 평가 기관들이 엄청난 양의 정보를 이용하면서도 현실 세계의 금융 위험을 제대로 반영하는데 실패한 모델에 의존했다는 사실이 명확해졌다. 이는 거대한 양의 데이터에 의해 촉발된 첫 번째 위기였다. 그리고 이러한 예는 앞으로도 더 있을 것이다.

정보가 관리되는 방법은 삶에 전반적으로 영향을 미친다. 20세기로의 변화의 시점에 전신이나 전화 같은 새로운 채널을 통한 정보의 흐름이 대량 생산을 뒷받침했다. 현대의 풍부한 데이터의 가용성이 기업들로 하여금 세계 곳곳에 위치한 작은 틈새 시장를 노릴 수 있게 한다. 경제적인 생산이 관리자가 모든 기계과 작업을 감시하여 이를 더 효율적으로 만드는 공장의 기본이었다. 지금 통계학자들은 경영에서 새로운 아이디어를 위한 정보를 발굴한다.

“데이터 중심의 경제가 이제 막 선보였을 뿐입니다.” 라고 마이크로소프트의 Mundie는 선언했다. “대충의 윤곽은 확인할 수 있습니다. 하지만 기술적인, 기반 시설의 그리고 비지니스 모델에의 영향들은 현재로서는 잘 이해되지 않습니다.” 이 스페셜 리포트는 어디에서 이러한 현상들이 나타날지를 가리키게 될 것이다.

영어 원문

[#M_ more.. | less.. |

A special report on managing information

Feb 25th 2010 |
From The Economist print edition

Data, data everywhere

Information has gone from scarce to superabundant. That brings huge new benefits, says Kenneth Cukier (interviewed here)—but also big headaches

WHEN the Sloan Digital Sky Survey started work in 2000, its telescope in New Mexico collected more data in its first few weeks than had been amassed in the entire history of astronomy. Now, a decade later, its archive contains a whopping 140 terabytes of information. A successor, the Large Synoptic Survey Telescope, due to come on stream in Chile in 2016, will acquire that quantity of data every five days.

Such astronomical amounts of information can be found closer to Earth too. Wal-Mart, a retail giant, handles more than 1m customer transactions every hour, feeding databases estimated at more than 2.5 petabytes—the equivalent of 167 times the books in America’s Library of Congress (see article for an explanation of how data are quantified). Facebook, a social-networking website, is home to 40 billion photos. And decoding the human genome involves analysing 3 billion base pairs—which took ten years the first time it was done, in 2003, but can now be achieved in one week.

All these examples tell the same story: that the world contains an unimaginably vast amount of digital information which is getting ever vaster ever more rapidly. This makes it possible to do many things that previously could not be done: spot business trends, prevent diseases, combat crime and so on. Managed well, the data can be used to unlock new sources of economic value, provide fresh insights into science and hold governments to account.

But they are also creating a host of new problems. Despite the abundance of tools to capture, process and share all this information—sensors, computers, mobile phones and the like—it already exceeds the available storage space (see chart 1). Moreover, ensuring data security and protecting privacy is becoming harder as the information multiplies and is shared ever more widely around the world.

Alex Szalay, an astrophysicist at Johns Hopkins University, notes that the proliferation of data is making them increasingly inaccessible. “How to make sense of all these data? People should be worried about how we train the next generation, not just of scientists, but people in government and industry,” he says.

“We are at a different period because of so much information,” says James Cortada of IBM, who has written a couple of dozen books on the history of information in society. Joe Hellerstein, a computer scientist at the University of California in Berkeley, calls it “the industrial revolution of data”. The effect is being felt everywhere, from business to science, from government to the arts. Scientists and computer engineers have coined a new term for the phenomenon: “big data”.

Epistemologically speaking, information is made up of a collection of data and knowledge is made up of different strands of information. But this special report uses “data” and “information” interchangeably because, as it will argue, the two are increasingly difficult to tell apart. Given enough raw data, today’s algorithms and powerful computers can reveal new insights that would previously have remained hidden.

The business of information management—helping organisations to make sense of their proliferating data—is growing by leaps and bounds. In recent years Oracle, IBM, Microsoft and SAP between them have spent more than $15 billion on buying software firms specialising in data management and analytics. This industry is estimated to be worth more than $100 billion and growing at almost 10% a year, roughly twice as fast as the software business as a whole.

Chief information officers (CIOs) have become somewhat more prominent in the executive suite, and a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.

More of everything

There are many reasons for the information explosion. The most obvious one is technology. As the capabilities of digital devices soar and prices plummet, sensors and gadgets are digitising lots of information that was previously unavailable. And many more people have access to far more powerful tools. For example, there are 4.6 billion mobile-phone subscriptions worldwide (though many people have more than one, so the world’s 6.8 billion people are not quite as well supplied as these figures suggest), and 1 billion-2 billion people use the internet.

Moreover, there are now many more people who interact with information. Between 1990 and 2005 more than 1 billion people worldwide entered the middle class. As they get richer they become more literate, which fuels information growth, notes Mr Cortada. The results are showing up in politics, economics and the law as well. “Revolutions in science have often been preceded by revolutions in measurement,” says Sinan Aral, a business professor at New York University. Just as the microscope transformed biology by exposing germs, and the electron microscope changed physics, all these data are turning the social sciences upside down, he explains. Researchers are now able to understand human behaviour at the population level rather than the individual level.

The amount of digital information increases tenfold every five years. Moore’s law, which the computer industry now takes for granted, says that the processing power and storage capacity of computer chips double or their prices halve roughly every 18 months. The software programs are getting better too. Edward Felten, a computer scientist at Princeton University, reckons that the improvements in the algorithms driving computer applications have played as important a part as Moore’s law for decades.

A vast amount of that information is shared. By 2013 the amount of traffic flowing over the internet annually will reach 667 exabytes, according to Cisco, a maker of communications gear. And the quantity of data continues to grow faster than the ability of the network to carry it all.

People have long groused that they were swamped by information. Back in 1917 the manager of a Connecticut manufacturing firm complained about the effects of the telephone: “Time is lost, confusion results and money is spent.” Yet what is happening now goes way beyond incremental growth. The quantitative change has begun to make a qualitative difference.

This shift from information scarcity to surfeit has broad effects. “What we are seeing is the ability to have economies form around the data—and that to me is the big change at a societal and even macroeconomic level,” says Craig Mundie, head of research and strategy at Microsoft. Data are becoming the new raw material of business: an economic input almost on a par with capital and labour. “Every day I wake up and ask, ‘how can I flow data better, manage data better, analyse data better?” says Rollin Ford, the CIO of Wal-Mart.

Sophisticated quantitative analysis is being applied to many aspects of life, not just missile trajectories or financial hedging strategies, as in the past. For example, Farecast, a part of Microsoft’s search engine Bing, can advise customers whether to buy an airline ticket now or wait for the price to come down by examining 225 billion flight and price records. The same idea is being extended to hotel rooms, cars and similar items. Personal-finance websites and banks are aggregating their customer data to show up macroeconomic trends, which may develop into ancillary businesses in their own right. Number-crunchers have even uncovered match-fixing in Japanese sumo wrestling.

Dross into gold

“Data exhaust”—the trail of clicks that internet users leave behind from which value can be extracted—is becoming a mainstay of the internet economy. One example is Google’s search engine, which is partly guided by the number of clicks on an item to help determine its relevance to a search query. If the eighth listing for a search term is the one most people go to, the algorithm puts it higher up.

As the world is becoming increasingly digital, aggregating and analysing data is likely to bring huge benefits in other fields as well. For example, Mr Mundie of Microsoft and Eric Schmidt, the boss of Google, sit on a presidential task force to reform American health care. “Early on in this process Eric and I both said: ‘Look, if you really want to transform health care, you basically build a sort of health-care economy around the data that relate to people’,” Mr Mundie explains. “You would not just think of data as the ‘exhaust’ of providing health services, but rather they become a central asset in trying to figure out how you would improve every aspect of health care. It’s a bit of an inversion.”

To be sure, digital records should make life easier for doctors, bring down costs for providers and patients and improve the quality of care. But in aggregate the data can also be mined to spot unwanted drug interactions, identify the most effective treatments and predict the onset of disease before symptoms emerge. Computers already attempt to do these things, but need to be explicitly programmed for them. In a world of big data the correlations surface almost by themselves.

Sometimes those data reveal more than was intended. For example, the city of Oakland, California, releases information on where and when arrests were made, which is put out on a private website, Oakland Crimespotting. At one point a few clicks revealed that police swept the whole of a busy street for prostitution every evening except on Wednesdays, a tactic they probably meant to keep to themselves.

But big data can have far more serious consequences than that. During the recent financial crisis it became clear that banks and rating agencies had been relying on models which, although they required a vast amount of information to be fed in, failed to reflect financial risk in the real world. This was the first crisis to be sparked by big data—and there will be more.

The way that information is managed touches all areas of life. At the turn of the 20th century new flows of information through channels such as the telegraph and telephone supported mass production. Today the availability of abundant data enables companies to cater to small niche markets anywhere in the world. Economic production used to be based in the factory, where managers pored over every machine and process to make it more efficient. Now statisticians mine the information output of the business for new ideas.

“The data-centred economy is just nascent,” admits Mr Mundie of Microsoft. “You can see the outlines of it, but the technical, infrastructural and even business-model implications are not well understood right now.” This special report will point to where it is beginning to surface.

_M#]