TI’s Big Data Journey Vishal Mehra Chief IT Architect Distinguished Member Technical Staff
[email protected] TI Information – Selective Disclosure
General Disclaimer Please be aware that the concepts and opinions expressed in the following presentation are those of the presenter (Vishal Mehra) and may not reflect the operational philosophies, nor strategic directions of Texas Instruments
TI Information – Selective Disclosure
Audience Demographics • How many attendees are Big Data Customers versus “Vendors / Service Providers”?
• How many of you have ongoing, active Big Data Projects? • How many of you have dedicated Big Data Teams? Data Scientists? • How many of you have asked your Operations teams to manage the Big Data Platform? • How many of you have the DBA team manage the Big Data Environment as well? • How many of you have built your own Big Data Cluster? • How many of you are using Big Data Technologies in production today? TI Information – Selective Disclosure
Agenda • TI @ a glance • What is “Big Data”? • TI Challenges over time • Big Data efforts
TI Information – Selective Disclosure
TI @ a glance
TI Information – Selective Disclosure
What is Big Data? • Oxford English Dictionary - "data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges" • Wikipedia - "an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using on-hand data management tools or traditional data processing applications" • 2011 Bid Data Study By McKinsey - "Datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze" • Big Data@Work by Tom Davenport - "The broad range of new and massive data types that have appeared over the last decade or so.” • "The new tools helping us find relevant data and analyze its implications" • "The shift (for enterprises) from processing internal data to mining external data." • "The belief that the more data you have the more insights and answers will rise automatically from the pool of ones and zeros." • "A new attitude by businesses, non-profits, government agencies, and individuals that combining data from multiple sources could lead to better decisions." TI Information – Selective Disclosure “Big Data Definitions What's Yours? “ in Forbes by Gil Press
Scale, Velocity and Variety Reporting Visualization Analytics
ERP
JDA
Replica
Semantic Layer
Ent. Data Warehouse
Web Acceleration Hadoop Data Lake
TI Information – Selective Disclosure
Databases @ Factories
Manufacturing Data Use-Case
Manufacturing Data Lake Data Model Development | Early Pattern Detection | Repeatable Machine Learning TI Information – Selective Disclosure
Organization – People, Culture & Technology • Focus on the business problem at hand, not the technology… • Make the team use the “platform” instead of understanding the plumbing behind it • Partnering with the key “Citizen Data Scientists” to help with the acceleration … IT cannot have all the SME’s around the data • Help with the data wrangling since 80% of the design and implementation time is spent dealing with data (cleansing, morphing, etc.) • Hadoop Technology is NOT a RDBMS environment … though it feels like one TI Information – Selective Disclosure
Potential Use Cases • Machine learning for the Factories • Customer 360° • Security Analytics • Data Archival
• ERP Data Analytics offload • IT Operations Intelligence
TI Information – Selective Disclosure
QUESTIONS TI Information – Selective Disclosure