Interview with James Kobielus, Big Data Evangelist, IBM
About James Kobielus
James Kobielus is IBM’s Big Data Evangelist, a title that speaks to his expertise in big data and extensive experience writing and speaking about it as a representative of IBM’s “deep global brain trust” in big data analytics. He is also a Senior Program Director for Product Marketing in Big Data Analytics where he serves as a content and social marketing strategist helping IBM portfolio teams with their campaigns, launches, and other critical initiatives. Finally, Mr. Kobielus serves as Editor-in-Chief of IBM Big Data Mag and the Technical Marketing Sector for IBM Big Data and Marketing Hub thought-leadership channels.
Mr. Kobielus maintains a very active freelance technical writing business. He has authored or contributed to numerous business technology books and written for many publications of note, including Network World magazine, to which he contributed for more than 20 years. Prior to joining IBM in 2012, Mr. Kobielus worked for consulting firms, analyst firms, technology vendors, and government contractors. His topics of professional interest include everything under the broad scope of data and analytics, from the technology platforms and tools to its myriad applications and impacts on work and life in the 21st century.
Mr. Kobielus holds a bachelor’s degree in economics from the University of Michigan and a master’s degree in journalism from the University of Wisconsin.
[OnlineEducation.com] You are frequently referred to as IBM’s “Big Data Evangelist.” The phrase “big data” has become a cultural touchpoint, but people may not always understand what it is in a practical sense. Can you describe what data analysts and scientists do, how they might use data, and the point at which “data” becomes “big data?”
[Mr. Kobielus] In my case, big data evangelist refers to my work as a subject matter expert who spearheads thought-leadership marketing for IBM on this technology and its applications.
Big data doesn’t refer to a specific type or even scale of data. Rather, it refers to the ability to achieve differentiated value from advanced analytics on trustworthy data at any scale. In other words, it’s a best practice for leveraging this resource for business advantage and/or operational efficiency. There is no specific threshold at which you can say that data suddenly goes from “small” to “big.” But generally, people reserve the term “big data” to allude to such volumes as petabytes of stored data, such velocities as sub-second end-to-end latencies, and such varieties as multi-structured.
By “advanced analytics,” I’m referring to various approaches—such as predictive modeling and machine learning—that use statistical analysis and mathematics to find trends, correlations, segmentations, outliers, and other patterns in data that might not be discovered at all or as efficiently without these tools. The term “data science” refers to the practice of using advanced analytics to engage in systematic testing of data-driven hypotheses, which may be for pure science or for practical concerns in business, government, and other spheres of activity. “Data analytics” is an all-encompassing term that refers to data science and advanced analytics as I’ve described, but also to reporting, dashboarding, and other analytic applications that focus on illuminating insights to be found in historical data.
In terms of the branches of advanced analytics and data science, they include multivariate statistical analysis, predictive modeling, data mining, text analytics, machine learning, deep learning, natural language processing, and streaming analytics.
[OnlineEducation.com] Reports suggest that the number of data experts more than doubled in the last five years, and they remain in high demand. What basic skills do you think employers expect from prospective data professionals? Are there certain skills that are highly valued, but harder to come by? What about personal characteristics, habits, and aptitudes?
[Mr. Kobielus] Data science skills are in high demand. To a great degree, one needs a degree, or something substantially like it, to prove that they’re committed to and qualified for this career. You would need to submit yourself to a structured curriculum to certify you’ve spent the time, money and midnight oil necessary for mastering this demanding discipline.
To some extent, it matters whether you get that old data-science sheepskin from a traditional university vs. an online school vs. a vendor-sponsored learning program. And it matters whether you only logged a year in the classroom vs. sacrificed a considerable portion of your life reaching for the golden ring of a Ph.D. And it certainly matters whether you simply skimmed the surface of old-school data science vs. pursued a deep specialization in a leading-edge advanced analytic discipline.
But what matters most to modern business isn’t that every data scientist has a doctorate. What matters most is that a substantial body of personnel has a common grounding in the core curriculum of skills, tools, and approaches. Ideally, you want to build a team where diverse specialists with a shared foundation can collaborate productively.
No specific credential is perfectly suited to all data scientist scenarios. And scholastic credentials might themselves be unnecessary if candidates have the core aptitudes–curiosity, intellectual agility, statistical fluency, research stamina, scientific rigor, skeptical nature–that distinguish the best data scientists. Some brilliant data scientists may be largely self-taught.
[OnlineEducation.com] One data scientist told OnlineEducation.com in a recent email that analytics is an increasingly technical field. Graduate-level data science programs often include coursework in areas like computer programming and machine learning. What technical expertise do you apply to your own work? Are there certain programming languages, software suites, or other tools you would encourage prospective data professionals to learn?
[Mr. Kobielus] I’m not a data scientist or developer. Rather, I’m a thought-leadership marketing professional in the big data analytics and data science. What I’ll present here are the key curricula that make for a well-rounded highly marketable data science professional:
Paradigms and practices: Every data scientist should acquire a grounding in core concepts of data science, analytics, and data management. They should gain a common understanding of the data science lifecycle, as well as the typical roles and responsibilities of data scientists in every phase. They should be instructed on the various role(s) of data scientists and how they work in teams and in conjunction with business domain experts and stakeholders. And they learn a standard approach for establishing, managing and operationalizing data science projects in the business.
Algorithms and modeling: Every data scientist should obtain a core understanding of linear algebra, basic statistics, linear and logistic regression, data mining, predictive modeling, cluster analysis, association rules, market basket analysis, decision trees, time-series analysis, forecasting, machine learning, Bayesian and Monte Carlo Statistics, matrix operations, sampling, text analytics, summarization, classification, primary components analysis, experimental design, unsupervised learning constrained optimization.
Tools and platforms: Every data scientist should master a core group of modeling, development and visualization tools used on your data science projects, as well as the platforms used for storage, execution, integration and governance of big data in your organization. Depending on your environment, and the extent to which data scientists work with both structured and unstructured data, this may involve some combination of Spark, Hadoop, stream computing, data warehousing, NoSQL and other platforms. It will probably also entail providing instruction in MapReduce, R, and other new open-source development languages, in addition to SPSS, SAS and any other established tools.
Applications and outcomes: Every data scientist should learn the chief business applications of data science in your organization, as well as in how to work best with subject-domain experts. In many companies, data science focuses on marketing, customer service, next best offer, and other customer-centric applications. Often, these applications require that data scientists understand how to leverage customer data acquired from structured survey tools, sentiment analysis software, social media monitoring tools and other sources. It also essential that every data scientist gains an understanding of the key business outcomes–such as maximizing customer lifetime value–that should focus their modeling initiatives.
[OnlineEducation.com] You began researching and writing about data long before its recent popularity. What changes have you observed in the field over the course of your career, and where does it stand today? Have you noted recent shifts or emerging trends that could impact the field going forward?
[Mr. Kobielus] Believe it or not, I’ve only been focusing on data analytics for 10 years, though I’ve been in the IT industry in many capacities for more than 30 years.
Nevertheless, I’ve seen major shifts over this past decade in terms of the sorts of tools, techniques, and approaches for which data analytics is being employed in business. Here is a quick summary of the “inflection points” in the data analytics market that I’ve observed (and in which I’ve participated) over that period:
2006: This year marked the start of a wave of business intelligence (BI) industry consolidations and saw an upsurge of startups offering BI tools for decision support, performance management, dashboarding, data visualization, and other specialties.
2008: This was the year data warehousing (DW) got sexy as the back-end for BI, and in which the appliance gained legitimacy as the dominant DW implementation platform.
2011: In this year, the big data mania started in earnest as businesses began to realize the analytic richness of very large unstructured data sets. It was the year in which Hadoop began its swift ascent to dominance as the platform for storing and processing all this data.
2013: This was the year organizations began to realize that their payoff from big data depended on cultivating an important new profession, the data scientist, blending skills such as statistical analysis, data mining, and predictive modeling.
2015: Over the past year, it became clear that the Internet of Things (IoT) is becoming the bedrock of 21st century connected society, and the IoT’s promise depends on continued advances in data science and big data. In keeping with that trend, I’ve published my thoughts in the past year on IoT relevance to the insight economy, trust infrastructure, network security, drones, remote sensing, distributed databases, schema standardization, and big data generally. And here’s the ample coverage (my own included) that we’ve given to IoT this past year in IBM Big Data & Analytics Hub.
[OnlineEducation.com] You have achieved tremendous career success studying the field of analytics. What advice might you offer future data professionals who would like to follow in your footsteps? Is there anything they can do now that might help them enter and succeed in the workforce? Are there any mistakes you would encourage them to avoid?
[Mr. Kobielus] Really? Have I? I actually don’t know what success is. In the larger perspective, I’m just some smart person who’s made a living from expressing his research and thinking in a professional context to a growing, appreciative audience. I definitely have fans and followers. But in the larger picture of how one measures material success, I’m fairly ordinary. I have no power. I don’t manage a team. I don’t control budgets. I have no profit and loss responsibilities. I don’t hire and fire. And I’ve never received an award or a promotion in my entire career.
But I appreciate you saying that. I’m not sure that I have any specific footsteps that others might follow. I didn’t plan or anticipate any of this. It all just happened as I moved through my career and tried my hand at various things. In all practical career matters, ever since I emerged from grad school, I’m entirely self-taught. I’ve just worked hard, stayed flexible, and always tried to keep my mind a bit ahead of the curve. My only advice is to never stop re-educating yourself. You have to do that in order to stay employable in a fast-moving world.
One thing you should avoid is the trap of thinking you’re not “technical” enough to be a success in analytics generally or data science specifically. If you have a structured mind, are good with numbers and data, and are adept at critical thinking, you have all the aptitudes to succeed in this field.
[OnlineEducation.com] Do you have any parting words of wisdom you would like to share with readers interested in data science and analytics?
[Mr. Kobielus] The next-generation application developer is, at heart, a data scientist. Machine learning algorithms, which are developed and refined by data scientists, are increasingly the core of all applications and infrastructure in the 21st century. This could not be a more exciting and potentially lucrative career path for somebody just starting out.