Analytics represents an evolution in the amount of data one can collect, the statistical models used to interpret it, and the various ways its insights can be applied. Research from the International Data Corporation suggests the digital universe—the amount of data produced and copied globally—more than doubles in size every two years. By 2020, it will have grown by a factor of 10 over where it stood in 2013. Organizations that know how to collect and analyze large datasets can use this knowledge to identify and solve problems, improve business strategies, and minimize risk. This is the current state of the art in the field of analytics. Over time, however, experts predict that the amount of data available to organizations may actually become a potential barrier to progress, as the ability to collect it outpaces the means to sort and process it. Along with organizations’ continued drive to extract ever more useful information from new and old datasets, such projections have put data expertise in high demand.
Global information research and advisory firm Gartner suggests that at its core, analytics is the discipline used to collect and process large amounts of data; extract and translate its insights; communicate findings; and apply new knowledge effectively. It is a distinctly interdisciplinary field: one could visualize analytics as a confluence of business, statistics, and computer programming. Dr. Eran Raviv, a quantitative financial analyst and econometrician, suggested to OnlineEducation.com that advances in analytics and its capabilities have essentially created a new discipline that exists apart from the traditional fields of applied mathematics, quantitative studies, and computational science.
“Over past decades, we have experienced enormous technological advancements. Capitalizing on this increased computational power, [we developed] many new techniques. So different that there was a need to create a distinction between ‘Statistics’ and ‘Data Science,’” said Dr. Raviv. “The former bears connotation to classical hypothesis testing and regression fitting while the latter carries the current buzz and glory of ‘big data’ and ‘machine learning.’ This newly created jargon has yet to stabilize… What everyone is after is to extract information and knowledge from data, whether it helps us to understand how something works or helps us in prediction and forecasting. For me, the distinction is simply artificial.”
Statistical Analytics Institute, better known as SAS, Vice President of Best Practices Jill Dyché—who has worked in, presented on, and written about business and data intelligence for more than two decades—described analytics in similar terms.
“Analytics is the use of data and technology to improve decisions. It can be as simple as accessing customer contact information to send a customer a new product offering, or as complex as watching financial transaction patterns and detecting money laundering activities in real time,” Ms. Dyché said during an interview with OnlineEducation.com. “The point of analytics is to help businesses—and business people—refine the decisions they make in order to save their companies money, protect them from fraud, comply with regulations, and generate revenues based on fact, not gut-feel. It’s all about being what they call ‘data driven.’”
While there is no universally agreed-upon definition of analytics, its practitioners often analyze data from sensors, devices, the web, social media, medical records, applications, and several additional sources, and to varying ends. Among them:
Many experts say the well-known term “big data” is a bit of a misnomer, as there is no agreed upon threshold or standard definition for when data becomes “big.” According to the International Data Corporation, the “big” datasets of today may be “medium” next week and “small” next year. And, big data refers more to what is being accomplished through data analytics, than to the size of the data sets. IBM Big Data Evangelist James Kobielus suggested the same during an interview with OnlineEducation.com.
“Big data doesn’t refer to a specific type or even scale of data. Rather, it refers to the ability to achieve differentiated value from advanced analytics on trustworthy data at any scale,” Mr. Kobielus said. “There is no specific threshold at which you can say that data suddenly goes from ‘small’ to ‘big.’ But generally, people reserve the term ’big data’ to allude to such volumes as petabytes of stored data, such velocities as sub-second end-to-end latencies, and such varieties as multi-structured.”
Analytics is often divided into primary types that help define how data will be used. IBM and other organizations list these key analytics categories as follows. Note that types of data analyses may also be described using other terms such as qualitative and quantitative data.
Descriptive analytics simply uses data to try to identify why something happened as it did. Analytics professionals conducting descriptive analyses often use key performance indicators—such as website views or purchases—to identify what went right and what did not. Data analysts may combine indicators from various sources to achieve a comprehensive view of past and current events and, in many cases, feed information to business reports or interactive dashboards. According to IBM, descriptive analytics is the most common type of analysis performed today.
As its name suggests, this type of analytics aims to identify what could happen in the future. Predictive analysts use quantitative modeling, numerical scoring, computational forecasting, and other statistical tools to predict potential trends and movements, such as upticks and downturns in the economy, or to forecast events, like the weather or next week’s retail stocking needs. Experts emphasize the importance of understanding that analyses of current and/or historical data are tools to provide information that can help professionals like financial analysts or meteorologists to make more informed projections.
Prescriptive analytics combines descriptive and predictive analytics to make decisions about what actions an organization should take. For example, analytics professionals working in medical research may review historic data to identify important cause-and-effect relationships, such as the correlation between poor sanitation and health. They could then use current information in a predictive model to identify potential problems and move to prevent or minimize them. That might mean identifying communities in which sanitation is poor and acting—or encouraging others to act—to improve conditions to prevent future illness.
Analytics is often categorized into sub-disciplines like data analytics, data science, and business intelligence. Analytics degree programs and employers may use similar delineations. How these categories are defined, however, is still a matter of professional debate. Burtch Works, an executive recruiting firm that conducts extensive career studies in the field of analytics, defines a data analyst as one who uses sophisticated quantitative and statistical analyses to derive insights from large amounts of data. In this sense, data scientists are a subset of data analysts, who possess “atypical” computer science skills that allow them to work with unstructured data sets like social media web scrapes, audio recordings, blocks of human language, and sensor readings. The firm categorizes business intelligence professionals separately from the other major disciplines because they tend to work with smaller datasets than data analysts and data scientists.
According to the SAS Institute’s Jill Dyché, however, business intelligence is “the general rubric” under which data analytics and data science fall, and analytics is a “general category name for everything having to do with using data for decision making.” She suggested that the variation in terms could complicate matters for prospective analysts trying to navigate the field.
“The business intelligence vernacular is tricky, fluid, and often confusing to newcomers. Often terms that mean different things—like ‘business intelligence’ and ‘analytics’—are used interchangeably,” said Ms. Dyché. “Data analysis is any activity involving looking at data to determine what to do next. The data can come to you via spreadsheets, statistical models, streaming video, or smoke signals for that matter. It’s about interpreting the meaning of data.”
Quantitative Analyst Steve Miller, co-founder and president of the Chicago-based business intelligence and analytics consulting firm Inquidia, also discussed his views on the matter during an interview with OnlineEducation.com. He suggests that data and decision-making are common threads linking related disciplines that have evolved over time.
“[Business Intelligence (BI)], analytics, and data science all obsess on data-driven decision-making,” Mr. Miller said. “BI is the oldest and now the most mature, governed, and IT-centric of the three, focusing on the data warehouse, reporting and online analytical processing (OLAP)… I’ve always seen analytics as applied statistics/machine learning in the work world: more data-focused and computational than statistics, but less so than data science… When challenged to define the ‘point’ that separates analytics from DS, however … there’s a continuum from analytics to data science on a data/computation axis with endpoints ‘not so much’ and ‘lots.’”
The division between data science and data analytics is also difficult to discern, as the terms are often used interchangeably, conflating one with the other. IBM’s Mr. Kobielus offered a more technical distinction between data science and what he calls “advanced analytics.”
“By ‘advanced analytics,’ I’m referring to various approaches—such as predictive modeling and machine learning—that use statistical analysis and mathematics to find trends, correlations, segmentations, outliers, and other patterns in data that might not be discovered at all or as efficiently without these tools,” Mr. Kobielus told OnlineEducation.com. “The term ‘data science’ refers to the practice of using advanced analytics to engage in systematic testing of data-driven hypotheses, which may be for pure science or for practical concerns in business, government, and other spheres of activity.”
The field of data analytics is evolving, and so are the various ways it is defined. IBM describes data analytics as the use of advanced analytical techniques to capture, manage, and process large and diverse datasets. The goal: to identify insights that can help researchers, analysts, and businesses make faster, better, and more informed decisions. Data analytics can also be used to optimize information and make it more accessible, as in the case of internet search engines. According to IBM’s James Kobielus, however, data analytics is a general term for many types of data analysis.
“Data analytics” is an all-encompassing term that refers to data science and advanced analytics as I’ve described, but also to reporting, dashboarding, and other analytic applications that focus on illuminating insights to be found in historical data,” Mr. Kobielus told OnlineEducation.com. He added that the field could be divided into areas like “multivariate statistical analysis, predictive modeling, data mining, text analytics, machine learning, deep learning, natural language processing, and streaming analytics.”
Data Analytics Skills and Responsibilities
Burtch Works, SAS, and other industry organizations define data analytics professionals as those who can do the following. Note that tools and other technical skills are defined more clearly elsewhere in this guide.
Data Analytics Careers
Data analytics is a growing field, one in which new careers offshoots are emerging every year While many job titles hint at a clear link to data analytics, some are less apparent. The same is true for job function: one may have to research positions carefully to understand what skills and duties are associated with each. Research from the McKinsey Global Institute suggests the following job titles are among the most prevalent for data analysts.
The definition of data science remains a point of dispute among experts. In a column he wrote for Forbes.com, researcher Gill Press suggested data science is little more than an inflated term for statistics. According to research conducted by Gartner, however, a text analysis of data scientist job descriptions suggests these professionals are generally expected to work more collaboratively, analyze larger datasets, and communicate even more effectively than statisticians. Data scientists are also more likely to have advanced experience in machine learning and computer science. They are, for example, frequently required to know technologies and programming languages like Hadoop, Python, and Java. As a result, Burtch Works reports, employers are twice as likely to require data scientists to have PhDs than they would expect of statisticians. Aaron Gowins, data scientist and research fellow with the National Institutes of Health (NIH), discussed some of these trends with OnlineEducation.com.
“Data scientists have an unusually large breadth of knowledge,” said Mr. Gowins. “They are database administrators, mathematicians, and programmers. They are skilled at explaining complicated topics to non-experts. They have keen insight when it comes to recognizing patterns and applying abstract concepts to real-world problems and specific questions.”
Data Science Skills and Responsibilities
Data science job and skills descriptions can vary distinctly from one expert or employer to the next. According to Mr. Gowins, data scientists generally apply “[traditional] scientific methodologies and mathematical modeling techniques to understand and draw inferences from data outside hard science.”
When Burtch Works defines data scientists for research purposes, it looks for professionals who can:
Data Science Careers
Data science careers and job titles tend to sound more technical than those associated with data analytics and business intelligence. Data scientists are more likely to work in research and the sciences, for instance. The online data science community Data Science Central reports that the following job titles are examples of those one might find in the field.
The boundaries of business intelligence are not well established. Some experts consider business intelligence to be a branch of data analytics in which professionals work with smaller datasets and are more likely to use out-of-the-box data solutions. Microsoft describes business intelligence as a field in which professionals discover and analyze information to inform decision makers. A business intelligence researcher or specialist might, for example, seek to improve individual work performance and/or unit efficient using data analytics.
Like its related disciplines, the field of business intelligence is changing as datasets become larger and analytics technologies more sophisticated. While many business intelligence programs remain primarily business-focused, a few now include some computer programming and other technical coursework. Inquidia’s Steve Miller reports that the field is also branching into sub-fields like “enterprise analytics.”
“After 25 successful years, BI is ceding to enterprise Analytics (EA), which gives up some IT control/centralization for more flexible data integration and analytics,” said Mr. Miller. “In essence, IT still governs the core intelligence data, but allows the business to extend the data and analytical assets in ways not initially envisioned. EA is a more agile approach that leverages both the governed and ungoverned data assets as inputs for a more technically-savvy analytics team.”
Business Intelligence Careers
The business intelligence job market indicates just how diverse the field really is. Professional duties and job titles can vary widely from one position to the next. Data Science Central reports that the following are among just some of the job titles associated with business intelligence professionals.
The skills required of analytics professionals depend on the specific focus of their work, as well as on the targeted goals of individual companies and the imperatives of different industries. An analytics professions might, for instance, deploy various techniques to mine data, use advanced statistical models to analyze that data, and then employ computer programming skills to design a predictive or interpretive algorithms. SafeAuto Insurance Company Data and Decision Science Analyst Chris Wetherill suggested future analytics professionals should be comfortable completing several tasks across the data analysis process.
“The most important takeaway I’ve gotten from this role is that it increasingly isn’t enough to just know SQL [programming], or to just know statistics, or to just know software engineering,” Mr. Wetherill said in an interview with OnlineEducation.com. “You need to be comfortable shepherding the data from start to finish: you need to be able to query it, analyze it, prepare it, and present it, developing out a reproducible and automated data extraction/analysis/reporting pipeline as you go, while still keeping in mind the business as a whole and how your data will be consumed and utilized within that context.”
When describing key data analytics skills, experts often cite the following:
Applied statistics is a baseline skill in data analysis and business intelligence. Professionals commonly conduct trend analyses, A/B testing, correlation analyses, profiling, and analysis of maximum likelihood estimators. Interpreting large data sets often calls for advanced statistical models and techniques, like time series predictive analyses, regression and decision trees, text analytics, and more. The online analytics publication Datanami reports that quantitative analytics professionals were once recruited primarily into the finance sector, but companies from many different industries have since followed suit.
Analytics is grounded in mathematical methods and concepts. While statistics is perhaps the primary discipline used in data analysis, industry education provider Udacity reports that linear algebra and multivariable calculus are the basis for a lot of the machine learning techniques data scientists and analysts are now using to mine and analyze very large and complex data sets.
Analytics professionals can scrape data from a myriad of sources, but it is not always useful or structured in a usable way. Data munging is the process of sorting and optimizing large datasets before they are analyzed, while data mining involves examining that data to generate new information. According to Datanami, analysts who mine data using machine learning can build and train predictive analytics applications for classification, recommendation, and system personalization.
Growing data sets and increasingly complex analytics processes have significantly increased demand for data analytics professionals who are trained in machine learning. According to SAS, machine learning is an analytical method that automates data analysis and model-building. Data experts use machine learning algorithms to conduct mathematical calculations on big data sets, automatically, and with increasing speed. Machine learning also helps data professionals identify hidden patterns that might otherwise be overlooked.
DTD frameworks like the one established by the analytics training agency Aryng help data analysts move from collecting and analyzing data using tools like Oracle solutions or R programming, to identifying and applying important insights that improve organizational decision-making. DTDs do this by establishing a step-by-step method for working through the decision-making process, similar to how a scientist might use the scientific method. Aryng, for instance, prompts data professionals to identify the business problem, create an analysis plan with a hypothesis, collect data, gather insights, and, only then, offer recommendations.
It is important for data analysts to be able to communicate data findings to colleagues and managers so they can make more informed decisions. Data visualizations built with languages like d3.js and ggplot can convey technical information in an accessible way. Programs such as Tableau and Qlikview allow data analysts and scientists to explore data visually by creating 2- and 3-dimensional graphics.
The large datasets that analytics professionals work with today often exceed the limits of out-of-box spreadsheets and statistics software solutions, leaving the analysts to code their own tools. Data scientists have used common programming languages to model, access, and visualize data for some time, but demand for more advanced programming skills used to leverage data architect platforms like Hadoop and Algorithms.io grew significantly in recent years. According to a study conducted by the analytics recruiting firm Wanted Analytics, and published by Forbes, the number of new job postings calling for applicants with both computer programming and data analysis expertise grew by more than 300% between 2013 and 2014 alone.
Analytics professionals may use a wide assortment of computer programming languages to conduct their analyses. The following are among those employers prefer or require most from data applicants.
R, Python, and Julia
Aaron Gowins told OnlineEducation.com that R and Python have “rightfully” emerged as frontrunners in the growing cache of programming languages data analysts and scientists use in the field. They often perform similar functions, but each offers its own advantages and posses different challenges. According to Mr. Gowins, choosing between them is “somewhat a matter of preference, and largely a matter of the nature of the analysis.”
R is an open-source programming language that allows data scientists and analysts to mine, sort, manipulate, and visualize large and complex datasets. Fast Company reports that R is an alternative to pricier commercial software solutions like SAS and Matlab, and it is applicable across many different fields. According to Mr. Gowins, R was written with statistics in mind and has vast libraries of built-in functionalities. He noted that though R makes it easy to build an app using its algorithms, but performs best with smaller datasets.
“For smaller datasets and more traditional statistical analyses, R provides the analyst with a breadth of options. R has tremendous documentation, mainly meaning online resources for learning. However, to some extent that is because R requires more documentation since it can be opaque and counterintuitive,” said Mr. Gowins. “Without bulky workarounds R runs entirely in memory, and, for this reason, Python dominates when it comes to out-of-memory analysis of large datasets and seamlessly integrates with modern databases.”
Python is a common and easy-to-use dynamic programming language data scientists use to store, manipulate, analyze, and visualize data. Like R, Python is an open-source language with a large community of users that can support and mentor beginners. InfoWorld and others suggest that Python is supplanting R in overall popularity.
“Python is gaining traction as the go-to language for big data science,” Mr. Gowins told OnlineEducation.com. “I find Python to be easier to learn, more flexible, and faster than R. Currently it provides better packages for natural language processing and text mining tasks. Python for application backend is highly scalable, meaning it can handle many users and large datasets. The recent advances in Python interactive publishing are the state of the art.”
Julia is a comparatively new dynamic programming language that, according to O’Reilly Media, is designed to offer a flexible alternative to the more dominant R and Python, and can be used to achieve the same ends. Because Julia is still in development, however, a data scientist must currently combine it with R and Python to complete certain tasks and computations. When used to conduct custom work or build advanced tools, however, it outperforms both. Mr. Chris Wetherill told OnlineEducation.com that he believed Julia will be “the next big hit.”
Java and Scala
Java is a foundational object-oriented programming language used to create data engineering infrastructures. It is also quite popular: Fast Company reports that major organizations like Facebook, Twitter, and Linkedin rely on Java. Java is not widely used for statistical modeling and data visualization, but allows data analytics professionals to build large data systems fairly quickly. Hadoop is written in Java.
Scala is an open-source Java-based language: it is similar to Java, but specialized for certain tasks. According to the Scala resource and community Scala-Lang.org, the name Scala is derived from “Scalable Language” because it grows with projects. Data scientists and analysts often use Scala to build high-level algorithms and to enable large-scale machine learning. They can also use programming languages that support currying, pattern matching, and other key data functions. Scala is commonly used with Apache Spark (see below for more information).
SQL, short for Structured Query Language, is a special-purpose programming language that data professionals use to communicate with relational database management systems (RDBMS) databases. SQL is so common that the American National Standards Institute deems it the standard language for RDBMS databases. Analysts can use SQL to write scripts or demands that let them select, insert, create, delete, and drop data in data tables.
Matlab and Octave
Matlab is a proprietary computing environment and programming language used for numerical computations. Data scientists and analysts use Matlab to perform a variety of tasks, such as plotting data functions, manipulating matrices, and executing algorithms. According to the University of Wisconsin’s (UW) Cooperative Institute of Meteorological Satellite Studies, analysts can also use Matlab for data exploration, modeling, and simulations.
Octave is a platform that uses a high-level programming language called by the same name. It is commonly used for solving linear and non-linear numerical computations. It is an open-source solution created by the GNU project, a collaborative effort to establish an open source operating system, and a popular alternative to the proprietary platform and languages SAS and Matlab.
A computing platform is an application, program, operating system, server, or other software or hardware environment that data analysts use to code and execute algorithms. Analytics professionals may conduct different types of analyses to achieve very different goals, but there is significant overlap in the platforms used to achieve them. Mr. Kobielus advises future analysts and data scientists to become well acquainted with some of the most prevalent tools.
“Every data scientist should master a core group of modeling, development, and visualization tools used on your data science projects, as well as the platforms used for storage, execution, integration and governance of big data in your organization,” Mr. Kobielus told OnlineEducaton.com. “Depending on your environment, and the extent to which data scientists work with both structured and unstructured data, this may involve some combination of Spark, Hadoop, stream computing, data warehousing, NoSQL, and other platforms.”
Analytics technologies include both proprietary and open-source platforms. The following are among some of the most common in today’s market.
Hadoop is an open-source platform that supports processes involving data that is mixed, complex, and computationally extensive, like clustering and targeting analyses. The first Hadoop-like platform was created by Google more than a decade ago. Google’s platform was eventually incorporated into an open source project named Nutch. Hadoop is a spin-off from that project. Hadoop’s popularity has grown recently across a number of different markets.
Apache Spark is an open source big data framework developed by UC Berkley’s AMPLab in 2009. It was designed to be a fast, easy-to-use alternative to similar technologies like Hadoop, although Spark can also be used within a Hadoop framework. When combined, Spark helps Hadoop clusters run up to 100 times faster. Analytics professionals can use Spark to write applications in the Python, Java, or Scala programming languages.
MySQL and NoSQL
MySQL is an open-source RDBMS that organizations commonly use to manage data stored in tables on community servers or commercial enterprise servers. Data analysts access data stored on MySQL databases. While MySQL databases have dominated the marketplace for decades, Tech Republic reports that an alternative called NoSQL is increasingly popular for big data and real-time web applications.
NoSQL provides an alternative to the MySQL RDBMS databases that have traditionally dominated the market. It is accessible online and through mobile applications, and data is not stored in the tabular structures used with MySQL. Developers of the NoSQL MongoDB database describe it as more scalable and agile, allowing analytics professionals to minimize downtime when handling large datasets. The name is derived from “no SQL,” or “not only SQL,” because NoSQL databases support multiple query languages.
Qlikview and Tableau
Qlikview and Tableau are designed to help data analytics professionals translate numerical sets into 2- or 3-dimensional charts, graphs, and images. The ability to translate data into visually accessible graphical formats often helps to facilitate the identification of patterns within that data. In addition, these visualization tools may help analysts communicate their findings to non-technical colleagues.
Matlab, SAS, and Octave
Matlab is a proprietary computing environment data analysts use to perform numerical computations using common mathematical notations. According to UW, data scientists can use the computer programming language also called Matlab to develop algorithms; create models and simulations; and explore and visualize data.
Statistical Analysis System – more commonly known by its acronym, SAS—is a software suite developed at North Carolina State University and, later, the SAS Institute for advanced analytics. Data analysts and business intelligence professionals use SAS to mine, update, access, and manage data, and to conduct statistical analyses. While SAS software uses a graphical interfaced designed for non-technical users, data analysts can perform more advanced tasks by using the SAS programming language.
Octave is a high-level programming language and computing environment data professionals use to perform linear and non-linear numerical computations, and to manipulate and visualize data. It is also an open-source platform, making it a popular alternative to proprietary solutions like SAS and Matlab. Octave was designed to be quite similar to Matlab, however, so that it is easier to transport programs across environments.
Algorithms.io is a cloud-based platform that allows analytics professionals to use machine learning algorithms to analyze data collected in real time from devices, sensors, and machines. For example, the healthcare industry can use Algorithms.io to collect patient data from various sites to make real-time predictions regarding infection risks.
The educational requirements for data analytics professionals can vary significantly from one employer to the next. This variability can be at least partly attributed to the relative newness of formal analytics degree programs; previous generations of data analysts and scientists often earned degrees in fields like applied statistics or computer science. Some were even self-taught or held degrees in unrelated disciplines. As data analysis becomes more complex, however, this trend is waning. Burtch Works reports that the vast majority of analytics professionals hold advanced degrees. Colleges have responded. According to the Institute of Advanced Analytics, the number of master’s degrees in data analytics, data science, and business intelligence grew by more than 500 percent between 2011 and 2013. That includes both campus-based and online analytics degree options. For the same reasons to which Ms. Dyché alluded previously, however, differentiating among specific disciplines can be difficult.
OnlineEducation.com carefully researched the field of online master’s degrees in analytics and other largely equivalent programs, and created a classification methodology based on what skills are taught rather than what the programs are called. Some of these programs may not be in disciplines like computer science, mathematics, or business, and yet they teach the core analytics skills one would need to enter the workforce. One can visit the following program pages to browse and learn more about online master’s degrees in analytics:
While research suggests that demand for analytics professionals more than doubled over the course of four years, new college graduates have not been the only source of qualified candidates for employment. A significant number of data scientists and analysts moved into the field from other jobs and industries. According to Market Watch, more than 1 in 5 adults wanted to change careers in 2014 (the latest dataset available), which was a notable increase compared to the previous five years. Those who choose to pursue analytics careers have a number of options available to them even if their work and educational backgrounds are in other fields. Mr. Kobielus suggested having the right skills—however acquired—is key.
“No specific credential is perfectly suited to all data scientist scenarios,” Mr. Kobielus told OnlineEducation.com. “[Scholastic] credentials might themselves be unnecessary if candidates have the core aptitudes–curiosity, intellectual agility, statistical fluency, research stamina, scientific rigor, skeptical nature–that distinguish the best data scientists. Some brilliant data scientists may be largely self-taught.”
Master’s programs in data analytics, science, and business intelligence can provide the formal training one might need to qualify for some analyst positions. Many colleges offer online analytics degrees, which may be helpful for professionals who want to continue in their current jobs full-time while attending school. Some employers will also accept a professional certification from vendors like IBM or SAS in lieu of a formal degree. Candidates earn certificates by successfully passing an exam, although one can typically learn the material however they choose. According to Mr. Gowins, there are a number of online resources that can help prospective analysts learn and practice working with data.
“There is no reason that someone who is curious about data science should ever be bored. The internet is full of data, projects, and tutorials on any topic you can imagine,” Mr. Gowins told OnlineEducation.com. “If you’re willing to put in the effort, online learning and data ‘boot camps’ can provide all the skills you need to succeed.”
Readers who would like to learn more about online master’s degrees in data analytics, data science, and business intelligence can follow the links to Online Education’s analytics program pages, listed in the previous section, for more information.
As the tremendous applications and potential benefits of data analysis become better known, new opportunities have emerged for aspiring analysts. Research indicates that demand for analytics expertise quadrupled over the course of just two years. The number and types of organizations that put a premium on data analytics hasn’t just grown, it’s diversified. Analysts can now find positions in companies of all sizes and sectors.
“Our customers aren’t confined to a single industry; rather, our work revolves on companies that are committed to evidence-based decisions,” Mr. Miller told OnlineEducation.com. “That said, we do have a wealth of marketing, healthcare, and government customers, and often work with new companies whose products are data/analytics.”
While Burtch Works reports that the financial services and marketing industries that traditionally hired the largest share of analytics professionals continue to do so, the proportions are shifting as data talent expands into new fields.
The amount of data organizations can now capture has grown significantly. And, the new and innovative ways that data can be analyzed, manipulated, and interpreted, continues to expand. The following list offers insights into some of the ways major industries use data analytics.
There is no universal skill set employers look for, but advancements in data collection and analysis have spurred demand for more varied and, often, more technical expertise. Analytics professionals are becoming more independent, as knowledge gaps between data professionals and their business colleagues grow, leaving some analysts to manage more of the data-to-decision process than they might have just a few short year ago. Mr. Wetherill described how the need to bridge the gap between data analytics and business teams shaped his own work.
“We’ve found that we’re taking in far more data than we have the capacity to digest, and so a big part of what I do is develop mechanisms to comb through those data, curate them in some way that our different business units are able to easily work with, and create dashboards, reports, and models to allow end users to interact with those data in ways that they haven’t previously been able,” Mr. Wetherill told OnlineEducation.com.
Another challenge posed by growing datasets and advancing analytics technologies is the tendency for many analyses to exceed the limits of what existing software and other out-of-the-box products can accommodate. Analytics professionals may, in turn, need to program their own solutions. Today’s data analytics, data science, and, to a lesser extent, business intelligence degree programs frequently include coursework in areas like data mining, machine learning, and computer programming. Employers increasingly look for these skills as well.
“Computation skills are no doubt critical for Inquidia’s new college hires, along with a solid general quantitative background,” Mr. Miller told OnlineEducation.com. “It’s easier to teach statistics and machine learning to programmers than it is to teach stats majors how to compute.”
Over time, however, languages can change to facilitate new functionalities and best practices. Analytics professionals may also work with platforms and computing environments that require different programming knowledge. According to SafeAuto’s Chris Wetherill, no one language is necessarily most important for prospective analysts to learn.
“In my experience, every language has its time in the spotlight sooner or later: for instance, in the data world, SAS seems to be on the decline; R is the cool kid on the block, followed by Python; Julia will probably be the next hit,” said Mr. Wetherill “Knowing how to write code is no longer an option in the field; however, programming languages come and go, and the specific ones that you learn are generally far less important than the skills and mindset that accompany them. Anyone entering the field of data science will benefit from the ability to decompose complex, thorny problems and to approach them in an almost modular, piecemeal way.”
Which analytics computing technologies will reign in tomorrow’s market remains an open question. Like coding languages, platforms come, go, and change: today’s standard tools of the field may be supplanted by new environments and software solutions that simplify data analysis processes or provide ever more advanced capabilities. SAS Institute’s Jill Dyché suggested that learning specific platforms is less important than mastering key analytics concepts.
“The advice I give to people who are getting involved with analytics is this: Learn the data. It’s much easier to learn to use the latest software tool than it is to truly understand how to access, format, correct, annotate, provision, and explain important business information,” said Ms. Dyché. “Put another way: Software tools come and go, but data is forever.”
Data analytics’ changing technical landscape can make it difficult for prospective data analysts and scientists to know what skills they will need in the future job market. Mr. Miller, who actively hires new analysts for his consulting firm, shared his thoughts on how new professionals can develop—and maintain—the knowledge necessary to succeed.
“My advice to aspiring Steve Millers emerging from college is to take their evidence-based obsession to a ‘data’ company or a consultancy like Inquidia to hone their data/computational/analytical skills,” said Mr. Miller. “They should commit to life-long learning, investing heavily in [continuing] education.”
As Mr. Wetherill and Ms. Dyché alluded to previously, data analysts and scientists’ personal attitudes and habits can influence one’s success just as much, if not more than their core analytics skills. It can be more difficult to change how one analyzes and solves problems than it is to learn a new programming language or computing platform. The following characteristics are among those that experts—including those who spoke with OnlineEducation.com—associate with successful analysts.
The ability to solve, and enjoy solving, problems is perhaps one of the most prevalent qualities experts and employers look for in data analysts and scientists. This is true regardless of their specialties and job titles. Mr. Wetherill suggested problems are an inherent part of the field. Deciding to work through them is what analytics professionals generally do.
“So much of the work that we do is a little fuzzy around the edges: we have a general sense of where we want to end up and what business problems we want to address, but no clear path to get there,” said Mr. Wetherill. He said analysts must be able to identify the right data for the problem, decide how to capture and store it, choose how to process and manipulate it, and so forth. “Individually, none of these is a particularly difficult question to answer, but without decomposing the problem and stepping through each of its component parts, it suddenly becomes a much thornier thing to tackle. Almost without exception, the people who have most thrived in this position and the applicants who have been the most successful have been the ones with this mindset, even if they had never written a line of code before starting with us.”
Programming is an increasingly necessary skill for data scientists and analysts, who may be required to create their own programs when existing tools are insufficient. Just as important, however, is one’s ability to understand how to structure and create models and algorithms. According to Mr. Kobielus, these skills are separate from one’s technical background.
“One thing you should avoid is the trap of thinking you’re not ‘technical’ enough to be a success in analytics generally or data science specifically,” said Mr. Kobielus. “If you have a structured mind, are good with numbers and data, and are adept at critical thinking, you have all the aptitudes to succeed in this field.”
While structured thinking and problem-solving skills help data analytics professionals overcome challenges, doing so can take time and patience. Whether one is pinpointing data needs, finding and scraping data, optimizing data, creating an ideal algorithm, or interpreting analytical outcomes, each step brings its own problems to solve. Mr. Gowins suggests good data scientists not only endure such difficulties but are inspired by them.
“Research inherently takes place at the limit of current knowledge, and this is true for the traditional sciences as well as data science. For each problem solved, a new one awaits,” Mr. Gowins told OnlineEducation.com. “A successful data scientist must combine a love of learning with a high tolerance for enduring frustrating and time-consuming problems… Good scientists specialize in solving problems, and the description above appeals to them.”
Research can be a large and critical part of the data lifecycle, particularly among data scientists and analysts posed with new, complex problems and exceptionally large datasets. Mr. Gowins told OnlineEducation.com that the difficult problems one faces seldom have a clear method to solve them; data scientists must continually research and explore new approaches.
“I enjoy challenges, and there’s nothing quite like the challenge of learning something new,” Mr. Gowins said. “I’ve learned that bringing an optimistic and adventurous outlook is a strength. People count on me to extract meaning from seemingly lifeless data, and develop insights that can answer questions, solve problems, and make useful predictions. To me, that’s about as good as it gets.”
One of the ways organizations use data analytics is to inform decisions that might improve company management, efficiency, productivity, and profits. As datasets have become larger—and analytical methods more sophisticated—traditional business intelligence “dashboarding” is often not sufficient. While demand for data analysts and scientists with a more technical skill set has grown, their colleagues may not have the same expertise. Analytics professionals must be able to translate and communicate complicated quantitative data in a way that is accessible to non-technical workers. Mr. Wetherill discussed the importance of having excellent communication skills.
“We might have some truly great predictive modeling going on behind the scenes, but that isn’t relevant to expose to our users, nor would it be helpful trying to explain the model or its underlying assumptions and limitations to them,” said Mr. Wetherill. “Instead, we need to do our best to control, non-intrusively to the end user, for cases where those assumptions are violated and leave them with just a nice, clean summary of how our systems are performing.”
Data analytics professionals may need to learn how to communicate complicated findings for business-oriented co-workers, but doing so requires a certain degree of business sense as well. Analysts who possess business acumen can use it to understand clearly what the goal is, what types of data they need, and how that data should be analyzed. It can also make team discussions more effective. Mr. Wetherill advises future data scientists and analysts to explore the theories and techniques associated with business management in general, and with the particularly industry they are entering.
“I think the best thing any data analyst or scientist can do is to take the time to learn the business,” said Mr. Wetherill. “Nothing ever ends with the data: they will always be a jumping off point for, and driver of, at the end of the day, business decisions. And approaching these problems with an understanding of how the data are actually used and required in the context of the business will only ever make the applications that you develop more powerful.”
Being able to demonstrate key analytics knowledge and skills may be a significant factor in employers’ hiring decisions, but according to experts interviewed by OnlineEducation.com, there are other steps one can take to make oneself more marketable. According to Mr. Miller, for example, Inquidia looks for candidates who have sought internships, or those who demonstrate a keen interest in data that he believes can be developed.
“An obsession with data and a data/computation-intensive internship are differentiators,” said Mr. Miller. “Inquidia’s a professional services firm, so strong interpersonal skills and the ability to function in a collegial environment are critical. We also look for those we think can develop as consultants, first building tech, analytic and business skills, then progressing in project/client management.”
Networking and mentorships can also improve one’s chances of being hired as a data analyst. Candidates can collaborate with experienced analytics professionals to learn more about the field, refine or build upon data skills, and get a better sense for which types of positions and employers would appeal to them most. Some prospective analysts may also connect with mentors who might be in a position to provide employment recommendations. Mr. Gowins advises new and rising analytics professionals to look to online communities and blogs for support.
“Getting involved in the data science community should be a priority,” said Mr. Gowins. “Anyone even casually interested in data science absolutely must visit datasciencecentral.com. It’s a terrific site packed with valuable information and resources.”
Mr. Kobielus also discussed ways future professionals can find careers and success in analytics, including those who may not have originally intended to enter the field. He told OnlineEducation.com that his relationship with data analytics emerged from an interest in trying new things; he was entirely self-taught. He credited his success to a willingness to learn and keep his skills current.
“I didn’t plan or anticipate any of this. I’ve just worked hard, stayed flexible, and always tried to keep my mind a bit ahead of the curve,” said Mr. Kobielus. “My only advice is to never stop re-educating yourself. You have to do that in order to stay employable in a fast-moving world.”
About the Author : Aimee Hosler is a long-time education journalist and founder of a website for K-6 families and educators committed to experiential and maker education. Aimee also serves as Director of Communications for a non-profit community laboratory and makerspace.
“The 9 Best Languages For Crunching Data,” Anna Nicolaou, Fast Company
“400 Categorized Job Titles for Data Scientists,” Vincent Granville, Data Science Central
"8 Skills You Need to be a Data Scientist," Dave Holtz, Udacity
“9 Must-Have Skills to Land Top Big Data Jobs in 2015,” Alex Woodie, Datanami
“Analytics,” Gartner IT Glossary, Garner
“Big Data: A Game Changer In The Retail Sector,” Bernard Marr, Forbes
“Big data: The next frontier for innovation, competition, and productivity,” McKinsey Global Institute
“Degree Programs in Analytics and Data Science,” Institute for Data Analytics, North Carolina State University
“Demystifying Engineering Analytics,” Cognizant
“Distinguishing Analytics, Business Intelligence, Data Science,” Shannon Kempe, Dataversity
“Emerging Role of the Data Scientist and the Art of Data Science,” Doug Laney, Lisa Kart, Garner
“GNU Octave,” The GNU Project
“Hadoop: What it is, how it works, and what it can do,” James Turner, O’Reilly Media
“How do I get my first job in data science?” Will Stanton, Will Stanton’s Data Science Blog
“How To Explain Hadoop To Non-Geeks,” Jeff Bertolucci, InformationWeek
“In data science, the R language is swallowing Python,” Matt Asay, InfoWorld
“Information technology, database languages, SQL multimedia and applications packages,” American National Standards Institute
“Julia’s Role in Data Science,” John Myles White, O’Reilly Media
“Machine Learning,” SAS Institute
“MSA Inforgraphic, 2007-2015,” Institute for Data Analytics, North Carolina State University,
“NoSQL databases eat into the relational database market,” Matt Asay, TechRepublic
“The Power of Analytics to Transform Government,” Hugo Moreno, Forbes
“Programmers See Biggest Growth in Big Data Hiring,” Wanted Analytics
“The Burtch Works Study,” Burtch Works
“Three Steps To Identify The Analytics Training You Need,” Piyanka Jain, Forbes
“Typical U.S. worker now lasts 4.6 years on the job,” Quintin Fottrell, Market Watch
“What is business intelligence?” TechNet, Microsoft
“What is Data Science?” Data Science at NYU, New York University
“What Is MATLAB?” Cooperative Institute of Meteorological Satellite Studies, University of Wisconsin
“WHAT IS SCALA: A Scalable Language,” Martin Odersky, Scala-Lang.org
“Where Big Data Jobs Are in 2015 – Midyear Update,” Louis Columbus, Forbes
“Why ‘data scientist’ is this year’s hottest job,” Katherine Noyes, PCWorld