Answer: Computer science and data science are distinct yet related technical fields that involve the study and use of computers and computer programming. Computer science is the broader of the two fields. It encompasses theoretical and applied research into functional components of computers, including hardware, software, operating systems, networking protocols, and other elements of information technology (IT) systems architecture. Data science is a newer and more narrowly focused interdisciplinary field that combines programming skills with statistical modeling and data analytics methodologies. While the knowledge associated with computer science and data science overlaps, computer scientists and data scientists apply their technical knowledge and programming skills in different ways. Computer scientists are primarily concerned with the inner working of computers and they use their skills and knowledge to write and debug code, develop and modify software applications, and/or design and optimize computer systems and networks. Data scientists study database systems and probability modeling in order to solve complex problems that involve collecting, sorting, and analyzing large datasets.
The relationship between computer science and data science is foundational in that data science emerged through advances in the field of computer science, specifically in areas related to data collection, storage, and processing power. As the data capabilities of computer systems has increased, mathematicians and statisticians have been able to tackle and solve increasingly complex data analytics problems. These problems require large amounts of raw information (i.e., “big data”) to be collected, cleaned, sorted, merged, processed, and otherwise manipulated, yielding descriptive and predictive models that provide insights into trends and patterns in economics, politics, consumer behavior, and other fields of inquiry. This process of solving problems using statistical reasoning, algorithmic programming, and computer modeling is the essence of data science. The research and development that led to the emergence of data science and that continues to expand our understanding of IT systems and increase the capabilities of computer systems is what forms the basis of computer science.
One way to contextualize the difference between computer science and data science is to consider the knowledge and skills required to work in these fields, and to examine the training and instruction typically offered by computer science and data science programs at the baccalaureate level and in graduate programs. While data science is not exclusively a subfield within computer science, training in data science incorporates some of the same technical knowledge and programming proficiencies that are integral to computer science. This is reflected in undergraduate curricula for computer science and data science majors, which often include some of the same courses. For example, courses in calculus, statistics, algorithms, coding, software engineering, and operating systems are commonly required for computer science and data science majors. In fact, earning a bachelor’s degree in computer science in preparation for a master’s in data science degree program, a private data science training program, or a data science certification program is one pathway to becoming a data scientist.
However, there are also many ways in which computer science and data science curricula diverge from one another. For example, a computer science curriculum may include advanced coursework in application development, operating systems, and network security, while a data science curriculum would more likely feature courses in data analysis, data mining, and probability modeling.
These differences become more evident at the master’s level. Although some master’s in computer science programs offer specializations, concentrations, or tracks in data science and analytics, the focus of the curriculum in a data science program is narrower and more specialized than a general computer science curriculum. For example, while computer scientists and data scientists commonly learn to use several programming languages (Python, Java, and C), there are additional statistical programming languages such as R and SAS that are central to data science and that may not be part of a computer science curriculum. Similarly, while computer scientists study in depth the various components of computer systems (hardware and software), data scientists focus more narrowly on database and data warehousing systems, as well as on the use of processing tools (Hadoop, Tableau, and Apache Spark) to refine and display data.
Computer science is a broad field that encompasses many areas of specialization, including data science. The Viterbi School of Engineering at the University of Southern California (USC) offers a Master of Science in Computer Science (MSCS) program that illustrates this point. The program has a general track that features three required core courses followed by four electives and a research project, but students in the program can also opt for one of eight concentrations. These concentrations include: Data Science; Game Development; Computer Security; Computer Networks; Software Engineering; Intelligent Robotics; Multimedia & Creative Technologies; and High Performance Computing & Simulation. Each of the concentrations in USC’s MSCS program has its own set of required courses, some of which overlap with other concentrations and some of which do not. For example, students in USC’s MSCS in Data Science program are required to complete the same three courses that are part of the general MSCS core curriculum.
The table below offers a side-by-side comparison of typical courses in a master’s in computer science general curriculum and courses commonly offered as part of a master’s in data science curriculum. It is important to note that some courses listed under master’s in computer science may also be part of a master’s in data science curriculum, and that some courses listed under master’s in data science may be offered as electives in a computer science program.
|Master’s in Computer Science||Master’s in Data Science|
|Design & Analysis of Algorithms||Foundations of Algorithms|
|Software Engineering||Statistical Methods & Probability Modeling|
|Database System Design||Principles of Database Systems|
|Computer Architecture||Data Warehousing for Data Science|
|Operating Systems Design||Deep Learning|
|Java & Python Programming||Big Data Processing Using Hadoop|
|Computer Networking||Information Visualization|
|Information Security & Cryptology||Optimizing-Based Data Analysis|
|Machine Learning||Machine Learning & Artificial Intelligence|
|Compiler Design & Construction||Natural Language Processing|
In addition to the courses listed above, most master’s in computer science programs require or encourage students to enroll in electives that cover one or more of the numerous areas of specialization in the field. These specializations include but are not limited to the following:
Data science is considered a computer science specialization and many master’s in computer science programs offer one or more data science or data analytics classes. Master’s in data science programs may also feature specializations in areas like artificial intelligence, biomedical data science, business analytics, data engineering, data management, and/or data modeling. However, data science is a more narrowly defined field with fewer discrete areas of specialization than computer science.
For more detailed information on data science degree programs refer to our Online Bachelor’s in Data Analytics/Data Science Programs and Online Master’s in Data Science Programs pages.