Question: What Is a Data Architect?

Answer: A data architect is an information technology (IT) specialist who designs and manages data systems, sets policies for how data is stored and accessed, coordinates various data sources within an organization, and integrates new data technologies into existing IT infrastructures. Data architects may act as liaisons between the IT side of an organization and other departments, aligning data collection and distribution policies with the organization’s operational and strategic objectives. They also typically work with members of a data team, which may include data engineers, data miners, data scientists, and data analysts, in areas related to data collection, data storage, data security, and data systems access.

Data Architecture

Data architecture refers both to the IT systems that facilitate the collection, storage, distribution, and consumption of data within an organization, and to the policies that govern how data is collected, stored, distributed, and accessed within an organization. From an IT standpoint, an organization’s data architecture typically includes data storage and warehousing systems (e.g., databases), computer networks that serve as data pipelines and provide access to stored data, and software platforms and analytics applications that process data in order to further an organization’s goals. In terms of organizational structure, data architecture may encompass personnel who have access to relevant and potentially sensitive data, policies governing data access, and the protocols for the secure distribution of data to relevant parties, including analytics specialists, operations managers, marketing departments, and others, depending on the size and type of the organization.

The Open Group, a consortium of IT industry leaders committed to “the development of open, vendor-neutral technology standards and certifications” in the realm of data technologies, maintains a framework for understanding data architecture in the context of other aspects of an organization’s IT infrastructures. The Open Group Architecture Framework (TOGAF) describes four types of architecture:

  • Business architecture, which defines the business strategy, governance, organization, and key business processes of the organization.
  • Data architecture, which describes the structure of an organization’s logical and physical data assets and the associated data management resources.
  • Applications architecture, which provides a blueprint for the individual systems to be deployed, the interactions between the application systems, and their relationships to the core business processes of the organization.
  • Technical architecture, or technology architecture, which describes the hardware, software, and network infrastructure needed to support the deployment of core, mission-critical applications.

In practice, it can be difficult to draw a clear line between data architecture and business, applications, and technical architectures, depending on the nature of an organization, its goals, and its size. As a consequence, it can be useful to conceptualize data architecture more broadly as a logical outgrowth of a business strategy or business architecture, which itself may follow from a larger strategic plan, sometimes referred to as enterprise architecture.

The Role of the Data Architect

In its TOGAF guidelines, the Open Group delineates three primary areas of concern for data architects: data management; data migration; and data governance. These are broad concerns that may require the attention of a data architecture group or department in larger organizations. In order to address these concerns effectively, data architects rely on a deep knowledge of contemporary data technology systems, including common operating and database systems, networking protocols, data analytics tools and methods, business intelligence software, and other data-centric elements of IT infrastructures.

While data engineers are typically tasked with the technical challenge of constructing and maintaining an organization’s data storage and distribution system, and data analysts and data scientists generally handle functions related to data modeling, data interpretation, and data reporting, their roles may overlap. However, the primary responsibility of data architects involves constructing and maintaining the technical infrastructures and policy frameworks for the secure and efficient collection, storage, and distribution of data within an organization.

The Data Management Association (DAMA International) publishes an International Guide to Data Management Body of Knowledge (DMBOK) and administers a Certified Data Management Professional credentialing program. DAMA defines data architecture as, “the overall structure of data and data-related resources as an integral part of the enterprise architecture,” and delineates several knowledge areas in addition to data storage systems that are central for data architects:

  • Cloud-base data
  • Linked data architecture
  • Mobile data
  • Open data
  • Semantic data architecture
  • Web-based data

In the “Skills Framework” articulated by the Open Group in its TOGAF guidelines, there are seven general skill areas that are considered central for of data and/or IT architects:

  1. Generic Skills – typically comprising leadership, teamwork skills, inter-personal skills, etc.
  2. Business Skills and Methods – typically comprising business cases, business process, strategic planning, etc.
  3. Enterprise Architecture Skills – typically comprising modeling, building block design, applications and role design, systems integration, etc.
  4. Program or Project Management Skills – typically comprising the management of change within a business, as well as project management methods and tools, etc.
  5. IT General Knowledge Skills – typically comprising brokering applications, asset management, migration planning, SLAs, etc.
  6. Technical IT Skills – typically comprising software engineering, security, data interchange, data management, etc.
  7. Legal Environment – typically comprising data protection laws, contract law, procurement law, fraud, etc.

TOGAF acknowledges that “‘IT Architecture’ and ‘IT Architect’ are widely used but poorly defined terms in the IT industry today.” As a result, professional responsibilities, skills, and knowledge areas for data architects may vary depending on the employer, the job, and the nature of an organization’s data systems and IT infrastructures.

How to Become a Data Architect

Just as there is no clear-cut, industry-wide definition of data architecture, there is no clearly defined pathway to becoming a data architect. As a general rule, data architects are professions who have formal training and/or professional experience in IT management, computer programming, and data systems engineering, as well as in the processes by which data is mined, sorted, stored, and analyzed. Data architects typically interact with and respond to the needs of non-technical managers and business administrators within an organization, which may require some training in professional communication. And it may be helpful for data architects to have familiarity and/or experience with business intelligence systems, data mining tools, and data analytics operations.

Training to become a data architect might begin with an undergraduate degree in computer science, computer engineering, or a related field. A graduate from a bachelor’s or associate degree program who possesses a strong background in computer programming and IT systems may find entry-level employment in a data-intensive field, such as IT administrations, computer programming, data mining, or data analytics. This types of work experience can be a pathway to a career in data architecture.

While master’s program in data architecture are rare, there are graduate programs that provide training and instruction in computer engineering, IT systems management, business data systems (i.e., business intelligence and analytics), many of which may include advanced coursework in data architecture, data warehousing, and data engineering. Master’s programs in these and similar technical fields can provide data professionals with the academic credentials and practical skills required to advance in the field of data architecture. There are also several professional credentials and certification programs that may be advantageous for data architects, including:

  • DAMA International’s Certified Data Management Professional (CDMP) certification
  • The Hortonworks Data Flow Certified NiFi Professional (HDFCNA) certification
  • The IBM Certified Data Architect – Big Data certification
  • The Salesforce Certified Data Architecture and Management Designer credential
  • The Open Group Certified Architect (Open CA) credential

Steps to Becoming a Data Architect

As noted, there is no formalized pathway for individuals who want to become professional data architects. However, there are several common pathways that include a number or steps, which may or may not include one or more of the steps listed below.

  • Earn a bachelor’s degree in computer science or a related field
  • Gain experience in IT administration and/or computer programming
  • Complete a master’s program in computer science, computer engineering, business intelligence, or IT management
  • Work in the field of business data management, learning about database systems, data analytics processes, and the data architecture of businesses and other types of organizations
  • Apply for certification from a respected vender or industry group in data architecture

Additional FAQs about Data Analytics: