Data management

The data lifecycle

Data management comprises all disciplines related to handling data as a valuable resource, it is the practice of managing an organization's data so it can be analyzed for decision making.[1]

Concept

[edit]

The concept of data management emerged alongside the evolution of computing technology. In the 1950s, as computers became more prevalent, organizations began to grapple with the challenge of organizing and storing data efficiently. Early methods relied on punch cards and manual sorting, which were labor-intensive and prone to errors. The introduction of database management systems in the 1970s marked a significant milestone, enabling structured storage and retrieval of data.[2]

By the 1980s, relational database models revolutionized data management, emphasizing the importance of data as an asset and fostering a data-centric mindset in business. This era also saw the rise of data governance practices, which prioritized the organization and regulation of data to ensure quality and compliance. Over time, advancements in technology, such as cloud computing and big data analytics, have further refined data management, making it a cornerstone of modern business operations.[3][4]

As of 2025, data management encompasses a wide range of practices, from data storage and security to analytics and decision-making, reflecting its critical role in driving innovation and efficiency across industries.[5][6]

Topics in Data Management

[edit]

The Data Management Body of Knowledge, DMBoK, developed by the Data Management Association, DAMA, outlines key knowledge areas that serve as the foundation for modern data management practices, suggesting a framework for organizations to manage data as a strategic asset.

Data Governance

[edit]

Data governance refers to the policies, procedures, and standards that ensure data is managed consistently and responsibly across an organization. In enterprise contexts, governance involves aligning stakeholders across business units, defining data ownership, and quantifying the benefits of improved data quality. Effective governance frameworks often include data stewardship roles, escalation protocols, and cross-functional oversight committees to maintain trust and accountability in data use.

Data Architecture

[edit]
Data architecture consist of models, policies, rules, and standards that govern which data is collected and how it is stored, arranged, integrated, and put to use in data systems and in organizations.[7] Data is usually one of several architecture domains that form the pillars of an enterprise architecture or solution architecture.[8]

Data Architecture focuses on designing the overall structure of data systems. It ensures that data flows are efficient and that systems are scalable, adaptable, and aligned with business needs.

Data Modeling and Design

[edit]

This area centers on creating models that logically represent data relationships. It’s essential for both designing databases and ensuring that data is structured in a way that facilitates analysis and reporting.

Data Storage and Operations

[edit]

Deals with the physical storage of data and its day-to-day management. This includes everything from traditional data centers to cloud-based storage solutions and ensuring efficient data processing.

Data Integration and Interoperability

[edit]

Ensures that data from various sources can be seamlessly shared and combined across multiple systems, which is critical for comprehensive analytics and decision-making.

Document and Content Management

[edit]

Focuses on managing unstructured data such as documents, multimedia, and other content, ensuring that it is stored, categorized, and easily retrievable.

Data Warehousing, Business Intelligence and Data Analytics

[edit]

Involves consolidating data into repositories that support analytics, reporting, and business insights.

Data warehousing

[edit]

In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is a core component of business intelligence.[9] Data warehouses are central repositories of data integrated from disparate sources. They store current and historical data organized in a way that is optimized for data analysis, generation of reports, and developing insights across the integrated data.[10] They are intended to be used by analysts and managers to help make organizational decisions.[11]

Business intelligence

[edit]

Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information to inform business strategies and business operations.[12][13] Common functions of BI technologies include reporting, online analytical processing, analytics, dashboard development, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics, and prescriptive analytics.

data mart

[edit]

A data mart is a structure/access pattern specific to data warehouse environments. The data mart is a subset of the data warehouse that focuses on a specific business line, department, subject area, or team.[14] Whereas data warehouses have an enterprise-wide depth, the information in data marts pertains to a single department. In some deployments, each department or business unit is considered the owner of its data mart, including all the hardware, software, and data.[15] This enables each department to isolate the use, manipulation, and development of their data. In other deployments where conformed dimensions are used, this business unit ownership will not hold true for shared dimensions like customer, product, etc.

Data analytics

[edit]

Data analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.[16] Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains.[17] In today's business world, data analysis plays a role in making decisions more scientific and helping businesses operate more effectively.[18]

Data mining

[edit]

Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems.[19] Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal of extracting information (with intelligent methods) from a data set and transforming the information into a comprehensible structure for further use.[19][20][21][22] Data mining is the analysis step of the "knowledge discovery in databases" process, or KDD.[23] Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.[19]

Data science

[edit]

Data science is an interdisciplinary academic field[24] that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge from potentially noisy, structured, or unstructured data.[25]

Metadata Management

[edit]

Manages data about data, including definitions, origin, and usage, to enhance the understanding and usability of the organization’s data assets.

Data Quality Management

[edit]

Data quality is not only a technical concern but a strategic enabler of trust, compliance, and decision-making. High-quality data supports consistent reporting, regulatory adherence, and customer confidence. Enterprise data management programs often define quality metrics such as precision, granularity, and timeliness, and link these to business outcomes.

Reference and master data management

[edit]

Reference data comprises standardized codes and values for consistent interpretation across systems. Master data management (MDM) governs and centralizes an organization’s critical data, ensuring a unified, reliable information source that supports effective decision-making and operational efficiency.

Data security

[edit]

Data security refers to a comprehensive set of practices and technologies designed to protect digital information and systems from unauthorized access, use, disclosure, modification, or destruction. It encompasses encryption, access controls, monitoring, and risk assessments to maintain data integrity, confidentiality, and availability.

Data privacy

[edit]

Data privacy involves safeguarding individuals’ personal information by ensuring its collection, storage, and use comply with consent, legal standards, and confidentiality principles. It emphasizes protecting sensitive data from misuse or unauthorized access while respecting users' rights.

Data management as a foundation of information management

[edit]

The distinction between data and derived value is illustrated by the "information ladder" or the DIKAR model.

diagram displays the DIKAR model - Data, Information, Knowledge, Action, Response. A model showing the relationship between data, information and knowledge.
The DIKAR model - Data, Information, Knowledge, Action, Response. A model showing the relationship between data, information and knowledge.

The "DIKAR" model stands for Data, Information, Knowledge, Action, and Result. It is a framework used to bridge the gap between raw data and actionable outcomes. The model emphasizes the transformation of data into information, which is then interpreted to create knowledge. This knowledge guides actions that lead to measurable results. DIKAR is widely applied in organizational strategies, helping businesses align their data management processes with decision-making and performance goals. By focusing on each stage, the model ensures that data is effectively utilized to drive informed decisions and achieve desired outcomes. It is particularly valuable in technology-driven environments.[26]

The "information ladder" illustrates the progression from data (raw facts) to information (processed data), knowledge (interpreted information), and ultimately wisdom (applied knowledge). Each step adds value and context, enabling better decision-making. It emphasizes the transformation of unstructured inputs into meaningful insights for practical use.[27]

Data management in research

[edit]

In research, Data management refers to the systematic process of handling data throughout its lifecycle. This includes activities such as collecting, organizing, storing, analyzing, and sharing data to ensure its accuracy, accessibility, and security.

Effective data management also involves creating a data management plan, DMP, addressing issues like ethical considerations, compliance with regulatory standards, and long-term preservation. Proper management enhances research transparency, reproducibility, and the efficient use of resources, ultimately contributing to the credibility and impact of research findings. It is a critical practice across disciplines to ensure data integrity and usability both during and after a research project.[28]

Big Data

[edit]

big data refers to the collection and analyses of massive sets of data. While big data is a recent phenomenon, the requirement for data to aid decision-making traces back to the early 1970s with the emergence of decision support systems (DSS). These systems can be considered as the initial iteration of data management for decision support.[29]

Financial and economic outcomes

[edit]

Studies indicate that customer transactions account for a 40% increase in the data collected annually, which means that financial data has a considerable impact on business decisions. Therefore, modern organizations are using big data analytics to identify 5 to 10 new data sources that can help them collect and analyze data for improved decision-making. Jonsen (2013) explains that organizations using average analytics technologies are 20% more likely to gain higher returns compared to their competitors who have not introduced any analytics capabilities in their operations. Also, IRI reported that the retail industry could experience an increase of more than $10 billion each year resulting from the implementation of modern analytics technologies. Therefore, the following hypothesis can be proposed: Economic and financial outcomes can impact how organizations use data analytics tools.

See also

[edit]

References

[edit]
  1. ^ "What Is Data Management? Importance & Challenges | Tableau". www.tableau.com. Retrieved 2023-12-04.
  2. ^ Foote, Keith D. (19 February 2022). "A Brief History of Data Management". DATAVERSITY. Retrieved 21 September 2025.
  3. ^ Redman, Thomas C. (2008). Data Driven: Profiting from Your Most Important Business Asset. Harvard Business Press. ISBN 9781422119129.
  4. ^ Pearce, Guy (30 August 2023). "Three Lessons from 100 Years of Data Management". ISACA Journal. 4. Retrieved 21 September 2025.
  5. ^ Kramer, Robert (20 Mar 2025). "The State Of Enterprise Data Management In Early 2025". Forbes. Retrieved 8 Apr 2025.
  6. ^ DAMA International (2024). DAMA-DMBOK: Data Management Body of Knowledge (2nd Revised Edition). Technics Publications. ISBN 9781634622349.
  7. ^ Business Dictionary - Data Architecture Archived 2013-03-30 at the Wayback Machine; TOGAF 9.1 - Phase C: Information Systems Architectures - Data Architecture
  8. ^ What is data architecture GeekInterview, 2008-01-28, accessed 2011-04-28
  9. ^ Dedić, Nedim; Stanier, Clare (2016). Hammoudi, Slimane; Maciaszek, Leszek; Missikoff, Michele M. Missikoff; Camp, Olivier; Cordeiro, José (eds.). An Evaluation of the Challenges of Multilingualism in Data Warehouse Development. International Conference on Enterprise Information Systems, 25–28 April 2016, Rome, Italy (PDF). Proceedings of the 18th International Conference on Enterprise Information Systems (ICEIS 2016). Vol. 1. SciTePress. pp. 196–206. doi:10.5220/0005858401960206. ISBN 978-989-758-187-8. Archived (PDF) from the original on 2018-05-22.
  10. ^ "What is a Data Warehouse? | Key Concepts | Amazon Web Services". Amazon Web Services, Inc. Retrieved 2023-02-13.
  11. ^ Rainer, R. Kelly; Cegielski, Casey G. (2012-05-01). Introduction to Information Systems: Enabling and Transforming Business, 4th Edition (Kindle ed.). Wiley. pp. 127, 128, 130, 131, 133. ISBN 978-1118129401.
  12. ^ Dedić N. & Stanier noC. (2016). "Measuring the Success of Changes to Existing Business Intelligence Solutions to Improve Business Intelligence Reporting" (PDF). Measuring the Success of Changes to Existing Business Intelligence Solutions to Improve Business Intelligence Reporting. Lecture Notes in Business Information Processing. Vol. 268. Springer International Publishing. pp. 225–236. doi:10.1007/978-3-319-49944-4_17. ISBN 978-3-319-49943-7. S2CID 30910248. Closed access icon
  13. ^ "What Is Business Intelligence (BI)? | IBM". IBM.
  14. ^ "What Is a Data Mart?". IBM. 2021-10-21. Retrieved 2024-12-16.
  15. ^ Inmon, William (July 18, 2000). "Data Mart Does Not Equal Data Warehouse". DMReview.com. Archived from the original on April 20, 2011.
  16. ^ "Transforming Unstructured Data into Useful Information", Big Data, Mining, and Analytics, Auerbach Publications, pp. 227–246, 2014-03-12, doi:10.1201/b16666-14, ISBN 978-0-429-09529-0, retrieved 2021-05-29
  17. ^ "The Multiple Facets of Correlation Functions", Data Analysis Techniques for Physical Scientists, Cambridge University Press, pp. 526–576, 2017, doi:10.1017/9781108241922.013, ISBN 978-1-108-41678-8, retrieved 2021-05-29
  18. ^ Xia, B. S., & Gong, P. (2015). Review of business intelligence through data analysis. Benchmarking, 21(2), 300-311. doi:10.1108/BIJ-08-2012-0050
  19. ^ a b c "Data Mining Curriculum". ACM SIGKDD. 2006-04-30. Archived from the original on 2013-10-14. Retrieved 2014-01-27.
  20. ^ Clifton, Christopher (2010). "Encyclopædia Britannica: Definition of Data Mining". Archived from the original on 2011-02-05. Retrieved 2010-12-09.
  21. ^ Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2009). "The Elements of Statistical Learning: Data Mining, Inference, and Prediction". Archived from the original on 2009-11-10. Retrieved 2012-08-07.
  22. ^ Han, Jaiwei; Kamber, Micheline; Pei, Jian (2011). Data Mining: Concepts and Techniques (3rd ed.). Morgan Kaufmann. ISBN 978-0-12-381479-1.
  23. ^ Fayyad, Usama; Piatetsky-Shapiro, Gregory; Smyth, Padhraic (1996). "From Data Mining to Knowledge Discovery in Databases" (PDF). Archived (PDF) from the original on 2022-10-09. Retrieved 17 December 2008.
  24. ^ Donoho, David (2017). "50 Years of Data Science". Journal of Computational and Graphical Statistics. 26 (4): 745–766. doi:10.1080/10618600.2017.1384734. S2CID 114558008.
  25. ^ Dhar, V. (2013). "Data science and prediction". Communications of the ACM. 56 (12): 64–73. doi:10.1145/2500499. S2CID 6107147. Archived from the original on 9 November 2014. Retrieved 2 September 2015.
  26. ^ ologbosere, oluwatosin; Akeem, Amodu (2021). "Critical Overview of Information Management, DIKAR Model and Technology in The 21st Century". International Journal of Business Management and Technology. 5 (1): 35–39. doi:10.1108/eum0000000007150.
  27. ^ Savoie, Michael J (2016). "2 The Information Ladder". Building Successful Information Systems : Five Best Practices to Ensure Organizational Effectiveness and Profitability (2 ed.). New York: Business Expert Press. ISBN 9781631574658. OCLC 960738491.
  28. ^ Coates, Heather (2014). Coble, Zach; Ho, Adrian (eds.). "Ensuring research integrity: The role of data management in current crises". College and Research Libraries News. 75 (11). Association of College and Research Libraries: 598–601. doi:10.5860/crln.75.11.9224. hdl:1805/5521.
  29. ^ Watson, Hugh J.; Marjanovic, Olivera (2013). "Big Data: The Fourth Data Management Generation". Business Intelligence Journal; Seattle. 18 (3): 4–8.

Further reading

[edit]
  • Sebastian-Coleman, Laura (2018). Navigating the Labyrinth: An Executive Guide to Data Management. New York: Morgan Kaufmann.
  • The DAMA Guide to the Data Management Body of Knowledge (DMBoK): Data Management for Practitioners and Professionals (2 ed.). DAMA International, Technics Publications. 2017.
[edit]