Data profiling, also called data discovery or data auditing, is specifically about discovering the data available at client organization and the characteristics of that data. Data profiling is a critical diagnostic phase that arms client with information about the quality of available data. This information is essential in helping to determine not only what data is available in client organization, but how valid and usable that data is.
Data analysis is a business perspective on enterprise data in order to identify patterns and establish relationships. Similar to “data mining,” data analysis techniques are useful for virtually any business to gain greater insight into the trends within their business, their industry, and their customer base.
The analysis of data objects and their relationships to other data objects. Data modeling is often the first step in database design and object-oriented programming as the designers first create a conceptual model of how data items relate to each other. Data modeling involves a progression from conceptual model to logical model to physical schema.
A framework for organizing the interrelationships of data, (based on an organization’s missions, functions, goals, objectives, and strategies), providing the basis for incremental, ordered design and development of systems based on successively more detailed levels of data modeling.
Also referred to as data scrubbing, the act of detecting and removing and/or correcting a database’s dirty data, that is, data that is incorrect, out-of-date, redundant, incomplete, or formatted incorrectly. The goal of data cleansing is not just to clean up the data in a database but also to bring consistency to different sets of data that have been merged from separate databases.
Application of methodologies and techniques for adding new data to source data that is required but is either partially represented or completely missing. Commonly achieved through the correlation of industry specific key data or the employment of computational algorithms which derive relationships through data composition and matching. The approaches for matching between data elements have a basis in statistics and probability. Augmentation typically utilizes data sources outside of the immediate scope for department or divisional data sources being operated on for a given data initiative.
Metadata involves the capture and presentation of the meaning and context behind the data in an organization. Metadata can be descriptions and/or definitions about the data and is key to transforming it into information that is useful to your business.
Many technological based business initiatives within an organization fail because the data that is behind the initiative is not well understood. Metadata can be of critical success in the delivery of a variety of initiatives including data warehouses, data integration, service oriented architecture, data migration and customer relationship management (CRM).
The problems are generally not technological in nature but semantic whereby the meaning or context of the information is not perfectly understood. Metadata provides the context that allows the business to better interpret its data.