σ (sigma or s)

Lowercase Greek letter that stands for standard deviation. The symbol "σ " refers to the standard deviation of an entire population of items. The symbol "s" refers to the standard deviation of a sample of items. (Larry English)


Σ (Sigma)

Uppercase Greek letter that stands for the summation of a group of numbers. (Larry English)


λ (lambda)

The Greek letter “lambda” used to represent the mean of a Poisson distribution.


μ (mu)

The Greek letter “mu” used to represent the mean of a population.


6 Sigma (6σ or 6s)

See Six Sigma.



a concise and systematic summary of the key ideas of a book, article, speech, or any other kind of relevant information. See also : knowledge compression. (Martin Eppler)


The process of moving from the specific to the general by neglecting minor differences or stressing common elements. Also used as a synonym for summarisation. (Martin Eppler)

  1. Capable of being reached, capable of being used or seen. (Martin Eppler)
  2. The characteristic / dimension of being able to access data when it is required. (Larry English)

Has two definitions:

  1. more commonly, it is a description of systematic errors, a measure of statistical bias;
  2. alternatively, ISO defines accuracy as describing both types of observational error above (preferring the term trueness for the common definition of accuracy).

(Wikipedia Accuracy and precision article)

Accuracy to reality

A characteristic / dimension of information quality measuring the degree to which a data value (or set of data values) correctly represents the attributes of the real-world object or event. (Larry English)

Accuracy to surrogate source

A measure of the degree to which data agrees with an original, acknowledged authoritative source of data about a real world object or event, such as a form, document, or unaltered electronic data received from outside the organisation. See also Accuracy. (Larry English)


A term that designates activities that make information more applicable and current, and its delivery and use more interactive and faster; a process that increases the usefulness of information by making it more vivid and organising it in a way that it can be used directly without further repackaging. (Martin Eppler)


The process of associating objects of different types together in a meaningful whole. Also called composition. (Larry English)


A set of statements or a formula to calculate a result or solve a problem in a defined set of steps. (Larry English)


A secondary and non-standard name or alternate name of an enterprise-standard business term, entity type or attribute name, used only for cross reference of an official name to legacy or software package data name, e.g., Vendor is an alias for Supplier (Larry English)


Acronym for American National Standards Institute, the U.S. body that sets standards.


the characteristic of information to be directly useful for a given context, information that is organised for action. (Martin Eppler)


A collection of computer hardware, computer programs, databases, procedures, and knowledge workers that work together to perform a related group of services or business processes. (Larry English)

Application architecture

A graphic representation of a system showing the process, data, hardware, software, and communications components of the system across a business value chain. (Larry English)

Archival database

A copy of a database saved in its exact state for historical purposes, recovery, or restoration. (Larry English)

Artificial Intelligence (AI)

The capability of a system to perform functions normally associated with human intelligence, such as reasoning, learning, and self-improvement. (Larry English)


See Relationship.

Associative entity type

An entity type that describes the relationship of a pair of entity types that have a many-to-many relationship or cardinality. For example, COURSE COMPLETION DATE has meaning only in the context of the relationship of a STUDENT and COURSE OFFERING entity types. (Larry English)

Asynchronous replication

Replication in which a primary data copy is considered complete once the update transaction completes, and secondary replicated data copies are queued to be updated as soon as possible or on a predefined schedule. (Larry English)

Atomic value

An individual data value representing the lowest level of meaningful fact. (Larry English)


An inherent property, characteristic, or fact that describes an entity or object. A fact that has the same format, interpretation, and domain for all occurrences of an entity type. An attribute is a conceptual representation of a type of fact that is implemented as a field in a record or data element in a database file. (Larry English)

Attributive entity type

An entity type that cannot exist on its own and contains attributes describing another entity. An attributive entity type resolves a one-to-many relationship between an entity type and a descriptive attribute that may contain multiple values. Also called characteristic or dependent entity type. (Larry English)

Audit trail

Data that can be used to trace activity such as database transactions. (Larry English)


The process of verifying that a person requesting a resource, such as data or a transaction, has authority or permission to access that resource. (Larry English)


A percentage measure of the reliability of a system indicating the percentage of time the system or data is accessible or usable, compared to the amount of time the system or data should be accessible or usable. (Larry English)



To restore a database to its state at a previous point in time. Backup is achieved :

  1. from an archived or a snapshot copy of the database at a specified time; or
  2. from an archived copy of a database and applying the logged update activity of changes since that archived copy was made.
(Larry English)

the quality of information and its source to evoke credibility based on the information itself or the history or reputation of the source. (Martin Eppler)


The process of analysing and comparing an organisation’s processes to that of other organisations to identify Best practices. (Larry English)

Best practice

A process, standard or component that is generally recognised to produce superior results when compared with similar processes, standards or components. (Larry English)

  1. Statistical error resulting in the distortion of measurement data caused by conscious or unconscious prejudice or faulty measurement technique such as an incorrect calibration of measurement equipment. (Larry English)
  2. A vested interest, or strongly held paradigm or condition that may skew the results of sampling, measuring, or reporting the findings of a quality assessment. For example, if information producers audit their own information quality, they will have a bias to overstate its quality. If data is sampled in such a way that it does not reflect the entire population sampled, the sample result will be biased. (Larry English)
  3. In this context, an unconscious distortion in the interpretation of information. (Martin Eppler)
Biased sampling

Sampling procedures that result in a sample that is not truly representative of the population sampled. (Larry English)


See Confidence interval.

Boyce/Codd Normal Form (BCNF)
  1. A relation R is in Boyce/Codd normal form (BCNF) if and only if every determinant is a candidate key. (Larry English)
  2. A table is in BCNF if every attribute that is a unique identify of attributes describing an entity is a candidate key of that entity. (Larry English)
Business application model

A graphic illustration of the conceptual application systems, both manual and automated, including their dependencies, required to perform the processes of an organisation. (Larry English)

Business information resource data

The Set of information resource data that must be known to information producers and knowledge workers in order to understand the meaning of information, the business rules that governs its quality and the stakeholders who create or require it. (Larry English)

Business information steward

A business subject-matter expert designated and accountable for overseeing some parts of data definition for a collection of data for the enterprise, such as data definition integrity, legal restriction compliance standards, information quality standards, and authorisation security. (Larry English)

Business intelligence (BI)

The ability of an enterprise to act intelligently through the exploitation of its information resources (Larry English)

Business intelligence (BI) environment

Quality information in stable, flexible databases, coupled with business-friendly software tools that provide knowledge workers timely access to, effective analysis of, and intuitive presentation of the right information, enabling them to take the right actions or make the right decisions. (Larry English)

Business process

A synonym for value chain, the term is used to differentiate a value chain of activities from a functional process or functional set of activities. (Larry English)

Business process model

A graphic and descriptive representation of business processes or value chains that cut across functions and organisations. The model may be expressed in different levels of detail, including decomposition into successive lower levels of activities. (Larry English)

Business process reengineering

the process of analysing, redefining, and redesigning business activities to eliminate or minimise activities that add cost and to maximise activities that add value. (Larry English)

Business resource category

A business classification of data about a resource the enterprise must manage across business functions and organisations, used as a basis for high-level information modelling. The internal resource categories are human resource, financial, materials and products, facilities and tangible assets, and information. External resources include business partners, such as suppliers and distributors; customers; and external environment, such as regulation and economic factors. Also called subject area. (Larry English)

Business rule

A statement expressing a policy or condition that governs business actions and establishes data integrity guidelines. (Larry English)

Business rule conformance

See Validity.

Business term

A word, phrase, or expression that has a particular meaning to the enterprise. (Larry English)

Business value chain

See Value chain.


Candidate key

A key that can serve to uniquely identify occurrences of an entity type. A candidate key must have two properties :

  1. Each occurrence or record must have a different value of the key, so that a key value identifies only one occurrence;
  2. No attribute in the key can be eliminated without nullifying the first property.

(Larry English)


The number of occurrences that may exist between occurrences of two related entity types. The cardinalities between a pair of related entity types are : one to one, one to many, or many to many. See Relationship. (Larry English)


Acronym for Computer-Aided Systems Engineering. the application of automated technologies to business and information modelling and software engineering. (Larry English)

Case study

An empirical inquiry that investigates a contemporary phenomenon within its real-life context; careful and systematic observation and recording of the experience of a single organisation. (Martin Eppler)

CASS (Coding Accuracy Support System)

A system for verifying the integrity of United States addresses against a USPS maintained database containing every mailing address in the United States. The system is concerned with just the addresses, not the people or organisations residing at these addresses. (Larry English)


The component of a Database Management System (DBMS) where physical characteristics about the database are stored, such as its physical design schema, table or file names, primary keys, foreign key relationships, and other data required for the DBMS to manage the data. (Larry English)


Here, the conscious effort to group information items together based on common features, family resemblances, rules, membership gradience, or certain group prototypes (best examples of a category). (Martin Eppler)

Cause-and-effect diagram

A chart in the shape of a “fishbone” used to analyse the relationship between error cause and error effect. The diagram, invented by Kaoru Ishikawa, shows a specific effect and possible causes or error. The errors are drawn in 6 categories, each a bone on the fish. The categories are :

  1. Human (or Manpower),
  2. Methods,
  3. Machines,
  4. Materials,
  5. Measurement and
  6. Environment.

Also called a Fishbone diagram. (Q) (Larry English)

Central tendency

The phenomenon that data measured from a process generally aggregates around a value somewhere between the high and low values. (Larry English)


In Six Sigma, the executive or manager who “owns” a process to be improved, and whose role is an advocate for the improvement project, with oversight and management of critical elements, reporting project success to up-line management, and who removes barriers to enable project improvement success. (Larry English)


A technique for quality improvement to identify steps to perform or items to check before work is complete. (Larry English)


Void of obscure language or expression, ease of understanding, interpretability. (Martin Eppler)

Class word

See Domain type.


See Data cleansing.

  1. A way of storing records or rows from one or more tables together physically, based on a common key or partial key value (ER). 
  2. Groups of objects that have similar characteristics or behaviours that are significantly different from other objects that are discovered through data analysis or mining (Stat).

(Larry English)

Cluster sampling

Sampling a population by taking samples from a smaller number of subgroups (such as geographic areas) of the population. The subsamples from each cluster are combined to make up the final sample. For example, in sampling sales data for a chain of stores, one may choose to take a subsample of a representative subset of stores (each a cluster) into a cluster sample rather than randomly select sales data from every store. (Larry English)

  1. To represent data in a form that can be accepted by an application program.
  2. A shorthand representation or abbreviation of a specific value of an attribute.

(Larry English)


A DML command that signals a successful end of a transaction and confirms that a record(s) inserted, updated, or deleted in the database is complete. (Larry English)

Common cause

An inherent source of variation in the output of a process due to natural variation in the process. See also Special cause. (Larry English)


Here, the interchange of messages resulting in the transferral or creation of knowledge; the creation of shared understanding through interaction among two or more agents. (Martin Eppler)


A characteristic / dimension of information quality measuring the degree to which all required data is known.

  1. Fact completeness is a measure of data definition quality expressed as a percentage of the attributes about an entity type that need to be known to assure that they are defined in the model and implemented in a database. For example, “80 percent of the attributes required to be known about customers have fields in a database to store the attribute values.”
  2. Value completeness is a measure of data content quality expressed as a percentage of the columns or fields of a table or file that should have values in them, in fact do so. For example, “95 percent of the columns for the customer table have a value in them.” Also referred to as Coverage.
  3. Occurrence completeness is a measure of the percent of records in an information collection that it should have to represent all occurrences of the real world objects it should know. For example, does a Department of Corrections have a record for each Offender it is responsible to know about? (IQ).

(Larry English)


The quality of information to cover a topic to a degree or scope that is satisfactory to the information user. (Martin Eppler)

Conceptual data model

See Data model.


Marked by brevity of expression or statement, free from all elaboration and superfluous detail. (Martin Eppler)


A characteristic or dimension of information quality measuring the degree to which the timing of equivalence of data is stored in redundant or distributed database files. The measure data concurrency may describe the minimum, maximum, and average information float time from when data is available in one data source and when it becomes available in another data source. Or it may consist of the relative percent of data from a data source that is propagated to the target within a specified time frame.

(Larry English)

Concurrency assessment

An audit of the timing of equivalence of data stored in redundant or distributed database files. See Equivalence. (Larry English)

Concurrency control

A DBMS mechanism of locking records used to manage multiple transactions access to shared data. (Larry English)

Conditional relationship

An association that is optional depending on the nature of the related entities or on the rules of the business environment. (Larry English)

Confidence interval

The upper and lower limits or values, or bounds on either side of a sample mean for which a confidence level is valid. (Larry English)

Confidence level

The degree of certainty, expressed as a percentage, of being sure that the value for the mean of a population is within a specific range of values around the mean of a sample. For example, a 95 percent confidence level indicates that one is 95 percent sure that the estimate of the mean is within a desired precision or range of values called a confidence interval. Stated another way, a 95 percent confidence level means that out of 100 samples from the same population, the mean of the population is expected to be contained within the confidence interval in 95 of the 100 samples. (Larry English)

Confidence limits

See Confidence interval.

Configuration management

The process of identifying and defining configurable items in an environment by controlling their release and any subsequent changes throughout the development life cycle; recording and reporting the status of those items and change requests; and verifying the completeness and correctness of configurable items. (Larry English)


The agreement of a group with a judgment, decision, or data definition in which the stakeholders have participated and can say, “I can live with it.” (Larry English)

  1. A measure of information quality expressed as the degree to which a set of data is equivalent in redundant or distributed databases. (Larry English)
  2. The condition of adhering together, the ability to be asserted together without contradiction. (Martin Eppler)

A business rule that places a restriction on business actions and therefore restrictions the resulting data. For example, “only wholesale customers may place wholesale orders.” (Larry English)


See Information quality contamination.

  1. Here, the sum of associations, ideas, assumptions, and preconceptions that influence the interpretation of information; the situation of origination or application of a piece of information. (Martin Eppler)
  2. A specific situation that defines the environment in which a piece of information originates or is interpreted. (Martin Eppler)
  1. The act of adding situational meta-information to information in order to make it more comprehensible and clear and easier to judge. (Martin Eppler)
  2. A term that designates activities that make information clearer, allow to see whether it is correct for a new situation, and enable a user to trace it back to its origin (in spite of system changes); a process that adds background to information about its origin and relevance. (Martin Eppler)

A mechanism that can be used to add context to a piece of information and thus increase its interpretability. (Martin Eppler)


The mechanisms used to manage processes to maintain acceptable performance. (Larry English)

Control chart

A graphical device for reporting process performance over time for monitoring process quality performance. (Larry English)

Control group

A selected set of people, objects, or processes to be observed to record behaviour or performance characteristics. Used to compare behaviour and performance to another group in which changes or improvements have been made. (Larry English)


Here, the ease-of-use or seamlessness by which information is acquired. (Martin Eppler)


The process of preparing, reengineering, cleansing and transforming data, and loading it into a new target data architecture. (Larry English)


Acronym for Cost of Poor Data Quality

Corporate data

See Enterprise data.


Conforming to an approved or conventional standard, conforming to or agreeing with fact, logic, or known truth. (Martin Eppler)


A predictive relationship that exists between two factors, such that when one of the factors changes, you can predict the nature of change in the other factor. For example, if information quality goes up, the costs of information scrap and rework go down. (Larry English)

Cost of acquisition
  1. The cost of acquiring a new customer, including identifying, marketing and presales activities to get the first sale.
  2. The costs of acquiring products, such as software packages, and services. This should be weighed against the cost of ownership.

(Larry English)

Cost of information quality assessment

The costs associated with measurement and quality conformance assurance as a component of the cost of quality information. (Larry English)

Cost of nonquality information

The total costs associated with failure or nonquality information and information services, including, but not limited to reruns, rework, downstream data verification, data correction, data transformation to nonstandard definition or format, work arounds. (Larry English)

Cost of ownership

The total costs of ownership of products, such as software packages, and services, including planning, acquiring, process redesign, implementation, and support required for the successful use of the product or service. (Larry English)

Cost of quality information

The total costs associated with providing nonquality information or information services. The costs consists of costs of failure or nonquality information plus the costs of assessment and conformance plus the costs of information process improvement and data defect prevention. (Larry English)

Cost of retention

The cost of managing customer relationships that result in subsequent sales to existing customers. (Larry English)


See Completeness.


Standards by which alternatives are judged. Attributes that describe certain (information) characteristics. (Martin Eppler)

Criteria of information quality

They describe the characteristics that make information valuable to authors, administrators, or information users. (Martin Eppler)

Critical information

Information that if missing or wrong can cause enterprise-threatening loss of money, life, or liability, such as failure to properly calculate pension withholding, not setting the aeroplane flaps correctly for take-off, or prescribing the wrong drug. (Larry English)


The characteristic / dimension of data or process that is of interest to more than one business or functional area. (Larry English)

  1. A characteristic / dimension of information quality measuring the degree to which data represents reality from the required point in time. For example, one information view may require data currency to be the most up-to-date point, such as stock prices for stock trades, while another may require data to be the last stock price of the day, for stock price running average. (Larry English)
  2. The quality or state of information of being up-to-date or not outdated. (Martin Eppler)

The person or organisation whose needs a product or service provider must meet, and whose satisfaction with product and service, including information is the focus of quality management. A customer may be a direct, immediate Customer or the End-consumer of the product or service. (Larry English)

Customer life cycle

The states of existence and relative time periods of a typical customer from being a prospect to becoming an active customer, to becoming nonactive and a “former” customer. (Larry English)

Customer lifetime revenue

The net present value of the average customer revenue over the life of relationship with the enterprise. (Larry English)

Customer lifetime value (LTV)

The net present value of the average profit of a typical customer over the life of relationship with the enterprise. (Larry English)

Customer segment

A meaningful aggregation of customers for the purpose of marketing or determining customer lifetime value. (Larry English)

Customer-supplier relationship

See Information customer-supplier relationship.


Abbreviation for Cumulative Summation, a more sensitive method for detecting out-of-control measurements than a simple control chart. The CUSUM indicates when a process has been off aim for too long a period of time. (Larry English)

Cycle time

The time required for a process (or subprocess) to execute from start to completion. (Larry English)



A symbol representing the set of deviations of a set of items from the mean of the set of items, expressed as d = x – x̄ for each value of x. (Larry English)


Re-interpretable representation of information in a formalized manner suitable for communication, interpretation, or)processing [NOTE Data can be processed by humans or by automatic means.] (ISO/IEC 2382-1:1993, 01.01.02]

  1. Symbols, numbers or other representation of facts;
  2. The raw material from which information is produced when it is put in a context that gives it meaning.
  3. See also Information. (Larry English)

Raw, unrelated numbers or entries, e.g., in a database; raw forms of transactional representations. (Martin Eppler)

Data administration

See Data management. (Larry English)

Data administrator

One who manages or provides data administration functions. (Larry English)

Data analyst

One who identifies data requirements, defines data, and synthesises it into data models. (Larry English)

Data architect

One who is responsible for the development of data models. (Larry English)

Data audit

See Information quality assessment. (Larry English)

Data cleansing

An information scrap-and-rework process to correct data errors in a collection of data in order to bring the level of quality to an acceptable level to meet the information customers’ needs. (Larry English)

Data cleanup

See Data cleansing. (Larry English)

Data consistency assessment

The process of measuring data equivalence and information float or timeliness in an interface-based information value chain. (Larry English)

Data content quality

The subset of information quality referring to the quality of data values. (Larry English)

Data defect prevention

The process of information process improvement to eliminate or minimise the possibility of data errors from getting into an information product or database. (Larry English)

Data deficiency

An unconformity between the view of the real-world system that can be inferred from a representing information system and the view that can be obtained by directly observing the real-world system. (Martin Eppler)

Data definition

The specification of the meaning, valid values or ranges (domain), and business integrity rules for an entity type or attribute. Data definition includes name, definition, and relationships, as well as domain value definition and business rules that govern business actions that are reflected in data. These components represent the “information product specification” components of Information Resource Data or meta data. (Larry English)

Data definition

The language used to describe database schemas or designs. (Larry English)

Data definition quality

A component of information quality measuring the degree to which data definition accurately, completely, and understandably defines what the information producers and knowledge workers should know in order to perform their job processes effectively. Data definition quality is a measure of the quality of the information product specification. (Larry English)

Data dictionary

A repository of information (meta data) defining and describing the data resource. A repository containing meta data. An active data dictionary, such as a catalogue, is one that is capable of interacting with and controlling the environment about which it stores information or meta data. An integrated data dictionary is one that is capable of controlling the data and process environments. A passive data dictionary is one that is capable of storing meta data or data about the data resource, but is not capable of interacting with or controlling the computerised environment external to the data dictionary. See also Repository. (Larry English)

Data dissemination

The distribution of a copy or extract of information in any form, from electronic to paper from a database or data source to other parties. This is NOT to be confused with data or information sharing. (Q) (Larry English)

Data element

The smallest unit of named data that has meaning to a knowledge worker. A data element is the implementation of an attribute. Synonymous with data item and field. (Larry English)

Data flow diagram

A graphic representation of the “flow” of data through business functions or processes. It illustrates the processes, data stores, external entities, data flows, and their relationships. (Larry English)

Data governance

The organization and implementation of policies, procedures, strucutre, toles, and responsibilities that outline and enforce rules of engagement, decision rights, and accountabilities of the effective management of information assets. (Joh Ladlay, Danette McGilvray, Anne-Marie Smith and Gwen Thomas)

Data independence

The property of being able to change the overall logical or physical structure of the data without changing the application program’s view of the data. (Larry English)

Data intermediary

See Data scribe. (Larry English)

Data intermediation

The design of and performance of processes in which the actual creator or originator of knowledge does not capture that knowledge electronically, but gives it in paper or other form to be entered into a database by someone else. (Larry English)

Data management

The management and control of data as an enterprise asset. It includes strategic information planning, establishing data-related standards, policies, and procedures, and data modelling and information architecture. Also called data administration. (Larry English)

Data Manipulation Language (DML)

The language used to access data in one or more databases. (Larry English)

Data mart

A subset of enterprise data along with software to extract data from a data warehouse or operational data store, summarise and store it, and to analyse and present information to support trend analysis and tactical decisions and processes. The scope can be that of a complete data subject such as Customer or Product Sales, or of a particular business area or line of business, such as Retail Sales. A data mart architecture, whether subject or business area, must be an enterprise-consistent architecture. (Larry English)

Data mining
  1. The process of analysing large volumes of data using pattern recognition or knowledge discovery techniques to identify meaningful trends, relationships and clusters represented in data in large databases. (Larry English)
  2. The process of analysing large volumes of data using pattern recognition or knowledge discovery techniques to identify meaningful trends and relationships represented in data in large databases. (Larry English)
Data model
  1. A logical map or representation of real-world objects and events that represents the inherent properties of the data independently of software, hardware, or machine performance considerations. The model shows data attributes grouped into third normal form entities, and the relationships among those entities. (DM)
  2. In data mining, an expression in symbolic terms of the relationships in data, such that the model represents how changes in one attribute or set of attributes causes changes in another attribute or set of attributes, revealing useful information about the reliability of the relationships. (DW) (Larry English)
Data presentation quality

A component / dimension of information quality measuring the degree to which information-bearing mechanisms, such as screens, reports, and other communication media, are easy to understand, efficient to use, and minimise the possibility of mistakes in its use. (Larry English)

Data quality

See Information quality. (Larry English)

Data quality assessment

See Information quality assessment.

Data reengineering

The process of analysing, standardising, and transforming data from un-architected or non-standardised files or databases into an enterprise standardised information architecture. (Larry English)

Data replication

The controlled process of propagating equivalent data values from a source database to one or more duplicate copies in other databases. (Larry English)

Data resource management (DRM)

See Information resource management.

Data scribe

A role in which individuals transcribe data in one form, such as a paper document, to another form, such as into a computer database; for example, a data entry clerk entering data from a paper order form into a database. (Larry English)

Data standards

The collection of standards, rules and guidelines that govern how to name data, how to define it, how to establish valid values, and how to specify business rules. (IRM) (Larry English)

Data store

Any place in a system where data is stored. This includes manual files, machine-readable files, data tables, and databases. A data store on a logical data flow diagram is related to one or more entities in the data model. (Larry English)

Data transformation

The process of defining and applying algorithms to change data from one form or domain value set to another form or domain value set in a target data architecture to improve its value and useability for the information stakeholders. (Larry English)

Data type

An attribute of a data element or field that specifies the DBMS type of physical values, such as numeric, alphanumeric, packed decimal, floating point, or datetime. (Larry English)

Data value

A specific representation of a fact for an attribute at a point in time. (Larry English)

Data visualisation

Graphical presentation of patterns and trends represented by data relationships. (Larry English)

Data warehouse

A collection of software and data organised to collect, cleanse, transform, and store data from a variety of sources, and analyse and present information to support decision-making, tactical and strategic business processes. (Larry English)

Data warehouse audits and controls

A collection of checks and balances to assure the extract, cleansing, transformation, summarisation, and load processes are in control and operate properly. The controls must assure the right data is extracted from the right sources, transformed, cleansed, summarised correctly, and loaded to the right target files. (Larry English)

Data-driven development

See Value-centric development. (Larry English)

Database administration

The function of managing the physical aspects of the data resource, including physical database design to implement the conceptual data model; and database integrity, performance, and security. (Larry English)

Database integrity

The characteristic of data in a database in which the data conforms to the physical integrity constraints, such as referential integrity and primary key uniqueness, and is able to be secured and recovered in the event of an application, software, or hardware failure. Database integrity does not imply data accuracy or other information quality characteristics not able to be provided by the DBMS functions. (Larry English)

Database marketing

The use of collected and managed information about one’s customers and prospects to provide better service and establish long-term relationships with them. Database marketing involves analysing and designing pertinent customer information needs, collecting, maintaining, and analysing that data to support mass customisation of marketing campaigns to decrease costs, improve response, and to build customer loyalty, reduce attrition, and increase customer satisfaction. (Larry English)

Database server

The distributed implementation of a set of database management functions in which one dedicated collection of database management functions, accessing one or more databases on that mode, serves multiple knowledge workers or clients that provide a human-machine interface for the requesting of a creation of data. (Larry English)


Acronym for Data Definition Language. (Larry English)

Decision Support System (DSS)

Applications that use data in a free-form fashion to support managerial decisions by applying ad hoc query, summarisation, trend analysis, exception identification, and “what-if” questions. (Larry English)

  1. In IQ, a quality characteristic of a data element, such as completeness or accuracy that does not meet its quality standard or meet customer expectation. A record may have as many defects for a quality characteristic as it has data elements. Compare to Defective;
  2. A quality characteristic of an item or a component that does not conform to its quality standard or meet customer expectation.

(Larry English)

Defect rate

A measure of the frequency that defects occur in a process. Also called failure rate (in manufactured products), or error rate. (Larry English)

  1. In IQ, a record or logical business unit of information, such as an insurance application or an order, that has at least one Defect causing it to not conform to its quality standard or meet customer expectation. The record is counted as one Defective regardless of the number of defects;
  2. A unit of product or service containing at least one Defect.

(Larry English)

Definition conformance

The characteristic of data, such that the data values represent a fact consistent with the agreed-upon definition of the attribute. For example, a value of “6/7/1997” actually represents the “Order Date : the date an order is placed by the customer,” and not the system date created when the order is entered into the system. (Larry English)

Delphi approach

An approach used to achieve consensus, that involves individual judgments made independently, group discussion of the rationales for disparate judgments, and a consensus judgment being agreed upon by the participants. (Larry English)


The study of human populations, especially with reference to size, density, distribution and other vital statistics. (Larry English)

Derived data

Data that is created or calculated from other data within the database or system. (Larry English)


The rendering of content in a communication medium; design is concerned with how things ought to be in order to attain goals and to function. (Martin Eppler)

Deviation (d)

The difference in value of an item in a set of items and the mean (x bar) of the set as expressed in the formula

d = x


d = deviation,
x = the value of an item in a set, and
= the mean or average of all items in the set. (Larry English)
Devil’s advocate

A technique used in decision making in which someone plays the role of challenging the predominant position in order to expose potential flaws, influence critical thinking and prevent biased and potentially harmful decisions. (Larry English)


Acronym for Data Flow Diagram.


Acronym for Data Interchange Format.

  1. An aspect or property of information or information service that an information customer deems important in order to be considered “quality information.” Characteristics include completeness, accuracy, timeliness, understandability, objectivity and presentation clarity, among others. Also called (Information) Quality characteristic.
  2. A category for summarising or viewing data (e.g., time period, product, product line, geographic area, organisation).

A table, block, index, or folder containing addresses and locations or relationships of data or files and used as a way of organising files. (Larry English)

Discount rate

The market rate of interest representing the cost to borrow money. This rate may be applied to future income to calculate its net present value. (Larry English)


see misinformation (Martin Eppler)


Acronym for Define-Measure-Analyse-Improve-Control, the Six Sigma method for process improvement.


Acronym for Data Manipulation Language.

Document identification keys

Concise alphanumeric labels that are attributed to documents according to a set of rules in order to facilitate their storage and retrieval. (Martin Eppler)

  1. Set or range of valid values for a given attribute or field, or the specification of business rules for determining the valid values.
  2. The area or field of reference of an application or problem set.

(Larry English)

Domain chaos

A dysfunctional characteristic of an attribute or field in which multiple types of facts are represented by more. For example, unit of measure code for one product has a domain value of “doz,” to represent a unit of measure of “one dozen,” while for another product unit of measure code has a value of “150,” to represent a the reorder point quantity. (Larry English)

Domain type

A general classification that characterises the kind of values that may be values of a specific attribute, such as a number, date, currency amount, or percent. The domain type name may be used as a component of an attribute name. Also called a class word. (Larry English)

Domain value redundancy

A dysfunctional characteristic of an attribute or field in which the same fact of information is represented by more than one value. For example, unit of measure code having domain values of “doz,” “dz,” and “12” may all represent the fact that the unit of measure is “one dozen.” (Larry English)

Drill down

The process of accessing more detailed data from summary data to identify exceptions and trends. May be multitier. (Larry English)

Drill through

The process of accessing the original source data from a replicated or transformed copy to verify equivalence to the record-of-origin data. (Larry English)


Acronym for Decision Support Systems. (Larry English)



Acronym for electronic commerce, the conducting of business transactions over the Internet.


The quality of an information environment to facilitate the access and manipulation of information in a way that is intuitive. (Martin Eppler)


Acronym for Electronic Data Interchange.

Edit and validation

The process of assuring data being created conforms to the governing business rules and is correct to the extent possible. Database integrity controls and software routines can edit and validate conformance to business rules. Information producers must validate correctness of data. (Larry English)


Acronym for Executive Information System.

Empty value

A data element that has no value has been capture, and for which the real-world object represented has no corresponding value. For example, there is no date value for the data element, “Last date of service” for an active Employee. Contrast with Missing value. (Stat, Q) (Larry English)


The persons or organisations whose needs a product or service provider must meet, and whose satisfaction with its products and services, including information, determines enterprise success or failure. A customer may be a direct, immediate Customer or the End-consumer of the product or service. (Larry English)

Enterprise data

The data of an organisation or corporation that is owned by the enterprise and managed by a business area. Characteristics of corporate data are that it is essential to run the business and/or it is shared by more than one organisational unit within the enterprise. (Larry English)

Entity integrity

The assurance that a primary key value will identify no more than one occurrence of an entity type, and that no attribute of the primary key may contain a null value. Based on this premise, the real-world entities are uniquely distinguishable from all other entities. (Larry English)

Entity life cycle

The phases, or distinct states, through which an occurrence of an object moves over a definable period of time. The subtypes of an entity that are mutually exclusive over a given time frame. Also referred to as entity life history and state transition diagram. (Larry English)

Entity Relationship Diagram (ERD)

See Entity relationship model.

Entity relationship model

A graphical representation illustrating the entity types and the relationships of those entity types of interest to the enterprise. (Larry English)

Entity subtype

A specialised subset of occurrences of a more general entity type, having one or more different attributes or relationships not inherent in the other occurrences of the generalised entity type. For example, an hourly employee will have different attributes from a salaried employee, such as hourly pay rate and monthly salary. (Larry English)

Entity supertype

A generalised entity in which some occurrences belong to a distinct, more specialised subtype. (Larry English)

Entity type

A classification of the types of real-world objects (such as person, place, thing, concept, or events of interest to the enterprise) that have common characteristics. Sometimes the term entity is used as a short name. (Larry English)

Entity/process matrix

A matrix that shows the relationships of the processes, identified in the business process model, with the entity types identified in the information model. The model illustrates which processes create, update, or reference the entity types. (Larry English)


A characteristic of information quality that measures the degree to which data stored in multiple places is conceptually equal. Equivalence indicates the data has equal values or is in essence the same. For example, a value of “F” for Gender Code for J. J. Jones in database A and a value of “1” for Sex Code for J. J. Jones in database B mean the same thing : J. J. Jones is female. The measure equivalence is the percent of fields in records within one data collection that are semantically equivalent to their corresponding fields within another data collection or database. Also called semantic equivalence. (Larry English)


Acronym for Entity Relationship Diagram. (Larry English)

Error cause removal

Elimination of cause(s) of error in a way that prevents recurrence of the error. (Larry English)

Error event

An incident in which an error or defect occurs. (Larry English)

Error proofing

Building edit and validation routines in application programs and designing procedures to reduce inadvertent human error. Also called foolproofing. (Larry English)

Error rate

See Defect rate.


The activity of assessing the quality of a system or the information it contains. (Martin Eppler)

  1. An occurrence of something that happens that is of interest to the enterprise.
  2. See also Error event.

(Larry English)

Executive Information System (EIS)

A graphical application that supports executive processes, decisions, and information requirements. Presents highly summarised data with drill-down capability, and access to key external data. (Larry English)


A knowledge worker who has a high degree of domain specific knowledge and a high heuristic competence in the field of expertise. (Martin Eppler)

Expert system
  1. A specific class of knowledge base system in which the knowledge, or rules, are based on the skills and experience of a specific expert or group of experts in a given field. 
  2. A branch of artificial intelligence. An expert system attempts to represent and use knowledge in the same way a human expert does. Expert systems simulate the human trait of thinking.

(Larry English)


The function of extracting information from a repository or database and packaging it to an export/import file. (Larry English)


The ability to dynamically augment a database (or data dictionary) schema with knowledge worker-defined data types. This includes addition of new data types and class definitions for representation and manipulation of unconventional data such as text data, audio data, image data, and data associated with artificial intelligence applications. (Larry English)


Semi-public TCP/IP network used by several collaborating partners. (Martin Eppler)


  1. Something that is known or needs to be known. (Larry English)
  2. In data warehousing, a specific numerical sum that represents a key business performance measure. (Larry English)
  3. A statement that accurately reflects a state or characteristic of reality. (Martin Eppler)
Fact Table

The primary table in dimensional modelling that contains key business measurements. The facts are viewed by various Dimensions. See also Enterprise fact. (Larry English)

Failure costs

See Costs of nonquality information.

Failure mode
  1. The precipitating defect or mechanism that causes a failure.
  2. The result or consequence of a failure or the manifestation of a failure.
  3. The way in which a failure occurs and its impact on the normal process.

(Larry English)

Failure mode analysis (FMA)

A procedure to determine the precipitating cause or symptoms that occur just before or after a process failure. The procedure analyses failure mode data from current and previous process designs with a goal to define improvements to prevent recurrence of failure. See also Information process improvement. (Larry English)

Failure rate

A measure of the frequency that defective items are produced by a process, hence the frequency with which the process fails. See also Defect rate. (Q) (Larry English)

False Negative
  1. In quality measurement, the condition of measuring a value for accuracy (or validity) and finding it to be not accurate (or not valid) when it is accurate (or valid). 
  2. In record matching, the condition of failing to identify that two records represent the same real world object.

(Larry English)

False Positive
  1. In quality measurement, the condition of measuring a value for accuracy (or validity) and finding it to be accurate (or valid) when it is not.
  2. In record matching, the condition of incorrectly identifying that two records represent the same real world object, when they actually represent two unique real world objects.

(Larry English)


The characteristic of information not to correspond to facts, logic, or a given standard. (Martin Eppler)

Feedback loop

A formal mechanism for communicating information about process performance and information quality to the process owner and information producers. (Larry English)


A data element or data item in a data structure or record. (Larry English)

Fifth Normal Form (5NF)


  1. A relation R is in fifth normal form (5NF) (also called Projection Join Normal Form (PJ/NF)) if and only if every join dependency in R is a consequence of the candidate keys of R.
  2. A table is in 5NF if a relation or record in which all elements within a concatenated key are independent of each other and cannot be derived from the remainder of the key.

(Larry English)

File integrity

The degree to which documents in a file retain their original form and utility (i.e., no misfiled or torn documents). (Larry English)


See Information quality measure. (Larry English)

First Normal Form (1NF)
  1. A relation R is in first normal form (1NF) if and only if all underlying domains contain atomic values only. 
  2. A table is in 1NF if it can be represented as a two-dimensional table, and for every attribute there exists one single meaningful and atomic value, never a repeating group of values.

(Larry English)

Fishbone diagram

See Cause-and-effect diagram. (Larry English)


A characteristic of information quality measuring the degree to which the information architecture or database is able to support organisational or process reengineering changes with minimal modification of the existing objects and relationships, only adding new objects and relationships. (Larry English)


See Failure mode analysis.

Focus group
  1. A facilitated group of customers that evaluates a product or service against those of competitors, in order to clearly define customer preferences and quality expectations. (Larry English)
  2. A market research technique where five to nine people discuss a topic with the help of a moderator in order to elicit common themes, problems, or opinions. (Martin Eppler)

Building edit and validation routines in application programs or procedures to reduce inadvertent human error. (Larry English)

Foreign key

In the context of relational databases, a foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table or the same table. (Wikipedia Foreign key article)

Fourth Normal Form (4NF)


  1. A relation R is in fourth normal form (4NF) if and only if, whenever there exists an MVD in R, say A ->-> B, then all attributes of R are also functionally dependent upon A. In other words, the only dependencies (FDs or MVDs) in R are of the form K -> X (i.e., a functional dependency from a candidate K to some other attribute X). Equivalently, R is in 4NF if it is in BCNF and all MVDs in R are in fact FDs.
  2. A table is in 4NF if no row of the table contains two or more independent multivalued facts about an entity.

(Larry English)

Frameworks of information quality

They group information criteria into meaningful categories. (Martin Eppler)

Frequency distribution

The relation number of occurrences of values of an attribute, including a graphic representation of that “distribution” of values. (Larry English)

Functional dependence

The degree to which an attribute is an inherent characteristic of an entity type. If an attribute is an inherent characteristic of an entity type, that attribute is fully functionally dependent on any candidate key of that entity type. See Normal form. (Larry English)



The process of aggregating similar types of objects together in a less-specialised type based upon common attributes and behaviours. The identification of a common supertype of two or more specialised (sub)types. See also Specialisation. (Larry English)

Generic information quality criteria

Attributes of information that are relevant regardless of the specific context, such as accessibility or clarity. (Martin Eppler)


Unsubstantiated, low-quality information that is passed on by word of mouth. (Martin Eppler)


Occurs when the members of a highly cohesive group lose their willingness to evaluate each other’s inputs critically. It is a phenomenon (coined by Irving Janis) that describes the negative effects that can take place in team dynamics, such as excluding information sources; a tendency in highly cohesive groups for members to seek consensus so strongly that they lose the willingness and ability to evaluate one another’s ideas critically (see Northcraft & Neale, 1990, p. 377).GUI : graphical user interface; the visual component of a (typically operating system) software application (e.g., Macintosh’s System X or Microsoft Windows). (Martin Eppler)



A method or rule of thumb for obtaining a solution through inference or trial-and-error using approximate methods while evaluating progress toward a goal. (Larry English)

Hidden complaint

An unhappy customer who has a complaint about a product or service, but who does NOT tell the provider organisation. (Larry English)

Hidden information factory

In IQ, all of the areas of the business where information scrap and rework takes place, including redundant databases and applications that move or re-enter data, as well as private, proprietary data files and spreadsheets people maintain the keep their information current, because they cannot access information in the way they need it, they do not trust it, or their “production” reports or queries does not meet their needs. In manufacturing, the hidden factory is all of the areas of the factory in which scrap and rework goes on, including replacement products, retesting or re-inspection of rejecting items. (Larry English)


Stressing the most essential elements of a document by emphasising sentences or items visually through colours, larger or different fonts or through flagging icons. (Martin Eppler)

Highly summarised

Data that is summarised to more than two hierarchies of summarisation from the base detail data. Highly summarised data may have lightly summarised data as its source. (Larry English)

Holding the gain

Putting in place controls in a process that has been improved to maintain the quality level achieved by the improvement. (Larry English)


A word, phrase or data value that has the same spelling, value or sound, but has a different meaning. (Larry English)

Hoshin planning (Hoshin Kanri)

Also known as Policy Management or Policy Deployment, is a management technique developed in Japan by combining Management by Objectives and the Plan-Do-Check-Act (PDCA) improvement cycle. Hoshin planning provides a planning, implementation and review process to align business strategy and daily operations through total employee participation to achieve business objectives and breakthrough improvements. (Larry English)

House of quality
  1. A mapping of customer quality expectations in product or service to the quality measures of the product or service to summarise all expectations and the work to meet them. See also Quality Function Deployment. (Larry English)
  2. A standard quality management tool that consists of a matrix which relates customer requirements to technical specifications or functionalities. (Martin Eppler)
Human error

An action performed by a person that is wholly expected to have a positive or satisfactory outcome, but that does not. (Ben Marguglio). Human error is NOT a root cause of defects, rather, human error is predictable, manageable, and human error is preventable. (Larry English)

Human factors

Static constraints related to human ergonomic and cognitive limitations. (Larry English)


The convergence of hypertext and multimedia. (Larry English)

  1. The ability to organise text data in logical chunks or documents that can be accessed randomly via links as well as sequentially. (Larry English)
  2. this term refers to the computer-based organisation of information by way of linking related (associated) information. (Martin Eppler)
Hypothetical reasoning

Hypothetical reasoning is a problem-solving approach that explores several different alternative solutions in parallel to determine which approach or series of steps best solves a particular problem. It is useful in business planning or optimisation problems, where solutions vary according to cost or where numerous solutions may be feasible. (Larry English)



One or more attributes that uniquely locate an occurrence of an entity type. conceptually synonymous with primary key. (Larry English)

In control

The state of a process characterised by the absence of special causes of variation. Processes in control produce consistent results within acceptable limits of variation. See also Out of control. (Larry English)

Inadvertent error

Error introduced unconsciously; for example, when a data intermediary unwittingly transposes values or skips a line in data entry. See also Intentional error. (Larry English)

Incremental load

The propagation of changed data to a target database or data warehouse in which only the data that has been changed since the last load is loaded or updated in the target. (Larry English)


A term coined by Shoshona Zuboff in The Age of The Smart Machine to described the benefit of information technology when used to capture knowledge about business events so that the knowledge can “informate” other knowledge workers to more intelligently perform their jobs. (Larry English)

  1. Knowledge concerning objects, such as facts, events, things, processes, or ideas, including concepts, that within a certain context has a particular meaning (ISO/IEC 2382-1:1993, 01.01.01)
    1. Data in context, i.e., the meaning given to data or the interpretation of data based on its context;
    2. The finished product as a result of processing, presentation and interpretation of data. (Larry English)
  3. Information can be defined as all inputs that people process to gain understanding. It is a difference (a distinction) that makes a difference, an answer to a question. A set of related data that form a message. (Martin Eppler)
Information administrator

Person who is responsible for maintaining (see also maintainability) information or keeping an information system running. (Martin Eppler)

Information architecture

A “blueprint” of an enterprise expressed in terms of a business process model, showing what the enterprise does; an enterprise information model, showing what information resources are required; and a business information model, showing the relationships of the processes and information. (Larry English)

Information architecture quality

A component of information quality measuring the degree to which data models and database design are stable, flexible, and reusable, and implement principles of data structure integrity. (Larry English)

Information assessment

See Information quality assessment.

Information chaos

A state of the dysfunctional learning organisation in which there are unmanaged, inconsistent, and redundant databases that contain data about a single type of thing or fact. The information chaos quotient is the number of unmanaged, inconsistent, and redundant databases containing data about a single type of thing or fact. (Larry English)

Information chaos quotient

The count of the number of unmanaged, inconsistent, and redundant databases containing data about a single type of thing or fact. (Larry English)

Information consumer

Person who is accessing, interpreting and using information products, see also : knowledge worker. (Martin Eppler)

Information customer-supplier relationship

The information stakeholder partnerships between the information producers who create information and the knowledge workers who depend on it. (Larry English)

Information directory

A repository or dictionary of the information stored in a data warehouse, including technical and business meta data, that supports all warehouse customers. The technical meta data describes the transformation rules and replication schedules for source data. The business meta data supports the definition and domain specification of the data. (Larry English)

Information float

The length of the delay in the time a fact becomes known in an organisation to the time in which an interested knowledge worker is able to know that fact. Information float has two components : Manual float is the length of the delay in the time a fact becomes known to when it is first captured electronically in a potentially sharable database. Electronic float is the length in time from when a fact is captured in its electronic form in a potentially sharable database, to the time it is “moved” to a database that makes it accessible to an interested knowledge worker. (Larry English)

Information group

A relatively small and cohesive collection of information, consisting of 20-50 attributes and entity types, grouped around a single subject or subset of a major subject. An information group will generally have one or more subject matter experts and several business roles that use the information. (Larry English)

Information life cycle

See Information value/cost chain. )

Information Management (IM)

The function of managing information as an enterprise resource, including planning, organising and staffing, leading and directing, and controlling information. Information management includes managing data as the enterprise knowledge infrastructure and information technology as the enterprise technical infrastructure, and managing applications across business value chains. (Larry English)

Information model

A high-level graphical representation of the information resource requirements of an organisation showing the information classes and their relationships. (Larry English)

Information myopia

A disease that occurs when knowledge workers can see only part of the information they need, caused by not defining data relationships correctly or not having access to data that is logically related because it exists in multiple nonintegrated databases. (Larry English)

Information Overload

A state in which information can no longer be internalised productively by the individual due to time constraints or the large volume of received information. (Martin Eppler)

Information policy

A statement of important principles and guidelines required to effectively manage and exploit the enterprise information resources. (Larry English)

Information presentation quality

The characteristic in which information is presented, whether in a report or document, on a screen, in forms, orally or visually, in a manner to communicate clearly to the recipient knowledge worker to facilitate understanding and enabling taking the right action or making the right decision. (Larry English)

Information preventive maintenance

Establishing processes to control the creation and maintenance of volatile and critical data to keep it maintained at the highest level feasible, possibly including validating volatile data on an appropriate schedule and assessment of that data before critical processes use it. (Larry English)

Information process improvement

The process of improving processes to eliminate data errors and defects. This is one component of data defect prevention. Information process improvement is proactive information quality. (Larry English)

Information producer
  1. An author who is creating or assembling an information product or its elements. (Martin Eppler)
  2. The role of individuals in which they originate, capture, create, or update data or knowledge as a part of their job function or as part of the process they perform. Information producers create the actual information content and are accountable for its accuracy and completeness to meet all information stakeholders’ needs. See also Data intermediary. (Larry English)
Information product improvement

The process of data cleansing, reengineering, and transformation required to improve existing defective data up to an acceptable level of quality. This is one component of information scrap and rework. See also Data cleansing, Data reengineering, and Data transformation. Information product improvement is reactive information quality. (Larry English)

Information product specifications

The set of information resource data (meta data) characteristics that define all characteristics for a process and creating/updating applications can produce quality information. Information product specification characteristics include : data name, definition, domain or data value set (code values or ranges) and the business rules that identify policies and constraints on the potential values. These specifications must be understandable to the information producers who create and maintain the data and the knowledge workers who apply the data in their work. (Larry English)

Information quality


  1. Consistently meeting all knowledge worker and end-customer expectations in all quality characteristics of the information products and services required to accomplish the enterprise mission (internal knowledge worker) or personal objectives (end customer). (Larry English)
  2. The degree to which information consistently meets the requirements and expectations of all knowledge workers who require it to perform their processes. (Larry English)
  3. The fitness for use of information; information that meets the requirements of its authors, users, and administrators. (Martin Eppler)
Information quality assessment

The random sampling of a data collection and measuring it against various quality characteristics, such as accuracy, completeness, validity, nonduplication or timeliness to determine its level of quality or reliability. Also called data quality assessment or data audit. (Larry English)

Information quality characteristic

see Dimension

Information quality contamination

The creation of inaccurate derived data by combining accurate data with inaccurate data. (Larry English)

Information quality decay

The characteristic of data such that formerly accurate data will become not accurate over time because the characteristic about the real world object will change without a corresponding update to the data applied. For example, John Doe’s marital status value of “single” in a database is subject to information quality decay and will become inaccurate the moment he becomes married. (Larry English)

Information quality decay rate

The rate, usually expressed as a percent per year, at which the accuracy of a data collection will deteriorate over time if no data updates are applied, for example, (1) person age decay rate is 100% within one year, decaying at a rate of approximately 1.9% per week; (2) if 17% of a population moves annually, the annual decay rate of address is 17%). (Larry English)

Information quality management

The function that leads the organisation to improve business performance and process effectiveness by implementing processes to measure, assess costs of, improve processes to control information quality, and by defining processes, guidelines, policies, and leading culture transformation and education for information quality improvement. The IQ management function does not “do” the information quality work for the enterprise, but defines processes and facilitates the enterprise to implement the values, principles and habit of continuous process improvement so that everyone in the enterprise takes responsibility for their information quality to meet their information customers’ quality expectations. (Larry English)

Information quality metric or measure

A specific quality measure or test (set of measures or tests) to assess information quality. For example, Product Id will be tested for uniqueness, Customer records will be tested for duplicate occurrences, Customer address will be tested to assure it is the correct address, Product Unit of Measure will be tested to be a valid Unit of Measure domain code, and Order Total Price Amount will be tested to assure it has been calculated correctly. Quality measures will be assessed using business rule tests in automated quality analysis software, coded routines in internally developed quality assessment programs, or in physical quality assessment procedures. Some call information quality metrics filters. (Larry English)

Information Resource Management (IRM)


  1. The application of generally accepted management principles to data as a strategic business asset.
  2. The function of managing data as an enterprise resource. This generally includes operational data management or data administration, strategic information management, repository management, and database administration. See also Information management.
  3. The organisation unit responsible for providing principles and processes for managing the information assets of the enterprise.

(Larry English)

Information scrap and rework

The activities and costs required to cleanse or correct nonquality information, to recover from process failure caused by nonquality information, or to rework or work around problems caused by missing or nonquality information. Analogous to manufacturing scrap and rework. (Larry English)

Information stakeholder

Any individual who has an interest in and dependence on a set of data or information. Stakeholders may include information producers, knowledge workers, external customers, and regulatory bodies, as well as various information systems roles such as database designers, application developers, and maintenance personnel. (Larry English)

Information steward

A role in which an individual has accountability for the quality of some part of the information resource. See Information stewardship. (Larry English)

Information stewardship

Accountability for the quality of some part of the information resource for the well-being of the larger organisation. Every individual within an organisation holds one or more information stewardship roles, based on the nature of their job and its relationship to information, such as creating information, applying it, defining it, modelling it, developing a computer screen to display it or moving it from one database or file to another. See Strategic information steward, Managerial information steward, and Operational information steward. (Larry English)

Information stewardship agreement

A formal agreement among business managers specifying the quality standard and target date for information produced in one business area and used in one or more other business areas. (Larry English)

Information value
  1. Information quality (or alternatively : benefit) in relation to the acquisition and processing costs of information; potential of information to improve decisions by reducing uncertainty. (Martin Eppler)
  2. The measure of importance of information expressed in tangible metrics. Information has potential and realised and value. Potential value is the future value of information that could be realised if applied to business processes where the information is not currently used. Realised value is the actual value derived from information applied by knowledge workers in the accomplishment of the business processes. (Larry English)
Information value/cost chain

The end-to-end set of processes and data stores, electronic and otherwise, involved in creating, updating, interfacing, and propagating data of a specific type from its origination to its ultimate data store, including independent data entry processes, if any. (Larry English)

Information view

A knowledge worker’s perceived relationship of the data elements needed to perform a process, showing the structure and data elements required. A process activity has one and only one information view. (Larry English)

Information view model

A local data model derived from an enterprise model to reflect the specific information required for one business area or function, one organisation unit, one application or system, or one business process. (Larry English)


A set of activities that makes information more comprehensive, concise, convenient, and accessible; combining information sources and aggregating content to ease the cognitive load on the information consumer. (Martin Eppler)

Intentional error

Error introduced consciously. For example, an information producer required to enter an unknown fact like birth date, enters his or her own or some “coded” birth date used to mean “unknown.” See also Inadvertent error. (Larry English)

  1. Being a two-way electronic communication system that involves a user’s orders or responses (Martin Eppler)
  2. The capacity of an information system to react to the inputs of information consumers, to generate instant, tailored responses to a user’s actions or inquiries. Interpretation : the process of assigning meaning to a constructed representation of an object or event. (Martin Eppler)
Interface program

An application that extracts data from one database, transforms it, and loads it into a non-controlled redundant database. Interface programs represent one cost of information scrap and rework in that the information in the first database is not “able” to be used from that source and must be “reworked” for another process or knowledge worker to use. (Larry English)


The technique of supposedly “integrating” application systems by developing “interface programs” or middleware to extract data in one format from a data source and transform it to another format for a data target rather than by standardising the data definition and format. (Larry English)

Internal view

The physical database design or schema in the ANSI 3-schema architecture. (Larry English)


Internal company networks designed to provide a secure forum for sharing information, often in a web-browser type interface. (Martin Eppler)


Acronym for Information Resource Management.

Ishikawa diagram

A chart that can be used to systematically gather the problem causes of quality defects. Sometimes referred to as the 5M- or 6M-chart because most causes can be related to man (e.g., human factors), machine, method, material, milieu (i.e., the work environment), or the medium (the IT-platform). (Martin Eppler)


ISO, the International Organization for Standardization, is an independent, non-governmental organization, the members of which are the standards organizations of the 162 member countries. It is the world’s largest developer of voluntary international standards and facilitates world trade by providing common standards between nations. Over twenty thousand standards have been set covering everything from manufactured products and technology to food safety, agriculture and healthcare. (Wikipedia International Organization for Standardization article)


ISO 8000

ISO 8000, the global standard for Data Quality and Enterprise Master Data, describes the features and defines the requirements for the Data Quality and Portability of Enterprise Master Data. Master Data is typically “internal” business information about clients, products and operations. The standard is currently under development, but is quickly being adopted by many Fortune 500 corporations and certain public agencies involved in the regulation and supervision of financial markets around the world. ISO 8000 is one of the emerging technology standards that large and complex organizations are turning to in order to improve business processes and control operational costs. The standard will be published as a number of separate documents, which ISO calls “parts”. (Wikipedia ISO 8000 article)

ISO 9000

The ISO 9000 family of quality management systems standards is designed to help organizations ensure that they meet the needs of customers and other stakeholders while meeting statutory and regulatory requirements related to a product or program. (Wikipedia ISO 9000 article)



Judgmental quality criteria

Criteria based on personal (subjective) judgment rather than on objective measures (e.g., relevance, appeal). (Martin Eppler)

Just-in-time information

Current information that is delivered in a timely manner (at the time of need), for example through a (profile-based) push mechanism. (Martin Eppler)


  1. Information context; understanding of the significance of information. (Larry English)
  2. Justified true belief, the know-what/-how/-who/-why that individuals use to solve problems, make predictions or decisions, or take actions. (Martin Eppler)
Knowledge base :
  1. That part of a knowledge base system in which the rules and definitions used to build the application are stored. The knowledge base may also include a fact or object storage facility.
  2. A database where the codification of knowledge is kept; usually a set of rules specified in an if . . . then format.

(Larry English)

Knowledge base system

A software system whose application-specific information is programmed in the form of rules and stored in a specific facility, known as the knowledge base. The system uses Artificial Intelligence (AI) procedures to mimic human problem-solving techniques, applying the rules stored in the knowledge base and facts supplied to the system to solve a particular business problem. (Larry English)

Knowledge compression

The skilful activity of extracting the main ideas and concepts from a piece of reasoned information and summarising them in a consistent and concise manner. (Martin Eppler)

Knowledge error

Information quality error introduced as a result of lack of training or expertise. (Larry English)

Knowledge management

The conscious and systematic facilitation of knowledge creation or development, diffusion or transfer, safeguarding, and use at the individual, team- and organisational level. (Martin Eppler)

Knowledge work

Knowledge work is human mental work performed to generate useful information. It involves analysing and applying specialised expertise to solve problems, to generate ideas, or to create new products and services. (Martin Eppler)

Knowledge worker
  1. Highly skilled professionals who are involved in the non-routine production, interpretation, and application of complex information. (Martin Eppler)
  2. The role of individuals in which they use information in any form as part of their job function or in the course of performing a process, whether operational or strategic. Also referred to as an information consumer or customer. Accountable for work results created as a result of the use of information and for adhering to any policies governing the security, privacy, and confidentiality of the information used.

(Larry English)

Knowledge-intensive process

We define a knowledge-intensive process as a productive series of activities that involves information transformation and requires specialised professional knowledge. They can be characterised by their often non-routine nature (unclear problem space, many decision options), the high requirements in terms of continuous learning and innovation, and the crucial importance of interpersonal communication on the one side and the documentation of information on the other. (Martin Eppler)



Adding informative and concise titles to information items so that they can be more easily scanned, browsed, or checked for relevance. Labels should indicate the type of information (e.g., definition, example, rule) and its content (e.g., safety issues). (Martin Eppler)


The quality of information to be easily transformed into knowledge. (Martin Eppler)

Legacy data

Data that comes from files and/or databases developed without using an enterprise data architecture approach. (Larry English)

Legacy system

Systems that were developed without using an enterprise data architecture approach. (Larry English)

Lifetime value (LTV)

See Customer lifetime value.

Lightly summarised

Data that is summarised only one or two levels of hierarchy of summary from the base detailed data. (Larry English)


To sequentially add a set of records into a database or data warehouse. See also Incremental load. (Larry English)


A means of serialising events or preventing access to data while an application or information producer may be updating that data. (Larry English)


A collection of records that describe the events that occur during DBMS execution and their sequence. The information thus recorded is used for recovery in the event of a failure during DBMS execution. (Larry English)

Lower control limit

The lowest acceptable value or characteristic in a set of items deemed to be of acceptable quality. Together with the upper control limit, it specifies the boundaries of acceptable variability in an item to meet quality specifications. (Larry English)


Acronym for Customer Lifetime Value.



The characteristic of an information environment to be manageable at reasonable costs in terms of content volume, frequency, quality, and infrastructure. If a system is maintainable, information can be added, deleted, or changed efficiently. (Martin Eppler)

Management Principle

A general, instructive, concise, and memorable statement that suggests a way of reasoning or acting that is effective and proven to reach a certain goal within an organisational context. (Martin Eppler)

Managerial information steward

The role of accountability a business manager or process owner has for the quality of data produced by his or her processes. (Larry English)

Managerial information stewardship

The fact that a business manager or process owner who has accountability for one or more business processes also has accountability for the integrity of the data produced by those processes. (Larry English)

Managerial information stewardship

The average of a set of values


Acronym for Multidimensional Database. (Larry English)

Measurement curve bundle

The collection of measurement points of a real-world attribute that represents the variation of values of that attribute in the real world. (Larry English)

Measurement system

A collection of processes, procedures, software, and databases used to assess and report information quality. (Larry English)


The median is the value separating the higher half of a data sample, a population, or a probability distribution, from the lower half. In simple terms, it may be thought of as the “middle” value of a data set. For example, in the data set {1, 3, 3, 6, 7, 8, 9}, the median is 6, the fourth number in the sample. The median is a commonly used measure of the properties of a data set in statistics and probability theory.

The basic advantage of the median in describing data compared to the mean (often simply described as the “average”) is that it is not skewed so much by extremely large or small values, and so it may give a better idea of a ‘typical’ value. For example, in understanding statistics like household income or assets which vary greatly, a mean may be skewed by a small number of extremely high or low values. Median income, for example, may be a better way to suggest what a ‘typical’ income is.

Because of this, the median is of central importance in robust statistics, as it is the most resistant statistic, having a breakdown point of 50%: so long as no more than half the data are contaminated, the median will not give an arbitrarily large or small result. (Wikipedia Median article)


Metadata is “data [information] that provides information about other data”. Three distinct types of metadata exist: descriptive metadata, structural metadata, and administrative metadata (Wikipedia Metadata article)


A formalised collection of tools, procedures, and techniques to solve a specific problem or perform a given function. (Larry English)

  1. See Information Quality Metric
  2. A fact type in data warehousing, generally numeric (such as sales, budget, and inventory) that is analysed in different ways or dimensions in decision support analysis. (Larry English)

A detailed method used by large firms to sort and analyse information to better understand their customers, products, markets, or any other phase of their business for which data has been captures. Mining data relies on statistical analyses, such as analysis of variance or trend analysis. (Martin Eppler)


Information that is uninformative and impedes effective and adequate action because it is incorrect, distorted, buried, confusing because it lacks context, manipulated or otherwise difficult to use. (Martin Eppler)


Human error resulting from poor information presentation quality. (Larry English)

Missing value

A data element that has no value has been capture, but for which the real-world object represented has a value. For example, there is no date value for the data element, “last date of service” for a retired Employee whose last day of official employment was June 15, 2002. Contrast with Empty value. (Larry English)

Modal interval

The range interval used to group continuous data values in order to determine a mode. (Larry English)


The mode is the value that appears most often in a set of data. The mode of a discrete probability distribution is the value x at which its probability mass function takes its maximum value. In other words, it is the value that is most likely to be sampled. The mode of a continuous probability distribution is the value x at which its probability density function has its maximum value, so the mode is at the peak. (Wikipedia Mode article)

Moment of Truth

A term coined by Jan Carlzon, former head of Scandinavian Airlines, meaning any instance in which a customer can form an opinion, whether positive or negative, about the organisation. (Larry English)

Monte Carlo

A problem-solving technique that uses statistical methods and random sampling to solve mathematical or physical problems. (Larry English)

Multidimensional Database (MDDB)

A database designed around arrays of data that support many dimensions or views of data (such as product sales by time period, geographic location, and organisation) to support decision analysis. (Larry English)



Algebraic symbol representing the number of items in a set. (Larry English)

Net Present Value (NPV)

The value of a sum of future money expressed in terms of its worth in today’s currency. NPV is calculated by discounting the amount by the discount rate compounded by the number of years between the present and the future date the money is anticipated. (Larry English)


A market trend that describes the strategy of incumbents or existing market players to consciously focus on specialised business models or business areas in order to distinguish themselves from competitors. (Martin Eppler)


The National Institute of Standards and Technology (NIST) is a measurement standards laboratory, and a non-regulatory agency of the United States Department of Commerce. Its mission is to promote innovation and industrial competitiveness. (Wikipedia NIST article)

  1. An uncontrollable common cause factor that causes variability in product quality. (Q) 
  2. A term used in data mining to refer to data with missing values (where one does exist in the real world object or event), empty values (where no value exists for the real world object or event), inaccurate values or measurement bias or data that may be inconsequential or misleading in data analysis or data mining.

(Larry English)

Non-quality costs

The costs that arise due to insufficient levels information quality or data quality defects. Examples are rework or re-entry costs. (Martin Eppler)


A characteristic of information quality measuring the degree to which there are no redundant occurrences of data, in other words, a real world object or event is represented by only one record in a database. (Q) (Larry English)

Nonquality data

Data that is incorrect, incomplete, or does not conform to the data definition specification or meet knowledge workers’ expectations. (Larry English)


The ability to provide proof of transmission and receipt of electronic communication. (Larry English)

Normal form

A level of normalisation that characterises a group of attributes or data elements. (Larry English)


The process of associating attributes with the entity types for which they are inherent characteristics. The decomposition of data structures according to a set of dependency rules, designed to give simpler, more stable structures in which certain forms of redundancy are eliminated. A step-by-step process to remove anomalies in data integrity caused by add, delete, and update actions. Also called non-loss decomposition. (Larry English)


Acronym for Net Present Value


The absence of a data value in a data field or data element. The value may exist for the characteristic of the real world object or event and is missing or unknown, or there may be no value (called “empty”) because the characteristic does not exist in the real world object or event. (Larry English)


  1. A characteristic of information quality that measures how well information is presented to the information consumer free from bias that can cause the information consumer to take the wrong action or make the wrong decision (Q). (Larry English)
  2. expressing or dealing with facts or conditions as perceived without distortion by personal feelings, prejudices, or interpretations. (Martin Eppler)

A specific instance of an entity type. For example, “customer” is an entity type. “John Doe” is an occurrence of the customer entity type. (Larry English)

Occurrence of record

A specific record selected from a group of duplicate records as the authoritative record, and into which data from the other records may be consolidated. Related records from the other duplicate records are re-related to this occurrence of record. (Larry English)


Acronym for Optical Character Recognition


Acronym for Operational Data Store


Acronym for Online analytical processing (Wikipedia Online analytical processing article)

Operational data

Data at a detailed level used to support daily activities of an enterprise. (Larry English)

Operational Data Store (ODS)

A collection of operation or bases data that is extracted from operation databases and standardised, cleansed, consolidated, transformed, and loaded into a enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarised for a data warehouse. The ODS may also be used to audit the data warehouse to assure the summarised and derived data is calculated properly. The ODS may further become the enterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as their operations databases. (Larry English)

Operational information steward

An information producer accountable for the data created or updated as a result of the processes he or she performs. (Larry English)

Optical Character Recognition (OCR)

Optical character recognition (also optical character reader, OCR) is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text. (Wikipedia Optical character recognition article)


As applied to a quality goal, that which meets the needs of both customer and supplier at the same time, minimising their combined costs. (Larry English)


The source or author of a piece of information (may include additional origination parameters, such as date, institution, contributors etc.). (Martin Eppler)


In statistics, an outlier is an observation point that is distant from other observations.[1][2] An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. (Wikipedia Outlier article)

Overloaded data element

A data element that contains more than one type of fact, usually the result of the need to know more types of facts growing faster than the ability to make additions to the data structures. This causes process failure when downstream processes find unexpected data values. (Larry English)



An example or pattern that represents an acquired way of thinking about something that shapes thought and action in ways that are both conscious and unconscious. Paradigms are essential because they provide a culturally shared model for how to think and act, but they can present major obstacles to adopting newer, better approaches. (Larry English)

Paralysis by analysis

When timely decision-making fails to occur because too much low quality information (irrelevant, detailed, obsolete, or poorly organised) is readily available. (Martin Eppler)

Pareto chart

A Pareto chart, named after Vilfredo Pareto, is a type of chart that contains both bars and a line graph, where individual values are represented in descending order by bars, and the cumulative total is represented by the line. (Wikipedia Pareto Chart article)

Pareto principle

The Pareto principle (also known as the 80/20 rule, the law of the vital few, or the principle of factor sparsity)[1] states that, for many events, roughly 80% of the effects come from 20% of the causes. (Wikipedia Pareto principle article)


The electronic analysis of data to break into meaningful patterns or attributes for the purpose of data correction, or record matching, de-duplication and consolidation. (Larry English)


The relationship of business personnel and information systems personnel in the planning, requirements analysis, design, and development of applications and databases. (Larry English)


Acronym for Plan-Do-Check-Act (Wikipedia PDCA article)

PDCA cycle

PDCA (plan–do–check–act or plan–do–check–adjust) is an iterative four-step management method used in business for the control and continual improvement of processes and products. It is also known as the Deming circle/cycle/wheel, Shewhart cycle, control circle/cycle, or plan–do–study–act (PDSA). PDCA (plan–do–check–act or plan–do–check–adjust) is an iterative four-step management method used in business for the control and continual improvement of processes and products. It is also known as the Deming circle/cycle/wheel, Shewhart cycle, control circle/cycle, or plan–do–study–act (PDSA).  (Wikipedia PDCA article)

Perceived needs

The requirements that motivate customer action based upon their perceptions. For example, a perceived need of a car purchaser is that a convertible will enhance his or her attractiveness. See also Real needs and Stated needs. (Larry English)

Personal data

Data that is of interest to only one organisation component of an enterprise, (e.g., task schedule for a department project). Contrasted with Enterprise data. (Larry English)


The act of modifying content or an information system to customise it to the needs and preferences of an information consumer. (Martin Eppler)

Physical database design

Mapping of the conceptual or logical database design data groupings into the physical database areas, files, records, elements, fields, and keys while adhering to the physical constraints of the hardware, DBMS software, and communications network to provide physical data integrity while meeting the performance and security constraints of the services to be performed against the database. (Larry English)

Poisson distribution

In probability theory and statistics, the Poisson distribution (French pronunciation [pwasɔ̃]; in English usually /ˈpwɑːsɒn/), named after French mathematician Siméon Denis Poisson, is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event. (Wikipedia Poisson distribution article)

Poka Yoke

Poka-yoke [poka joke] is a Japanese term that means “mistake-proofing” or “inadvertent error prevention”. The key word in the second translation, often omitted, is “inadvertent”. There is no Poka Yoke solution that protects against an operator’s sabotage, but sabotage is a rare behavior among people.[1] A poka-yoke is any mechanism in a lean manufacturing process that helps an equipment operator avoid (yokeru) mistakes (poka). Its purpose is to eliminate product defects by preventing, correcting, or drawing attention to human errors as they occur. (Wikipedia Poka Yoke article)

Policy Deployment

See Hoshin planning.


An entire group of objects, items or data from which to sample for measurement or study. Also called a Universe. Contrast with Sample. (Larry English)

Post condition

A data integrity mechanism in object orientation that specifies an assertion, condition, business rule or guaranteed result that will be true upon completion of an operation or method; else, the operation or method fails. (Larry English)

Pragmatic (information quality dimension)

The characteristics that make information useful or usable. (Martin Eppler)


The closeness of agreement among a set of results. (Wikipedia Accuracy and Precision article)


A data integrity mechanism in object orientation that specifies an assertion, condition or business rule that must be true before invoking an operation or method; else, the operation or method cannot be performed. (Larry English)

Presentation format

The specification of how an attribute value or collection of data is to be displayed. (Larry English)

Primary key

The attribute(s) that are used to uniquely identify a specific occurrence of an entity, relation, or file. A primary key that consists of more than one attribute is called a composite (or concatenated) primary key. (Larry English)

Prime word

A component of an attribute name that identifies the entity type the attribute describes. (Larry English)

Principles of information quality

They describe how the quality of information can be increased by focusing on crucial criteria and crucial improvement measures. (Martin Eppler)

Procedural error

Error introduced as a result of failure to follow the defined process. (Larry English)

  1. A defined set of activities to achieve a goal or end result. An activity that computes, manipulates, transforms, and/or presents data. A process has identifiable begin and end points. See Business process. (Larry English)
  2. A group of sequenced tasks, which eventually lead to a value for (internal or external) customers. (Martin Eppler)
Process control

The systematic evaluation of performance of a process, taking corrective action if performance is not acceptable according to defined standards. (Larry English)

Process management

The process of ensuring that a process is defined, controlled to consistently produce products that meet defined quality standards, improved as required to meet or exceed all customer expectations and optimised to eliminate waste and non-value adding. (Larry English)

Process management cycle

A set of repeatable tasks for understanding customer needs, defining a process, establishing control, and improving the process. (Larry English)

Process management team

A team, including a process owner and staff, to carry out process ownership obligations. (Larry English)

Process owner

The person responsible for the process definition and/or process execution. The process owner is the managerial information steward for the data created or updated by the process, and is accountable for process performance integrity and the quality of information produced. (Larry English)


The output or result of a process. (Larry English)

Product satisfaction

The measure of customer happiness with a product. (Larry English)


Measures of a population based on social, personality and lifestyle behaviours. (Larry English)



Acronym for Quality Function Deployment.


See Taguchi quality loss function.

  1. degree to which a set of inherent characteristics (3.5.1) fulfils requirements (3.1.2) (ISO 9000:2005)
  2. Consistently meeting or exceeding customers’ expectations. (Larry English)
  3. the totality of features of a product or service that fulfil stated or implied needs (ISO 8402). The correspondence to specifications, expectations or usage requirements. The absence of errors. (Martin Eppler)
Quality assessment

An independent measurement of a product’s or service’s quality. (Larry English)

Quality characteristic
  1. An identifiable aspect or feature of a product, process or service that a customer deems important in order to be considered a "quality" product or service.
  2. A distinct attribute or property of a product, process or service that can be measured for conformance to a specific requirement. See Information quality characteristic.

(Larry English)

Quality circle

An ad hoc group formed to correct problems in or to improve a shared process. The goal is an improved work environment and productivity and quality. (Larry English)

Quality Function Deployment (QFD)

The involvement of customers in the design of products and services for the purpose of better understanding customer requirements, and the subsequent design of products and services that better meet their needs on initial product delivery. (Larry English)

Quality goal

See Quality standard

Quality improvement

A measurable and noticeable improvement in the level of quality of a process and its resulting product. (Larry English)

Quality loss function (QLF)

See Taguchi quality loss function.

Quality Management

See Total Quality Management. Generally speaking, the systematic on-going effort of a company to assure that its products and service consistently meet or exceed customer expectations. (Martin Eppler)

Quality measure / metric

A metric or characteristic of information quality, such as percent of accuracy or average information float, to be assessed. (Larry English)

Quality standard

A mandated or required quality goal, reliability level, or quality model to be met and maintained. (Larry English)



Pearson correlation coefficient, also referred to as the Pearson’s r or Pearson product-moment correlation coefficient, is a measure of the linear correlation between two variables X and Y. (Wikipedia Pearson correlation coefficient article)


Acronym for Rapid Application Development. The set of tools, techniques and methods that results in at least one-order-of-magnitude acceleration in the time to develop an application with no loss in quality when using QFD techniques compared to using conventional techniques. (Larry English)


Acronym for Rapid Data Development. An intensive group process to rapidly develop and define sharable subject area data models involving a facilitator, knowledge workers, and information resource management personnel, using compression planning and QFD techniques. (Larry English)

Random number generator

A software routine that selects a number from a range of values in such a way that any number within the range has an equal likelihood of being selected. This may be used to identify which records from a database to select for assessment. (Larry English)

Random sampling

The sampling of a population in which every item in the population is likely to be selected with equal probability. This is also called statistical sampling. See also Cluster sampling, Systematic sampling, and Stratified sampling. (Larry English)

Rating (of information or of a source)

The (standardised) evaluation of a piece of information or its source according to a given scale by one or several reviewers or readers. (Martin Eppler)

Real needs

The fundamental requirements that motivate customer decisions. For example, a real need of a car customer is the kind of transportation it provides. See also Stated needs and Perceived needs. (Larry English)

Realised information value

See Information value.

Reasonability tests

Edit and validation rules applied to assure that a data value is within an expected range of values or is a realistic value. (Larry English)


A collection of related fields representing an occurrence of an entity type. (Larry English)

Record linkage

The process of matching data records within a database or across multiple databases to match data that represents one real world object or event. Used to identify potential duplicates for "de-duping" (eliminating duplicate occurrences) or consolidation of attributes about a single real world object. (Larry English)

Record of origin

The first electronic file in which an occurrence of an entity type is created. (Larry English)

Record of reference

The single, authoritative database file for a collection of fields for occurrences of an entity type. This file represents the most reliable source of operational data for these attributes or fields. In a fragmented data environment, a single occurrence may have different collections of fields whose record of reference is in different files. (Larry English)


Restoring a database to some previous condition or state after system, or device, or program failure. See also Commit. (Larry English)

Recovery log

A collection of records that describe the events that occur during DBMS execution and their sequence. The information thus recorded is used for recovery in the event of a failure during DBMS execution. (Larry English)

Recursive relationship

A relationship or association that exists between entity occurrences of the same type. For example, an organisation can be related to another organisation as a Department manages a Unit. (Larry English)


The provision of information beyond necessity. (Martin Eppler)


A method for radical transformation of business processes to achieve breakthrough improvements in performance. (Larry English)

Reference data

A term used to classify data that is, or should be, standardised, common to and shared by multiple application systems, such as Customer, Supplier, Product, Country, or Postal Code. Reference data tends to be data about permanent entity types and domain value sets to be stored in tables or files, as opposed to business event entity types. (Larry English)

Referential integrity

Integrity constraints that govern the relationship of an occurrence of one entity type or file to one or more occurrences of another entity type or file, such as the relationship of a customer to the orders that customer may place. Referential integrity defines constraints for creating, updating, or deleting occurrences of either or both files. (Larry English)


The manner in which two entity or object types are associated with each other. Relationships may be one to one, one to many, or many to many, as determined by the meaning of the participating entities and by business rules. Synonymous with association. Relationships can express cardinality (the number of occurrences of one entity related to an occurrence of the second entity) and/or optionality (whether an occurrence of one entity is a requirement given an occurrence of the second entity). (Larry English)

Reliability (of an infrastructure)

The characteristic of an information infrastructure to store and retrieve information in an accessible, secure, maintainable, and fast manner. (Martin Eppler)


See Data replication.


A database for storing information about objects of interest to the enterprise, especially those required in all phases of database and application development. A repository can contain all objects related to the building of systems including code, objects, pictures, definitions, etc. Acts as a basis for documentation and code generation specifications that will be used further in the systems development life cycle. Also referred to as design dictionary, encyclopedia, object-oriented dictionary, and knowledge base. (Larry English)


The characteristic of a source to be consistently associated with the provision of high quality information. (Martin Eppler)


Customer expectations of a product or service. May be formal or informal, or they may be stated, required or perceived needs. (Larry English)

Response time

The delay between an initial information request and the provision of that information by the information system. (Martin Eppler)

Return on Investment (ROI)

A statement of the relative profitability generated as a result of a given investment. (Larry English)

Reverse engineering

The process of taking a complete system or database and decomposing it to its source definitions, for the purpose of redesign. (Larry English)


The systematic evaluation of information such as articles, papers, project summaries, etc. by at least one independent qualified person according to specified criteria (such as relevance to target group, methodological rigour, readability, etc.). (Martin Eppler)


An acronym for Return on Investment.

Role type

A classification of the different roles occurrences of an entity type may play, such as an organisation may play the role of a customer, supplier, and/or competitor. (Larry English)


The process of restoring data in a database to the state at its last commit point. (Larry English)

Root cause

The underlying cause of a problem or factor resulting in a problem, as opposed to its precipitating or immediate cause. (Larry English)


A knowledge representation formalism containing knowledge about how to address a particular business problem. Simple rules are often stated in the form : "If <antecedent> then <consequent>, where <antecedent> is a condition (a test or comparison) and <consequent> is an action (a conclusion or invocation of another rule)." An example of a rule would be "If the temperature of any closed valve is greater than or equal to 100°F, then open the valve." (Larry English)


Salience (of information)

The quality of information to be interesting or intriguing. (Martin Eppler)


An item or subset of items, or data about an item or a subset of items that comes from a sampling frame or a population. A sample is used for the purpose of acquiring knowledge about the entire population. (Larry English)


The technique of extracting a small number of items or data about those items from a larger population of items in order to analyse and draw conclusions about the whole population. See Random sample, Cluster sampling, Stratified sampling,and Systematic sampling. (Larry English)

Sampling frame

A subset of items, or data about a subset of items of a population from which a sample is to be taken. (Larry English)


Acronym for ISO/IEC JTCI Sub-Committee for OSI data management and distributed transaction processing.


The complete description of a physical database design in terms of its tables or files, columns or fields, primary keys, relationships or structure, and integrity constraints. (Larry English)

Scrap and rework

The activities and costs required to correct or dispose of defective manufactured products. See Information scrap and rework. (Larry English)


Acronym for Systems Development Life Cycle.

Seamless integration

True seamless integration is integration of applications through commonly defined and shared information, with managed, replication of any redundant data. False "seamless" integration is use of interface programs to transform data from one applications databases to another applications databases. See "Interfaceation." (Larry English)

Second Normal Form (2NF)
  1. A relation R is in second normal form (2NF) if and only if it is in 1NF and every nonkey attribute is fully functionally dependent on the primary key.
  2. A table is in 2NF if each nonidentified attribute provides a fact that describes the entity identified by the entire primary key and not part of it. See Functional dependence.

(Larry English)


The prevention of unauthorised access to a database and/or its contents for updating, retrieving, or deleting the database; or the prevention of unauthorised access to applications that have authorised access to databases. (Larry English)

Security (of information)

Measures taken to guard information against unauthorised access, espionage or sabotage, crime, attack, unauthorised modification, or deletion. (Martin Eppler)

Semantic equivalence

See Equivalence.

Sensitivity analysis

A procedure to determine the sensitivity of the outcomes of an alternative to changes in its parameters; it is used to ascertain how a given model output depends upon the input parameters. (Martin Eppler)


An instrument that can measure, capture information about or receive input directly from external objects or events. (Larry English)

Shewhart cycle

See Plan-Do-Check-Act cycle.

Side effect

The state that occurs when a change to a process causes unanticipated conditions or results beyond the planned result, such as when an improvement to a process creates a new problem. (Larry English)

Six Sigma (6σ )
  1. Six standard deviations, used to describe a level of quality in which six standard deviations of the population fall within the upper and lower control limits of quality and in which the defect rate approaches zero, allowing no more than 3.4 defects per million parts.
  2. A methodology of quality management originally developed by Motorola.

(Q) (Larry English)


Acronym for Subject Matter Expert.

Source information producer

The point of origination or creation of data or knowledge within the organisation. (Larry English)


Acronym for Statistical Process Control.

Special cause

A source of unacceptable variation or defect that comes from outside the process or system. (Larry English)


The process of aggregating subsets of objects of a type, based upon differing attributes and behaviours. The resulting subtypes specialisation inherits characteristics from the more generalised type. (Larry English)


Describes how much variation there is in a set of items. (Larry English)


Acronym for Statistical Quality Control.


A characteristic of information quality measuring the degree to which information architecture or a database is able to have new applications developed to use it with minimal modification of the existing objects and relationships, only adding new objects and relationships. (Larry English)

Stability (of information)

The quality of information or its infrastructure to remain unaltered for an extended period of time. (Martin Eppler)

Standard deviation (σ or s)

A widely used measure of variability that expresses the measure of spread in a set of items. The standard deviation is a value such that approximately 68 percent of the items in a set fall within a range of the mean plus or minus the standard deviation. For data from a large sample of a population of items, the standard deviation σ (standard deviation of a population) or s (standard deviation of a sample) is expressed as :
Standard deviation formula
s (σ ) = standard deviation of a sample (population)
d = the deviation of any item from the mean or average
n = the number of items in the sample
σ = "the sum of".
(Larry English)


A stage in a life cycle that a real-world-object may exist in at a point in time and which is reflected in a state of existence that an entity occurrence or object may be in at a point in time. A real-world object comes into a specific state of existence through some event. The state of an object is represented in a database by the values of its attributes at a point in time. (Larry English)

State transition diagram

A representation of the various states of an entity or object along with the triggering events. See also Entity life cycle. (Larry English)

Stated needs

Requirements as seen from the customers’ viewpoint, and as stated in their language. These needs may or may not be the real requirements. See also Real needs and Perceived needs. (Larry English)

Statistical control chart

See Control chart.

Statistical process control (SPC)

The application of statistical methods to control processes to provide acceptable quality. One component of statistical quality control. (Larry English)

Statistical quality control (SQC)

The application of statistics and statistical methods to assure quality. Processes and methods for measuring process performance, identifying unacceptable variance, and applying corrective actions to maintain acceptable process control. SQC consists of statistical process control and acceptance sampling. (Larry English)

Stored procedure

A precompiled routine of code stored as part of a database and callable by name. (Larry English)

Strategic information steward

The role a senior manager holds as being accountable for a major information resource of subject, authorises business information stewards and resolves business rule issues. (Larry English)

Stratified sampling

Sampling a population that has two or more distinct groupings, or strata, in which random samples are taken from each stratum to assure the strata are proportionately represented in the final
sample. (Larry English)

Subject area

See Business resource category.

Subject database

A physical database built around a subject area. (Larry English)

Subject Matter Expert (SME)

A business person who has significant experience and knowledge of a given business subject or function. (Larry English)


The phenomenon such that the accomplishment of departmental goals minimises the ability to accomplish the enterprise goals. (Larry English)


See Entity subtype.


See Entity supertype.


The process of making data equivalent in two or more redundant databases. (Larry English)

Synchronous replication

Replication in which all copies of data must be updated before the update transaction is considered complete. This requires two-phase commit. (Larry English)


A word, phrase, or data value that has the same or nearly the same meaning as another word, phrase or data value. (Larry English)

System log

Audit trail of events occurring within a system (e.g., transactions requested, started, ended, accessed, inspected, and updated). (Larry English)

System of record

See Record of reference. The term system of record is meaningless when defining the authoritative record in an integrated, shared data environment where data may be updated by many different application systems within a single database. (Larry English)

Systematic sampling

Sampling of a population using a technique such as selecting every eleventh item, to ensure an even spread of representation in the sample. (Larry English)

Systems approach

The philosophy of developing applications as vertical functional projects independent of how they fit within the larger business value chain. This approach carves out an application development project into a standalone project and does not attempt to define data to be shared across the business value chain or to meet all information stakeholder needs. (Larry English)

Systems Development Life Cycle (SDLC)

The methodology of processes for developing new application systems. The phases change from methodology to methodology, but generically break down into the phases of requirements
definition, analysis, design, testing, implementation, and maintenance. If data definition quality is lacking, this process requires improvement. (Larry English)

Systems thinking

The fifth discipline of the learning organisation, this sees problems in the context of the whole. Applications developed with systems thinking see the application scope within the context of its value chain and the enterprise as a whole, defining data as a sharable and reusable resource. (Larry English)


Tacit knowledge

Know-how that is difficult to articulate and share; intuition or skills that cannot easily be put into words. The term was coined in the 1950s by Michael Polanyi (Martin Eppler)

Taguchi Quality Loss Function (QLF)

The principle, for which Dr. Genichi Taguchi who won the Japanese Deming Prize in 1960, that deviations from the ideal cause different degrees of loss in quality and economic loss. Small deviations in some critical characteristics can cause significantly more economic loss than even large deviations in other characteristics. The Loss can be roughly expressed as a formula
Taguchi Quality Loss Function formula

L = overall economic "Loss" caused by deviation from the target quality
D = the "Deviation" from the target quality expressed in standard deviations
C = the "Cost" of the improvement to produce it to the target quality

Some information quality problems are likewise critical and cause significantly more economic loss than others, and become the higher priority for process improvement. (Q) (Taguchi, "Robust Quality," Harvard Business Review, Jan-Feb 1990, p. 68.) (Larry English)


The cooperation of many within different processes or business areas to increase the quality or output of the whole. (Larry English)

Technical information resource data

The Set of information resource data that must be known to information systems and information resource management personnel in order to develop applications and databases. (Larry English)

Third Normal Form (3NF)
  1. A relation R is in third normal form (3NF) if and only if it is in 2NF and every nonkey attribute is nontransitively dependent upon the primary key.
  2. A table is in 3NF if each nonkey column provides a fact that is dependent only on the entire key of the table.

(Larry English)

  1. A characteristic/dimension of information quality measuring the degree to which data is available when knowledge workers or processes require it. (Larry English)
  2. coming early or at the right, appropriate or adapted to the times or the occasion. (Martin Eppler)
Total Data Quality Management (TDQM) cycle

The TDQM cycle encompasses four components. The definition component of the TDQM cycle identifies IQ dimensions, the measurement component produces IQ metrics, the analysis component identifies root causes for IQ problems and calculates the impacts of poor-quality information, and finally, the improvement component provides techniques for improving IQ. See Huang et al. (1999). (Martin Eppler)

Total Quality Management
  1. Techniques, methods, and management principles that provide for continuous improvement to the processes of an enterprise. (Larry English)
  2. a management concept (and associated tools) that involves the entire workforce in focusing on customer satisfaction and continuous improvement. (Martin Eppler)

Acronym for Two Phase Commit Protocol.


Acronym for Total Quality Management.


The quality of information to be linked to its background or sources. (Martin Eppler)


A conflict among two qualities of information that tend to be mutually exclusive. (Martin Eppler)

Transaction consistency

The highest isolation level that allows an application to read only committed data and guarantees that the transaction has a consistent view of the database, as though no other transactions were active. All read locks are kept until the transaction ends. Also known as serialisable.(Larry English)


See Data transformation.


A software device that monitors the values of one or more data elements to detect critical events. A trigger consists of three components : a procedure to check data whenever it changes, a set or range of criterion values or code to determine data integrity or whether a response in called for, and one or more procedures that produce the appropriate response. (Larry English)

Trusted database

Data that has been secured and protected from unauthorised access. (Larry English)

Two-phase commit

In multithreaded processing systems it is necessary to prevent more than one transaction from updating the same record at the same time. Where each transaction may need to update more than one record or file, the two-phase commit protocol is often used. Each transaction first checks that all the necessary records are available and contain the required data, simultaneously locking each one. Once it is confirmed that all records are ready and locked, the updates are applied and the locks freed. If any record is not available, the whole transaction is aborted and all other records are unlocked and left in their original state. (Larry English)

Two-stage sampling

Sampling a population in two steps. The first step extracts sample items from a lot of common groupings of items such as sales orders by order taker. The second stage takes a second sample from the items in the primary or first stage samples. (Larry English)


Uncommitted read

The lowest isolation level that allows an application to read both committed and uncommitted data. Should be used only when one does not need an exact answer, or if one is highly assured the data is not being updated by someone else. (Also known as read uncommitted, read through, or dirty read). (Larry English)


A state of a unit of recovery that indicates that the unit of recovery’s changes to recoverable database resources must be backed out. (Larry English)

Unit of recovery

A sequence of operations within a unit of work between points of consistency. (Larry English)

Unit of work

A self-contained set of instructions performing a logical outcome in which all changes are performed successfully or none of them is performed. (Larry English)


See Population.


Causing to change values in one or more selected occurrences, groups, or data elements stored in a database. May include the notion of adding or deleting data occurrences. (Larry English)

Upper control limit

The highest acceptable value or characteristic in a set of items deemed to be of acceptable quality. Together with the lower control limit, it specifies the boundaries of acceptable variability in an item to meet quality specifications. (Larry English)


The characteristic/dimension of an information environment to be user-friendly in all its aspects (easy to learn, use, and remember). (Martin Eppler)


The quality of having utility and especially practical worth or applicability. (Martin Eppler)


An unfortunate term used to refer to the role of people to information technology, computer systems, or data. The term implies dependence on something, or one who has no choice, or one who is not actively involved in the use of something. The term is inappropriate to describe the role of information producers and knowledge workers who perform the work of the enterprise, employing information technology, applications and data in the process. The role of business personnel to information technology, applications, and data, is one of information producer and knowledge worker. The relationship of business personnel to information systems personnel is not as users, but as partners. If Industrial-Age personnel were [machine] "operators" or "workers," then Information-Age personnel are "knowledge workers." (Larry English)


The usefulness of information to its intended consumers, including the public. (OMB 515) (Larry English)



Evaluating and checking the accuracy, consistency, timeliness and security of information, for example by evaluating the believability or reputation of its source. (Martin Eppler)


A characteristic/dimension of information quality measuring the degree to which the data conforms to defined business rules. Validity is not synonymous with accuracy, which means the values are the correct values. A value may be a valid value, but still be incorrect. For example, a customer date of first service can be a valid date (within the correct range) and yet not be an accurate date. (Larry English)

  1. Relative worth, utility, or importance.
  2. An abstraction with a single attribute or characteristic that can be compared with other values, and may be represented by an encoding of the value.

(Larry English)

Value chain

An end-to-end set of activities that begins with a request from a customer and ends with specific benefits for a customer,
either internal or external. Also called a business process or value stream. See Information value chain and Business value chain. (Larry English)

Value completeness

See Completeness.

Value stream

See Value chain.

Value-centric development

A method of application development that focuses on data as an enterprise resource and automates activities as a part of an integrated business value chain. Value-centric development incorporates "systems thinking," which sees an application as a component within its larger value chain, as opposed to a "systems approach," which isolates the application as a part of a functional or departmental view of activity and data. (Larry English)

Variance (v or σ)

The mean of the squared deviations of a set of values, expressed as :
Variance formula


A presentation of data from one or more tables. A view can include all or some of the columns contained in the table or tables on which it is defined. See also Information view. (Larry English)

Visual management

The quality management technique of providing instruction and information about a task in a clear and visible way so that personnel can maximise their productivity. (Larry English)


The use of graphic means to represent information. (Martin Eppler)

Voice of the customer

Documentation of the wants and needs of a product or service, including customer verbatims (actual words used) and reworded data into specific implications for the product or service. (Larry English)

Voice of the engineer

Documentation of the specification required to meet a quality requirement for a product or service as made by the engineer of a product or designer of a service. (Larry English)

Voice of the process

Statistical data from or out of a process that indicates the process stability or capability that provides feedback to process performers as a tool for continual improvement. (Larry English)



Knowledge in context. Knowledge applied in the course of actions. (Larry English)

World class

The level of process performance that is as good as, or better than, the best competitors in the performance of a process type or in the quality of a product type. (Larry English)



The algebraic symbol representing a set of values. (Larry English)

x̄ (x bar)

The algebraic symbol x̄ representing the mean, or average, of a set of values. (Larry English)

n (X Sigma n)

Formula to find the standard deviation(s) of the X values. Sometimes written as σn. (Larry English)


A generic mark-up language that can be used to structure on-line documents meaningfully. (Martin Eppler)


Zero defects

A state of quality characterised by defect-free products or Six-Sigma level quality. See Six Sigma. (Larry English)

Zero-faults or zero errors

The quality of information to be one hundred percent correct. (Martin Eppler)