Data Analytics

The term "data analytics" generally relates to processes used to inspect data — usually large quantities of data — and to transform that data into useful information. In the case of fraud, data analytics may be used to identify and isolate patterns of fraudulent activity. Data analytics includes a number of techniques, approaches and practices, such as data mining, predictive analytics and business intelligence.

In the fight against fraud, certain types of digital analysis, particularly the application of Benford's Law, described below, can be helpful in identifying fraudulent activities. While a number of commercial tools exist specifically to support data analytics, for smaller data sets, widely available database and spreadsheet packages can be employed to support data analytics. For example, data analytics may be used to identify fraudulent prescription and money laundering schemes.

Data mining is an extraordinarily powerful and useful tool in the fight against fraud that has been augmented by advances in information technology. It is, in essence, the process of identifying patterns in large data sets. It incorporates aspects of statistics, database management and artificial intelligence. Given the power of technology, entire data sets, not merely samples, can be evaluated. Data mining can be said to involve five types of activities: gathering data and establishing relationships between variables in the data; amalgamating data sets by common characteristics; isolating patterns within the data sets; deriving functions that govern the patterns and that can be used predictively; and, validating the derived functions. In the fraud arena, data can be analyzed to see if it fits into one or more known patterns of fraudulent activity. An example of this is the application of Benford’s Law to large data sets. Benford’s Law is a form of digital analysis that, among other things, predicts that the initial digits of a naturally occurring set of numbers will be distributed in a non-random way and that otherwise unexplained variances for the predicted distribution of initial digits may be an indication of fraudulent activity. Spatial data mining, as another example, has been used to identify patterns of fraud geographically, such as the pattern of physicians, pharmacies and patients relating to the specious prescriptions of drugs. While data analytics have existed as long as data, it is only recent advances in technology —particularly the development of expansive data warehouses — that allow vast amounts of data to be efficiently and effectively mined. A number of tools — an inventory that changes almost daily — exist to help management and auditors mine their data in the effort to combat fraud.                            

risks risks risks

Activity Indicative of Potential Fraud

Even dollar transactions.

Incorrect totals.

High standard deviation.

Duplicate transactions.

Unusual time lags.

For duplicate payments, missing check or invoice numbers.

Common with P-Cards and travel expenses.

Purchase orders/invoices where totals are not based on stated unit prices and quantities ordered.

Accounts receivable/payable, P-Cards, travel reimbursements that have unusually high standard deviation values.

Invoices/payments with same vendor invoice number, date and amount.

Payroll payments to same bank account, same date and amount.

Invoice date is prior to purchase order date.

Invoice date is too soon compared to payment date or invoice due date.

The Effective Use of Benford's Law to Assist in Detecting Fraud in Accounting Data, by Cindy Durtschi, William Hillison and Carl Pacini

Bypassing Transaction Authorization Limits

Split transactions. Look for multiple purchase orders, requisitions and/or invoices where dates are the same or 1-2 days apart.

Conflict of Interest

Employees who are also vendors. Managing Risks in Vendor Relationships

General Guidance

Data analytics is not a fraud type, a business process, per se, or a program. Rather it is a methodology that can be employed to combat fraud. It does this by using information technology to spot red flags and to examine more data than could reasonably examined by other means. Some of the methods used by governments (and the results those methods produced) are listed in the column to the right. Some types of patterns that can be identified using Data Analytics are discussed below.

AGA's Research Series: Leveraging Data Analytics in Federal Organizations

Fraud Deterrence Strategies and Tools for Government Auditors

Data Mining 101: Tools and Techniques, by The Institute of Internal Auditors (IIA)

DOJ and HHS Announce Efforts to Obtain Proactive Data Mining Tools to Supplement Anti-Fraud Efforts, by Ellyn L. Sternfield; Mintz, Levin, Cohn, Ferris, Glovsky and Popeo, P.C.

Hi-end Data Mining Tool Comparison, by Elder Research

Using Data Mining to Detect Insurance Fraud, from

Fraud Detection, by StatSoft

Data Mining and the Auditor's Responsibility, by By Bob Denker, CISA, CIA, CFE

Distributed Data Mining in Credit Card Fraud Detection, by Philip K. Chan, Florida Institute of Technology and Wei Fan, Andreas L. Prodromidis, and Salvatore J. Stolfo, Columbia University

Fighting Identity Fraud with Data Mining, from Safran

Examples of Data Analytics