Benford’s Law states that numbers for many real-life data sources have a distribution of leading digits that follow the following equation:
In other words, the first digit of an arbitrary number is the digit “1” around 30.1% of the time, with successive digits appearing less frequently, culminating in the digit “9” around 4.6% of the time.
As a simple exercise, consider the numbers from 1 to 20. The leading digit “1” appears 10 times (1, 10..19), the digit “2” appears twice (2, 20), and the other digits appear only once. As we expand the range of possible numbers, the distribution of leading digits converges on the Benford Law distribution.
Not all real-life sources follow this distribution. For example, the distribution of eyes per person, or children per family do not follow this distribution because the range of possible values is quite narrow. But numbers that can vary over many orders of magnitude, such as leaves on a tree, or ants in an ant nest, or income figures for a large corporation, should all follow Benford’s Law very closely.
In Benford’s Law and the Decreasing Reliability of Accounting Data for US Firms, Jialan Wang examined data from 20,000 firm’s SEC filings, with some interesting results:
Deviations from Benford’s law have increased substantially over time, such that today the empirical distribution of each digit is about 3 percentage points off from what Benford’s law would predict.While these time series don’t prove anything decisively, deviations from Benford’s law are compellingly correlated with known financial crises, bubbles, and fraud waves. And overall, the picture looks grim. Accounting data seem to be less and less related to the natural data-generating process that governs everything from rivers to molecules to cities. … And it’s just one more reason for investors to beware.