Skip to: search, navigation, or content.


Indiana University Bloomington

PwC Faculty Fellow

Bog Index Data for 10-K Filings

This dataset contains Bog Index scores for 10-K filings filed since 1994. The Bog Index is described and validated in Bonsall, Leone, Miller and Rennekamp (2017).

Please note that consistent with Bonsall, Leone, Miller and Rennekamp (2017) the Bog Index is calculated using only the text from the 10-K filing itself (i.e., all exhibits except for exhibit 13 are excluded). We only provide the Bog Index when there are at least 3,000 words remaining in the parsed document. In cases where there are less than 3,000 words we mark the Bog Index as missing. Detailed parsing procedures are discussed in the internet appendix to the paper. The related parsing code to parse the 10-K filings is available on Samuel Bonsallís website.

This dataset is freely available. We only request that if you use a data you reference our paper and acknowledge the data source.


References:
Bonsall, S., A. Leone, B. Miller, and K. Rennekamp. (2017). A Plain English Measure of Financial Reporting Readability. Journal of Accounting and Economics 63 (2-3): 329-357.

Bog Scores from January 1, 1994 to March 31, 2018
Bog Index Data for 10-K Filings Dataset 1994-2018 CSV Format