Tuesday 2 May 2017

What do filers do with all the XBRL data items at their disposal?

This is the latest in a series of ponderings on whether the XBRL dream has come true?

In our XBRL to XL database, we have 20,000 data items that originate from the US-GAAP Taxonomies. That's a heck of alot of data items. Admittedly that includes alot of bumf that is just there for presentation or connecting stuff (abstract items) or axis (column headings) and items that have subsequently been deprecated. But compare that to S&P Compustat's 500 odd or the 1600 data items we use for our standardisation routines.

I guess the FASB were worried about companies using extension tags, although perhaps there are more direct routes of limiting i.e. prohibiting extensions.

Of course the more there are, the greater the risk of mistakes. Choosing the wrong item over another in the same section is not easy to detect. Usual validation routines won't pick it up. And none of this stuff is audited. Ultimately it is up to the company, or whoever they've outsourced their XBRL tagging to, to pick it up.

The kinda advice stressed by Ernst & Young on the release of new taxonomies is typical and not surprising.

"As they move to a new taxonomy, companies should review their practices for selecting XBRL tags and thoroughly search the tags in the new taxonomy".

So what value do we as users get out of all this wonderful detail?

Well lets have a look at disclosure levels for our probably over the top standard data set of 1632 items (lets refer to this set as the standidZd 1600. So a couple of weeks ago I did a chop of our SEC XBRL to XL database (powers XBRL sheet) to check on item usage. I thought it would be interesting to look at usage for our reference set of industrial companies we established when looking at filing histories (seeing our standiZd 1600 was designed explicitly for non financial companies). Keeping everything consistent with all our other data sampling in this series, I only included 10-K's but all the 10-K's they've ever filed in XBRL. I should stress that this is before we've applied our standardisation routines.


So what is this table telling us? That 263 of our standiZd 1600 have NEVER been used by any of the top US industrial companies. And a whopping 65% of what we regard as the most comparable individual data items have been used by 5% or less. So if we were trying to compare any one of these 1000 odd items, we would have at best only 11 other companies (and roughly on average probably only six) to compare them with. Lets hope a couple of them happen to be in the same sector! And we can probably forget about creating any meaningful aggregates for these items.

What happens if we widen it right out to the total population of US-GAAP tags and the total population of companies. Not surprisingly it gets uglier. 10,000 of US-GAAP tags appear in 5% or less of the total number of 10-K filings made to date (approx. 50,000). And roughly only 1,000 items appear in more than 5% of these filings.

This isn't a perfect piece of analysis as some items have come and some items have gone (but that has been happening by and large around the margins of the taxonomy)

So it's probably not too unreasonable to conclude 20,000 data items is drastic overkill for comparative analysis and that our 1600 standiZd data items IS over the top! We will certainly take a close look at those 263 items to see if we need to rename! this set downwards.

So what use does this disparate tagging have? Well it does allow us to easily find the rare occasions where particular tags have been used but because use is so rare, it probably has no real advantage over them just being tagged using extensions. A simple text search, although not as neat would be just as effective, if not more so as it would pick up all (the right or wrong) occasions when an extension has been used instead.

All this prompts the question - will there ever be less data items?

No comments:

Post a Comment