Wednesday 18 February 2015

The SEC Structured Data Sets - uh?

This SEC Announcment may have slipped your notice, as it was slid out at the back end of last year when certainly my thoughts were more about parties than databases. The SEC announced that some of that mountain of XBRL data that's been piling up on their computers is now available as a series of flat files. The Data Transparency Coalition certainly thought it was significant.

So what are these Structured Data Sets and what does this all mean for accessing Comparative XBRL data for financial analysis? By bulking it up and stripping out the markup they've made part of the XBRL filing data more accessible but with a number of caveats. Note I said "more accessible" and not "accessible". You can't open these files in Excel - you might think you can because they are tab separated flat files but actually you can't; they are too big. They are designed to be read into a database (from whence they come). So if you quickly just want to get hold of some tagged data for a few companies from the SEC, you're out of luck. Note you can also load the XBRL instance document into Excel as XML but that doesn't make it any more readable.

But it is now much easier to read them into an database. Yes you still need to build an intermediary database. But you don't have to worry about context references and dimensions and all that messy XML tagging. e.g. In an existing XBRL filing I might find 43 values for "Revenues" but which is actually the one I want? In the flat file, num.txt, there will probably be just 3 values - one for each of the primary financial periods.

Of course you don't have to build your own database - because we built one earlier! So if you quickly just want to get hold of some tagged data for a few companies then you can! We thought it would be an interesting exercise in evaluating the worth of this pilot program. We found it relatively easy and we like what we see. We chose to add a little pre-processing to the files, to make the resulting database run more efficiently and coalesce better with our existing ones.

Our copy can be accessed in exactly the same way as all our data - through Excel. A simple web query brings the data into a sheet according to the parameters you supply. It works just like the existing XBRL Sheet. The query is available here and the example XBRL Sheet here. You can also find a video here that shows you how to get started with the example sheet. Probably worth pointing out that because access is simply through the rendering of a customisable web page - the web query, access isn't confined to Excel; many other applications and programming languages can interface with this.

So what are the caveats? It is only data from the Financial Statements themselves that are in these files. Nothing from the Notes to Financial Statements for now and new files only appear once a quarter. And it is just as it was filed - there are no corrections in these data sets.

In the next couple of posts I'll talk more about the technical aspects, our implementation and the current limitations of this initiative.

1 comment: