So continuing on from my previous post, what else is there to say about Arelle?
Through the various windows and tabs, you can see just about every piece of information that's contained in a filing's XBRL files minus the confounded XML. For example the "Presentation" and "Calculation" tabs two along from the "Fact Table" tab basically show you what's in a filing's "_pre.xml" and "_cal.xml" files, demonstrating the relationships between the various data items.
The app is well supported with documentation available with the install and on the web site. The development team seem active, replying to a query I raised within 24 hours.
The command line tools are of particular interest as it means there is the potential to automate the extraction of data to Excel. And as an analyst what you want is an automated "one button" solution. You don't want to be wasting time messing around with the data you wish to analyse.
The command line executable enables you to set arguments to extract the XBRL (or more precisely and more usefully! - the data minus the XBRL) to a "csv" file that can be opened directly in Excel or loaded into (or referenced from) an existing spreadsheet. There are seven options but only three are of any interest. "--csvFacts" enables you to extract the "Fact List" (what's in the Fact List tab). Note this is not the Fact Table (darn!) and is essentially all the data items in the instance document. The "--csvPre" and "--csvCal" options unsurprisingly serve up the "Presentation" and "Calculation" tabs respectively.
Using the command line is best done using a batch file (.bat) which can be scheduled to run automatically by the operating system. An example batch file for use with the "csv" options is provided in the "scripts" directory of the Arelle install. The file is called "exportCsvFromXbrlInstance.bat". You can edit this to create the "csv" file you require. Don't double click on it to edit it as it will run! but open it from a text editor.
As you can see from the batch file, you can customise the columnar output of "--csvFacts" using the "--csvFactCols" argument. Note that contrary to the documentation on the website, you need two dashes in front of the arguments and not one. The "Fact List" delivers a data value for each period on a separate line (as per the instance document) so it requires a great deal of manipulation in Excel to get it looking sensible. It could provide a useful feed into an intermediary database such as Access but my interest here is Excel. So despite the degree of automation afforded by the command line, to get something more immediately useful we need to head back to the "Fact Table" generated by the Arelle GUI.
Showing posts with label automatic extraction. Show all posts
Showing posts with label automatic extraction. Show all posts
Thursday, 15 December 2011
XBRL from the Arelle Command Line
Labels:
access,
arelle,
automatic extraction,
automation,
batch file,
command line,
csv,
database,
excel,
one button,
spreadsheet
Friday, 12 August 2011
XBRL and Me
Right from the early days of collecting data, I could imagine a time when data would be extracted automatically from annual reports for analysis. Back then I assumed this would be achieved by clever machines using OCR and AI techniques (Some of the software I've built recently has attempted to use similar methods). In those days, we didn’t sell structured data, just standardised financial statements and ratios on bits of paper. The widespread adoption of computers allowed for the migration of this standardised data into structured databases. But often you were putting square pegs into round holes. Our solution was to expand the data set, so whatever was reported could have its own "hole", to create the world’s first as reported database. In practice, it only had 1600 data items (compared to nearly 16,000 tags in the XBRL US-Gaap Taxonomy!) so the description "as reported" was always stretching it a bit, but we were applying that old 80:20 rule and so more often than not a data item was as described. We even had an equivalent of company defined taxonomy extensions, where if a data item didn’t fit, it would be stuffed into a tagged “other” item (ensuring it all still added up) and broken out in a custom labelled table.
To cut a long and rather dull story short, I love the thinking behind XBRL. It’s a great idea. I have though become more than little curious with regards to it's implementation.
To cut a long and rather dull story short, I love the thinking behind XBRL. It’s a great idea. I have though become more than little curious with regards to it's implementation.
Subscribe to:
Posts (Atom)