Excel files and Scripting Bridge: Three conditions

Working with Excel files in Excel via Scripting Bridge is non-obvious, even when the bulk of the work done is extracting information from Excel like I do. Exporting content from Excel into CSV or a tab-delimited file isn’t an option since oftentimes I need to capture styles applied to text that is simply lost upon export. I have to work in Excel, and over time I have learned that in order to work with an Excel file with Scripting Bridge, the file must meet these conditions:

  • Saved in .XSLX format and not .XLS.
  • Have more than one worksheet (tab).
  • Not have any data connections.

The first condition was simple to troubleshoot and check because the file will not open. There is no visible error, but the returned value is nil. Bam. Done.

The second condition was not as obvious. I discovered the condition through the getItemCountInColumn:removeHeaderRow:inSheet: function in my OCScriptingBridgeExcelController class. The problem was that the function would never stop counting. Columns with 1,000 items would be in the tens of thousands before I figured out what was going on and stopped the application. I don’t know what other issues may arise by having only one tab—finding those would be all but impossible to test given the sheer volume functions available in Excel—but at least I know the issue is there and the condition is easy to test. As to why the number of worksheets would interfere with such a seemingly simple function is buried somewhere in Microsoft.

The third test—no data connections—is proving to be much more difficult and had the same problems as the second. The problem testing for this arises when trying to decipher what Excel means by a data connection. Is this a connection in a pivot table or is this a named item, and what can be done about the connection if one is found? Connections, in general, are pretty well scattered throughout the API. Connections are managed (so far as I have found) in two places: the pivotCache class, accessed through the workbook class, and the namedItem class, accessed through the worksheet class. Documentation on both is just as scattered (and non-existent on the Mac). In my case, it appears the namedItems array is where my found connections are held. In Applescript, there is no way to remove one once it has been created, and I haven’t gotten around to figuring out how to handle the existence of a named item. I am left with the same question as I do for the second condition: What does the existence of a named item have to do with getting information from an Excel spreadsheet? This is just weird.