Tag Archives: excel

What do you care, Excel?

While doing some automation work, I unintentionally tried to open two files with the same name but from different directories in Excel. Then, this happened:


I don’t think this is something I’ve ever done in the past, but I also don’t understand why this would be a problem. If the complete paths were the same, I would understand because Mac OS treats all network volumes as /Volumes/path/to/file.ext. Statistically, small but not to the point of being pragmatically impossible, two files on two external volumes could have the same path. But going by just the name seems…short-sighted…or something. Weird.

Wither Applescript

John Gruber linked to an article that explains the dropping of Applescript from the latest release of iWork.

The writing has been on the wall about this for a long, long time. How long has it been since Apple allowed their Applescript pages be allowed to go offsite? Instead is simply lamenting about the impending, far-too-long-drawn-out demise of Applescript, I wanted to substantiate a claim made by the aforementioned article:

What I suspect Apple doesn’t realize is how much small business and small shops workflow depends upon Applescript. Casual use is fine. But a lot of people do more.
iWork 13 — A Huge Regression

iWork is irrelavent in my industry; the lingua franca for developing book content is Microsoft Office, but the red flag being raised is just as relevant. Also, I am not a small company. I work for a large company that is almost a third Macs. But, I think there is the point to be made that Apple might not be aware of the complexity of workflows built around Applescript, and how casually and quietly deprecating support sucker punches any developer regardless of size.


At my day job in a publishing company, I manage a pre- and post-press department that is concerned with the archiving, retrieval, and most importantly re-use of content. (I suppose this could be summarized as “content lifecycle management.”) Content comes in many forms but for me the focus is on art and text.

For art, we have a number of digital products that are derived from the main texts that we publish, all of them an image bank in one form or another. An image bank is essentially a catalog of images that is easily searchable and useable by instructors and students ultimately showing an image on its own page. The term “image” in this context is actually comprised of several elements:

  • The picture itself
  • Callout (Figure 1-1, Figure 1-2, etc.)
  • Caption
  • Credit
  • Relavent metadata, like page number

Image banks come in three forms, two web-based versions—one of which you can see an example of here—and PowerPoint. The PowerPoint-based image bank is literally each slide showing the above components for a given image.

An image bank can have anywhere between 200 to 3,000 images, and not all images have all of those elements. A book will have any or all the different types of image banks, so content must be displayed in each of those consistently. My team receives hundreds of varying types of content requests each month, but the image banks are the biggest project in terms of scale and scope by a huge margin. We do fewer image banks than we do of other requests, but image banks take the longest and are the most time consuming.

A long time ago, we used to make image banks by hand. The process took months and was riddled with errors since someone manually proofed and placed at least 1,000 content and data points. Then, a former employee wrote an Applescript and everything changed.

Enter Applescript

Applescript has helped us make the process much more reliable and faster. Essentially, the way the creation process works is the following:

  • My team collects all content into an Excel spreadsheet templates (this is still done manually, though XML is going to change that really soon)
  • Editorial edits the content to meet common and lowest-common-denominator image bank specs
  • My team takes the edited spreadsheets and runs a script to create the product.
  • The script extract the content from Excel, massage as required, and then transform into the required format
  • The resulting files are returned for proofing and any changes made are done in the spreadsheet (they are almost always typos) and the process starts at the beginning again

The “resulting files” can be made in any number of ways, but Adobe Photoshop is a common tool because all art must be converted to meet web specs or be usable in PowerPoint. Adobe InDesign is another tool in the process because one of the image banks uses some fairly complex layouts.

The Applescripts do a lot for us in the process:

  • Captions and credits are made “web ready” by converting special characters to HTML entities and removing any odd formatting that may have been carried over from the page layouts.
  • Common text styling is captured from Excel and applied in the output files via a simple spec.
  • Layouts are created and optimized dynamically via simple rules with no manual intervention.
  • Files are appropriately named, errors logged, and so on.

Applescript makes inter-application automation a breeze. Most scripts go between the Finder, Excel, and then InDesign or Powerpoint at the click on icon, and I have developed scripts and processes that I can hand off to people with no programming knowledge of any kind. These scripts are anywhere between 1,200-3,000 lines of code; they are major applications unto themselves. My team doesn’t even open the script in Script Editor or Script Debugger to run it. I always felt the real power of Applescript was the ability to create complex workflows so easily, but that’s provided application developers are willing to support Applescript. In this case, Applescript saved the day and continued to for a long time.

Enter JSX, Wither Applescript

Adobe wants to own their entire automation stack. Since their introduction of the ExtendScript ToolKit using their version of Javascript as the basis (the file extension is “jsx” hence what I am calling it here), Adobe’s Applescript support has deteriorated to the point where it is now useless to us.

The first nail in the coffin was when Photoshop could no longer open a file via Applescript. But with the release of Creative Cloud and the associated updates to Creative Suite 6, now their dictionaries are stripped of the needed command to create a hybrid Applescript-JSX workflow. The final nail that came with Creative Cloud is that I have since discovered that the JSX framework is no longer accepting arguments as part of script calls from external commands. Everything just passes through as undefined. This may be a big bug, but given the (very) niche nature of my environment, I don’t hold any hope this will be fixed.

I have been porting most of my Adobe automation code to JSX—rebuilding much of what I have done in Applescript in JSX—an effort that has taken better part of this year to get ahead of these changes. I had originally banked on Scripting Bridge to help with pushing content from Excel to the Creative Suite. My goal was to consolidate all of my significant workflows into a Cocoa application for ease of use and better long-term project management. But since that relies on the same frameworks as Applescript—see Apple’s Open Scripting Architecture that Adobe is clearly looking to drop—I now have to look into alternatives.


Again, from the article:

It wouldn’t be so bad were there an alternative.
iWork 13 — A Huge Regression

Alternatives to Applescript depends entirely on one’s own technology stack and a willingness to accept change. To put it another way, alternatives to Applescript simply do not exist only when one decides to stick with Mac OS.

I am exploring options in still maintaining consistent, reliable workflows to create image banks with Adobe Creative Cloud on the Mac. Right now, I am focused on the idea that since I am working with as much text as I am art, I can leverage tab-delimited files (a perennial favorite around here) saved from Excel and JSX’s baked-in XML parsing to create an content transfer spec by creating an in-house content transfer file specification. This means that one-button-click processes turns into n-button-clicks processes, but I still get to keep my investments in Mac OS. I’m not thrilled with this and neither will my team.

Another option is to remove Creative Cloud from the process completely, especially since we are moving everything to web anyway and layouts can be managed with CSS. The PowerPoints are a sticky wicket since that is one of the most-requested formats, and we are beholden to our customers. If not, I would jump to PDF in a second, making the slides using LaTeX. Convincing Sales and Editorial to migrate to PDF, however, would be a huge leap, but one that I am also considering for a variety of reasons, not just this.

Luckily for me, Adobe’s applications are cross-platform as is their JSX platform, and Adobe still maintains a Visual Basic API. The implication is that when Microsoft decides to drop Applescript support from Office as well—given that Adobe and Apple are moving in this direction, why should Microsoft continue?—I can make the move to Windows for process automation and use Visual Basic to automate Excel and then, with some relatively minor changes, carry over my JSX code as writ.

There are some serious caveats to significant of a migration, training and monetary investment notwithstanding. Primarily, if I can reliably call a JSX script with arguments via Visual Basic, then I have effectively re-established where I was before Creative Cloud was released. If Adobe has not treated Visual Basic with the same disdain as it has Applescript, then I could eventually be back where I was before JSX was released. That will take time, but in talking with our preferred vendors, the fact that we are using Macs for such large-scale production makes us the odd one out anyway. I will always have a MacBook Pro by my side for everything else, but then I lose my ability to effectively replicate my environment across both computers.

Switching platforms is something that I don’t want to do but the enormity of my Applescript code cannot be overlooked. At the same time, I have made significant investments in Mac-only applications and technologies—BBEdit, OmniGraffle, Cocoa framework—and now Unix now that I have been in school these past couple of years. There is still a lot of Applescript code running around here (and elsewhere) that is useful and viable.

But, when I do take a step back and survey the current environment as it pertains to publishing automation, there is little being done these days that is specific to Mac OS, and with the limited resources available to me I need to consider where I will receive the most support from the platforms upon which I am supporting my career. I’ll try a couple things first and then weigh my options.

Excel files and Scripting Bridge: Three conditions

Working with Excel files in Excel via Scripting Bridge is non-obvious, even when the bulk of the work done is extracting information from Excel like I do. Exporting content from Excel into CSV or a tab-delimited file isn’t an option since oftentimes I need to capture styles applied to text that is simply lost upon export. I have to work in Excel, and over time I have learned that in order to work with an Excel file with Scripting Bridge, the file must meet these conditions:

  • Saved in .XSLX format and not .XLS.
  • Have more than one worksheet (tab).
  • Not have any data connections.

The first condition was simple to troubleshoot and check because the file will not open. There is no visible error, but the returned value is nil. Bam. Done.

The second condition was not as obvious. I discovered the condition through the getItemCountInColumn:removeHeaderRow:inSheet: function in my OCScriptingBridgeExcelController class. The problem was that the function would never stop counting. Columns with 1,000 items would be in the tens of thousands before I figured out what was going on and stopped the application. I don’t know what other issues may arise by having only one tab—finding those would be all but impossible to test given the sheer volume functions available in Excel—but at least I know the issue is there and the condition is easy to test. As to why the number of worksheets would interfere with such a seemingly simple function is buried somewhere in Microsoft.

The third test—no data connections—is proving to be much more difficult and had the same problems as the second. The problem testing for this arises when trying to decipher what Excel means by a data connection. Is this a connection in a pivot table or is this a named item, and what can be done about the connection if one is found? Connections, in general, are pretty well scattered throughout the API. Connections are managed (so far as I have found) in two places: the pivotCache class, accessed through the workbook class, and the namedItem class, accessed through the worksheet class. Documentation on both is just as scattered (and non-existent on the Mac). In my case, it appears the namedItems array is where my found connections are held. In Applescript, there is no way to remove one once it has been created, and I haven’t gotten around to figuring out how to handle the existence of a named item. I am left with the same question as I do for the second condition: What does the existence of a named item have to do with getting information from an Excel spreadsheet? This is just weird.