XML and Search

Back in 1999–2000, I was working for a small, local compositor and doing some freelance graphic design.[1] Working at the compositor was good, if a bit of a grind. I liked the work, but I was trying to figure out how to get onto the publisher side of the equation, because that had more of a long-term future and a variety of career paths.

My mother was working at Pearson (now retired), and one day said to me, “You need to go learn XML. If you learn XML, you will never be out of a job.” What’s interesting about this moment is that my mother disputes two key facts.  First, she doesn’t recall ever having that conversation with me.  Second, she said that if we did have that conversation, she would have said “SGML,” not “XML.”

I swear she said “XML” but she must be right because I promptly went out and bought Practical SGML by Eric van Herwijnen.[2] Besides, we definitely had the conversation because at that time, I never would have come up with SGML on my own. I had never even heard of SGML before then. I was still stuck in Quark and managing the company’s server backups.

Either way, I practically inhaled that book. Pretty much everything made sense, both in terms of how it explains SGML and my own innate grasping of the concepts. Soon after reading the book, I tried playing with some of the concepts in the book with demonstrable success.  In 2001, I landed a job at a publisher based on my familiarity with XML through SGML.

There, I was able to demonstrate XML’s power in InDesign through building the company’s international catalog, saving a ton of time and error.  That work led to more technical projects, which in turn led to others, needing to learn new skills, and so on. All of which I leveraged to eventually pivoting my career from graph design into computer science, and opportunities which I found to be a lot more interesting to me. [3] Parallel to those efforts, Indian composition firms exploded in size, and a lot of domestic firms ended up being acquired or simply going out of business, including the one I was working for previously. Which of course means my mother was right that if I learned XML I would never be out of a job.

After working with XML for as long as I have, I understand the derision towards XML by the developer community at large. It’s just not the most exciting technology. It’s verbose, hardly human-readable beyond a low threshold of complexity, the development tools are usually esoteric and at times outright cryptic. Then there is the whole issue of working with schemas and DTDs, which have their own variant syntaxes. It’s not hard to master, but the surrounding environments allow you to get into all sorts of poorly-documented trouble.

But through all of those technical thickets is immense power and value by having all of that content semantically identified. XML is an exceedingly small portion of what I do today, but it is still foundational to what I do. These days, I am focused on building robust content search and re-use capabilities to meet a wide variety of business needs. XML lies at the core of those efforts because so much of that content is stored in XML or soon will be. Those semantics are what is going to drive so much search going forward, knowing what kind of content exists where, which makes it well worth the effort and reinforces its utility to me today.

If I had to say to someone what I thought the next big career skill would be, I would say learn how to search. Really understand how search tools like Google’s advanced search and using modifiers like “AND”, “OR”, and “site:”. I’ve been thinking about a quote I read by Chris Bolin where he wrote [4]…

“Make time. I bet the thing that makes you valuable is not your ability to Google something, but your ability to synthesize information. Do your research online, but create offline.”

I like that quote because it’s definitely true of my career these days, but the ability to search in any system, even Google, has been an immensely beneficial skill to have. Searching Google is not as frustrating when  you know there is just as much of a chance the answer you are looking for is a few pages into the search results, and new search terms may be revealed with a bit of research.

But, in addition to Google, esoteric wikis and content management systems just open up troves of information once I know the search modifiers. It may not be a publicly demonstrable skill like putting together that killer presentation, but the ability to research most any system for what you need is vital to coming to that solution.


[1] For those uninitiated in publishing, a compositor is a company that essentially builds books, taking manuscript, applying a design to it, and combining components to build a cohesive product. Most of this work in Adobe InDesign, but there are many other applications that do something similar, like LaTeX.

[2] https://www.springer.com/us/book/9780792306351

[3] Not that I didn’t like graphic design, but rather I found I was a lot better at programming than I was at graphic design. 

[4] Sorry, can’t find the original link.