XML and Search

Back in 1999–2000, I was working for a small, local compositor and doing some freelance graphic design.[1] Working at the compositor was good, if a bit of a grind. I liked the work, but I was trying to figure out how to get onto the publisher side of the equation, because that had more of a long-term future and a variety of career paths.

My mother was working at Pearson (now retired), and one day said to me, “You need to go learn XML. If you learn XML, you will never be out of a job.” What’s interesting about this moment is that my mother disputes two key facts.  First, she doesn’t recall ever having that conversation with me.  Second, she said that if we did have that conversation, she would have said “SGML,” not “XML.”

I swear she said “XML” but she must be right because I promptly went out and bought Practical SGML by Eric van Herwijnen.[2] Besides, we definitely had the conversation because at that time, I never would have come up with SGML on my own. I had never even heard of SGML before then. I was still stuck in Quark and managing the company’s server backups.

Either way, I practically inhaled that book. Pretty much everything made sense, both in terms of how it explains SGML and my own innate grasping of the concepts. Soon after reading the book, I tried playing with some of the concepts in the book with demonstrable success.  In 2001, I landed a job at a publisher based on my familiarity with XML through SGML.

There, I was able to demonstrate XML’s power in InDesign through building the company’s international catalog, saving a ton of time and error.  That work led to more technical projects, which in turn led to others, needing to learn new skills, and so on. All of which I leveraged to eventually pivoting my career from graph design into computer science, and opportunities which I found to be a lot more interesting to me. [3] Parallel to those efforts, Indian composition firms exploded in size, and a lot of domestic firms ended up being acquired or simply going out of business, including the one I was working for previously. Which of course means my mother was right that if I learned XML I would never be out of a job.

After working with XML for as long as I have, I understand the derision towards XML by the developer community at large. It’s just not the most exciting technology. It’s verbose, hardly human-readable beyond a low threshold of complexity, the development tools are usually esoteric and at times outright cryptic. Then there is the whole issue of working with schemas and DTDs, which have their own variant syntaxes. It’s not hard to master, but the surrounding environments allow you to get into all sorts of poorly-documented trouble.

But through all of those technical thickets is immense power and value by having all of that content semantically identified. XML is an exceedingly small portion of what I do today, but it is still foundational to what I do. These days, I am focused on building robust content search and re-use capabilities to meet a wide variety of business needs. XML lies at the core of those efforts because so much of that content is stored in XML or soon will be. Those semantics are what is going to drive so much search going forward, knowing what kind of content exists where, which makes it well worth the effort and reinforces its utility to me today.

If I had to say to someone what I thought the next big career skill would be, I would say learn how to search. Really understand how search tools like Google’s advanced search and using modifiers like “AND”, “OR”, and “site:”. I’ve been thinking about a quote I read by Chris Bolin where he wrote [4]…

“Make time. I bet the thing that makes you valuable is not your ability to Google something, but your ability to synthesize information. Do your research online, but create offline.”

I like that quote because it’s definitely true of my career these days, but the ability to search in any system, even Google, has been an immensely beneficial skill to have. Searching Google is not as frustrating when  you know there is just as much of a chance the answer you are looking for is a few pages into the search results, and new search terms may be revealed with a bit of research.

But, in addition to Google, esoteric wikis and content management systems just open up troves of information once I know the search modifiers. It may not be a publicly demonstrable skill like putting together that killer presentation, but the ability to research most any system for what you need is vital to coming to that solution.


[1] For those uninitiated in publishing, a compositor is a company that essentially builds books, taking manuscript, applying a design to it, and combining components to build a cohesive product. Most of this work in Adobe InDesign, but there are many other applications that do something similar, like LaTeX.

[2] https://www.springer.com/us/book/9780792306351

[3] Not that I didn’t like graphic design, but rather I found I was a lot better at programming than I was at graphic design. 

[4] Sorry, can’t find the original link.

Gödel, Escher, Bach

When I was in high school, a bunch of the honors English students had copies of Gödel, Escher, Bach: An Eternal Golden Braid by Douglas Hofstadter. They swore it was an awesome book. I was not an honors English student (I did terrible in high school) but I still hung around them because I was on the Speech and Debate team (one of the things I was actually really good at doing, and the only part of high school I genuinely enjoyed). I bought a copy because I figured they were on to something if they had all read it. Personally, I think some of them bought it because it looked cool in a nerdy sort of way and only claimed to have read it.. But I can’t criticize them for that because I bought it for the very same reason. Just by thumbing through it has an appealing pretentiousness. I found a copy for cheap at a used book store, and cracked it open.

I must have tried reading this book several times over the years, but I clearly was not equipped to read something like that (which spurned and fueled my doubts about most of the honors English students’ claims). I knew nothing of computer science and high-level mathematics, much less any formal knowledge of philosophy. I was still very much held onto the incorrect notion that much of philosophy was so much navel gazing about the universe. The book read like a foreign language to me. These were English sentences, but the words had little to meaning to me. So, I put it away, and did not open it again for another 30 years. Still, I lugged that thing around with a ton of other books I had, from apartment to apartment and even across the country, because it still has that nerdy cache that can be used to impress friends the way it was used to impress me (albeit falsely but I think we all do that one way or another).

Just recently, I was looking for a book for a friend in my personal library, and came across my copy of Gödel, Escher, Bach. I soon realized I was equipped with at least the rudimentary tools to approach this book, because when I graduated Harvard Extension, I walked away with a Computer Science concentration and Government minor, the latter being grounded in political philosophy, with some moral philosophy and logic thrown in for good measure, and with honors. So, I if I could tackle all of that, I could tackle this book.

Last night, I cracked it open one more time, read the overview, and all of it made sense. I decided right then I’m going to read it finally. It is clearly as pretentious as it appears, but I will have a good time with it all the same, just as I did hanging out with the honors English students while on the Speech and Debate team.

The Wisdom of Insecurity

Music is a delight because of its rhythm and flow. Yet the moment you arrest the flow and prolong a note or chord beyond its time, the rhythm is destroyed. Because life is likewise a flowing process, change and death are its necessary parts. To work for their exclusion is to work against life.

The Wisdom of Insecurity, Alan Watts, 1951

Pocket City

I don’t have a lot of time to play games, as much as I love playing them. So, when I decide on a new game to play, I choose carefully so my time isn’t wasted. For iOS, my “go to” is Minecraft if only because I like to play with Lego, and the analog between the two is obvious. However, given the increasingly rare time I have to play, it’s been all too easy to get into a rut. A pleasant distraction still, but a rut all the same.

I don’t know how I landed on Kotaku’s review of Pocket City but it reminded me of how much I loved playing SimCity way back in the days of monochrome Macs. I looked at SimCity, itself, but it has in-game currency to purchase, which is a business model I won’t support. I prefer to pay a fair price up front, and get the whole game, even if I have to unlock content through well-designed achievements. A sandbox or creative mode is even better. Minecraft has in-game purchases, I know, but the difference is that I don’t feel pressured to buy anything. I can take it or leave it, and the entirety of Minecraft is available solely through gameplay.

The point here being that Kotaku’s review of Pocket City is spot on. If you miss SimCity’s creativity and depth, and don’t want to deal with the chicanery of in-game currency, then Pocket City is the way to go. At $4.99, it’s practically a steal.

P.S.: In case anyone asks, I’m not associated with Kotaku or Pocket City. I’m just an intensely fickle customer who has been delighted.


Vox.com: “Epistocracy: a political theorist’s case for letting only the informed vote”

Georgetown University political philosopher Jason Brennan author of Against Democracy

We know that an unfortunate side effect of democracy is that it incentivizes citizens to be ignorant, irrational, tribalistic, and to not use their votes in very serious ways. So this is an attempt to correct for that pathology while keeping what’s good about a democratic system.

We have to ask ourselves what we think government is actually for. Some people think it has the value a painting has, which is to say that it’s symbolic. In that view, you might think, “We should have democracy because it’s a way of civilizing and expressing the idea that all of us have equal value.”

There’s another way of looking at government, which is that it’s a tool, like a hammer, and the purpose of politics is to generate just and good outcomes, to generate efficiency and stability, and to avoid mistreating people. So if you think government is for that purpose, and I do, then you have to wonder if we should pick the form of government that best delivers the goods, whatever that might be.

I read this book when it first came out. Even if you do not agree with his solution of replacing democracy with an epistocracy, his critique of modern democracy is witheringly on point and worth a read by anyone who is interested in government, regardless of their political leanings.

Those Three Five-Letter Words

Every year at tax time I am reminded of my high school history teacher who had a sign posted above the chalkboard in the front of the classroom that read “Life can be summed up in a series of five-letter words: TOUGH, TAXES, DEATH.” He was a total hardass. And sometimes I actually wonder why I am a pessimist, as if somehow there is no answer to that question. Then I do taxes, and then I’m all “Oh, yeah…”

“The Prince,” Machiavelli

At a recent leadership training held by my company, there was a team dinner where we had to present to another participant a book on leadership that meant something to to us, and give a 3-minute presentation on why. I’ve read very few business advice books in general, so I was kind of stuck. But, I did have to read Machiavelli’s The Prince while in school, and very much enjoyed it. I re-read it, as well as revisited the lecture and notes, and wrote this presentation (3 minutes spoken):

Fortune and luck, and their intertwined relationship, are the two forces Machiavelli is primarily concerned with in “The Prince.” Fortune is about being in control, and luck is about not being in control. To acquire fortune and manage luck, one must play the political game which has three goals: gaining power, maintaining power, and using power. Machiavelli understands fundamentally that leaders do not have an entitlement to power. Leaders have to obtain power through a variety of means, all by leading well through the political game, and are capable of losing power when they lead poorly. Power, according to Machiavelli, is subject to change, and change that is not always in a leader’s control.

To claim Machiavelli is cruel or amoral, that he is using the ends to justify the means simply to maintain power, is dismissive of his actual lessons. Machiavelli argues that while leadership and morality are to be separate considerations from each other at the state level, and one could argue they ought to be tightly coupled, that does not the fact that leadership and its politics are a messy yet necessary game that everyone plays at some level. A good leader will know how best to play that game to earn as much fortune and power as possible, for that is what it means to be an excellent leader in his view of morality and ethics. He states no position on whether this is right or wrong, but simply that everyone is playing the same game with each other, and offering advice on how to be the best leader.

We have to rely on other people to achieve our goals—direct reports, colleagues, executives, vendors, and customers—each of whom have their own goals and politics. While Machiavelli warns against relying on others too much, he clearly understands there is no long term success as a “lone wolf.” Power and leadership mean nothing without people and events to influence. To that end, Machiavelli exemplifies core virtues of what he considers to be excellent leadership, particularly an understanding of foresight and context.

A good leader must have the foresight to predict opportunity (good luck) and problems (bad luck) before they occur. To maintain power and fortune, a good leader must have a grasp of luck if only because bad luck can cause a leader to lose their fortune and power. The more a leader is better able to foresee problems, the better able they can manage luck. Whether good or bad luck faces a leader does not define good leadership in and of itself, but rather how they react to good and bad luck to achieve the goals of the political game defines good leadership. If a leader leads well in the face of either, they maintain their power. If they lead poorly, they lose power and deservedly so.

A good leader must also have a good understanding of when the rules of the political game have changed. Leadership skills that work in one context may not (and likely will not) work in another. A good leader needs to understand that different contexts exist, most of which are outside of a leader’s control, and quickly recognize when the context has changed, if not anticipate it beforehand as they should with good and bad luck.

The world is a different place than it was 500 years ago, and the stakes in a feudal state are obviously different than they are for a 21st-century corporation. But people have not changed all that much since 1532. Much of what Machiavelli wrote still rings true today, whether we like it or not. Fortune and luck are fickle, leading through them both requires playing the political game for which there are mutually understood rules, take them or leave them at your peril.

Through all that, it is still up to you to decide what kind of leader you want to be. Will you be judicious? Will you be punitive? Will you be compassionate? Will you be focused? Will you be affiliative? Machiavelli would argue that you should be any of those whenever appropriate to be the best leader you can be. To that end, I wish you good fortune, and way more than just luck.

The Bar Room Polka JSON

Dummy content for a POC project in JSON based on “The Bar Room Polka” by Frankie Yankovic. If you haven’t heard this song, you should. The content provides a nice mix of data structures and missing values. Besides, “plush line club” is just an awesome vintage phrase. My only question is: where the hell is “Tweedunk”?


'places' : ['tavern', 'salon', 'cabaret'],

'nowhere' : ['dive', 'joint'],

'everywhere' : 'plain old fashioned drunk',

'Pennsylvania' : {

'itsCalledA' : 'taproom',

'theyServe' : 'n/a'


'Manhattan' : {

'itsCalledA' : 'plush line club',

'theyServe' : 'n/a'


'Chicago' : {

'itsCalledA' : 'gin mill',

'theyServe' : 'n/a'


'London' : {

'itsCalledA' : 'blooming pub',

'theyServe' : 'n/a'


'Newport' : {

'itsCalledA' : 'n/a',

'theyServe' : 'cocktails'


'Tweedunk' : {

'itsCalledA' : 'n/a',

'theyServe' : 'boiler maker'


'Milwaukee' : {

'itsCalledA' : 'bar room',

'theyServe' : 'n/a'


'Wilksbury' : {

'itsCalledA' : 'meeting hall',

'theyServe' : 'n/a'


'Greenland' : {

'itsCalledA' : 'beer joint',

'theyServe' : 'n/a'


'Tennessee' : {

'itsCalledA' : 'cider mill',

'theyServe' : 'n/a'


'Altoona' : {

'itsCalledA' : 'n/a,

'theyServe' : 'spooners'


'Bridgeport' : {

'itsCalledA' : 'honky talk',

'theyServe' : 'n/a'




“Abstraction is the elimination of the irrelevant and the amplification of the essential”

—Robert C. Martin, Agile Principles, Patterns, and Practices in C#

Rob Pike’s 5 Rules of Programming

Rob Pike’s 5 Rules of Programming

  • Rule 1. You can’t tell where a program is going to spend its time. Bottlenecks occur in surprising places, so don’t try to second guess and put in a speed hack until you’ve proven that’s where the bottleneck is.

  • Rule 2. Measure. Don’t tune for speed until you’ve measured, and even then don’t unless one part of the code overwhelms the rest.

  • Rule 3. Fancy algorithms are slow when n is small, and n is usually small. Fancy algorithms have big constants. Until you know that n is frequently going to be big, don’t get fancy. (Even if n does get big, use Rule 2 first.)

  • Rule 4. Fancy algorithms are buggier than simple ones, and they’re much harder to implement. Use simple algorithms as well as simple data structures.

  • Rule 5. Data dominates. If you’ve chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.

A bit more can be found at the source here: http://users.ece.utexas.edu/~adnan/pike.html

In my experience, I’ve come across Rule 1 enough times to know how to avoid the lesser obvious bottlenecks (which for me generally focus on data merging). I’ve definitely come across Rule 4. Rule 5 trumps them all, as it were. A deep understanding of the appropriateness of data structures is core to good code. Rules 2 and 3 don’t generally impact my day-to-day programming.

Via Hacker News, which I’m sure will lead to a killer comments thread.