On Software Quality

One perspective I’ve profoundly changed compared to the junior engineer in me is how I see software quality. With decades of economic expansion and Internet boom, our industry has learned a rather sloppy norm on quality. In this post, here are my takes on how you should approach software quality.

Your User is Your Customer

Only software engineers and drug dealers call their customers “users”.

— a random joke on Internet

The joke is meant to be a joke, but there is some truth in it. The term “user” fails to remind listeners that software, like everything engineered in the world, is experienced, explored, and felt by human beings. A human being that makes intentional efforts to tinker with your creation.

Some people find the word strange because “customer” implies these people are paying for the software. I would argue that this is always true, no matter your business model. Traditionally, your customer pays for software download, or the CD-ROM containing the software. Nowadays they may pay through subscriptions. Often people pay indirectly through advertisers that put ads on your UI. Even the free-tier customers of your freemium service pay with their time.

Surely there are times when you need to distinguish between “end-users” and “external developers” when your software interfaces both. Or your business software is paid for by businesses serving their employees. Yet at the end of the day, all parties are your customers.

Software engineering is one of the highest-paying industries. Like any other economic activity, it interacts with the rest of the economy, creates value, and transfers wealth. Referring to your customers as “customers” acknowledges this fact with gratitude.

Software Errors are Interruptive

Try to reload the page if you encounter an issue. Clear cookies and open the browser again.

— literally every troubleshooting article

No, your website shouldn’t stop working because you fail to bust the cache. Your app shouldn’t crash because you fail to handle an edge case. Period.

Sure, it is hard on the web to deliver assets atomically. Formal methods are not realistically deployable yet. But those are the problems our industry needs to collectively solve, not hand-waving them and leaving the problems to the “users.”

During the boom, such an attitude is tolerated, because software creates much more value to the economy predates it. Software engineering is the economy now. It occupies a much critical role in people’s lives, than that gimmicky website for ordering books.

Assume people’s lives depend on your software doing the right things indirectly even if the use cases don’t suggest that. We are not all software engineers working on avionics software, but you never know how your software would be repurposed.

We are, ultimately, all software customers. You wouldn’t want to spend 5 minutes resetting the light bulbs every other day, don’t you?

Respect your Quality Assurance Engineers

The next inspector is the customer.

— a banner that should be in your office

For some reason, QA engineers are being paid less than software engineers in general. That translates to an unhealthy disrespect to QA engineers — pushing back on what they found, questioning their value to the project, etc.

It is true that QA engineers is the easiest entry point into software engineering (the next being, err, web front-end engineer.) The best QA engineers are, however, generalists — they know a little bit of everything to effectively ensure software quality; they self-motivate and constantly reinvent themselves for the next best way to identify customer-facing issues. They do much, much, more things than manual test case runs.

Sadly, I do not see this notion changing any time soon. Life finds a way, many QA engineers end up taking the title of “automation engineers” or “build and release managers,” as an attempt to shed the stigma of the plain title with whatever supporting responsibility the team shoves on them.

Assume the QA engineer in your team does more than manual testing, and stop treating them with whatever feeling arises when you hear the title “QA engineer.” We are all responsible for quality. Given how critical software errors may be, identifying them before shipping could very well save lives.

Quality Metrics can be Automated, but Release Cannot

A computer can never be held accountable. Therefore a computer can never make a management decision.

— a memorable quote, allegedly from IBM in 1979

Tests can definitely be automated and they should be automated. Quality metrics and observability are software engineering problems with software solutions. However, eventually, someone needs to make the judgment call on quality and releases.

Software release/sign-off quality is, and forever will be, the managerial decision that can’t be automated away. This is a responsibility that falls under QA engineering, which makes their job indispensable.

Even if your project has no dedicated people responsible for QA or testing, someone is making a go/no-go decision based on quality. That someone, by definition, is subsuming the role.

Developer Experiences is a Means, not an End

Especially in Web Engineering, people lose the big picture when engaging in developer experience (DX) discussions. Unidirectional or bidirectional data flow. Static typed or dynamically typed. Test case paradigm on continuous integration pipelines. Git flow or GitHub flow. The list goes on.

There is no point in arguing any of these if people can’t agree that the end goal is to elevate software quality. Developers’ quality of life is important, but it is secondary to that goal. For me, the best developer experience is what allows me to make fewer mistakes, in the same development timespan.

There will always be “rockstar developers” showing up and telling you their next project is the best thing ever, or their methodology is perfect. Give them a hard look through the lens of software quality — many of them fall apart pretty quickly.

One persistent DX argument is “fast.” Faster tool does contribute to software quality by tightening the feedback loop. Regretfully, the adoption and maintenance cost of a newer, immature tool often offset the gain.

Compared to “fast,” I am much more interested in “correct” — like a programming language that can eliminate use-after-free, or prevent TypeErrors in JavaScript alone. Only you are in the position to judge adoption and the maintenance cost though, no one else.


Software provides values like any other machinery driving the world. Its excellence is measured by quality, and quality is measured by the experiences of those who interact with the software.

As software engineers, we are mortal code-typing intelligent monkeys, sitting in front of a keyboard trying to ask silicons to do the right thing. We are undoubtedly empowered to figure out “how” we can be better at doing it — but we should do so without losing sight of “why,” and treating others better.

A new take on Asian IME for English audiences

East Asian Input Methods are not hard to understand, but for English speakers, we can do better than a general explanation on Wikipedia with a concrete example. This post is me attempting to do that for English audiences.

It is written for my own amusement, but I hope you would like it. My recommendation is to read it once without tapping on footnotes and skim through it again with footnotes.

Spoken languages are humans conveying ideas by making sounds. A human can only make a limited number of sounds, due to anatomy. Not every sound humans can make is used for a given spoken language.

There are around 44 sounds in spoken English. Linguists call them phonemes. English is usually written in Latin alphabets. There are 26 alphabets. To represent 44 sounds, combinations of alphabets are utilized. Linguists call these alphabet combinations graphemes. Each English phoneme is being represented by many graphemes (too many for some, without a spelling reform). Humans are taught to pick the right grapheme combinations to write down the exact words they intend to speak. It’s called spelling.

Now, imagine there is a language1 written using a different script. Instead of Latin alphabets, it would traditionally be written with graphemes composed of distinct shapes of drawings2. Linguists call these shapes monograms. This particular imagery script comes with tens of thousands of these monograms, with the same relationship with the phonemes of the imagery language like English — just as humans are taught to spell English words correctly, they are taught to pick the right monogram combinations to write down the exact words they intend to speak.

Most humans were born with ten fingers. Modern computer keyboards come with around 78 keys, designed for the ten fingers to operate. This is enough for Latin alphabets given that there are only 26 of them, but far from enough for the monograms of the imagery language. Something would have to be done with that.

Thankfully, as a human language, the number of sounds of the imagery language would still be within a manageable magnitude. Before the introduction of modern computers, local linguists would have already identified these phonemes. They would have gone afar and invented a set of symbols for these sounds. These symbols — phonetic symbols, they called — would “spell” a phoneme with just one to four symbols3. Unlike English, since the symbols were constructed and not naturally developed, each symbol combination would only represent one phoneme, and each phoneme would only be written by one symbol combination, systematically.

Aside: other linguists disagree and used Latin alphabets to “spell” the phonemes of the same language4. The principle is the same, though.

Aside: Local linguists of another imagery language picked a different route and decided to invent symbols to directly represent each sound. It’s “less” systematic, but it gets the job done too5.

Since the number of symbols is limited, they could then be arranged on a modern computer keyboard. Computers would then be loaded with a program6 that allows humans to pick the right monogram for each symbol combination, as they type.

Aside: For spelling the language with Latin alphabets, it is even easier — you don’t even have to arrange a different set of symbols on the keyboard.

When computers were dumb with limited capacity, these programs would only be implemented with a simple mapping table, mapping symbol combinations to monograms. This would be quite cumbersome, because words that people type are often the same monogram combinations, and humans really hate to repeat themselves.

Aside: A different school of programs for the same purpose would instead map monograms to visual symbols by decomposing their shapes, instead of the sounds they represent. Their mapping table would map visual symbol combinations to monograms7. This is very helpful for typing a monogram without knowing its sound and/or disregarding the spoken language being written. Some argue it is easier to type too, given that there can be arbitrary more symbol combinations designed for the program, than a fixed number of phonemes.

As computers become more powerful, a new class of programs would have developed. Instead of mapping symbol combinations to monograms (the shapes that make up the words), these programs would map multiple symbol combinations to words8. It would need a bigger mapping table for sure, and the table would also require constant curation, because of the endless evolution of human thoughts and their new words (comparably, new monograms are rarely added.)

Thankfully, computers are also powerful enough to manage these tasks. Maintaining and developing these bigger mapping tables are also helped by the fact that computers have since been connected across the planet9 (and its lower orbits10, to be exact) so it would not be hard for computers to find a large body of text written in the imagery language (a “text corpus“, linguists and computer scientists call) waiting to be extracted and processed11.

Thus, through the ingenuity of these humble programs built upon linguistics knowledge, our imagery language would have been allowed to strive in the Information Age, expressed in monograms the same way it would have been written down for thousands of years, and perhaps thousands of years to come.

If you like this post, you would like my not-to-be-updated JSZhuyin and its interactive tutorial. I would imagine you will already be frequent on many YouTube videos on linguistics, and a fan of the movie The Arrival, like me.

  1. Mandarin Chinese is the imagery language in question. ↩︎
  2. Mandarin Chinese is traditionally written in Chinese characters, a monogram shared among East Asia languages. Among these languages, the usage of Chinese characters only survived in Chinese, Japanese, and Korean, abbreviated as CJK in the information processing field. ↩︎
  3. This is how Bopomofo Phonetic Symbol system spells Mandarin sounds. It was invented in the 1900s. Invented in 1443, which predates modern linguistics, Hangul works more or less the same way for spelling Korean. ↩︎
  4. The Pinyin system is designed to spell Mandarin with Latin alphabets. ↩︎
  5. Japanese Kana is one such system. ↩︎
  6. The programs are Input Method programs, or IMEs, the subject of our discussion here. ↩︎
  7. An example of these kinds of IMEs is Cangjie input method, which codes Chinese Characters with 24 invented “radical” symbols. ↩︎
  8. These newer IMEs are often dubbed “smart” or “intelligent” IMEs. As mentioned in the later paragraph, all IMEs are “smart” nowadays. ↩︎
  9. Internet and World Wide Web, if you haven’t heard about it. ↩︎
  10. There is Internet on International Space Station, usable by astronauts. One got sued for it (and vindicated.) ↩︎
  11. This study of distilling human language using computers is called Natural Language Processing. ↩︎

Status of IDN ccTLDs

For some reasons, work has taken me to investigate current usage of Internationalized country code top-level domain. Something I came across all the way back almost two decades ago.

I remember it was a big thing being promoted by NICs. As a web engineer, I have also found it to be an interesting technical endeavor (with Punycode and etc) and spent my own effort to make sure the <IDN>.tw site I managed at the time also resolves on <IDN>.台灣, given that per NIC rule they auto-register you with the IDN ccTLD when you register for a second level ccTLD domain. Edit: I misremember this.

Fast forward to today: I was struggling to find a live website that resolves on an IDN ccTLD hostname. I no longer handled that <IDN>.台灣 website and my successor broke it (probably because of me failing to document my work.) The university websites that I know of at the time all stopped resolving on their IDN ccTLD hostnames. Hell, even the TWNIC website doesn’t resolve on twnic.台灣!

Eventually through the wonder of Wikipedia, I found the one website that resolves: уміц.укр, Ukrainian Network Informational Centre. It is good enough for me even though it won’t connect over HTTPS.

Ukrainians never disappoint.