我也寫了輸入法,而且用 Javascript!

聽完 jserv 在 COSCUP 2012 講的新酷音輸入法專案,決定來把這篇欠很久的文章寫完,順便換個標題。

在 Mozilla 工作,加入 Boot to Gecko 專案的第一個工作是處理 Gaia 的輸入法,弄了大約一兩個月,最後做出了可以在瀏覽器獨立運作(不需要從伺服器選字)的自動選字注音輸入法:

Gaia 注音輸入法「JS 注音」瀏覽器示範網頁

詞庫

詞庫要匯進程式勢必要轉成 JSON 格式。原本是從新酷音的詞庫轉出來的,後來發現 LGPL 跟專案不相符(我們用 Apache),找了很久,最後找到 mjhsieh 維護的小麥注音詞庫。至於轉檔,本來一度寫了 phantomJS 的 script 來處理,後來改用自家的 Javascript shell。比較麻煩的就是 jsshell 要用 -U 啟動不然字串不會被當成 UTF-8。

資料庫

要把 3MB 的 JSON 整包放在記憶體也不是不行,只是這樣放手機是跑不動的。另外的後果就是這樣每次網頁載入就要每次重新下載,還不能保證會不會碰到 HTTP cache。

Gecko 提供的 database 解決方案是 IndexedDB。也說不上來好用或是不好用,反正我沒得選(笑)。比起直接用 key 從 JS Object 拿資料,IndexedDB 的 async 流程當然是比較複雜,但也不會說難用(而且 Disk I/O 本來就應該要 off main-thread 才對)。比較複雜的是一些手機輸入法的進階找字功能(例如輸入「ㄊㄅ」就要查到「台北」)就需要用到 IndexedDB 的 range search 等等。

演算法

我沒有實作像是 tree search 之類的搜尋方式,而是傻傻的用窮舉法,把 N 個音可能的斷詞通通切開然後送進資料庫查詞還有問積分。和文字雲一樣,用了很簡單的演算法但是卻很有效。

剩下的部份與接下來

剩下就是一些 code organization 的問題。牽涉到這麼多 function 的程式不太容易整理好,我自己改寫了兩次,後來被北京的同事 port 成拼音輸入法的時候他又改成 prototype 的寫法,看起來又更整齊。Boot to Gecko(產品名 Firefox OS)最初的上市目標是南美洲,所以中文支援的功能就先放著,先去處理系統的部份。

我有很刻意的把注音的 JS 檔保持著可以讓其他環境與架構順利載入與 fall back 的狀況(還偷偷加了 CommonJS 的 define())。DEMO 網頁連 iPhone 都可以用,沒有 IndexedDB 的瀏覽器也會每次載入詞庫 JSON。有一些 fancy 的 idea 是像是可以用 node-gtk 或是其他方法執行,反攻桌面或是 Android 之類的。不過這記憶體消耗比別人大,演算法也沒有比別人強,要做只是證明因為做的到吧 … XD

有什麼意見還請不吝賜教。

Common misconceptions about Boot-to-Gecko, and the Web

中文版刊登於謀智台客部落格。

Being asked to introduce the Boot to Gecko project to various audiences, I think it’s appropriate for me to write some FAQ here on common misconception of the project, and the Web in general.

Misconception #1: Boot to Gecko is yet another mobile platform for apps.

To run you apps on Boot to Gecko (and any other devices with a browser), you would just have to deliver your apps on the plain old-school Web on Internet.

Boot to Gecko is aim to (re)boot to the web. That is, to bring the Open Web to a level that it could compete with proprietary mobile platforms. To application developers, the Web is yet another platform to develop (you would need to port your apps from Obj-C/Java to HTML5), but the good news is your app would now becomes the cross-platform web app, not B2G Apps, PhoneGap Apps, nor Metro-style Windows 8 Apps, and it’s always available to everyone on every platform, one web address away. It would also make your app free from platform vendor control, since no one owns the Web.

Misconception #2: Phone need to be always on-line to use Boot to Gecko and web apps.

Even though every apps on B2G phone is from a web address, even the home screen itself, offline capability can be easily achieved with HTML5 Offline AppCache technology. It’s something that can be done today, available to many browsers available on both desktop and mobile. The only thing you should take care of is how network dependent your app really is. Even on iOS, Twitter app or Facebook app is useless without network connection. You should define a clear boundary between program assets and online information in your HTML5 app. After all, it’s not just an website anymore.

Misconception #3: Web app is crappier than “native apps” in turns of UX.

This is a notion I strongly disagree. The only reason mobile browsers on devices perform worse than the native app is because the venders of the device did not invest enough effort on it. There is a conflict of interest here. Limiting the capability of APIs accessible, besides permission management issue, is the same thing. At Mozilla, we can show that with proper engineering, mobile browser, or web runtime, can run smoothly as silk on devices and deliver the so-called “native” experiences. Other venders investing the web is also doing the same thing. For devices running B2G or Chrome OS, the native app is web apps, there is no point to make it slow, intentionally or unintentionally.

Misconception #4: Web apps running on B2G is dependent on B2G or Mozilla Web APIs.

Yes and no. If you need specific access like SMS message database or phone dialing, initially B2G would be the only platform that make these features available to you. However, we design these Web APIs with standardization in mind, and work closely with standardizing bodies to made these APIs ready for other venders to implement. Evidently, our Web APIs does not live under PhoneGap.*, Windows.*, nor Ti.*; they are design to be at where it should be, like navigator.* (Sure, as experimental features, they came with moz prefix right now, and usual feature detection is advised).

If you choose other packaged web app solution, then, your web app is depend on it. Forever. Mozilla would like you to develop and distribute apps directly to the Web, and you should.

Misconception #5: Apps on Web are free of charge, developers will never make money out of it.

Yeah, like no one had ever become billionaire because of applications or services they put on the web.

There are tons of way to provide services on the web in exchange for money, surely there are ways to make money with web apps, other than bagging for US$0.99 from users. Sure, closed platform with single distribution “app store” seems effective in terms of delivering revenue to developers (besides having 30% of your money being taken along the away), but Mozilla believes choices matters, and an open platform deliver choices to both users and developers.

Mozilla is also exploring effective distribution and monetize channel for the Open Web, in our Open Web App Project and the Mozilla Marketplace website. The goal of offering choices and freedom is deeply embedded in the feature design, both in browser App API and in storefront, and with user authentication and authorization (which is the scope of the Mozilla Persona project, previously known as BrowserID).

Conclusion

Just like Mozilla did with Firefox years ago, we intend to deliver great product that could influence the market, and let the market influence other venders. No one owns the Web, and no one should. Mozilla is here not to make money or become monopoly, but to bring the goodness to the Web that could drive innovation and opportunities centuries to come.

The CTO of Opera Software, Håkon Wium Lie, once said the Web will last for at least 500 years, and made impact to the society just like Gutenberg’s printing press. I totally agree with him, even though none of us will be around and told me I was wrong, as he puts it. Let’s make it great.

Disclaimer: This post express the opinion of my own and does not necessarily represents the point of view of Mozilla nor Mozilla Corp.

American Censorship Day

Note: If you are reading this on the site, you will see the banner censored.

The US Congress is currently discussing a bill that will give the power of government to shut down websites, prosecute social network users (i.e. you and me) for making minor non-commercial copyright violation (e.g. singing a pop song on Facebook), etc. This is the end of free speech on Internet as we know it, and is an attempt to build national firewall and censorship system much like the one in China.

Please join us for the fight to preserve freedom on Internet. More information can be found at americancensorship.org.