牛人與凡人的差距

-

Image Source: Wikipedia

Pedro Domingos終極算法(The Master Algorithm)這本書1裡面提到:

我在大四时,用了一个夏天玩俄罗斯方块游戏,这是一个涉及方块叠加的电子游戏,游戏中由正方形组成的各种形状的图案往下掉,你要将这些图案堆起来,堆得越紧密越好。如果图案堆到屏幕顶部,那么游戏就结束了。当时我完全没有意识到,这就是我接触NP完全问题的开始,这是理论计算机科学最重要的一个问题。

這就是牛人和凡人的差距嗎?Pedro 玩電玩想到 NP-Complete,我輩玩俄羅斯方塊,想到什麼?


  1. 這本書的簡體中文版書名是《終極算法:機器學習和人工智能如何重塑世界》,由中信出版社出版 

來塗鴉吧!

Google 的人工智慧研究團隊,設計了一個 A.I. Experiments 網站,展示人工智慧技術能做些什麼,裡面有個 QuickDraw 遊戲,讓你用滑鼠或是手指,依照畫面指示用20秒畫出指定的圖形,比如鴨子、犀牛、房屋、汽車,Google 研究員們利用神經網路技術,判斷你畫的像不像。這個遊戲用手指玩比較方便,用滑鼠畫畫比較不容易。

如果玩者同意,畫的圖形就提供給研究員加入資料庫,大家都聽說過了,神經網路技術技術能發揮作用靠的就是龐大的訓練集(training set),畫的越多,資料越多,QuickDraw 就越準。

有意者盍興忽來,來玩吧

機器學習 vs. 統計

機器學習越發受到大眾矚目之後,比較機器學習和統計有什麼不同的各種說法越來越多,我自己也在各種資料上,和各種討論(打嘴炮)場合,見過和聽說過各種偏見和意見。有一點小意外, Data Mining 圈大名鼎鼎的 KDNuggets ,竟然找了投資銀行出身的 Astash Shah 來說說機器學習和統計有什麼不同?

Source: SAS Institute – A Venn diagram that shows how machine learning and statistics are related

從教科書抄出來的定義,和一般大眾的印象,得到的總結是這樣的:

Machine learning is all about predictions, supervised learning, unsupervised learning, etc.

Statistics is about sample, population, hypothesis, etc.

然後 Astash Shah 說統計是數學的分枝科目,而機器學習的理論技術則是源自人工智慧。

Machine learning is a subfield of computer science and artificial intelligence. It deals with building systems that can learn from data, instead of explicitly programmed instructions.

statistical model, on the other hand, is a subfield of mathematics.

既然內功源流不同,打磨熬煉的經脈不同,若是面對同一件事,機器學習專家和統計專家會如何描述這件事呢?老實說,從下面的例子還真不容易分出來。

ML professional: “The model is 85% accurate in predicting Y, given a, b and c.”

Statistician: “The model is 85% accurate in predicting Y, given a, b and c; and I am 90% certain that you will obtain the same result.”

最後的結論:

The difference between the two has reduced significantly over the past decade. Both the branches have learned from each other a lot and will continue to come closer together in the future.

But, understanding the association and knowing their differences enables machine learners and statisticians to expand their knowledge and even apply methods outside their domain of expertise. This is the notion of “data science” itself, which aims to bridge the gap.

好像,真的沒有什麼不同!?做投資的人,就是有辦法。

Source: Machine Learning vs Statistics – KDnuggets

用機器學習技術幫助身心障礙患者

看到技術大廠和個人運用機器學習技術幫助身心障礙、學習障礙的患者,能更好的在這個世界活下去,比起看到 AlphaGo 大殺四方和 Libratus 贏了多少錢的新聞,心頭多了幾分暖意

比如:

  • YouTube 開發新演算法讓字幕更人性,更好理解,聽障人士的福音

(Youtube) rolled out algorithms that indicate applause, laughter, and music in captions. More sounds could follow, since the underlying software can also identify noises like sighs, barks, and knocks.

  • MIT 和 IBM 合作開發語言處理軟體,協助閱讀障礙和認知障礙的患者理解文本

Researchers at IBM are using language-processing software developed under the company’s Watson project to make a tool called Content Clarifier to help people with cognitive or intellectual disabilities such as autism or dementia. It can replace figures of speech such as “raining cats and dogs” with plainer terms, and trim or break up lengthy sentences with multiple clauses and indirect language.Researchers at IBM are using language-processing software developed under the company’s Watson project to make a tool called Content Clarifier to help people with cognitive or intellectual disabilities such as autism or dementia. It can replace figures of speech such as “raining cats and dogs” with plainer terms, and trim or break up lengthy sentences with multiple clauses and indirect language.

  • 自閉症譜系障礙的患者,開發軟體幫助有同樣困擾的患者,學習更加獨立

Austin Lubetkin, who has autism spectrum disorder, has worked with Florida nonprofit Artists with Autism to help others on the spectrum become more independent.

好吧,或許是我鄉愿,但我真心希望機器學習能帶給人類更多好處。


Software that can understand images, sounds, and language is being used to help people with disabilities such as deafness and autism in new ways.

Source: Machine Learning Opens Up New Ways to Help Disabled People – MIT Technology Review

我們應該擔心嗎

前幾天 MIT Technology Review 網站有文章談到深度學習大牛 Yann LeCun 認為機器可以利用機器視覺技術從大量影片中提取「常識」等級的知識,還有篇文章談如何利用機器學習技術,協助法官判案

光看這兩篇文章的標題,就讓我渾身冷颼颼,在人工智慧技術進展迅速的今日, John Markoff 的書Machines of Loving Grace 裡面所說 IA (intelligence augmentation) vs. AI (artificial intelligence) 的天平,似乎擺盪頻率愈發的高,擺盪幅度也愈發的大了。

看了上面這兩篇文章,我不禁懷疑,IA 和 AI 兩個取向,天平擺盪會有贏家輸家嗎?誰贏誰輸,最終對人類的影響究竟有什麼不同?

AlphaGo 初次露臉之後,李開復寫了一篇《人工智慧對人類真正的威脅是什麼?》,我覺得他對人工智慧議題的觀點是稍偏 IA 這一側的。但機器若能從大量影片裡面觀察到事物的特色與限制(真的邁向 common sense 了?),那可真的是「學習」路上一大步,不是 augmentation 或 amplification ,而是 intelligence 了。

One of the things we really want to do is get machines to acquire the very large number of facts that represent the constraints of the real world just by observing it through video or other channels. That’s what would allow them to acquire common sense, in the end. These are things that animals and babies learn in the first few months of life—you learn a ridiculously large amount about the world just by observation.

去年有人說臉書的廣告演算法和推薦演算法為什麼不一樣(唉,竟然忘記出處),因為他不需要 profiling 你是什麼樣的人,他根本就知道你是誰啊。Yann LeCun 現在可是臉書的人工智慧研究部門的老大,如果臉書的研究往前走了這麼一大步,怎麼不讓我感到冷颼颼。

說真格的,人工智慧對人類究竟是不是「威脅」,人言言殊,真的很難說。雖然現在不可能有答案,杞人憂天,畢竟也是談資啊Vox 讓旗下專欄作者 Sean Illing 找了十來個專家,問 How worried should we be about artificial intelligence?,答案可以說南轅北轍,也可以說所處的位置和行業決定了答案。人工智慧搶工作的議題是一定會一提再提的,技術演進的腳步快慢也一定有不同的看法。

當然,一定會有人要大家認真看待人工智慧的威脅,首先開槍的是來自牛津大學的哲學家 Nick Bostrom,人工智慧說不定那天就搞出大事了,怎麼能不小心謹慎呢。

The transition to machine superintelligence is a very grave matter, and we should take seriously the possibility that things could go radically wrong. This should motivate having some top talent in mathematics and computer science research the problems of AI safety and AI control.

最近幾年,只要談到資料挖掘、大數據、人工智慧,異常搶鏡的 Andrew Ng,則大剌剌的說,未來我們的後代也許需要擔心這個,但是現在擔心這問題,跟擔心火星上發生貪汙案一樣。中國味十足的答案,莫非因為他去了百度,常常閱讀中國材料,耳濡目染中國的反貪腐宣言,一不小心就帶進對話裡了。

Worrying about evil-killer AI today is like worrying about overpopulation on the planet Mars. Perhaps it’ll be a problem someday, but we haven’t even landed on the planet yet. This hype has been unnecessarily distracting everyone from the much bigger problem AI creates, which is job displacement.

我想,最好的答案,也是最雞湯的答案,應該是 MIT 的 Daniela Rus 的宣言吧!

It’s understandable that people have fears and anxieties about AI, and, as researchers, we have a duty to recognize those fears and provide different perspectives and solutions. I am optimistic about the future of AI in enabling people and machines to work together to make our lives better.

Pedro Domingos on “The Master Algorithm”

2012 年 CACM 有一篇機器學習圈子裡很受矚目的文章 A few useful things to know about machine learning1 (這篇文章在中國技術圈子被稱為《機器學習那些事》),Google 搜尋的資料說這篇文章至目前為止已經被引用 587 次,一篇導論性的文章短短幾年內被引用接近六百次,確實不少了。

-

這篇文章的作者 Pedro Domingos 教授在 2015 年出版了 The Master Algorithm2,書的副題是 How the Quest for the Ultimate Learning Machine Will Remake Our World,作者自己給這本書下的註腳是 Everything you always wanted to know about Machine Learning,比爾蓋茲在 Code Conference 2016 的訪問說這是想要瞭解 AI 必讀的兩本書之一。

And, in Gatesian fashion, he suggested a pair of books that people should read, including Nick Bostrom’s book on superintelligence and Pedro Domingos’ “The Master Algorithm.”

Melinda Gates noted that you can tell a lot about where her husband’s interest is by the books he has been reading. “There have been a lot of AI books,” she said.

盛名之下,不知道是否名副其實。看()書之前,先看看 Pedro Domingos 受邀至 Google 談大演算法的演講吧:

 


  1. P. Domingos, “A few useful things to know about machine learning,” Commun. ACM, vol. 55, no. 10, pp. 78-87, Oct. 2012. [Online]. Available: http://doi.acm.org/10.1145/2347736.2347755  
  2. 繁體中文版書名是《大演算:機器學習的終極演算法將如何改變我們的未來,創造新紀元的文明?》