機器學習 vs. 統計

機器學習越發受到大眾矚目之後,比較機器學習和統計有什麼不同的各種說法越來越多,我自己也在各種資料上,和各種討論(打嘴炮)場合,見過和聽說過各種偏見和意見。有一點小意外, Data Mining 圈大名鼎鼎的 KDNuggets ,竟然找了投資銀行出身的 Astash Shah 來說說機器學習和統計有什麼不同?

Source: SAS Institute – A Venn diagram that shows how machine learning and statistics are related

從教科書抄出來的定義,和一般大眾的印象,得到的總結是這樣的:

Machine learning is all about predictions, supervised learning, unsupervised learning, etc.

Statistics is about sample, population, hypothesis, etc.

然後 Astash Shah 說統計是數學的分枝科目,而機器學習的理論技術則是源自人工智慧。

Machine learning is a subfield of computer science and artificial intelligence. It deals with building systems that can learn from data, instead of explicitly programmed instructions.

statistical model, on the other hand, is a subfield of mathematics.

既然內功源流不同,打磨熬煉的經脈不同,若是面對同一件事,機器學習專家和統計專家會如何描述這件事呢?老實說,從下面的例子還真不容易分出來。

ML professional: “The model is 85% accurate in predicting Y, given a, b and c.”

Statistician: “The model is 85% accurate in predicting Y, given a, b and c; and I am 90% certain that you will obtain the same result.”

最後的結論:

The difference between the two has reduced significantly over the past decade. Both the branches have learned from each other a lot and will continue to come closer together in the future.

But, understanding the association and knowing their differences enables machine learners and statisticians to expand their knowledge and even apply methods outside their domain of expertise. This is the notion of “data science” itself, which aims to bridge the gap.

好像,真的沒有什麼不同!?做投資的人,就是有辦法。

Source: Machine Learning vs Statistics – KDnuggets

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s