http://rawgraphs.io/
https://plotdb.com/
https://d3js.org/
MR. MINING 發表在 痞客邦 留言(0) 人氣()
通常在統計的分析報告中你會看到這樣的用語「在95%的信心水準下,平均值會落在某某誤差區間內。」
這裡的「95%的信心水準」是指在95%的時候可以得到距真實值這麼近的結果。也就是說,有5%的情況下樣本與真實值會超過的誤差界限。
有幾個重點你必需知道的:
- 我們無法得知這組樣本是屬於「95%中了」的那組,或是「5%沒有」的那組。只能說有95%的把握。
- 如果要把信心水準拉大到99%,那就得接受比95%信心時大的「誤差界限」。
- 如果要把信心水準拉大到100%,那就得把誤差界限拉大到0到1,但這樣的結論就沒有用處了。
- 如果要在相同的信心水準下要求較小的誤差界限,那就要取大一點的樣本。
MR. MINING 發表在 痞客邦 留言(0) 人氣()
這裡有30哈佛商業評論(HBR)上的大數據科學分析提供最新的技術和事件數據的世界的見解的文章。
Source : http://www.kdnuggets.com/2015/09/30-hbr-articles-analytics-big-data-science.html
On Data Science
- Data Scientist: the sexiest job of the 21st century by Thomas H. Davenport and D.J. Patil (Oct 2012)
How the idea of LinkedIn's People You May Know feature really clicked! The key player involved was a "Data Scientist", a title coined by the two authors.
- The Sexiest Job of the 21st Century is Tedious, and that Needs to Change by Sean Kandel (Apr 2014)
Which phase does a data scientist spend more time on? Data Discovery, data structuring and creating context. Should they shift their focus?
With the right mix of technical skill & human judgment, machine learning could be a new tool for decision makers. Learn what mistakes to avoid.
We are at a new phase of big data. Is Data capture and storage now less relevant than making it more useful & impactful?
What makes an exceptional data scientist? Data by itself is meaningless. The skill & curiosity is what makes the difference.
How to derive insights & intuitions from data? We “humanize” the data by turning raw numbers into a story about our performance.
Better than the Best! Great data scientists bring four mutually reinforcing traits to bear that even the good ones can’t.
Data scientist jobs are very much in demand as companies grapple with the challenge of making valuable discoveries from Big Data. Is a huge crowd just joining the bandwagon?
- 10 Kinds of Stories to Tell with Data by Tom Davenport (Nov 2013)
Narrative is—along with visual analytics—an important way to communicate analytical results to non-analytical people. Explore the 10 types.
- How to Start Thinking Like a Data Scientist by Thomas C. Redman (Nov 2013)
You don’t have to be a data scientist or a Bayesian statistician to tease useful insights from data. The author demonstrates how to think with a small exercise.
- Stop Searching for That Elusive Data Scientist by Michael Schrage(Sep 2014)
Stop hunting for that data science unicorn and/or silver bullet. What to do instead?
- How to Explore Cause and Effect Like a Data Scientist by Thomas C. Redman (Feb 2014)
While we can use data to understand correlation, the more fundamental understanding of cause and effect requires more.
-
- You May Not Need Big Data After All by Jeanne W. Ross, Cynthia M. Beath and Anne Quaadgras (Dec 2013)
Companies are investing like crazy in data scientists, data warehouses, and data analytics software. Should they channelize their efforts?
- Big Data Hype (and Reality) by Gregory Piatetsky-Shapiro (Oct 2012)
Does your big data have big impact? The potential of “big data” has been receiving tremendous attention lately. The author analyzes using practical scenarios.
- With Big Data Comes Big Responsibility by Harvard Business Review Staff (Nov 2014)
An interview with Alex “Sandy” Pentland, the Toshiba Professor of Media Arts and Sciences at MIT who talks about the principles " A New Deal on data".
- Inventory Management in the Age of Big Data by Morris A. Cohen (Jun 2015)
Managers will need to redesign their supply-chain processes to make effective use of new data to stay competitive.
- Why Health Care May Finally Be Ready for Big Data by Nilay D. Shah and Jyotishman Pathak (Dec 2014)
Explore the key elements that are crucial for health care to truly capture the value of big data.
- What the Companies Winning at Big Data Do Differently by Satya Ramaswamy(Jun 2013)
A brief analysis of Netflix success using consumer behavior data. How big data can change the structure of an industry by fundamentally shifting the power.
- Stop Worrying About Whether Machines Are “Intelligent”. by JC Spender (Aug 2015)
Are we right to be afraid that the machines may take over? An interesting read about Turing's test and machine intelligence.
- Are You Data Driven? Take a Hard Look in the Mirror. by Andrew McAfee and Erik Brynjolfsson (Oct 2012)
The term “data driven” is penetrating the lexicon ever more deeply these days. What are the traits?
- Marketers Flunk the Big Data Test by Mick Collins (Apr 2015)
Marketing in particular is feeling the pressure to embrace new data-driven customer intelligence capabilities. Learn more about the key findings.
-
- Simplify Your Analytics Strategy by Narendra Mulani
Companies can get stuck trying to analyze all that’s possible and all that they could do through analytics. How to strategize to avoid this?
- Making Advanced Analytics Work for You by Dominic Barton and David Court
Big data could transform the way companies do business, delivering performance gains. How to get the strategy suited to your needs?
- A Predictive Analytics Primer by Tom Davenport (Sep 2014)
A brief read on predictive analytics with a focus on customers.
- The Persuasiveness of a Chart Depends on the Reader, Not Just the Chart by Scott Berinato (May 2015)
What's more a better way to persuade people than visual information? An interesting read on how good is your data chart is based on the audience's understanding of it and cognitive state.
- Analytics 3.0 by Thomas H. Davenport (Dec 2013)
A new resolve to apply powerful data-gathering and analysis methods not just to a company’s operations but also to its offerings—to embed data smartness into the products and services customers buy.
- What People Analytics Can’t Capture by Daniel Goleman (July 2015)
The latest fad in human resources, using big data analytics and personality test scores to predict who is best for a given job – so-called “XQ.”. Do the scores capture accurately all the required skills?
- Gamification Can Help People Actually Use Analytics Toolsby Lori Sherer-(Feb 2015)
You have to identify the right data and develop useful tools, such as predictive algorithms. But then comes an even tougher task: getting people to actually use the new tools.
- What Popular Baby Names Teach Us About Data Analytics by Kaiser Fung (Apr 2015)
Find out what FiveThirtyEight’s Nate Silver and Allison McCann did with the baby names dataset sets an example for all data analysts. Their article represents the best of data journalism.
- A Better Way to Tackle All That Data by Chris Taylor (Aug 2013)
Hampered by a shortage of qualified data scientists to perform the work of analysis, big data’s rise is outstripping our ability to perform analysis and reach conclusions fast enough.
MR. MINING 發表在 痞客邦 留言(0) 人氣()
在開收資料大商機(Open data now, Joel Gurin)一書中,作者的定義為:「可取得的公開資料,讓人們、公司,以及組織可用以創立新事業、分析型能與趨勢、做出資料導向決策,以及解決複雜問題。」
Deloitte, Harvey Lewis將開收資料領域的企業區分為五大類:
- 供應商(Supplier):資料供應者-不收費
- 匯總者(Aggregators):分析並提供洞察-收費
- 軟體開發商(Developers):設計並建立應用程式
- 改進者(Enrichers):使用開放資料來改進現有產品與服務
- 輔助者(Enablers):幫助其他公司更善於利用開收資料-收費
MR. MINING 發表在 痞客邦 留言(0) 人氣()
資料來源:數位時代2015年7月
根據數位時代2015年的文章,台灣在Big data產業中,於資料分析及服務的拼圖上是缺乏的。
網通設備
伺服器
儲存設備
系統整合
- 精誠
- 聚碩
- 凌群
- 零壹
- 敦陽
- 資通
- 華碩
- 新鼎
- 台達電
- Etu
資料分析
- 雲深 (http://www.cloudeep.com.tw/)
- 關貿 (http://www.yodass.com/)
- 科智 (http://www.servtech.com.tw/about.php)
- 科智企業股份有限公司的公司成員,大多來自於財團法人資訊工業策進會衍生創業(Spin-off)的專業技術服務團隊,其核心產品為關鍵製程資料之應用服務解決方案Servolution,主要是透過資通訊科技(Information and Communication Technology, ICT)蒐集整廠資訊,研發製造最佳化分析技術,協助設備加工廠提高稼動率並藉以提升供應鏈管理彈性,帶動製造業服務化模式創新。
- 所採用方法為整合前端感測裝置訊號,使前端機械設備及週邊機械可透過各式訊號源傳回,經由整合且標準化之通訊協定送往後端平台,進行資料分析、根本原因(Root Cause)分析以挖掘生產瓶頸,進而運用不同應用服務,來增強競爭優勢。
- LOCATION
- 威朋 (http://www.vpon.com/zh-tw/)
- 成立於2008年,專注在行動裝置的行動廣告領域,憑藉強大的研發技術、海量數據處理分析,以及對品牌廣告主的商業拓展能力,Vpon威朋已服務超過1000家知名品牌,包括:麥當勞、可口可樂、美國運通、花旗銀行等。獨立使用者超過4.5億,廣告業務涵蓋東京、上海、廣州、香港、台北等750多個城市。目前Vpon威朋於上海/東京/台北/香港設有辨公室,是亞洲地區成長最迅猛的大數據廣告公司。Vpon威朋獲獎無數,2015年更獲《Forbes China富比士中國100強》選為中國非上市潛力企業第3名。
-
技術優勢亞洲首家LBS技術與行動應用廣告模式相結合的創新行動廣告服務提供
MR. MINING 發表在 痞客邦 留言(0) 人氣()
到底是英國人特別喜歡作奇怪的研究,還是台灣的媒體特別喜愛"誤"用英國的研究結果來大作文章。姑且不論其研究的可靠性,光目下標的新聞標題就夠醒目,讓你不尤自主地點進連結。
以統計的關點來看,不論其研究命題的方式,其大多犯了一個統計上常見的錯誤。
迴歸分析(Regression analysis)是統計上的一大利器,但如果你不了解其中的函義前就直接套用,相信你也可以導出像英國研究這樣的大膽假設,無心求證的結果。
常見有七大濫用的情況:
- 使用迴歸分析來分析非線性關係的問題。 (就像是硬要在分佈極遠的3個點之間劃上一條線,而就希望該線能代表3個點)
- 數據間具相關系並不等於具有因果關係。(某一時期美國自閉症人數上升,中國的GDP也上升。這並不意味其間有真接的因果關係)
- 顛倒的因果。(A 與B具相關性,並無法推論出是A導致B,還是B導致A。常見的英式作法就是看那種推論較為聳動就選那一個)
- 遺漏變數偏誤。(生活壓力大的人大多是"年輕人",做愛次數本來就比年長者多。故其分析遺漏了"年紀"的這個變數)
- 高度相關的解釋變數(多重共線性)。(只有走路3次就可以......? 但常走路的人不代表就不從事其它運動。一個統計模型中放入太多高度相關的變數,會使分析的焦點模糊)
- 超出資料範圍的推測。(用身高來推算智力? 那你很能得到身高=負25公分這樣的可笑結果)
- 資料地雷(太多變數):
MR. MINING 發表在 痞客邦 留言(0) 人氣()
熱銷書"讓天賦自由(The Element_How Finding Your Passion Changes Everything)的作者Ken Robinson接連在2009初版,2010年第一版第30次,2011年發行10萬冊,及2013年又出版相同系列新書"發現天賦之旅"(Finding Your Element)。
由這些熱銷的數字背後不難看出其實人們是急於找尋心中的自我。當生命的旅途有發生不如意事時,人們會進行反思人生的意義,並試圖告訴自已其實不需要太在乎社會所設下的規範。
我無法針對以上書中觀點立下結論,例如去回答人生的目的為何? 但我卻可以發現以上的這些社會情況只會發生在人類遇到困惑的時候,會想要找尋一種"說法"來讓生命找到出路。姑且不論這種"說法"的正確性為何,是否能通過歷史長時間的考驗,但至少他能為人類帶來短暫的出口。
更進一步的說,透過研究熱銷書的種類,數字,趨勢,可以用來分析人類內心深處所缺乏的原素,可作為社會現象研究及行銷研究的有用資訊。
MR. MINING 發表在 痞客邦 留言(0) 人氣()
聰明學統計的13又½堂課:每個數據背後都有戲,搞懂才能做出正確判斷
- 作者:查爾斯.惠倫
- 出版社:先覺
- 出版日期:2013/11/28
- 語言:繁體中文
"數據(Data)"本身就已經極簡化整個待描述事件及過程,並且試圖用合理地、可說明人地理由讓使人相信這些行為所產生的"數據"是具代表性的。可惜,事實往往並非如此,大多數的數據都仍未能充份地解懌一項行為,且在未經合理的分析前,總是沒有用處,甚至會誤導人們。(通常的原因是其中充滿太多的outliers)
統計本身的迷人之處在於其透過分析的技巧,可讓這些似是而非的"數據"變的更容易解讀,以進而作出正確的選擇決策。但這中間的成功關鍵則取決於統計分析者的專業能力及品德。統計學者可以因為"自已的無知"或"讀者的無知",有意或無意地使其分析結果讓人誤判。所以強烈建議每個人都必需有解讀"數據"或者"統計報告"的能力,一些簡單的邏輯判斷即可有助作出正確的決策。在此,我不願意去提及有些因為政治因素而利用統計這項利器來愚弄人民,甚不可取。
作者在文章之末提到了幾項統計軟體,有些軟體隨手可得且可完成大部份你想的到的統計分析,例如Microsoft Excel。如果你的手中有一些資料,而你也想透過這資數據來作出一些屬於你的分析決策的話,不彷一試。
- Microsoft Excel
- Stata (www.stata.com)
- SAS (www.sas.com/technologies/analytics/statistics)
- R
- IBM SPSS
作者介紹
作者簡介
查爾斯.惠倫 Charles Wheelan
芝加哥大學公共政策博士。大學畢業於歷史悠久的常春藤盟校「達特茅斯學院」,碩士就讀普林斯頓大學。2012年6月起任教於達特茅斯學院。2004~2012年擔任芝加哥大學公共政策哈里斯學院資深講師,任教第一年即獲得學生票選為非必修課程的「年度最佳教授」。他的第一本著作《聰明學經濟的12堂課》(先覺出版)榮登博客來「年度百大趨勢書」。繼《非典型人生建言》(先覺出版)之後,本書2013年出版即登上《紐約時報》非文學類暢銷榜。
作者討厭微積分,卻深愛統計學,因為他發現,統計可以幫助你回答五個重要的問題:
民調是如何計算出來,又怎樣加以操弄?
在哈佛這類名校求學,真的會改變人生?
為什麼我們不應該花錢延長產品的保固期?
信用卡公司如何從你的消費紀錄看出你是會延遲繳款的人?
如何提升做決策的效果,而不白忙一場?
MR. MINING 發表在 痞客邦 留言(0) 人氣()
給您身邊有的朋友,只要有(中華電信門號的人) (2011/12/31前)到中華電信營運處中心(室內電話申辦中心並非一般加盟店),帶著你的雙證件。(每1個人可申辦1個你最常撥打的(中華電信門號)指定門號後撥打給對方,完全免通話費。另外第2件好康是(每1個人:另外可申辦4個/常撥電話手機號碼/指定熱線月租型門號,但每1個門號加收50元,一樣4個門號撥打也免通話費。 快快轉貼~~~~記得別讓自己的權益睡著,因為中華電信沒有大力推銷知道的人不多喔!!!!!!!!!!!! 快轉貼 快轉貼 剛剛我已經打電話去中華電信確認過了!
MR. MINING 發表在 痞客邦 留言(0) 人氣()
Data Warehouse stores the huge historical data from diverse data sources, you can analyze the data completely by using the data warehouse. However, if the data size is too big, and/ or the access user is many, you should consider the system loading and the security issue. Now, what you can do is develop the data warehouse for different department for differnt analysis use purpose, the we call it Data Mart.
From the Data Mart, you can create the pre-caculated result and stored as the multi-deminion Data Cube.
So, the Data Warehouse, Data Mart, Data Cube are all functioned to store data but in different space size. The smaller space size the higher anlysis efficiency it is.
MR. MINING 發表在 痞客邦 留言(0) 人氣()
To develop the analyical system is a easy thing. You, and most of us are all the IT engineers that have limited idea of of company's operation for each industries. By the way, the knowledge and experience of managment is also not in place. So we won't able to develop the analyical system that can fulfill the company decision makder's expection unless you do not only focus on the system detail operations but expand the theory of management and the advanced vision.
The analyical system is not just a standard product package like the ERP or the accounting system that could be simply oursoucing. It need the people who has the domain knowldege or at least can stands at the same level as the managers do.
MR. MINING 發表在 痞客邦 留言(0) 人氣()