A place to provide few bits of news to folks who study the uncertainty behind everything...
Expressly, a platform for statistical community...giving news from India as well as abroad, about statisticians, novel methods & techniques, useful resources... anything that someone from our clan is interested in !!!
Sunday, May 4, 2014
Sunday, April 20, 2014
How Fast the Fastest Human Would Run 100m?
People have used extreme value theory to predict the records in various sports. Here is an articles which provides codes to visualize the same. One can update the dataset to take into account latest records. It's interesting to see how this updation affects the estimates:
http://www.r-bloggers.com/how-fast-the-fastest-human-would-run-100m/
Where nobody lives
Despite having a population of more than 310 million people, 47 percent of the USA remains unoccupied. Here is a map showing places where nobody lives:
Vectorization in R: Why?
Beginning R users are often told to “vectorize” their code. Here, is an attempt to explain why vectorization can be advantageous in R by showing how R works under the hood:
Checking (G)LM assumptions in R
(Generalized) Linear models make some strong assumptions concerning the data structure. Here is how to verify those assumptions in R:
Wednesday, April 16, 2014
Mapping a century of earthquakes
Did you know that United States Geological Survey maintains an ever growing archive of earthquakes detected around the world, and they make it easy to query and download?
Here is how you can map that data using R:
Benefits of using Open Source Software
Why public universities should use open source software? Read the reasons at:
http://www.r-bloggers.com/public-universities-should-use-open-source-software/
http://www.r-bloggers.com/public-universities-should-use-open-source-software/
Monday, April 14, 2014
The Median Isn't the Message
The Median Isn't the Message is the wisest, most humane thing ever written about cancer and statistics. It is the antidote both to those who say that, "the statistics don't matter," and to those who have the unfortunate habit of pronouncing death sentences on patients who face a difficult prognosis. Anyone who researches the medical literature will confront the statistics for their disease. Anyone who reads this will be armed with reason and with hope:
Thursday, April 10, 2014
Beeps and progress alerts to your phone
Would you like your R program to alert you with a beep or ping, as soon as the execution is over? then here is the way out:
R Help about Symbols
How to open R help about a symbol or punctuation mark like ( parenthesis or [ bracket:
Saturday, April 5, 2014
God is a Gambler
"All the evidence shows that God was actually quite a gambler, and the universe is a great casino, where dice are thrown, and roulette wheels spin on every occasion."
- Stephen Hawking
Friday, April 4, 2014
Some R Resources for GLMs
It is relatively easy to figure how to code a GLM in R. Even a total newcomer to R is likely to figure out that the glm() function is part of the core R language within a minute or so of searching. Thereafter though, it gets more difficult to find other GLM related stuff that R has to offer. Here is a far from complete, but hopefully helpful, list of resources.
Thursday, April 3, 2014
Does R have too many packages?
Most of us agree with the fact that availability of thousands of packages on CRAN is often life saving. But, there are few, who feel that there are rather too many packages on R. Read this post to know what makes them think the other way?
Tuesday, April 1, 2014
Probability of Extreme Events like 9/11
An interesting article on estimating probability of large terrorist events like 9/11
http://sourish-das.blogspot.in/2013/11/probability-of-extreme-events-like-911.html
http://sourish-das.blogspot.in/2013/11/probability-of-extreme-events-like-911.html
Global Flow of People
A very cool representation of global migration flows between all countries using R:
Don't miss this link and check out the migration pattern of your country:
http://www.global-migration.info/
http://www.global-migration.info/
Friday, March 28, 2014
Why use R?
Why should R be preferred over other statistical software? Read in the words of an extensive user of both a proprietary statistical programming language as well as the open source alternative.
http://www.r-bloggers.com/why-use-r-five-reasons/
Add new colors to your R-charts
If you are a big fan of Wes Anderson's movies and if you love the quirky characters and stories, the distinctive cinematography, and the unique visual style, then you can bring some of that style to your own R charts, by making use of these Wes Anderson inspired palettes.
Wednesday, March 26, 2014
Statistics reveal a prescription drug epidemic
After the tragically early death of actor Philip Seymour Hoffman last month, Carlos Grajales finds that the statistics reveal a prescription drug epidemic in the US. Can you believe that in 2010 drug overdose caused more deaths than motor vehicle traffic crashes.
Tuesday, March 25, 2014
Overlapping Clusters
Aren't all of us used to seeing the well-separated clusters displayed in textbooks and papers.. But that doesn't happen in reality. So, what should one do in such cases? Read about how to deal with such situation at:
http://www.r-bloggers.com/warning-clusters-may-appear-more-separated-in-textbooks-than-in-practice/Handling Character data in R
In today's data-centric world, a statistician can't escape from text data. It's not a very difficult task, if we start in time. So, let's learn about handling character data in R with this free e-book:
http://www.r-bloggers.com/learn-about-handling-character-data-in-r-with-this-free-e-book/
http://www.r-bloggers.com/learn-about-handling-character-data-in-r-with-this-free-e-book/
Saturday, March 22, 2014
About Normality and Testing for Normality
It is often said that with small sample sizes, everything looks normal, as the normality tests are, indeed, very sensitive to what goes on in the extreme tails. In other words, if we have enough data to fail a normality test, we always will because our real-world data won’t be clean enough. If we don’t have enough data to reliably fail a normality test, then there’s no point in performing the test, and we have to rely on the fat pencil test or our own understanding of the underlying processes. Read the detailed reasoning at:
Why one shouldn't use Bivariate Correlations for Variable Selection?
In applied statistics, what typically happens is a researcher sits down with their statistical software of choice and they compute a correlation between their response variable and their collection of possible predictors. From here, they toss out potential predictors that either have low correlation or for which the correlation is not significant. The concern here is that it is possible for the correlation between the marginal distributions of the response and a predictor to be almost zero or non-significant and for that predictor to be an important element in the data generating pathway. Read more about why we shouldn't be using bivariate correlations for variable selection..
Friday, March 21, 2014
Teaching for Modern Generation
Dr. Rajeeva Karandikar speaking about how teaching should be transformed for modern generation which is an instant generation, the Facebook/Whats app/Twitter generation, the generation for which sending email is too slow. All those who are teachers or who aspire to become one, should not miss it ...
Thursday, March 20, 2014
80/20 Rule of Statistical Methods Development
Developing statistical methods is hard and often frustrating work. One of the under appreciated rules in statistical methods development is the 80/20 rule. The basic idea is that the first reasonable thing you can do to a set of data often is 80% of the way to the optimal solution. Everything after that is working on getting the last 20%. The hard decision is whether to create a new method is whether the 20% is worth it. This is obviously application specific. Here is an interesting discussion about 80/20 rule of statistical methods development.
The Improbability Principle
The video and slides from David Hand's lecture on the subject of his new book 'The Improbability Principle'.
It is about extraordinarily improbable events. It’s about events which are so unlikely that we wouldn’t expect to see them during our entire lifetimes - or even the lifetime of the human race or the universe itself. And it’s about why, despite all that, we do see such events; and more, it’s about why we them again and again.
Secrets of Teaching R: An Interview with Bob Muenchen
It is of interest to see what makes R so popular, yet ‘quirky’ to learn. To get some insight from a real pro here is an interview with Bob Muenchen. Bob is the author of 'R for SAS and SPSS Users'. He is also the creator of r4stats.com, a popular web site devoted to analyzing trends in analytics software and helping people learn the R language.
http://www.r-bloggers.com/secrets-of-teaching-r/
Google Drive in R
Want to retrieve all direct links to your Google Documents? R can help you out. Check out the details at:
Bayesian First Aid
Bayesian First Aid is an attempt at implementing reasonable Bayesian alternatives to the classical hypothesis tests in R. Here are a few of them:
- Binomial Test - http://www.r-bloggers.com/bayesian-first-aid-binomial-test/
- t-test - http://www.r-bloggers.com/bayesian-first-aid-one-sample-and-paired-samples-t-test/
- Pearson Correlation Test - http://www.r-bloggers.com/bayesian-first-aid-pearson-correlation-test/
Here are a few more introductory articles:
Thursday, March 13, 2014
Magical Wolfram Language
Examples of what can be done with the knowledge-based Wolfram Language..
Right from Blurring Faces in an Image to Hiding Secret Messages in Images, Make a You-Centric world map.. Do check out the complete list!!
http://www.wolfram.com/language/gallery/
Right from Blurring Faces in an Image to Hiding Secret Messages in Images, Make a You-Centric world map.. Do check out the complete list!!
http://www.wolfram.com/language/gallery/
Mathematical Character Curves
Check out to see how various shapes are represented through mathematical equations and inequalities..
We're glad to see that people have been enjoying our mathematical character curves!
http://blog.wolframalpha.com/2013/05/17/making-formulas-for-everything-from-pi-to-the-pink-panther-to-sir-isaac-newton/
Check out how you can play with your favorite cartoon characters using Wolfram Mathematica http://blog.wolframalpha.com/2014/03/11/they-choose-you-pikachu/
We're glad to see that people have been enjoying our mathematical character curves!
http://blog.wolframalpha.com/2013/05/17/making-formulas-for-everything-from-pi-to-the-pink-panther-to-sir-isaac-newton/
Check out how you can play with your favorite cartoon characters using Wolfram Mathematica http://blog.wolframalpha.com/2014/03/11/they-choose-you-pikachu/
Wednesday, March 12, 2014
A Hack to Create Matrices in R, Matlab style!!
The Matlab syntax for creating matrices is pretty and convenient. Its R-counterpart is functional but not as pretty, plus the default is to specify the values column wise. Using meta-programming we can hack together a function that allows us to create matrices in a similar way as in Matlab. Read more at:
Thursday, March 6, 2014
The Magical Mind of Persi Diaconis
When Diaconis first came to Stanford, he planned to keep his magic background a secret from his academic colleagues.. fearing they wouldn't take seriously a man of hocus-pocus who did research on card shuffling.
Then he stumbled upon a book that described an experiment by the French mathematician Paul Lévy, analyzing the phenomenon known as perfect shuffling - in which a standard deck of cards is carefully shuffled eight times and ends up returning precisely to its starting arrangement. Diaconis says. "I thought, If Paul Lévy can study perfect shuffling, I can say I study perfect shuffling. So I wrote up my work on perfect shuffling, and it got on the front page of The New York Times."
Forecasting weekly data
What would you do if the seasonal period is rather long and non-integer? For example, if you have a weekly data, ARIMA models do not tend to give good results. The simplest approach in such situation is a regression with ARIMA errors. Here is an example using weekly data on US finished motor gasoline products supplied (in thousands of barrels per day) from February 1991 to May 2005.
http://www.r-bloggers.com/forecasting-weekly-data/
Wednesday, March 5, 2014
Beauty is the First Test
"Beauty is the first test; there is no permanent place in the world for ugly mathematics."
- G. H. Hardy
Why Mathematics Is Beautiful and Why It Matters, here is an Huffington Post article.
http://www.huffingtonpost.com/david-h-bailey/why-mathematics-matters_b_4794617.html
- G. H. Hardy
Why Mathematics Is Beautiful and Why It Matters, here is an Huffington Post article.
http://www.huffingtonpost.com/david-h-bailey/why-mathematics-matters_b_4794617.html
No need for SPSS – Now beautiful output in R as well
Many social scientists don't want to move R as it doesn't give a simple table view, just like the SPSS output window. The articles below discuss ways to put the results of certain statistics in HTML tables in R. These tables can be saved to disk or, even better for quick inspection, shown in a web browser or viewer pane... and then R output will be atleast as beautiful as the SPSS output.
Tuesday, March 4, 2014
Photoshop via Clustering
"Do not believe anything: what artists really do is to hang around all day."
-Paco de Lucia
It seems clustering is the new way to Photoshop.. one gets different variations with different no. of clusters..
PS: Don't miss the video link in the end.
-Paco de Lucia
It seems clustering is the new way to Photoshop.. one gets different variations with different no. of clusters..
PS: Don't miss the video link in the end.
Oldies but Goldies: Some Classical Books on Statistical Graphics
The article below highlights some interesting things about three classical books on statistical graphics. The books are old but still relevant and together they give a sense of the development of exploratory graphics in general and the graphics system in R specifically as all three books were written at Bell Labs where the S-language was developed.
Monday, March 3, 2014
Movies and Statistics
It’s Oscars season again, so why shouldn't statisticians enjoy this movie fever...
Here is some number crunching with IMDb data, using R..
http://www.r-bloggers.com/predicting-movie-ratings-with-imdb-data-and-r/
Some tools on predicting Academy Awards..
http://onlinelibrary.wiley.com/doi/10.1111/j.1467-985X.2007.00518.x/abstract
Here is some number crunching with IMDb data, using R..
http://www.r-bloggers.com/predicting-movie-ratings-with-imdb-data-and-r/
Some tools on predicting Academy Awards..
http://onlinelibrary.wiley.com/doi/10.1111/j.1467-985X.2007.00518.x/abstract
Thursday, February 27, 2014
Why You Should Always Get The Bigger Pizza
When one looks at thousands of pizza prices, one can see that you almost always get a much, much better deal when you buy a bigger pizza.The math of why bigger pizzas are such a good deal is simple: A pizza is a circle, and the area of a circle increases with the square of the radius.
Job Trends in the Analytics Market
This article presents various ways of measuring the popularity or market share of software for analytics including R, SAS, SPSS etc. It's interesting to note that analytics jobs for SPSS have not changed much over the years, while those for R have been steadily increasing. The jobs for R finally crossed over and exceeded those for SPSS toward the middle of 2012.
Monday, February 24, 2014
Modelers' Hippocratic Oath
For those who make a living by building some kind of Analytical models... Here comes a modelers' Hippocratic Oath, from the book "The Quants"
1. I will remember that I didn't make the world, and it doesn't satisfy my equations
2. Though I will use models boldly to estimate value, I will not be overly impressed by Mathematics
3. I will never sacrifice reality for elegance without explaining why I have done so.
4. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.
5. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension
1. I will remember that I didn't make the world, and it doesn't satisfy my equations
2. Though I will use models boldly to estimate value, I will not be overly impressed by Mathematics
3. I will never sacrifice reality for elegance without explaining why I have done so.
4. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.
5. I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension
Thursday, February 20, 2014
A Delicious Analysis !!
A topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. The article below discusses, use of Topic Modelling to find relevance of various ingredients, using data on recipes..
Coloured Noise
Have you ever wondered that there could be other colors to our all time favorite White Noise... like Red, Pink or Green. Read more about these coloured noises at:
http://www.ee.columbia.edu/~dpwe/noise/
http://www.ee.columbia.edu/~dpwe/noise/
Wednesday, February 19, 2014
Princeton vs Facebook
Whoa!!! it seems two biggies here had a tussle. When Princeton claimed rapid decline in Facebook, Facebook retorted by debunking Princeton. Enjoy reading and don't forget to take away the message that how cautious one should be while doing data analysis..
http://flowingdata.com/2014/01/24/facebook-debunks-princeton-study/
Here are some third party views on this debate:
http://www.r-bloggers.com/princeton-vs-facebook-modeling-contagion/
http://www.independent.co.uk/life-style/gadgets-and-tech/facebook-is-10-whats-next-for-the-social-network-9104592.html
Monday, February 17, 2014
Communication Skills in Academics and Research
"The first rule of writing is not to omit the thing you meant to say.” - Ralph Waldo
Here is Terry Speed, discussing the importance of communication skills in academics and research..
Here is Terry Speed, discussing the importance of communication skills in academics and research..
Thursday, February 13, 2014
Wednesday, February 12, 2014
Handling Dates and Times in R
Here is a small tutorial on the various ways to handle dates and times in R:
http://www.r-bloggers.com/using-dates-and-times-in-r/
http://www.r-bloggers.com/using-dates-and-times-in-r/
Does commuting affect our well-being?
Does commuting affect our well-being? Definitely according to the Office for National Statistics!
Data analysis from the Annual Population Survey revealed that commuting has a negative impact on personal well-being with the worst effects on happiness and anxiety. Read more at:
Monday, February 10, 2014
Banknotes featuring Scientists and Mathematicians
One can find a collection of currency featuring scientists and mathematicians from all over the world at the link below:
http://www-personal.umich.edu/~jbourj/money.htm
http://www-personal.umich.edu/~jbourj/money.htm
Saturday, February 8, 2014
Getting to The Heart of it With Monte Carlo
You only need two functions to draw a heart mathematically. Once you draw a heart, by using these two function, one can easily compute the area by using Monte-Carlo techniques. Details can be found at:
Friday, February 7, 2014
Linear Modeling and Logistic Regression with R
If you're new to the R language but keen to get started with linear modeling or logistic regression in the language, take a look at the link below. It works through a series of examples to teach by demonstration. All of the datasets used in the guide are available online, so it's easy to follow along from home.
http://www.r-bloggers.com/princetons-guide-to-linear-modeling-and-logistic-regression-with-r/
Thursday, February 6, 2014
Wolfram Personal Analytics for Facebook
Recently you might have been enjoying personal #FacebookIs10 videos.. Now it is time to take a look at the stats behind your Facebook profile with Wolfram Personal Analytics..
Right from Statistics of your posts, their weekly distribution, post lengths to word cloud, top commenters and sharers..
http://www.wolframalpha.com/facebook/
Right from Statistics of your posts, their weekly distribution, post lengths to word cloud, top commenters and sharers..
Interview with Kanti Mardia
Statistics provides a challenge somewhat akin to Sherlock Holmes’ task: how to find hidden truth in any data, from small to big.
- Kanti Mardia
Read the complete interview with Samuel S. Wilks Award winner Kanti Mardia, at:
- Kanti Mardia
Saturday, February 1, 2014
Amount of snow to cancel school
Following link gives an interesting visualization showing an estimated amount of snow required to close school for the day, by county. It's not simply directly proportional to the amount of snowfall as school cancellation is the result of more snow than an area is used to handling
http://flowingdata.com/2014/01/31/amount-of-snow-to-cancel-school/
http://flowingdata.com/2014/01/31/amount-of-snow-to-cancel-school/
Thursday, January 30, 2014
Comparing multiple (g)lm in one graph
It is already possible to compare multiple models as table output, here the author has built a function that plots several (g)lm-objects in a single ggplot-graph:
Recurrent events analysis, not so straightforward!
Heart failure hospitalizations are associated with an increased risk of cardiovascular death, so if an individual dies during follow-up, this isn't necessarily independent of the event process of interest. Dependent censoring needs to be accounted for in any analyses that are carried out and this renders standard methods as unsuitable. Here is some discussion about the alternative approaches:
History through the president’s words
Studying president's choice of words, over time, provides glimpses of change in American politics. Check out different tabs.. Eg. Foreign policy gives a very clear picture of how relations evolved with various countries..
http://www.washingtonpost.com/wp-srv/special/politics/2014-state-of-the-union/language-of-sotu/Free books on statistical learning
Here are some books related to statistical learning, freely available online:
http://www.r-bloggers.com/free-books-on-statistical-learning/
http://www.r-bloggers.com/free-books-on-statistical-learning/
Story Competition
People first started talking about the Normal Distribution nearly 300 years ago. The scientific community used their understanding of the Normal Curve to model and give meaning to the results of their experiments.Today, we owe much of our modern technology and modern world to the discoveries made possible by the Normal Curve. So, what would the world be like if the Normal Curve had never been discovered? Submit your story for a chance at $3500 in cash prizes!
Wednesday, January 29, 2014
Charts That Don’t Start at Zero
A statistician throws light on how an improper usage of statistical tools can lead to misleading conclusions:
http://www.r-bloggers.com/lies-damn-lies-data-journalism-and-charts-that-dont-start-at-0/
Tuesday, January 28, 2014
Interview with Inventor of S and R
John Chambers (creator of S programming language & core member of R programming language project) recounts the history of S and R in the following interview:
John Chambers talks about his involvement in the birth of the S language in 1976, and how it evolved over the years to become the inspiration for the R language.
Monday, January 27, 2014
Public transit times in major cities
Here is an interesting visualization..
You can select the time of the day and day of the week, and get a realistic estimate of how long it takes to get from point A to point B. There is also an interesting comparison option, which lets you choose two locations to see which area will get you somewhere else faster.
Musings on Random Walk
"A drunk man will find his way home, but a drunk bird may get lost forever."
- Shizuo Kakutani
Want to know why? Read at:
http://mahalanobis.twoday.net/stories/228354/
http://www.math.cornell.edu/~mec/Winter2009/Thompson/randomwalks.html
- Shizuo Kakutani
Want to know why? Read at:
http://mahalanobis.twoday.net/stories/228354/
http://www.math.cornell.edu/~mec/Winter2009/Thompson/randomwalks.html
R Tricks for Kids
Here is an article from 'Teaching Statistics', which describes real-world phenomena simulation models, which can be used to engage middle-school students with probability. Links to R instructional material and easy-to-use code are provided to facilitate implementation in the classroom.
http://onlinelibrary.wiley.com/doi/10.1111/test.12016/full
Friday, January 24, 2014
An interview with Sir David Cox
Sir David Cox is arguably one of the world’s leading living statisticians. He has made pioneering and important contributions to numerous areas of statistics and applied probability over the years, of which perhaps the best known is the proportional hazards model, which is widely used in the analysis of survival data. In this interview, he says, "I would like to think of myself as a scientist, who happens largely to specialise in the use of statistics”
Read the complete interview at:
http://www.statisticsviews.com/details/feature/5770651/I-would-like-to-think-of-myself-as-a-scientist-who-happens-largely-to-specialise.html
Wednesday, January 22, 2014
Does 1+2+3… really equal -1/12?
A recent Numberphile video claims that the sum of all the positive integers is -1/12. Bothered by that, Evelyn Lamb talks about what it means to assign a value to an infinite series and explains different ways of doing this.
A century of passenger air travel
Kiln and the Guardian explored the 100-year history of passenger air travel, and to kick off the interactive is an interactive map that uses live flight data from FlightStats. The map shows all current flights in the air right now. Be sure to click through all the tabs. They're worth the watch and listen, with a combination of narration, interactive charts, and old photos.
Tuesday, January 21, 2014
Solving water resource problems using Statistics
In an exclusive interview, Dr. Upmanu Lall, Director of Columbia Water Center discusses how he uses Statistics and an understanding of climate, agriculture, commerce, engineering, technology, and politics to solve some of the world’s most pressing water problems:
Sunday, January 19, 2014
Not Missing at Random
Not Missing at Random (NMAR) is data that is missing for a specific reason..
Here is an interesting example of NMAR data.. with the message that one shouldn't be sad and low, after reading on Facebook, about abnormally flattering lives of their friends' ..
The Music Timeline
The Music Timeline shows genres of music waxing and waning using stacked area chart. Each stripe on the graph represents a genre; the thickness of the stripe tells you roughly the popularity of music released in a given year in that genre.
An Interview with Larry Wasserman
Professor Larry Wasserman is currently Professor in the Department of Statistics and Machine Learning at Carnegie Mellon University. His research interests include nonparametric inference, machine learning, statistical topology and astrostatistics. Here is a link to his interview where he talks about statistics and his career in statistics.
R is the most-used tool
O'Reilly has just published the results of the Data Scientist Salary Survey, based on data collected from attendees of the O'Reilly Strata conferences in 2012 and 2013. Each respondent listed multiple tools that they used both in data roles and non-data roles. R topped the list of Statistical Software beating SAS, SPSS, Excel etc.
Thursday, January 16, 2014
Competitions to celebrate 175th Anniversary of ASA
American Statistical Association is celebrating 175th anniversary. You may celebrate with them by doing any of the following:
- Entering ASA's Got Talent, the ASA's unique talent competition
- Looking for clues in Amstat News and playing ASA's Trivia Challenge
- Sending in your design for the ASA's official 175th anniversary T-shirt
Submit your entries before 30th April 2014. More details can be found at:
Wednesday, January 15, 2014
Timeline of Statistics
Check out this precise yet detailed "timeline of statistics" published by Significance magazine to celebrate its 10th anniversary..
Regression with Gradient Descent
Here is an overview of the gradient descent algorithm, which offers some intuition on why the algorithm works and where it comes from, and provides examples of implementing it for ordinary least squares and logistic regression in R:
Lexical distance between European languages
So why is English still considered a Germanic language and not the Latinate one? How do you measure the proximity in linguistic families? Read more at:
Tuesday, January 14, 2014
n vs n-1
People keep on wondering “Why is the denominator in the sample mean n, but the denominator for the sample variance is n−1?” All of us have had to answer this question at some time in our careers, either for our students or for ourselves. How do you answer it, and how helpful is your answer? Do you feel obliged to introduce distinctions such as populations vs samples, description vs inference, parameters vs statistics, Greek vs Roman letters? Or more advanced concepts, such as degrees of freedom, dimensions of subspaces, unbiasedness or maximum likelihood? Read more at:
How Much Time to Conceive?
One of the most important questions that people ask when they make the decision to have a child is: how long is it going to take us to get pregnant? The probabilities mentioned by doctors provide an answer to this question. But these probabilities are estimates at best (albeit, no doubt, educated estimates!) and are associated with some not insignificant uncertainties. Here is an approach to judge how important is the value of the monthly probability in determining the time to conception, using basic probability distributions and R visualisations:
http://www.r-bloggers.com/how-much-time-to-conceive/
Monday, January 13, 2014
Hidden History
A modern statistician needs to appreciate the historical roots of the profession, argues Terry Speed:
So look to your statistical roots!
So look to your statistical roots!
Friday, January 10, 2014
Are you saving too much?
The only hard-and-fast rule for how much retirement income you will need is that there is no hard-and-fast rule. New research shows that many retirees can live well on less than the amount suggested by financial industry but others rack up higher expenses through travel, expensive hobbies or medical costs that can't be avoided. Read more at:
Thursday, January 9, 2014
From spreadsheet thinking to R thinking
One may have inertia in switching from spreadsheets to R. Here is a post to help overcome the same:
http://www.r-bloggers.com/from-spreadsheet-thinking-to-r-thinking/
Statistics and The War
We all agree that wars are terrible and to be avoided to the greatest extent possible, yet it is hard not to concede that wars can bring scientific, technological, industrial, cultural, political, even economic benefits. This is one of the many paradoxes of war. Statistics is no exception. Not only was there extremely rapid development of some areas of statistics, especially industrial statistics, but also a large proportion of the leaders in our subject in the 40 years following the World War 2 met it for the first time during the War. Most of them, would not have become statisticians but for the War.
Wednesday, January 8, 2014
Paul Erdös, The Maverick Genius
One of the finest minds in the history of mathematics, Erdös chose as his epitaph the self-deprecating Hungarian phrase “Finally I am becoming stupider no more.” Read more about him at:
Friday, January 3, 2014
Bodily maps of emotions
Emotions are often felt in the body, and somatosensory feedback has been proposed to trigger conscious emotional experiences. As one would expect, the body looks like it shuts down with depression, and it lights up with happiness, but it's the subtle differences that are most interesting. Read how statistical classifiers were used to distinguished emotion-specific activation maps accurately:
Thursday, January 2, 2014
Generalized linear models for predicting rates
We often need to build a predictive model that estimates rates. A simple example is estimating default rates of mortgages or credit cards. One could try linear regression, but specialized tools often do much better. Here is a discussion of how to do such things in R:
Experience the thrill of touching real data
The story of one man's efforts to re-analyse the stats behind a BBC report on bowel cancer is a heartwarmingly nerdy one:
Wednesday, January 1, 2014
Parallelisation may not be always better than sequential processing
Parallelisation incurs some overhead: information needs to be distributed over the nodes, and the result from each node needs to be collected and aggregated into the resulting object. This overhead is one of the main reasons why in certain cases parallel processing takes longer than sequential processing. Read more at:
Animation of the Construction of a Confidence Interval
The confidence interval is one of the more tricky statistical concepts. A way of explaining confidence intervals is as the region of possible null hypotheses resulting in corresponding significance tests that are not rejected. Turns out it is not easy to make a corresponding nice explanatory animation either, but that’s what has been tried here:
Subscribe to:
Posts (Atom)