[ View menu ]

August 3, 2016

Heuristica: An R package for testing models of binary choice

Filed in Encyclopedia ,Ideas ,Programs ,R ,Tools
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

CLASSIC HEURISTICS AND DATA SETS ALL IN ONE TIDY PACKAGE

shtmus

It just got a lot easier to simulate the performance of simple heuristics.

Jean Czerlinski Whitmore, a software engineer at Google with a long history in modeling cognition, and Daniel Barkoczi, a postdoctoral fellow at the Max Planck Institute for Human Development, have created heuristica: an R package to model the performance of simple heuristics. It comprises the heuristics covered in the first chapters of Simple Heuristics That Make Us Smart such as Take The Best, Unit Weighted Linear model, and more. The package also includes data, such as the the original German cities data set which has become a benchmark for testing heuristic models of choice, cited in hundreds of papers.

A good place to start is the README vignette, as with vignettes:

Here’s the heuristica package’s home on CRAN and here’s a description of the package in the authors’ own words:

The heuristica R package implements heuristic decision models, such as Take The Best (TTB) and a unit-weighted linear model. The models are designed for two-alternative choice tasks, such as which of two schools has a higher drop-out rate. The package also wraps more well-known models like regression and logistic regression into the two-alternative choice framework so all these models can be assessed side-by-side. It provides functions to measure accuracy, such as an overall percentCorrect and, for advanced users, some confusion matrix functions. These measures can be applied in-sample or out-of-sample.

The goal is to make it easy to explore the range of conditions in which simple heuristics are better than more complex models. Optimizing is not always better!

July 25, 2016

We’ve bet that Hillary Clinton will win

Filed in Gossip ,Ideas
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

PUTTING OUR MONEY WHERE OUR MOUTH IS

uspres.s.logo
Click to enlarge

Some experts seem pretty sure Donald Trump will be the next US President. Michael Moore wrote an article entitled 5 Reasons Why Trump Will Win.

moore

Prediction maven Nate Silver has warned a few days ago “Don’t think people are really grasping how plausible it is that Trump could become president. It’s a close election right now.”

silver

Despite this, we think that Hillary Clinton is going to win.

And we’ve put our money where our mouth is. There’s a prediction market called PredictIt in which US citizens in most states can legally bet on events happening or not. There’s an $850 limit on any contract, but you can get around that, in the following way.

As the figure up top shows, we’ve placed two bets:

  • We bet $799.50 that the next President will not be a Republican. That is, we bought 1250 shares of “no” on that contract at 65 cents each. If the next President is indeed not a Republican, we’ll be able to sell those shares for a dollar each, or $1230. Otherwise we lose our money.
  • We bet $849.87 that Hillary Clinton will be the next President. That is, we bought 1349 shares of “yes” on that contract at 63 cents. If Hillary wins, we’ll be able to sell our shares for $1349. Otherwise we lose our money.

So, we’ve bet $1,649.37. If Hillary wins, we’ll have $2,579 (minus the market’s 10% fee on profits). If Trump or some other Republican wins, we’ll have bupkis.

July 22, 2016

Behavioral Science and Policy Association PolicyShop Blog

Filed in Ideas ,Research News
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

A BLOG OF NOTE

bpsa2

Decision Science News readers presumably like to read about judgment and decision making, behavioral economics, and policy research. They may therefore be interested in the Behavioral Science and Policy Association’s PolicyShop blog, which covers all these topics.

July 11, 2016

Second Consumer Financial Protection Bureau (CFPB) Research Conference on Consumer Finance (December 15-16th, 2016)

Filed in Conferences
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

CALL FOR PAPERS. SUBMISSION DEADLINE AUG 26, 2016

This winter, the Consumer Financial Protection Bureau (CFPB) will host its second research conference on consumer finance.

We encourage the submission of a variety of research. This includes, but is not limited to, work on: the ways consumers and households make decisions about borrowing, saving, and financial risk-taking; how various forms of credit (mortgage, student loans, credit cards, installment loans etc.) affect household well-being; the structure and functioning of consumer financial markets; distinct and underserved populations; and relevant innovations in modeling or data. A particular area of interest for the CFPB is the dynamics of households’ balance sheets.

A deliberate aim of the conference is to connect the core community of consumer finance researchers and policymakers with the best research being conducted across the wide range of disciplines and approaches that can inform the topic. Disciplines from which we hope to receive submissions include, but are not limited to, economics, the behavioral sciences, cognitive science, and psychology.

The conference’s scientific committee includes:

  • Adair Morse (University of California Berkeley, Haas School of Business)
  • Annette Vissing-Jorgensen (University of California Berkeley, Haas School of Business)
  • Colin Camerer (California Institute of Technology)
  • Eric Johnson (Columbia University, Columbia Business School)
  • Jonathan Levin (Stanford University)
  • Jonathan Parker (Massachusetts Institute of Technology, Sloan School of Management)
  • José-Victor Rios-Rull (University of Pennsylvania)
  • Judy Chevalier (Yale School of Management)
  • Matthew Rabin (Harvard University)
  • Susan Dynarski (University of Michigan)

Authors may submit complete papers or detailed abstracts that include preliminary results. All submissions should be made in electronic PDF format to CFPB ResearchConference at cfpb.gov by Friday, August 26th, 2016.

Please remember to include contact information on the cover page for the corresponding author. Please submit questions or concerns to Worthy.Cho at cfpb.gov.

July 6, 2016

How many calories should you eat per day?

Filed in Ideas ,R
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

US GOVERNMENT GUIDELINES BY AGE, SEX, ACTIVITY LEVEL

raw.s
Click to enlarge

At Decision Science News, we are always on the lookout for rules of thumb.

Our colleague Justin Rao was thinking it would be useful to express calories as a percentage of daily calories. So instead of a coke being 150 calories, you could think of it as 7.5% of your daily calories. Or whatever. The whatever is key.

This is an example of putting unfamiliar numbers in perspective.

So, we were then interested to see if there would be an easy rule of thumb for people to calculate how many calories per day they should be eating, so that they could re-express foods as a percentage of that.

We found some calorie guidelines on the Web published by the US government. With the help of Jake Hofman, we used Hadley Wickham‘s rvest package to scrape them and his other tools to process and plot them.

The result is above. If you have any ideas on how to fit it elegantly, let us know.

We tried a number of fits. Lines are good for heuristics, so we made a bi-linear fit to the raw data (in points). We’re all grownups reading this blog, so let’s focus on the lines to the right of the peak.

two.part.linear.sClick to enlarge

Time to make the heuristics. For women, you need about 65 fewer calories per day for every decade after age 20. For men, you need about 105 fewer calories per day for every decade after age 20. Or let’s just say 70 and 100 to keep it simple.

So, if you have an opposite sex life partner (OSLP?), keep in mind that you may need to cut back by more or fewer calories as the person across the table as you age together. Same sex life partner (SSLP?), cut back the same amount. Just don’t go beyond the range of the chart. The guidelines suggest even sedentary men shouldn’t eat fewer than 2,000 calories a day at any age. For women, that number is 1650.

REFERENCES

Barrio, Pablo J., Daniel G. Goldstein, & Jake M. Hofman. (2016). Improving comprehension of numbers in the news. ACM Conference on Human Factors in Computing Systems (CHI ’16). [Download]

The R code, below, has some other attempts at plots in it. You may be most interested in it as a way to see rvest in action. Or just to get the data.

June 27, 2016

Prediction markets have to occasionally “get it wrong” to be calibrated

Filed in Encyclopedia ,Jobs ,Research News
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

PREDICTION MARKETS NOT AS BAD AS THEY APPEAR

pit

Two recent events in the UK made it look like prediction markets’ predictions aren’t worth much.

The soccer team Leicester City won the premiere league title despite the markets putting the odds of them doing so at 5,000 to 1 (.02%).

Last week, people in the UK, voted to leave the European Union. A few hours before it was sure they would exit, a prediction market put their probability of leaving at 10%. See the figure above from PredictIt. X axis is roughly time before the outcome was certain. Y axis can be interpreted as probability of exit (70 cents = 70%).  It jumped from 10% to 90% in just five hours.

Analysts like to “explain” market results, coming up with a reason why an event was a failure of the prediction market. For instance, in the two events above, the Wall Street Journal, perhaps correctly, claims the bets were unduly influenced by London bettors. Through big London bets the odds moved to reflect what Londoners believe instead of the sentiment of the crowd. In predicting a Brexit, the sentiment of the crowd is exactly what you want.

Whenever the prediction market is far on the wrong side of 50%, explanations will arise as to why the prediction market was wrong. Let’s take a step back here.

A desirable property of a prediction market is that it is calibrated. To be calibrated, events that it predicts to be 90% likely should occur 90% of the time. Events that it predicts to be 10% likely should occur 10% of the time.

If events that it predicts to be 10% likely (e.g. Brexit) occur 0% of the time, the prediction market has a problem. It is over-estimating the chances.

Looking at the calibration of prediction markets across many events, one will see that they are typically very well calibrated. Take, for instance that Hypermind prediction market. The figure below shows close to 500 events that it predicted. If the market were perfectly calibrated, the points would fall along the diagonal line (i.e., 10% likely events would happen 10% of the time, 20% likely events would happen 20% of the time, and so on).

It’s very well calibrated. Here’s a similar chart for U.S. primaries at PredictWise.

pit2

So the next time a market “misses” a 10% prediction, let us keep in mind that it needs to miss 10% of those predictions to stay calibrated. As the Émile Servan-Schreiber mentions in this post, “It is perhaps a bit ironic to note that the data from the Brexit question slightly improved [the prediction market’s] overall calibration. It is as if the occurence of an unlikely event was long overdue in order to better match predicted probabilities to observed outcomes!”

You may wish that that prediction markets had prefect calibration and predicted only 0% or 100%. We do, too. But the world is an imperfect place.

June 20, 2016

R is now the number two statistical computing program in scholarly use

Filed in Encyclopedia ,Ideas ,R ,Research News
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

WE THOUGHT IT WAS NUMBER ONE

rspss

Robert Meunchen has done a rather in depth analysis of the popularity of various packages for statistical computation. More detail here.

We were surprised to see that R has passed SAS for scholarly use. We were surprised because we assumed this would have happened years ago. Meunchen too predicted it would have happened in 2014.

In addition, we were surprised to see that SPSS holds the number one spot, and by a fair margin. We can only think of one person who uses SPSS.

Perhaps we’re guilty of false consensus reasoning.

A chart further down the post shows that Python is rapidly growing in use. That meshes with what we observe. Many of our PhD student interns come in using Python for data analysis. One who I worked with, Chris Riederer, is even writing a version of dplyr in Python and calling it dplython, of course.

June 17, 2016

Can’t compute the standard deviation in your head? Divide the range by four.

Filed in Encyclopedia ,Ideas ,R
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

TESTING A HEURISTIC TO ESTIMATE STANDARD DEVIATION

range.heuristic.dists.s
Click to enlarge

Say you’ve got 30 numbers and a strong urge to estimate their standard deviation. But you’ve left your computer at home. Unless you’re really good at mentally squaring and summing, it’s pretty hard to compute a standard deviation in your head. But there’s a heuristic you can use:

Subtract the smallest number from the largest number and divide by four

Let’s call it the “range over four” heuristic. You could, and probably should, be skeptical. You could want to see how accurate the heuristic is. And you could want to see how the heuristic’s accuracy depends on the distribution of numbers you are dealing with.

Fine.

We generated  random numbers from four distributions, pictured above. We nickname them (from top to bottom): floor, left of center, normalish, and uniform. They’re all beta distributions. If you want more detail, they’re the same beta distributions studied in Goldstein and Rothschild (2014). See the code below for parameters.

We vary two things in our simulation:
1) The number of observations on which we’re estimating the standard deviation.
2) The distributions from which the observations are drawn

With each sample, we compute the actual standard deviation and compare it to the heuristic’s estimate of the standard deviation. We do this many times and take the average. Because we like the way mape sounds, we used mean absolute percent error (MAPE) as our error metric. Enough messing around. Let’s show the result.

range.heuristic.s
Click to enlarge

There you have it. With about 30 to 40 observations, we could get an average absolute error of less than 10 percent for three of our distributions, even the skewed ones. With more observations, the error grew for those distributions.

With the uniform distribution, error was over 15 percent in the 30-40 observation range. We’re fine with that. We don’t tend to measure too many things that are uniformly distributed.

Another thing that set the uniform distribution apart is that its error continued to go down as more observations were added. Why is this? The standard deviation of a uniform distribution between 0 and 1 is 1/sqrt(12) or 0.289. The heuristic, if it were lucky enough to draw 1 and a 0 as its sample range, would estimate the standard deviation as 1/4 or .25. So, the sample size increases, the error for the uniform distribution should drop down to a MAPE of 13.4% and flatten out. The graph shows it is well on its way towards doing so.

REFERENCES

Browne, R. H. (2001). Using the sample range as a basis for calculating sample size in power calculations. The American Statistician, 55(4), 293-298.

Hozo, S., Djulbegovic, B., & Hozo, I. (2005). Estimating the mean and variance from the median, range, and the size of a sample. BMC medical research methodology, 5(1), 1.

Ramıreza, A., & Coxb, C. (2012). Improving on the range rule of thumb. Rose-Hulman Undergraduate Mathematics Journal, 13(2).

Wan, X., Wang, W., Liu, J., & Tong, T. (2014). Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. BMC medical research methodology, 14(1), 135.

Want to play with it yourself? R Code below. Thanks to Hadley Wickham for creating tools like dplyr and ggplot2 which take R to the next level.

June 7, 2016

Hemingway app forces you to write more simply

Filed in Ideas ,Tools
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

OR AT LEAST WITH SHORTER SENTENCES

hmgy

If you are an academic, and you probably are if you read this blog, you may be interested in tools that improve writing. Recently, we stumbled upon the Hemingway Editor, a tool to encourage simple and forceful writing. It flags hard to read (i.e., long) sentences. It also flags adverbs, passive voice and unnecessarily complex phrases. We think that following all the editor’s advice would be a bad idea. Yet, using it for a few paragraphs on a paper reminds you to keep things simple. And it does, at least superficially, make you write more like Hemingway.

For fun, we applied it to one of our own abstracts below. It went from 145 words with grade 18 readability to 119 words with grade 14 readability.

You should beware, however, that readers may infer that simpler academic texts are of lower quality (Galak and Nelson, 2011).

REFERENCES

Galak, J. & Nelson, L. D. (2011). The virtues of opaque prose: How lay beliefs about fluency influence perceptions of quality. Journal of Experimental Social Psychology, Volume 47, Issue 1, January 2011, Pages 250–253

Goldstein, Daniel G., & David Rothschild. (2014). Lay understanding of probability distributions. Judgment and Decision Making, 9(1), 1-14.

Hemingway Editor: http://www.hemingwayapp.com/

BEFORE (145 Words. Grade 18 readability)

How accurate are laypeople’s intuitions about probability distributions of events? The economic and psychological literatures provide opposing answers. A classical economic view assumes that ordinary decision makers consult perfect expectations, while recent psychological research has emphasized biases in perceptions. In this work, we test laypeople’s intuitions about probability distributions. To establish a ground truth against which accuracy can be assesed, we control the information seen by each subject to establish unambiguous normative answers. We find that laypeople’s statistical intuitions can be highly accurate, and depend strongly upon the elicitation method used. In particular, we find that eliciting an entire distribution from a respondent using a graphical interface, and then computing simple statistics (such as means, fractiles, and confidence intervals) on this distribution, leads to greater accuracy, on both the individual and aggregate level, than the standard method of asking about the same statistics directly.

AFTER (119 Words. Grade 14 readability)

How accurate are laypeople’s intuitions about probability distributions of events? The economic and psychological literatures provide opposing answers. A classical economic view assumes that ordinary decision makers consult perfect expectations. Recent psychological research has emphasized biases in perceptions. In this work, we test laypeople’s intuitions about probability distributions. We control the information seen by each subject to establish unambiguous ground truth answers. We find that laypeople’s statistical intuitions are accurate but depend upon the elicitation method. We computed simple statistics from distributions elicited using a graphical interface. The statistics included means, fractiles, and confidence intervals. All statistics derived from distributions were more accurate than those obtained by direct asking. This was true on both the individual and group level.

June 1, 2016

The SJDM Newsletter is ready for download

Filed in SJDM
Subscribe to Decision Science News by Email (one email per week, easy unsubscribe)

SOCIETY FOR JUDGMENT AND DECISION MAKING NEWSLETTER

 

The quarterly Society for Judgment and Decision Making newsletter can be downloaded from the SJDM site:

http://sjdm.org/newsletters/

best,
Dan Goldstein
SJDM President & Newsletter Editor