The Lupyan Lab approach to doing science:

We are a question-rather than methods oriented lab. Our method of doing science goes something like this: (1) pick a question worth studying, (2) pick an experimental paradigm suitable for the question. (3) Use a range of statistical techniques to get a feel for the data (ANOVAs, hierarchical mixed effects, Monte Carlo techniques). A phenomenon that is robust against statistical fickleness is one that is more likely to be real. In the end, replication is always the best statistic (Luck, 2005).

But how does one know whether a question is worth studying? Here are some hints that it might be. You read a paper and come across a sentence such as “Unfortunately, these provocative speculations by (long-dead person) have never been tested empirically” or “Of course there is no way we can find out X because of confound Y”. Or, after reading a bunch of papers on a topic you realize that the logic of the authors makes an assumption that (to you) seems clearly false.

Studies worth doing are more often than not elegant. They are easy to explain to others, and they have easily interpretable outcomes. You may think you’re being comprehensive by including lots of manipulations in a single study because, after all, if you find something you can always run a follow-up. But what often ends up happening is that you find high-level interactions that aren’t reproducible because the very inclusion of the factors changes how people do the task. So instead of a logical sequence of studies in which you test for A, then B, then C., you end up with apparently fleeting main effects and interactions. In other words, a mess.

The importance of prototyping:

A better approach is to start simple, replicate, and once you are sure you have a real effect, try to kill it, or try to find mediating factors that will help you understand the mechanism that gives rise to it. It is much more exciting to get an effect and try to make it go away through followup experiments than to keep fiddling until something works. An additional advantage of the quick-prototype-study approach is that analyzing the data sometimes reveals a fundamental flaw in the design and encourages a redesign of the study prior to collecting large amounts of data.

Subject sampling and individual differences:

Just about all our subjects are WEIRD. We are aware that US college undergrads do not represent the human species at large. Despite being a homogenous set, much can be learned by attending to individual differences. We always look at individual data, at least to ensure that it is normally distributed. Bimodality, when present, can be highly revealing.

Over the past few years we have been making greater use of crowdsourcing platforms, most notably Amazon Mechanical Turk. We’ve learned a lot in the process (the do’s and don’ts of Mechanical Turk will be a subject of a separate post).

Collecting data online was initially stigmatized: “How do you know who your subjects are? How do you know they are paying attention? How can you tolerate so little control over their testing environment?” The stigma is dissipating, but its presence reveals a rarely examined assumption in experimental psychology—that running multi-trial tightly-controlled lab studies on relatively few participants is the most effective route to a greater truth. I am not so sure. Consider all the ways in which we limit individual differences in a typical lab study: we recruit college students from roughly similar SES, living in the same city, taking many of the same classes, and being tested in the very same room(s). And then we try to make claims about normative performance and build theories of “human” memory, “human” categorization, “human” language based on these stable results. But perhaps this stability is illusory. Perhaps by setting up a common context and testing very similar people, we are led to believe that cognition/perception are far more stable than it actually is.

Compared to this traditional lab-testing approach, collecting data over Mechanical Turk typically constrains the researcher to run shorter experiments on many more participants who come from a much more diverse sample, and are tested under more diverse conditions. This approach often reveals (replicable) individual differences that are masked in lab studies, and arguably leads to more generalizable results.

Programming and automation

An ability to quickly move from hypothesis to data is essential for understanding what story that data is trying to tell. To this end, the lab emphasizes programming skills (Python, in particular), automation, and efficiency.