Coffee or Tea?

Deliverables: part 1 full report here, part 2 full report here, CHI 2021 submission here

(this is a two part project, where part 1 is about self-experimentation and part 2 is about thematic analysis)

Part 1 - Self Experiment on coffee vs tea

In this part, I conducted an self experiment on a topic of my choice. I used my own experimental setup & design, as well as Self-E, an app developed by my school's HCI Research Group specifically aimed for conducting customized self experiments. The goal is to compare and contrast my analysis with the app's automated results.


Caffeine runs in my blood. I always start my day with coffee, tea (or even both) in order to "function”.
However, my productivity varies drastically most days and I am not sure what’s causing these changes- are the drink themselves or the varying amounts of caffeine that’s affecting my productivity? In an attempt to find out, I became interested in investigating whether drinking solely tea or coffee, with the same amount of caffeine, will make a difference in my productivity levels.


Personally, I often experience cases where if I start being productive, I can sustain this productivity for a long time. If I am not productive, I often continue to stay lazy and get nothing done that same day. Therefore, I hypothesized that:

if I drink both coffee and tea with the same amount of caffeine, coffee will increase my productivity level more than tea does, because coffee contains more caffeine per cup which may kickstart my productivity and result in increased productivity level overall.

Methods & Procedures
I performed two iterations of the experiment with the goal of improving over iterations (which I will describe in detail in the next section). I performed both iterations for 6 days and did the following:

Bias & Validity

In order to avoid bias and ensure construct validity, I established several guidelines:

  • Rated my productivity levels 3 times a day (at 10am, 1pm, and 4pm) and took the average value each day.
  • Made my sleep schedule consistent by going to sleep and waking up around the same time daily and start working at the same time everyday (early morning).
  • Drank all my coffee and tea at the same time (in the mornings with breakfast) and made sure to not consume anything else containing caffeine.
  • Pre-planned my tasks for the 6 days, making sure the tasks every day roughly added up to the same amount of time.


I conducted three iterations of the experiment, with the intentions as below:

  • Iteration 1 = a "test run" where I can gather ideas on what worked well/what can be changed
  • Iteration 2 = the improved experiment.
  • Iteration 3 = using Self-E.
Data Analysis

Since I conducted the experiment both manually and using Self-E, I received two set of results:
1. Via my own analysis (hypothesis testing). Result = The type of drink (coffee vs tea) did not make a significant difference in my productivity level when I consumed them with the same amount of caffeine.
2. Via Self-E. Result outputted by app = “It is most likely (~76% likelihood) that your productivity is 0.3 levels higher when you Drink coffee."
My own statistical analysis is explained below:

As shown above, The difference in productivity level & type of drink was larger for Self-E than for my own data analysis. However, there is really no perfect analysis method in this case due to the experiment's inherent variability, and I feel that my results and the app’s results are more accurate than each other in different situations.
Personal "feeling" wise, however, both the Self-E result and my own analysis indicated that the differences in productivity levels are not large, or significant enough, to conclude that one drink resulted in increased productivity. I personally do not consider the difference in levels of 0.3 and the probability of 76% large enough to indicate that one drink resulted in higher productivity than the other.


There are several validity issues arising from this experiment that could have affected my results.
1. Rating productivity levels by myself might have been rather subjective and arbitrary. There is also no set guideline on what counts as a productivity level of 0, 1...and so on. I might have rated two days of similar productivity with different values, and someone else might have rated my productivity different than I did as well.
2. Rating my productivity level three times a day at 10am, 1pm, and 4pm might have caused missed information as well. There could have been times outside these time blocks where I was productive but unable to record it. Also, it was possible that the drinks impacted me the most right after I consumed them, so rating productivity throughout the day might not be as informative.
3. Coronavirus Chaos - This entire experiment coincided with with the pre-spring break period where I had to suddenly move out and go home. This significantly impacted my productivity's consistency levels. Iteration 2 also happened in the middle of spring break where there were no hard deadlines for tasks I had to do. This might also have been a factor on why it went better than the first iteration in terms of reduced carryover effects.

Part 2 - Thematic Analysis

I wasn't the only person to conduct a self experiment. In fact, 15 other people in my seminar also did the same! Now, has to be some interesting patterns and conclusions I can draw regarding self-experimentation. In this part, I worked with another student to look at the entire class's self experiments and conduct thematic analysis on various aspects of self-experimentation & the use of Self-E.
Our process is illustrated below:

Findings (Full report here)

Based on thematic analysis of these journals, we identified and summarized the following aspects: and 3.results and success.
1. Changes between iterations: With opportunities for iterations, participants were faced with abilities to change their experiments, which naturally brought on new challenges to be faced including changing variables and randomization & bias minimization.
2. Experimental challenges: The first iteration of the self-experiment was intended to give participants practice with self-experimentation and identify and later fix issues and challenges during the second iteration. However, some issues remained, primarily around bias and validity, and new issues arose with the introduction of Self-E. These include self reporting on constructed measurements, app bugs/usability issues, and doubt in the app's resylts.
3. Results and success interpretation: In the end, nine participants determined their interventions as effective and ten received Self-E results that aligned with their own calculations and/or perceptions. Factors that influenced these conclusions include discrepancies between own &Self-E's statistical methods and perceptions. Moreover, all but one participant devised plans for future iterations to improve their own experiment.

A revised version of our findings was also incorportated in the "Customized Study Findings" section of this paper submitted to CHI 2021, which details the Self-E app & the practice of self experimentation.


This was an invaluable exercise for me to test something on myself with a special experiment of N=1, practice hypothesis testing, and gain my first taste of thematic analysis. The self experiment was quite different from all previous experiments I have done, and the two iterations allowed me to learn some good practices in self experimental design, such as randomization and controlling for confounding factors. For future iterations, I wish to continue this experiment for a longer period of time to reach more conclusive results, conduct this during regular school time so my schedules and tasks will be more consistent with each other, and find other, more objective ways to assess productivity. Learning about and conducting thematic analysis on my classmates' experiences also shed some useful insights. I was, in fact, a bit relieved to discover that many of us faced similar challenges such as mitigating bias as much as possible, and making judgements between our own result & the Self-E app's results.