Skin Deep: Racial Bias In Wearable Tech

woman looking at watch

Technology influences the way we eat, sleep, exercise, and perform our daily routines. But what to do when we discover the technology we rely on is built on faulty methodology and legacy effects of racial bias? Is it simply a case of buyer beware? Or does addressing it become a moral issue? An ethical consideration? A scientific mandate? According to Peter Colvonen (né Colvin) ’01, it’s all of the above.

In May 2020, the United States was deep in the throes of the COVID-19 health crisis, with record deaths and global quarantine measures forcing mass populations to stay home and stay isolated to slow the burgeoning pandemic. On May 25, the country found itself further embroiled in a different kind of crisis, as the murder of George Floyd sparked a heightened focus on the Black Lives Matter movement and a call for greater anti-racism efforts took hold nationwide.

That same week, Peter Colvonen (né Colvin) ’01, a professor at UC San Diego and clinician at the VA hospital, was working on a consumer wearable sleep study and began noticing discrepancies in the data being collected: higher error rates and gaps in readings from certain subjects, all of whom were African American. Colvonen and his team decided to look into the phenomenon further, and found that the connecting thread was in the consumer wearable tech being used in the study. Despite copious evidence of accuracy issues for darker skin types, hardly any research or action was being taken to address the problem.

“I was looking at my data, noticing this gap, and at the same time seeing these big companies like Apple posting long essays in the wake of George Floyd on how diversity is important to them and how they back up individuals of color,” said Colvonen. “There was a mismatch between what they were putting out there publicly and what was actually happening behind the scenes in terms of their science.”

The Science Behind the Bias

Most fitness trackers available in today’s market employ the use of photoplethysmographic (PPG) green light signaling to monitor heart rate. These green light sensors (which can be seen on the back of many wearable fitness devices) continuously emit light into the skin of the wearer, penetrating the skin layer and tracking the amount of light absorbed or reflected back to chart the rise and fall of blood volume in the arteries and estimate heart rate and energy expenditure.

Green light PPG sensors are common in consumer devices because they are cheaper and less sensitive to motion errors than infrared lights, which are used in hospital-grade trackers. But green light has a shorter wavelength, meaning it is more susceptible to being absorbed by melanin, making it harder to get an accurate reading (or any reading) in individuals with darker skin tones (i.e., more melanin).

So how did green light technology become the most prevalent in the consumer wearables industry if the science used in development could have revealed the discrepancies? Colvonen sees a threefold answer:

“Number one is the consumer model. Big companies are taking the quickest and easiest technology to market. There are no studies, no examination other than ‘Is it readable? Is it cheap?’ Then get it to market,” he explains. “Number two is when they do start running data on it, they’re using the sample they have access to, which for tech is going to often be in Silicon Valley. So you’re getting a heavily white, Asian-influenced sample; you’re not getting the full diversity of American culture. And lastly, it’s a question of how you classify skin tone. Historically, scientists have been using the Fitzpatrick Skin Type Scale, which has just six categories of skin tone and which a colleague once compared to the old paper Brown Bag scale from back in the day.”

Originally developed in 1975 to measure skin’s reactivity to the sun (e.g., tanning), the Fitzpatrick Skin Type Scale (FSTS) is often used in scientific research for an entirely different purpose: as a measure of skin color. This can be particularly problematic because it is a subjective scale with individuals classified into only six different skin tone categories based on perceived skin color, rather than an objective measure of how light actually bounces off of the skin.

wearable tech watch
Photo by WESTEND61

While it is hard to imagine how a subjective scale developed nearly 50 years ago could still be used today to validate the efficacy of current consumer technology, Colvonen sees it as less a function of willful ignorance and more just a matter of following a flawed status quo.

“You’re not really allowed to make revolutionary shifts in research,” Colvonen explained. “I have to do what the people before me did and make a small change. The researchers who use the FSTS or other subjective validation practices are not racist. They aren’t out to keep inequality locked in. They’re using the best tools that the previous generation used ahead of them without thinking about it. It’s unconscious biases continuing to advance or getting locked into algorithms without awareness. So, we have to shine some light and say, Hey, as scientists we need to shift this.”

Shining a Light

Colvonen and his teammates set to work researching studies on green light technology in consumer wearable trackers, looking for steps to change the prevailing science behind it. They put out an editorial in the September 2020 issue of Sleep, challenging the health care community to reexamine the methodology in their scientific processes and to “unveil and dismantle systemic bias . . . to ensure that digital health solutions do not reinforce existing disparities in care and access.”

While wearable technology limitations are most visible in the consumer market, its applications beyond personal fitness can have unintentional impacts. As Ruth Hailu points out in her July 2019 article in STAT, many companies use fitness trackers to provide financial incentives and benefits for employees—if the trackers aren’t recording data equally for all involved, access to those incentives becomes less available for certain members of the population. Even more troubling is the transition toward using remote and wearable devices in the actual health care industry. The rise of COVID has put greater emphasis on remote care options such as telehealth and virtual doctor’s visits, with wearable devices aiding individuals in managing and reporting their own health status. And as these wellness trackers transition from consumer goods to medical-grade devices, their algorithms could get locked in as FDA-approved—a move that has dire implications, including issues of access to proper health care and perpetuating an existing system based on historical biases.

“Let’s say you have a heart arrhythmia or a heart murmur, the fitness trackers are way more likely to accurately capture your heart murmur if you’re white than if you’re Black. You’re way more likely to know whether you have hypoxia (low oxygen in the blood) if you’re white, than if you’re Black,” says Colvonen. “These are real medical concerns that people are using to then go to their doctor to get help. If you’re trusting your device to capture that for you, you may be falling victim to and continuing the inequalities that are already baked into the science from a history of biased practices.”

A Leap of Faith

Colvonen recognizes that as a white man, this particular issue doesn’t affect him directly, but in examining his role in effecting change, he hearkens back to his studies at Wesleyan:

Peter Colvonen
Peter Colvonen ’01

“When I was in the College of Social Studies, we talked about what it meant to be part of a nation state. What it means to be a part of something that we’re all in together, and what’s my responsibility as an individual? For me, it’s about equality and about bettering myself. It’s about sitting uncomfortably with what role I may have played in keeping inequality going. We see it all over the place with BLM, police brutality, with access to care. I’m just seeing it in this one area, and if I can shine a light on it, I have a moral responsibility to do so.”

It will take time to determine whether or not Colvonen’s work or that of other researchers and journalists trying to raise the alarm about racial bias in consumer tech development will actually manifest in meaningful change. As Wesleyan’s Chair and Associate Professor of Science in Society Tony Hatch asserts, this is just one area of a very complex and expansive system of bias that must be examined.

“These technologies are good examples of what Ruha Benjamin calls ‘discriminatory design,’” Hatch says. “Though technoscientific approaches like data collection are presented as a method of ending inequality, they often work to expand it by shifting the blame of racial disparities away from human beings to algorithms themselves. As Benjamin argues, by imagining the data collected by wearable technology as objective and free from bias so long as it can accurately detect darker skin tones, we risk reducing anti-racism to making a simple technology upgrade and leave intact the social and economic structures at the root of racism.

“We need to go beyond the data to address the fundamental and structural causes of racial health disparities.”

With Assistant Professor Mitali Thakor, Hatch plans to examine and further pursue this type of work in the new Science in Society Black Box Labs undergraduate research program, to be launched this fall at Wesleyan.

In the meantime, Colvonen has revised his own research practices and continues to urge his colleagues and community to take up the charge and move the work forward in whatever way speaks to them.

“It’s hard for me to see the ‘arc of the moral universe’ bending in the right direction. I don’t see these companies spontaneously integrating diversity and running the validation studies necessary to close the health care gap,” he admits. “However, I see we can influence our circle and our community, and that is powerful. I work hard to pass on values to my kids so future generations are morally driven to continue calling out disparities. I can write articles calling out what I see and hope that trickles up to the right ears and eyes. I can run my research and labs integrating diversity and equality and hope my students can continue that in their careers. To hope, I suppose, is an act of faith. Take action in the present and have faith it grows and influences in ways you couldn’t envision.”

Photo at top by John Fedele

* * *

Sidebar: Making Science More Objective

Consumer wearables (Apple Watches, FitBits, Samsung Galaxies, Garmin trackers, WHOOP) are a staple for the health-conscious and health-challenged, with individuals using these devices to meticulously count, track, measure, analyze, curate, and report back myriad nuggets of real-time data in colorful and visually appealing ways. While it’s a phenomenon that purports to put control of health and wellness almost literally in the hands of the wearer, for a large portion of the estimated 40 million people in the United States who use fitness trackers, the “data” they’re using to monitor their health may not be accurate.

In a September 2020 editorial published with colleagues in the medical journal Sleep and in a solo article published in February 2021 in npj Digital Medicine journal, Peter Colvonen ’01 outlines simple steps for making the science behind the technology more equitable.

For the consumer wearable tech market:

  • Use multiple light streams (green, red, etc.) to get a better sense of activity across different spectrums of wavelengths.
  • Validate the research against multiple skin tones (using a diverse sample).
  • Be honest. Implement a disclaimer model about where the product does and does not work, similar to those found on pharmaceutical packages.
  • Fully integrate a diversity team into the research, marketing, and validation to highlight blind spots in company technology advancements.

For the scientific community, Colvonen and his cohort’s call to action is much more prescriptive and wide-reaching:

  • Work directly with wearable companies to improve effectiveness and consumer reach to support people of color.
  • Urge them to improve their technology.
  • Decrease use of the subjective Fitzpatrick Skin Type Scale and increase use of objective standards of skin tone such as colorimeters.
  • Hold the research community accountable for addressing and reporting bias.
  • Make sure people of varying skin tones are included in validation and effectiveness research.

Together, these shifts in methodology can help to address some of the racial biases in the science behind wearable tech and some of the disparities present in existing access to accurate and equitable health care for certain segments of the population. As stated in their paper: “Technological advancements are muted if their inherent biases continue historical structural health disparities.”

Managing Editor, Wesleyan University Magazine