![]() |
Both studies show that it’s possible to decode brain signals into speech at a rate of about 60–70 words per minute. Image: Stanford |
The Stanford investigation involved the implantation of electrodes within the brain of an amyotrophic lateral sclerosis (ALS) patient, focusing on two regions linked to speech. This BCI system aimed to discern neural activity during speech attempts. The resultant signals underwent algorithmic processing, enabling the correlation of specific brain activity patterns with phonemes—speech constituents. Training the algorithm involved the patient vocally or silently articulating sample sentences over 25 sessions spanning approximately four hours each.
Simultaneously, the collaboration between UC San Francisco
and UC Berkeley involved the surgical placement of an ultrathin sheet containing
253 electrodes onto the brain of an individual grappling with profound
paralysis due to a brainstem stroke. Following a methodology analogous to the
Stanford approach, the patient was engaged in algorithm training through speech
endeavors, allowing the recognition of brain signals corresponding to distinct
phonemes. These signals were then translated into virtual facial expressions
and modulated speech through a digital avatar.
Despite nuanced differences in the research protocols, the outcomes bore striking similarities in terms of precision and swiftness. The Stanford study reported an error rate of 9.1 percent for a 50-word vocabulary, ascending to 23.8 percent for a 125,000-word vocabulary. Over a span of roughly four months, the algorithm achieved a conversion rate of brain signals to words at approximately 68 words per minute. The UC San Francisco and Berkeley team achieved a median decoding rate of 78wpm, accompanied by an 8.2 percent error rate for a 119-word vocabulary, and a 25 percent error rate for a 1,024-word vocabulary.
Although an error rate ranging from 23 to 25 percent may not yet be suitable for daily application, it represents a significant advancement from current technologies. Edward Chang, chair of neurological surgery at UCSF and co-author of the UCSF study, highlighted the substantial progress, comparing the rate of effective communication with current technology—laboriously ranging from five to 15wpm—against the natural speech range of 150 to 250wpm.
Chang underlined the significance: "Reaching 60 to 70
wpm is an important milestone, emerging from two distinct centers and
approaches."
![]() |
The UCSF study involved a BCI translating brain signals into facial expressions and modulated speech on a digital avatar. Image: UCSF |
Nevertheless, these studies primarily demonstrate a proof of concept rather than a technology primed for mainstream integration. One potential challenge lies in the extensive training sessions required for algorithmic calibration. However, both research teams expressed optimism about the possibility of refining algorithm training to be less demanding in the future.
Frank Willett, a research scientist at the Howard Hughes Medical Institute and co-author of the Stanford study, emphasized the early stages of these studies and the necessity for an expanded dataset: "As we amass more recordings and data, we should be able to apply the algorithmic insights from one individual to others." However, Willett also acknowledged the need for further investigation and confirmation.
Furthermore, usability remains a concern. The technology
must be sufficiently user-friendly for at-home use, without necessitating
caregivers to undergo intricate training. Notably, the brain implants are
invasive, and in these studies, the BCI required external connections via wires
to a device affixed to the skull, subsequently linked to a computer. Worries
about electrode degradation and the sustainability of these solutions are also
pertinent. Consumer readiness will necessitate rigorous testing, a process that
can be time-consuming and resource-intensive.
![]() |
More research is needed to see whether a wireless version of this technology is feasible. Image: Noah Berger, UCSF |
Moreover, the conducted studies focused on patients with residual mobility. For neurological conditions like advanced-stage ALS, "locked-in syndrome" might prevail—a state wherein the individual retains cognitive functions, sight, and hearing but can solely communicate through minimal movements such as eye blinking. The technology's applicability to individuals in such states requires further exploration.
Edward Chang reflected on the horizon, emphasizing both the achievement and the subsequent steps: "We've reached a performance threshold that excites us, particularly because it transcends usability constraints. We are giving serious consideration to what lies ahead, pondering the immense potential if this technology can be securely and widely adopted."
0 Comments