Will Google’s Controversial LaMDA Help or Hinder Internet Discovery?

In his online Masterclass on the art of writing, renowned journalist Malcolm Gladwell explains the shortcomings of Google when it comes to research and discovery. “The very thing that makes you love Google is why Google is not that useful“, he chirps. To Gladwell, a Google search is but a dead-end when a true researcher wants to be led “somewhere new and unexpected“.

In juxtaposition to Google’s search engine stands ye olde library, which Gladwell calls the “physical version of the internet” (sans some of the more sophisticated smut…). In a library — should it be required — guidance is on-hand in the form of a librarian, and unlike the internet there is a delightful order to things that the writer likens to a good conversation. Discovery can be as simple as finding what books surround the book that inspired you…and following the trail. Gladwell elucidates: “The book that’s right next to the book is the book that’s most like it, and then the book that’s right next to that one is a little bit different, and by the time you get ten books away you’re getting into a book that’s in the same general area but even more different.”

There is something altogether more natural and relational about uncovering the new — and the forgotten — in the context of a library or a conversation. Hidden gems lay undisturbed, unlike popularity-ranked internet search results that spew out the obvious and the familiar.

Enter LaMDA AI.

A new experimental language model from Google that the company says will “engage in a free-flowing way about a seemingly endless number of topics” — much like natural dialogue. Internet research needn’t be about endless lists of links, it could be more akin to the library experience of semi-incidental discovery.

Unlike most language models (e.g. GPT-3) LaMDA — short for “Language Model for Dialogue Applications” — is trained on real dialogue, which has apparently allowed it to pick up on “several of the nuances that distinguish open-ended conversation from other forms of language“. One of these is called “sensibleness” (feels like it should be called appropriateness…), and just means that the system gives responses that make sense in the context, as opposed to some of the complete garbage we’ve become accustomed to hearing out of Alexa and Siri.

Apparently Google is also looking into dimensions like “interestingness” — whether something is insightful, unexpected or witty — but as you can tell from the demo, they’ve clearly not gotten there just yet…

LaMDA AI still sounds a little basic, but the spinners say that this could make search and discovery more efficient, interesting and ultimately more enlightening. MIT Technology Review report that Google plans to “integrate LaMDA into its main search portal, its voice assistant, and Workplace, its collection of cloud-based work software that includes Gmail, Docs, and Drive. But the eventual goal, said Pichai, is to create a conversational interface that allows people to retrieve any kind of information—text, visual, audio—across all Google’s products just by asking.”

So, this could very well be the future of search. But before we all commence book-burning, we should address the extremely large elephant-in-the-room. The excitement around this product has been very much tempered by the backlash led by Google’s own former researchers, some of whom were ruthlessly dismissed for trying to flag the dangers of models like LaMDA.

Large language models (LLMs) are trained on an enormous corpus of human-generated data and are known to spout racist and abusive language, regularly confuse fact with fiction, and encode — and perpetuate — the harmful stereotypes buried within the insane amount of information required to train them (the entirety of English language Wikipedia made up around 0.6% of the training data used to train the GPT-3 LLM).

Google are adamant that they’re busy “vetting” the language used to train LaMDA, but one of the co-authors of the controversial paper that got the researchers fired, Emily Bender, told The Verge that there are no details available, adding: “I’d very much like to know about the vetting process (or lack thereof).”

Of course, the horrors that have emanated from LLMs also come about because these systems do not actually understand a single word of the information they imbibe. They just lock onto patterns in real dialogue and learn what the building blocks are à la Searle’s Chinese Room. Sometimes this means regurgitating some pretty unpleasant stuff by accident, so Google will need to be clear and specific about how they’re mitigating the problematic scenarios these technologies bring along with them — and there will be pressure to go beyond the rather wet promises about “responsibility first” and “AI principles”.

But let’s say a perfect version of LaMDA could exist, without all of the nasty “niggles” we’ve learned of, could this be the kind of information discovery tool that would please researchers and other library-loving Gladwell-esque types?

Sadly, the answer is probably “no”. Though the conversation with LaMDA may feel free-flowing, it will still adhere to pre-determined principles which mean it’s unlikely to throw out any lesser-known factoids or references to barely-read texts. Google knows that people asking about “Paris” tend to want know basic facts or access popular travel tips. They generally don’t want to be confronted with obscure references to previous residents or historical weather phenomena. LaMDA will still have to filter.

Interestingly, Google bods do seem to be insisting that the kind of truly aimless meandering that comes with natural conversation will be possible and desirable. Product manager Eli Collins tells us that LaMDA gets that a dialogue’s “open-ended nature means they can start in one place and end up somewhere completely different. A chat with a friend about a TV show could evolve into a discussion about the country where the show was filmed before settling on a debate about that country’s best regional cuisine.”

But even if this is right it just leads us to ask why? Not to be facetious, but why do I need to have a meandering conversation with Google? We’re accustomed to asking a search engine for specific material and receiving a number of related links — often they’re perfect but occasionally they’re wrong or a bad fit or just unhelpful. Wouldn’t an improvement involve LaMDA acting as a librarian and helping us to find what we do want? Acting as a co-navigator of the internet library in our quest to uncover some gems? We certainly don’t need a librarian that walks us through random departments that are unrelated to our original prompt…

Yes, humans might have meandering social conversations with friends, but we approach Google with a purpose and not looking for a companion to shoot the breeze with.

This begs the question whether this system — as it has been described — seems superior to the current one? At least when it comes to research? Again, the instinctive (and admittedly very early) answer has to be “no”. Google results may be ranked in a certain way but, at least theoretically, we can click right through the many many pages of results to find the new and the interesting. On the other hand, a conversational agent necessarily filters its responses based on unknown factors. We are served Google’s selected responses, and even if these are unfiltered, meandering and unique to our conversation, it still means our choices are necessarily curtailed by the choices of the system.

The above reaction has to be caveated with an acknowledgement that this is very early days for LaMDA and Google staffers are no doubt hard at work thinking through potential uses, as well as the perfecting the technology itself. However, conversational agents can sometimes feel like a solution in search of a problem. If systems like LaMDA are being billed as “next generation” then it’s still unclear what the value add actually is. Perhaps we can find out more about Pluto or paper planes in a handsfree fashion but, acknowledging that search queries are often chasing much more than pat answers to general knowledge questions, it’s difficult to imagine this clunky, unknowing system becoming a valuable new tool for research.*

*The author reserves the right to change her mind.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s