Finnish news media outlets have been experimenting with bots that convert ice hockey scores into news copy, but have struggled with the results. One key issue is that when the bot is given creative freedom, the players invariably end up dead. The bots have been trained using Finnish news texts and evidently the training material has a deterministic bias to finish off its subjects, even in the context of division 1 hockey. This leaves the media outlets with the option of scrapping their AI/bot-ventures – which some have already done – or reducing the operation to a more template based approach, which effectively removes both artificial and intelligence from the equation. The crux of the issue lies in the limited amount of Finnish training material and the NLP algorithm itself.
ChatGPT was launched in December to the public, and has provenly been able to keep its hockey players alive and sweating. It is the most sophisticated Large Language Model (LLM) to date and basically has all of the internet as its training material. What does this mean for us?
Well, first of all, as a human kind, we’re terrible at writing. An infinitely small fraction of written text created every day is worth the digital ink it is proverbially printed on. We have a mix of professions, whose only raison d’être is to put words into a correct sequence to pass on a message, elicit a response, impact the reader. Where would we be without authors, editors, journalists, and copywriters? Writing in an engaging manner and flexing your vocabulary also conveys a sign of status and pedigree, especially in the anglo-Saxon world. It is a method of signaling to the reader that you have had the possibilities to better yourself in the craft, and the luxury of time to compose the long winded albeit entertaining laundry detergent sales letter that landed in your client’s inbox. Written words convey meanings also about the writer.
Damn you autocorrect
We’re no strangers to having machines assist us in our literary tasks. The majority of email programs offer both autocorrect and autofill assistants that help us in composing the 300+ billion messages that are sent every day. They finish our sentences, correct typos and gently remind us to actually attach the document that our message is referring to. Sometimes they also misunderstand the context, suggesting (and correcting, faster than we can react) words that are entirely inappropriate.
Is GPT3(.5) different?
Yes. Kind of.
Like our helpful email assistant, GPT3 is good at guessing the correct order of words in relation to the intention of its user. The fundamental shift for us, the large audience, in ChatGPT was that the chosen interface was chat. We now have the possibility to interact with a LLM using a method that is familiar to practically anyone who uses the internet. Mimicking a real conversation we can ask GPT3 to create different literary outputs for our use and our amusement. Unlike previous chatbots, ChatGPT is not stateless. As long as the chat session is open, we can refer to earlier inputs/outputs (within reason) and ask it to make changes according to new prompts.
GPT3 is also uncannily good at what it does. When prompted, it conjures up poems, stories, business plans, hypothetical dialogues between two historical figures etc. Its ability to produce quality creative content with such ease is unparalleled. A prompt to GPT3 can be a few hasty bullet points that turn into Shakesperian sonnets or even a finished piece of text that it will comment and iterate when asked to do so. GPT3 has also proven its abilities in generating quite passable code in several programming languages, bridging the divide between no code/low code and ‘real’ programming most likely for good.
We’ve established that GPT3 can produce different forms of creative content effortlessly and at least at this point at a 0 cost to the end user. Is this an issue? And to whom?
I’ll approach it from a few different angles.
Is this the real life
GPT3s literary prowess masks the unsettling truth that it actually knows nothing. To a certain point one can argue that eg. Google’s search algorithms actually know what is true and what is not, and also that expressing these truths are in the algorithm’s interests. GPT3 on the other hand is absolutely and unequivocally unconcerned with the whole question. It does have a significant tilt, just based on its training material, to produce text that has embedded in its logic, accuracy and well established truths, but in essence it remains apathetic to the subject. When previous news bots awarded every goal scorer with a hasty death on the ice rink, GPT3 will give a lush set of hair to a bald player and make a 5 ‘6 forward tower over a defender twice his size to the thunderous applause of 50 people in an empty stadium. If you are interested in truth as it equates to an empirically perceived world around us, GPT3 is not the tool of choice.
Also, it is noteworthy that as GPT3 is the sum of its training material, it regurgitates and amplifies biases on gender, race and sexuality that the internet is rife with. GPT3 produces English text that most likely mimics the likeness of something a white heterosexual cis male might conjure up. Most likely the model can not correct itself in this regard, so the onus falls on the user to be aware of these blatant limitations.
Is this just fantasy
As GPT3 is capable of producing excellent creative text, it is very difficult for a reader to distinguish GPT3s outputs from something a human would write. Is this a problem? I argue it is not a universal problem that we should pay too much heed to. The possible problems are highly contextual. If the text you produce is used for example to measure and grade something you should have learned and/or your skills in synthesizing several themes, using GPT3 is very problematic. The 1500 word college essay has now been dead for a couple of weeks.
In terms of journalistic contexts, the news media in Finland has decided, in accordance with the self governing authority JSN (Council for Mass Media in Finland), to clearly mark text created by a bot as such. No clear harm has been identified in publishing text without this notation, but it is quite human, pun intended, to have a careful approach to the subject at this point in time.
That being said, we are constantly bombarded with copy texts attributed to a single writer, that we very well know have gone through careful polishing and nuancing by half a dozen people prior to publishing. We accept this practice as industry standard for example in corporate communications. Texts, regardless of genre, are seldom an individual effort and always at minimum draw inspiration from the world around us, whether we notice it or not. The question of authorship and provenance of a text is important as such, but in terms of using GPT3, it is once again not a generally applicable problem.
Caught in a landslide
So, is there a line to be drawn between the human-AI interface and its use cases? We’ve established that GPT3 is apathetic to the question of truth and simultaneously excellent in creating infinite amounts of content that come off as believable at a first glance. GPT3 and its later iterations will impact any profession where putting words in a meaningful sequence is part of the job. This is inevitable, and it’s probably best we just accept it.
But, this does not mean that GPT3 and its kind should not be the focus of scrutiny, debate and regulation, where it is needed. We’ve identified that LLMs value and possible hazards both relate to the context of its use and the model’s training material. These are good starting points for further inspection.
GPT3 can produce contextually relevant output effortlessly. Using GPT3 for example to produce site specific propaganda at scale, where the model is programmed to take part in eg. Twitter discussions with an ever so slight tilt towards a political leaning is a future that is most likely already here. The question of us trying to identify and label LLM-created text is very relevant when it is used to deceive and/or subversively affect us.
Contextually relevant output will also create new efficiency drivers in a lot of professions, where at minimum GPT3 and its successors will become good assistants. In terms of the efficiency it will create in the workplace, the need to identify LLM outputs are once again very context driven issues. Regarding LLM:s impact on our daily grind, I foresee that we’ll most likely overestimate the short term outcomes and underestimate the long term consequences.
No escape from reality
All models are wrong, but some are useful (George Box, 1976). GPT3 offers us a glimpse into the collective humankind as it is represented on the internet. The picture is distorted by manyfold biases and gaps in the training material, and the model itself, that is at least at this point opaque to a layman. GPT3 is also a continuum to a variety of Silicon Valley technological innovations that will most likely impact our social fabric in ways that are at this point still difficult to identify. Should we just accept our new robot overlords, or is there still agency at play?
Sure there is. The fact that LLMs produce outputs that are difficult to discern from something a human can create is not inherently a negative or a positive thing. As we’ve already established, it is how and where they are used that anchors the technology into academic, political, sociological and economic realms, to name a few. If previously content was king, now it’s context. Context is something that we have tools to impact and at best control, through a variety of policies and decisions made by people, for the people.
Open your eyes
Context is also personal. I decided to write this ‘au naturel’ because I enjoy writing. I’m not a professional writer and I’m fully aware that using an LLM would have most likely helped me tremendously in creating a wall of text to work on. Instead, I participated in a great inhouse workshop on this theme, had calls with old acquaintances and several discussions by our ever important workplace coffee machine. Our focus on LLMs’ impacts are results oriented and to a cause, but let’s not lose sight of the human process of creating something new. This process is also something we invite our clients and peers to participate in.
Topi Ahava is an Account Director and a consultant at Solita. His work focuses on helping Solita’s clients make better, more informed, strategic decisions with the help of meaningful data in all its forms. When writing about complex subjects, Topi listens to Queen’s Bohemian Rhapsody on repeat.