Allen Institute’s Grover and the funeral pyre of text generation

by Volodymyr Bilyk


If you are following the technology news for some time, especially the ones related to artificial intelligence and machine learning technologies, you can spot this strange tendency that underlines every new project or breakthrough in the field. 


It is not about doing something that goes beyond human comprehension, it is not about making something inspiring or astounding or simply distinct. No, it is mostly about doing the opposite – the generic, blend-in, nothing special, middle of the road, template-based kind of stuff that can pass of as human-made because it is so unassuming or just being like something an average human would do. It is lazy. And we are not even starting to talk about DeepFakeNews shebang. 


This peculiar tendency leaves a strange aftertaste. The one that can be described with word “meh”. And it seems like it is really the whole point of it all, because otherwise it must a prolonged episode of mass delusion. 


Case in point – Grover by Allen Institute.

Allen Institute are big guns of Artificial Intelligence and Natural Language Processing Technology. Their AllenNLP is one of the most efficient tools for developing NLP models of all kinds. It is a Swiss Army Knife of NLP that brings the technology to every corner of business or scientific operation.

So – a couple of days ago Allen Institute had presented Grover. It is a natural language generation tool able to create texts that are very much like the ones written by humans. Or, if being exact, human copywriters. Just like regular SEO-enlightened copywriters, Grover creates articles that seem legit. Like you can actually read them and think you’ve just wasted a five minutes of your life kind of legit. And that’s about it. Yey!

Grover is a showcase of how far the NLP technology had went over the last couple of years.

With all the text analysis tools and pattern recognition and recurrent neural networks you can dig deep into the thick of the text and understand its structure and how its elements connect with each other and do your own thing based on the template and it will be mostly comprehensible. To the point that the output is completely indistinguishable from human-made.

Grover shows how easy it is to make this kind of article. It is fascinating how good it is at doing these bland, generic, middle of the road, not really saying anything texts.

I’ve tried the topic “Why Chael Sonnen is so good at talking and so bad at fighting?” and got exactly what i would write if didn’t really cared about the topic and just phoned it in with a thermonuclear impact. It’s impressive and concerning.

The availability of tools like Grover poses a serious threat to the mass media and its consumers due to systematic spread of fake news on a huge scale from all possible directions (social media, comment sections, faux new resources, reposts-rewrites, content aggregation). Just think about manipulating public opinion on climate change or abortions with an avalanche of misleading content. But that’s political stuff.

There is also another problem – information noise AKA useless stuff.

The name of the perpetrator is SEO. It’s not a big secret that search engine optimization downright sodomized the very concept of writing. To the point it was boiled down to wonky guidelines that turned writing into a tossing of the lego blocks to fit the requirements. SEO copywriting made writing boring and generic chore. It is like socialist realist literature with a new coat of paint. It too wholly depended on interchangeable templates and was mindnumbingly unreadable garbage that was just occupying the space.
But it works. Search engines dig it. It is cheaper than actually paying to promote your content.
This spawned innumerable amount of trash content with little to no value, and also birthed the clickbait mindset driven by trends and not actual necessity. Do we really need that many “All you need to know about Cyberpunk 2077?”

Add some AI to the mix and you get flooding of the search engines with an actual spam content that perfectly fits the criteria, is not all that blatantly spammy and it wholly overtakes the narrative leaving no place for anything else. Yey! Kinda like what we have today, except instead of going ten pages down the search results to find anything worthwhile, you will go twenty five or fifty.

But I’m used to dig through garbage, it doesn’t really scare me that much. What makes me sad is the waste of talent on such kind of stuff. It seems actual creativity and NLP doesn’t like each other these days.

One of the fun things about natural language generation of the past was its problematic relationship with such concept as “sense”.
The majority of old-time text generators just could not pull it off clean. There was always something off about their output. Sometimes it was slight, stupid and kinda cute, other times its propensity towards hallowed nonsense was legitimately impressive. There was always something truly unexpected.

Those markov chains text generators you can find on the web – they are capable of running nuclear weapon testing in your head if used “properly”. Modern text generators like Grover and their sophisticated algorithms don’t do that. They are too smart to have this kind of fun. It is not part of their design – they want to do business, boring things that blend seamlessly into the background. We can do better than that.





Print Friendly, PDF & Email