I was approached today by a Science Writer in the US, writing an article on computer generated poetry for Motherboard. She asked me for comments on the FIGURE8 approach by Sarah Harman, presented at this year’s International Conference on Computational Creativity. This is great work, as evidenced by it being nominated for the best paper award. I have pretty strong views about computer generated poetry, which come out in my response…
It’s great that you’re writing about Computational Creativity for Motherboard. The FIGURE8 approach by Sarah is solid work, and I was pleased to see it presented at ICCC 2015 this year (I was general chair for that conference). It fully deserved being nominated for the best paper prize.
However, to my mind, it isn’t particularly groundbreaking in the Computational Creativity sense. We already know that software can produce interesting linguistic artefacts with decent regularity. I say “interesting” here in the sense that if a person produced them, then they would be assessed favourably. And therein lies the problem that almost everyone in computer generated poetry is missing…
I have a particularly hardcore view about computer generated poetry. I have some practical experience of this, as described in this paper:
and I’m now leading a natural language generation project called WHIM, where we’re building The WhatIf Machine to perform fictional ideation:
The first paragraph of the above paper is very important indeed in computer generated poetry, I believe. To summarise, poetry is a way by which people connect – it’s a very human thing, indeed it can be seen as a celebration of humanity, to a certain extent. In that context, we have to ask ourselves: what is the point of computer generated poetry? I’m very much against silly Turing-style tests where people are put in the situation of getting confused about the origins (computational or human) of artefacts, and then the origins are revealed. I gave this example (which I got from Alison Pease – a close colleague) in a talk in London a couple of weeks ago: Imagine you have just read a poem about childbirth, and you made a real connection with the poet. You assume it’s written by a woman, but then you’re told it was written by a man. Your perception immediately changes. If you’re then told that the man was a convicted mass murderer, your perception of the poem changes again, probably irrevocably. But the words in the poem haven’t changed one bit! There is so much more to a poem than just th words on the paper – researchers tend to completely miss this. A computer generated poem is very distinct to a human-authored one, even if it looks very much like a normal poem. We could go on for the rest of our lives hiding the originator of poems, but every time the reveal is made, and readers realise that they’ve not made the human connection they thought they had, a small bit of me dies (and we look very naive in science).
So, there is a humanity gap in computational creativity, and I believe it is most noticeable in computer generated poetry. If poetry exists for people to make connections with other people, how do we fill that humanity gap? I would recommend two things:
1. Stop calling texts produced by software “poems” and start calling them “c-poems”. They might look like human-authored poems, but they are distinctly not. If you’re bought an e-book for Christmas, you don’t expect to unwrap a leather bound volume! Similarly with c-poems, you don’t expect to make a human connection with the author. That doesn’t mean that the poem can’t move you, for instance you might connect with the characters in the poem, or it might invoke personal memories, etc.
2. Get the generative software to provide commentaries (or better, engage in dialogue) about the poem it has produced. This at least offers the opportunity for the software to show that it is acting intelligently (there is a placebo effect being exploited by the writers of Twitter Bots, as highlighted recently by Tony Veale in this paper, also at ICCC 2015:
). It also enables the software to add background information to the poem and its processes. I went to a poetry reading a few years ago: four poets nominated for a prize, they each had 30 minutes on stage. Each one of them spent 15 minutes explaining the context of the poems, what they meant to them, with one even going into a tirade about Thatcherite politics (that was a crowd pleaser!). Only after 15 minutes of background did they get round to reading their poems aloud.
Most importantly, however, getting software to provide commentaries highlights the fact that they were written by a computer program, not a person. Last year, the BBC (radio 4) interviewed me about computer generated poetry, and I agreed that they could pass on the poems to some professors (of poetry) in Manchester for assessment. I tried to make it absolutely clear that they should read the poems in full knowledge that they were produced by software, not by a person. The professors took this very seriously, even in the context of a particularly biased (against software) radio show. However, in general, when people are asked to assess the quality of poems (or other linguistic artefacts), they generally tend to assess them as if they were written by a poet. “No!” I say, “they were written by a computer program”… “OK” they reply “I’ll read them as if a novice wrote them”. “No!” “OK, I’ll read them as if a child wrote them” “No!!” Software isn’t human. Computer generated poems are not the same as human authored ones, however we might like that to be the case!
(I told you I had a hardcore view on this!)
In the paper at the top of the email, we got the software to produce commentaries about the poems produced, which I believe worked very well – for instance, it helped the Manchester professors to take the poems seriously.
In the above light, I tend to view work on computer generated poetry (and linguistic generation in general – but there are exceptions) as being a bit dull if it doesn’t address the humanity gap, and doesn’t try to get evaluators to assess the artefacts as being from a computer rather than a person. In the abstract for Sarah’s ICCC paper, it says the paper presents a “possible architecture for generating and ranking figurative comparisons on par with humans”. I’m afraid that I don’t subscribe to the hidden assumption here that software generated artefacts can be compared with those produced by people purely by looking at the output. Sarah’s work is indeed a step forward to automatically producing texts which look like a person might have produced them, and there may be utilitarian value in this. However, it doesn’t address the humanity gap in any way, and I wish researchers would take this issue seriously – it’s about time in Computational Creativity research that we faced up to the fact that we are programming a machine, not a person! I should point out that Sarah’s work is no exception here: most people are still working on the hidden assumption that assessment of computer generated artefacts by output alone (and not process)
is valid. I don’t believe it’s valid, although it does help to get papers published.
Feel free to pass on my comments above to Sarah, if you think that’s appropriate.
If you would like to hear more rantings on the philosophy of Computational Creativity, where we expand on the above arguments (and others) at length, please see these papers:
I’ve CCd Stephen McGregor, who is organising a computer poetry and computational creativity event this week in London, as you might like to cover this event in your article. The event details are here:
(Sorry I can’t be there Stephen – good luck with it!!)
Also CCd are Tony Veale and Pablo Gervas, who are computational language experts and colleagues on the WHIM project, with much experience in automated poetry generation. Finally, I’ve Ccd Rafael Perez y Perez who is a Computational Creativity expert (currently chair of the association for computational creativity), who has much studied language generation.
Good luck with the article. If you would like a chat on the phone/skype, I would be happy to talk this week. You’ll notice that I’ve not addressed the question of whether software systems are truly “creative”, as you allude to in your article. I have a lot to say on why we shouldn’t be attempting to define creativity (as explained in the two philosophy papers above). But I’ll leave that for now 🙂