Nltk: text.generate() does not exist, but is still referenced

Created on 20 Aug 2014  ·  27Comments  ·  Source: nltk/nltk

I've been trying to follow along with the Natural Language Processing book, but right in the first chapter I'm coming across some issues. After importing everything from nltk.book, my first thought was to try text3.generate() as was demonstrated in one of the examples. Of course, I got a lovely AttributeError because the Text class apparently doesn't have that method in the NLTK that I installed.
Furthermore, even running nltk.text.demo() tries to do generated text - and returns the same error.
Of course, I couldn't find any documentation for the generate() method, so I'm assuming it was removed; if that's the case, you should remove references to it from nltk.text.demo() and from the textbook.

I'm using Python 2.7.8 with NLTK 3.0.0b1 (which was the version available via Windows installer package from PyPI at the time of this writing). The text3.generate() example is in both the old and current versions of the textbook.

language-model

Most helpful comment

This is still referred to in Chapter one of the book: http://www.nltk.org/book/ch01.html

It's no biggy, but I'm probably typical in spending a few minutes googling this as an issue. Those few minutes times however many people are working their way through the book... :)

All 27 comments

Sorry for the confusion. There were problems with NLTK's language modelling
class and so we removed it until the problems are resolved. This broke the
text generation functionality. I've updated the online version of the book
with a note about this, and updated the text demo (to be included in
release 3.0.0b2 soon).

-Steven Bird

On 20 August 2014 17:12, Kasran [email protected] wrote:

I've been trying to follow along with the Natural Language Processing
book, but right in the first chapter I'm coming across some issues. After
importing everything from nltk.book, my first thought was to try
text3.generate() as was demonstrated in one of the examples. Of course, I
got a lovely AttributeError because the Text class apparently doesn't
have that method in the NLTK that I installed.
Furthermore, even running nltk.text.demo() tries to do generated text -
and returns the same error.
Of course, I couldn't find any documentation for the generate() method,
so I'm assuming it was removed; if that's the case, you should remove
references to it from nltk.text.demo() and from the textbook.

I'm using Python 2.7.8 with NLTK 3.0.0b1. The text3.generate() example is
in both the old and current versions of the textbook.

Reply to this email directly or view it on GitHub
https://github.com/nltk/nltk/issues/736.

Alright! Totally understandable. I guess I'll have to write my own chatterbots with applied Markov chains, the old crude way, until it gets fixed.

Thanks for your prompt response.

I also met with this same issue which puzzled me a little bit. Thanks for this thread which google search had taken me to.

hmmm. does this help (https://github.com/alvations/nltk/blob/develop/nltk/translate/decoder.py#L33)? The language model seems to work fine when decoding. I'll be pushing up an updated and properly documented code for the translate library soon.

Sorry I was out for hackathon, workshops and conferences back to back. I'll try to push it by the end of this week, i hope.

Ah but my model loads a precacluated model, not building the model from scratch. Is anyone building a language model module?

I see two issues with the current code (besides the fact that it does not work), user-experience-wise:

  1. In the book, the method does not take any parameters (at least when it first appears in the book), but the current signature requires a words parameter. So when you follow the book, what you get is the following:

    >>> text1.generate()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: generate() missing 1 required positional argument: 'words'
    

    This is not very user-friendly. I suggest giving words a default value in the signature (e.g. words=None) to avoid this issue.

  2. Using the regular Python console following the book instructions I do not see the DeprecationWarning that the code logs. This is what I did from the beginning:

    [adrian@chakra temporal]$ mkdir nltk
    [adrian@chakra temporal]$ cd nltk/
    [adrian@chakra nltk]$ python3 -m venv venv
    [adrian@chakra nltk]$ . venv/bin/activate
    (venv) [adrian@chakra nltk]$ pip install nltk
    Collecting nltk
    Using cached nltk-3.2.2.tar.gz
    Collecting six (from nltk)
    Using cached six-1.10.0-py2.py3-none-any.whl
    Installing collected packages: six, nltk
    Running setup.py install for nltk ... done
    Successfully installed nltk-3.2.2 six-1.10.0
    You are using pip version 8.1.1, however version 9.0.1 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command.
    (venv) [adrian@chakra nltk]$ python
    Python 3.5.2 (default, Jan 18 2017, 23:05:33) 
    [GCC 5.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import nltk
    >>> # Download the book resources, which requires GUI interaction.
    ... 
    >>> nltk.download()
    showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
    True
    >>> from nltk.book import *
    *** Introductory Examples for the NLTK Book ***
    Loading text1, ..., text9 and sent1, ..., sent9
    Type the name of the text or sentence to view it.
    Type: 'texts()' or 'sents()' to list the materials.
    text1: Moby Dick by Herman Melville 1851
    text2: Sense and Sensibility by Jane Austen 1811
    text3: The Book of Genesis
    text4: Inaugural Address Corpus
    text5: Chat Corpus
    text6: Monty Python and the Holy Grail
    text7: Wall Street Journal
    text8: Personals Corpus
    text9: The Man Who Was Thursday by G . K . Chesterton 1908
    >>> text1.generate(words=None)
    >>>
    

    Since the generate() is not simply deprecated as in “it will be removed”, but also as in “it does not work anymore”, I suggest raising a NotImplementedError instead of logging a warning. Doing so would both make existing code fail (which is what I, as a developer, would like it to do instead of apparently succeeding without actually doing what it used to do) and show up in the console for book readers.

If you agree with these changes but do not have the time or motivation to implement them, just let me know and I will send a merge request.

I also wonder if we could remove the references to generate() from the book. At least in chapter 1, they do not seem to be required by other sections of the chapter.

In my opinion, I totally agree with you guys to remove (generate() references).

This is still referred to in Chapter one of the book: http://www.nltk.org/book/ch01.html

It's no biggy, but I'm probably typical in spending a few minutes googling this as an issue. Those few minutes times however many people are working their way through the book... :)

Bump, generate() references are still there.

It's still there now.

It's still there now.

It's still there now (on the Safari Online version of the NLTK book)

Oh I see there was a note inserted after the example, that generate() was removed from NLTK 3 -- sorry

It's still throwing error

The generate method should be present as of NLTK 3.4. Check out the nltk.lm package!

Hi, what are possible current alternatives to auto generate text, in replacement to this function? I am reading the generated text from the Chapter 1 of the book, and I too would like to produce marvelous sentences such as "In the beginning of his brother is a hairy man" or this fundamental question : "so shall thy wages be?".

Sorry @eric-burel I don't quite understand your comment. I'm going to try phrasing it in my words and you tell me if that's correct, ok?

Do you want some other way to generate text other than training a language model and using its generate method?

Hi, sorry I was unclear, I was just having a good laugh reading the NLP with Python book when reaching the text generation part, as the sentences are very realistic while still being weird, and I find this subject interesting in general.
I just wonder what's the current recommanded alternative to generate text automatically, in replacement to generate(). Should I update smth or load another library ?

Well, starting with NLTK version 3.4 you actually have access to generate(). Take a look at the documentation for the lm module:

>>> from nltk import lm
>>> help(lm)

Could also build your own generate, if ya want: http://www.cyber-omelette.com/2017/01/markov.html

I am attempting to follow the examples in the O'Reilly Natural Language Processing with Python book, which is how I got to this discussion. I think people are talking about the nltk.text.generate function and not the nltk.parse.generate function, which seems to be causing some confusion.

I type in text1.generate(), I'm supposed to get a list of phrases in the style of Moby Dick. Instead, all I get is the error message TypeError: generate() missing 1 required positional argument: 'words'.

This appears to be coming from site-packages/nltk/text.py which is defined as def generate(self, words) and it's only purpose appears to be to print a warning that the generator function is no longer available, which it fails to do even if you do pass in a value for the words argument.

I'm using nltk version 3.5.2.

This can be closed now since generate was added.

Thanks @Copper-Head

Was generate removed again? I'm following the book, and got the error generate() missing 1 required positional argument: 'words'. Any help, please?

Still having troubles with this function :/

@Tserewara @AlbertSawZ what version of NLTK are you using? Could you post a minimal example that we can try running to reproduce this problem?

It's a bit difficult to help you based on what information you sent so far :(

I am also following the book but the issue still persists
Any alternate way for this generate function

TypeError                                 Traceback (most recent call last)
<ipython-input-36-463eb7c367ab> in <module>()
----> 1 text3.generate()

TypeError: generate() missing 1 required positional argument: 'words'
Was this page helpful?
0 / 5 - 0 ratings