I was going through chapter 1 of the book and the collocations function returns an error. It seems like line 440 in text.py is redundant, since the collocation_list function has been introduced. I fixed the issue by rewriting the current line 440 and line 441 in text.py.
old code:
collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]*
print(tokenwrap(collocation_strings, separator="; "))
new code:
print(tokenwrap(self.collocation_list(), separator="; "))
Thanks @martinevanschouwenburg for raising the bug!
Yes it looks like the collocation list is needed. To replicate the bug:
$ python3
Python 3.6.4rc1 (v3.6.4rc1:3398dcb14f, Dec 5 2017, 00:58:30)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/text.py", line 440, in collocations
collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/text.py", line 440, in <listcomp>
collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]
ValueError: too many values to unpack (expected 2)
I am still seeing this error as well when going through chapter 1 of the book.
* Introductory Examples for the NLTK Book *
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
Traceback (most recent call last):
File "c:\Users\Adam.vscode\extensions\ms-python.python-2019.6.24221\pythonFiles\ptvsd_launcher.py", line 43, in
main(ptvsdArgs)
File "c:\Users\Adam.vscode\extensions\ms-python.python-2019.6.24221\pythonFiles\lib\python\ptvsd__main__.py", line 434, in main
run()
File "c:\Users\Adam.vscode\extensions\ms-python.python-2019.6.24221\pythonFiles\lib\python\ptvsd__main__.py", line 312, in run_file
runpy.run_path(target, run_name='__main__')
File "c:\users\adam\appdata\local\programs\python\python37-32\Lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "c:\users\adam\appdata\local\programs\python\python37-32\Lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "c:\users\adam\appdata\local\programs\python\python37-32\Lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\Users\Adam\Documents\code\python\natlang\natlang.py", line 4, in
text4.collocations()
File "C:\Users\Adam.virtualenvs\natlang-9ek-vNym\lib\site-packages\nltk\text.py", line 444, in collocations
w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
File "C:\Users\Adam.virtualenvs\natlang-9ek-vNym\lib\site-packages\nltk\text.py", line 444, in
w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
ValueError: too many values to unpack (expected 2)
@networkjr I can confirm that too. Maybe the fix in #2227 hasn't been pushed to PyPi yet?
@networkjr it's the same with the Anaconda package
I'm working through the NLTK book, am completely new to NLTK and fairly new to Python - and I'm getting this same error.
$ python
Python 3.7.2 (default, Feb 14 2019, 11:13:53)
[Clang 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/george/code/nltk/py3env/lib/python3.7/site-packages/nltk/text.py", line 444, in collocations
w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
File "/Users/george/code/nltk/py3env/lib/python3.7/site-packages/nltk/text.py", line 444, in <listcomp>
w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
ValueError: too many values to unpack (expected 2)
According to my Pipfile.lock
I'm using NLTK 3.4.5 which I believe is the most recent release.
Is there a fix for this issue?
This has been fixed on #2377 , should be fixed in the next NLTK release soon.
Otherwise, if you can't wait =)
pip install -U https://github.com/nltk/nltk/archive/develop.zip
I still have the same error after updating cntk with
pip install -U https://github.com/nltk/nltk/archive/develop.zip
Current cnkt version '3.4.5'
How can I fix it?
Many thanks.
Also still having issues with .collocations()
, but .collocation_list()
works.
Replace at line 444 in /nltk/text.py :
collocation_strings = [ w1 + " " + w2 for w1, w2 in text.collocation_list(num, window_size)]
with the following:
collocation_strings = [ w for w in text.collocation_list(num, window_size)]
Same here. Working through the nltk book gives error for collocations() whereas collocation_list() works.
Most helpful comment
Also still having issues with
.collocations()
, but.collocation_list()
works.