Nltk: "oing"๋‹จ์–ด๋ฅผ ์–ด๊ฐ„ํ•˜๋ ค๊ณ  ํ•  ๋•Œ "IndexError : string index out of range"

์— ๋งŒ๋“  2017๋…„ 02์›” 08์ผ  ยท  5์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: nltk/nltk

์žฌํ˜„ํ•˜๊ธฐ ์‰ฌ์›€ :

>>> from nltk import PorterStemmer
>>> stemmer = PorterStemmer()
>>> stemmer.stem('oing')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 665, in stem
    stem = self._step1b(stem)
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 376, in _step1b
    lambda stem: (self._measure(stem) == 1 and
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 258, in _apply_rule_list
    if suffix == '*d' and self._ends_double_consonant(word):
  File "/Users/peterbe/virtualenvs/songsearch/lib/python3.5/site-packages/nltk/stem/porter.py", line 214, in _ends_double_consonant
    word[-1] == word[-2] and
IndexError: string index out of range
>>> import nltk
>>> nltk.__version__
'3.2.2'
bug pleaseverify stelemma

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

๊ทธ๋ž˜์„œ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ ๋˜์—ˆ์Šต๋‹ˆ๊นŒ?

๋ชจ๋“  5 ๋Œ“๊ธ€

"aed" ๋‹จ์–ด์— ๋Œ€ํ•ด์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

from nltk.stem.porter import PorterStemmer
from nltk.corpus import stopwords
stemmer = PorterStemmer()
stemmer.stem('aed')

์˜ค๋ฅ˜ :

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 665, in stem
    stem = self._step1b(stem)
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 376, in _step1b
    lambda stem: (self._measure(stem) == 1 and
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 258, in _apply_rule_list
    if suffix == '*d' and self._ends_double_consonant(word):
  File "/Users/peter.hurford/.virtualenvs/rex/lib/python2.7/site-packages/nltk/stem/porter.py", line 214, in _ends_double_consonant
    word[-1] == word[-2] and
IndexError: string index out of range

๋‹ค์Œ๊ณผ ํ•จ๊ป˜ ์„ค์น˜ :

pip install nltk
python -m nltk.downloader -d

๋ฒ„์ „:

import nltk
nltk.__version__ # '3.2.2'

https://github.com/nltk/nltk/issues/1581๊ณผ ์ค‘๋ณต

์ด ๋ฒ„๊ทธ๋Š” ๋ฒ„์ „ 3.2.2์—์„œ ๋„์ž…๋˜์—ˆ์œผ๋ฉฐ master์—์„œ ์ˆ˜์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค. develop ๋˜๋Š” ๋ฒ„์ „ 3.2.1์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฒ„๊ทธ๋ฅผ ์ œ๊ฑฐ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ˆ˜์ • ๋œ ์ƒํƒœ๋กœ ์ข…๋ฃŒ ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

๊ทธ๋ž˜์„œ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ ๋˜์—ˆ์Šต๋‹ˆ๊นŒ?

์ด ๋ฌธ์ œ๋Š” # 1582 ๐Ÿ˜‰์— ์˜ํ•ด ํ•ด๊ฒฐ๋˜์—ˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

>>> import nltk
>>> nltk.__version__
'3.2.5'

>>> from nltk import PorterStemmer
>>> porter = PorterStemmer()
>>> porter.stem('oing')
u'o'

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰