Nltk: λ°°μ—΄ ν•¨μˆ˜κ°€ 였λ₯˜λ₯Ό λ°˜ν™˜ν•©λ‹ˆλ‹€.

에 λ§Œλ“  2019λ…„ 05μ›” 15일  Β·  10μ½”λ©˜νŠΈ  Β·  좜처: nltk/nltk

μ±…μ˜ 1 μž₯을 μ‚΄νŽ΄ λ΄€λŠ”λ° collocations ν•¨μˆ˜κ°€ 였λ₯˜λ₯Ό λ°˜ν™˜ν•©λ‹ˆλ‹€. collocation_list ν•¨μˆ˜κ°€ λ„μž…λ˜μ—ˆμœΌλ―€λ‘œ text.py의 440 행이 μ€‘λ³΅λ˜λŠ” κ²ƒμ²˜λŸΌ λ³΄μž…λ‹ˆλ‹€. text.pyμ—μ„œ ν˜„μž¬ 쀄 440κ³Ό 쀄 441을 λ‹€μ‹œ μž‘μ„±ν•˜μ—¬ 문제λ₯Ό ν•΄κ²°ν–ˆμŠ΅λ‹ˆλ‹€.

이전 μ½”λ“œ :
collocation_strings = [w1 + ''+ w2 for w1, w2 in self.collocation_list (num, window_size)] *
print (tokenwrap (collocation_strings, separator = ";"))

μƒˆ μ½”λ“œ :
print (tokenwrap (self.collocation_list (), separator = ";"))

bug goodfirstbug resolved text

κ°€μž₯ μœ μš©ν•œ λŒ“κΈ€

.collocations() 에도 μ—¬μ „νžˆ λ¬Έμ œκ°€ μžˆμ§€λ§Œ .collocation_list() λŠ” μž‘λ™ν•©λ‹ˆλ‹€.

λͺ¨λ“  10 λŒ“κΈ€

버그λ₯Ό 제기 ν•΄ μ£Όμ‹  @martinevanschouwenburg μ—κ²Œ κ°μ‚¬λ“œλ¦½λ‹ˆλ‹€!

예, λ°°μ—΄ λͺ©λ‘μ΄ ν•„μš”ν•œ 것 κ°™μŠ΅λ‹ˆλ‹€. 버그λ₯Ό λ³΅μ œν•˜λ €λ©΄ :

$ python3
Python 3.6.4rc1 (v3.6.4rc1:3398dcb14f, Dec  5 2017, 00:58:30) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/text.py", line 440, in collocations
    collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/text.py", line 440, in <listcomp>
    collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]
ValueError: too many values to unpack (expected 2)

μ±…μ˜ 1 μž₯을 진행할 λ•Œλ„μ΄ 였λ₯˜κ°€ 계속 ν‘œμ‹œλ©λ‹ˆλ‹€.

* NLTK Book의 μ†Œκ°œ 예제 *
text1, ..., text9 및 sent1, ..., sent9λ‘œλ“œ 쀑
λ³Ό ν…μŠ€νŠΈ λ˜λŠ” λ¬Έμž₯의 이름을 μž…λ ₯ν•˜μ‹­μ‹œμ˜€.
자료λ₯Ό λ‚˜μ—΄ν•˜λ €λ©΄ 'texts ()'λ˜λŠ” 'sents ()'λ₯Ό μž…λ ₯ν•˜μ‹­μ‹œμ˜€.
text1 : Herman Melville 1851의 Moby Dick
text2 : Jane Austen 1811의 감각과 감성
text3 : μ°½μ„ΈκΈ°
text4 : μ·¨μž„ μ—°μ„€ μ½”νΌμŠ€
text5 : μ½”νΌμŠ€ μ±„νŒ…
text6 : λͺ¬ν‹° 파이썬과 μ„±λ°°
text7 : μ›”μŠ€νŠΈλ¦¬νŠΈ 저널
text8 : νΌμŠ€λ„ μ½”νΌμŠ€
text9 : λͺ©μš”일에 μžˆμ—ˆλ˜ λ‚¨μž by G. K. μ²΄μŠ€ν„°ν„΄ 1908
μ—­ 좔적 (κ°€μž₯ 졜근 호좜 λ§ˆμ§€λ§‰) :
파일 "c : UsersAdam.vscodeextensionsms-python.python-2019.6.24221pythonFilesptvsd_launcher.py", 43 ν–‰
main (ptvsdArgs)
파일 "c : UsersAdam.vscodeextensionsms-python.python-2019.6.24221pythonFileslibpythonptvsd__main __. py", 쀄 434, κΈ°λ³Έ
운영()
run_file의 "c : UsersAdam.vscodeextensionsms-python.python-2019.6.24221pythonFileslibpythonptvsd__main __. py", 312 ν–‰ 파일
runpy.run_path (λŒ€μƒ, run_name = '__ main__')
run_pathμ—μžˆλŠ” "c : usersadamappdatalocalprogramspythonpython37-32Librunpy.py", 263 ν–‰ 파일
pkg_name = pkg_name, script_name = fname)
_run_module_code의 파일 "c : usersadamappdatalocalprogramspythonpython37-32Librunpy.py", 96 ν–‰
mod_name, mod_spec, pkg_name, script_name)
_run_codeμ—μžˆλŠ” 파일 "c : usersadamappdatalocalprogramspythonpython37-32Librunpy.py", 85 ν–‰
exec (μ½”λ“œ, run_globals)
파일 "c : UsersAdamDocumentscodepythonnatlangnatlang.py", 4 ν–‰,
text4.collocations ()
λ°°μ—΄μ—μžˆλŠ” 파일 "C : UsersAdam.virtualenvsnatlang-9ek-vNymlibsite-packagesnltktext.py", 444 ν–‰
w1 + ""+ w2 for w1, w2 in self.collocation_list (num, window_size)
파일 "C : UsersAdam.virtualenvsnatlang-9ek-vNymlibsite-packagesnltktext.py", 쀄 444, in
w1 + ""+ w2 for w1, w2 in self.collocation_list (num, window_size)
ValueError : 압좕을 풀기에 λ„ˆλ¬΄ λ§Žμ€ κ°’ (2 개 μ˜ˆμƒ)

@networkjr λ‚˜λ„ 확인할 수 μžˆμŠ΅λ‹ˆλ‹€. # 2227의 μˆ˜μ • 사항이 아직 PyPi둜 ν‘Έμ‹œλ˜μ§€ μ•Šμ•˜μ„κΉŒμš”?

@networkjr Anaconda νŒ¨ν‚€μ§€μ™€ λ™μΌν•©λ‹ˆλ‹€.

μ €λŠ” NLTK 책을 톡해 μž‘μ—…ν•˜κ³  있으며, NLTK에 μ™„μ „νžˆ μ΅μˆ™ν•˜μ§€ μ•Šκ³  Python에 μƒλ‹Ήνžˆ μ΅μˆ™ν•©λ‹ˆλ‹€.이 같은 였λ₯˜κ°€ λ°œμƒν•©λ‹ˆλ‹€.

$ python
Python 3.7.2 (default, Feb 14 2019, 11:13:53) 
[Clang 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/george/code/nltk/py3env/lib/python3.7/site-packages/nltk/text.py", line 444, in collocations
    w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
  File "/Users/george/code/nltk/py3env/lib/python3.7/site-packages/nltk/text.py", line 444, in <listcomp>
    w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
ValueError: too many values to unpack (expected 2)

λ‚΄ Pipfile.lock 에 λ”°λ₯΄λ©΄ κ°€μž₯ 졜근 릴리슀라고 μƒκ°ν•˜λŠ” NLTK 3.4.5λ₯Ό μ‚¬μš©ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.

이 λ¬Έμ œμ— λŒ€ν•œ μˆ˜μ • 사항이 μžˆμŠ΅λ‹ˆκΉŒ?

이 λ¬Έμ œλŠ” # 2377μ—μ„œ μˆ˜μ •λ˜μ—ˆμœΌλ©° 곧 λ‹€μŒ NLTK λ¦΄λ¦¬μŠ€μ—μ„œ μˆ˜μ • 될 μ˜ˆμ •μž…λ‹ˆλ‹€.

그렇지 μ•ŠμœΌλ©΄ 기닀릴 수 μ—†λ‹€λ©΄ =)

pip install -U https://github.com/nltk/nltk/archive/develop.zip

cntkλ₯Ό μ—…λ°μ΄νŠΈ ν•œ 후에도 μ—¬μ „νžˆ λ™μΌν•œ 였λ₯˜κ°€ λ°œμƒν•©λ‹ˆλ‹€.
pip install -U https://github.com/nltk/nltk/archive/develop.zip

ν˜„μž¬ cnkt 버전 '3.4.5'

μ–΄λ–»κ²Œ κ³ μΉ  수 μžˆμŠ΅λ‹ˆκΉŒ?

κ°μ‚¬ν•©λ‹ˆλ‹€.

.collocations() 에도 μ—¬μ „νžˆ λ¬Έμ œκ°€ μžˆμ§€λ§Œ .collocation_list() λŠ” μž‘λ™ν•©λ‹ˆλ‹€.

/nltk/text.py의 444 ν–‰μ—μ„œ κ΅μ²΄ν•˜μ‹­μ‹œμ˜€.
collocation_strings = [w1 + ""+ w2 for w1, w2 in text.collocation_list (num, window_size)]

λ‹€μŒκ³Ό ν•¨κ»˜ :
collocation_strings = [w for w in text.collocation_list (num, window_size)]

여기도 λ§ˆμ°¬κ°€μ§€μž…λ‹ˆλ‹€. nltk 책을 톡해 μž‘μ—…ν•˜λ©΄ collocations ()에 였λ₯˜κ°€ λ°œμƒν•˜λŠ” 반면 collocation_list ()λŠ” μž‘λ™ν•©λ‹ˆλ‹€.

이 νŽ˜μ΄μ§€κ°€ 도움이 λ˜μ—ˆλ‚˜μš”?
0 / 5 - 0 λ“±κΈ‰