Nltk: text.generate()๊ฐ€ ์กด์žฌํ•˜์ง€ ์•Š์ง€๋งŒ ์—ฌ์ „ํžˆ ์ฐธ์กฐ๋ฉ๋‹ˆ๋‹ค.

์— ๋งŒ๋“  2014๋…„ 08์›” 20์ผ  ยท  27์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: nltk/nltk

์ €๋Š” Natural Language Processing ์ฑ…์„ ๋”ฐ๋ผํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ–ˆ์ง€๋งŒ ์ฒซ ๋ฒˆ์งธ ์žฅ์—์„œ ๋ฐ”๋กœ ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ๋ฅผ ์ ‘ํ•˜๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. import ์—์„œ ๋ชจ๋“  ๊ฒƒ์„ nltk.book ํ•˜๊ณ  ๋‚œ ํ›„ ์ฒซ ๋ฒˆ์งธ ์ƒ๊ฐ์€ ์˜ˆ์ œ ์ค‘ ํ•˜๋‚˜์—์„œ ๋ณด์—ฌ์ง„ ๊ฒƒ์ฒ˜๋Ÿผ text3.generate() ๋ฅผ ์‹œ๋„ํ•˜๋Š” ๊ฒƒ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋ฌผ๋ก  Text ํด๋ž˜์Šค์—๋Š” ๋‚ด๊ฐ€ ์„ค์น˜ํ•œ NLTK์— ํ•ด๋‹น ๋ฉ”์„œ๋“œ๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ‹์ง„ AttributeError ๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ nltk.text.demo() ๋ฅผ ์‹คํ–‰ํ•ด๋„ ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ ค๊ณ  ์‹œ๋„ํ•˜๊ณ  ๋™์ผํ•œ ์˜ค๋ฅ˜๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
๋ฌผ๋ก  generate() ๋ฉ”์„œ๋“œ์— ๋Œ€ํ•œ ๋ฌธ์„œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†์—ˆ์œผ๋ฏ€๋กœ ์ œ๊ฑฐ๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ nltk.text.demo() ๋ฐ ๊ต๊ณผ์„œ์—์„œ ์ด์— ๋Œ€ํ•œ ์ฐธ์กฐ๋ฅผ ์ œ๊ฑฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์ €๋Š” NLTK 3.0.0b1(์ด ๊ธ€์„ ์“ฐ๋Š” ์‹œ์ ์— PyPI์˜ Windows ์„ค์น˜ ํ”„๋กœ๊ทธ๋žจ ํŒจํ‚ค์ง€๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฒ„์ „)๊ณผ ํ•จ๊ป˜ Python 2.7.8์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. text3.generate() ์˜ˆ์ œ๋Š” ๊ต๊ณผ์„œ์˜ ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜„์žฌ ๋ฒ„์ „ ๋ชจ๋‘์— ์žˆ์Šต๋‹ˆ๋‹ค.

language-model

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

์ด๊ฒƒ์€ ์—ฌ์ „ํžˆ โ€‹โ€‹์ฑ…์˜ 1์žฅ์—์„œ ์–ธ๊ธ‰๋ฉ๋‹ˆ๋‹ค: http://www.nltk.org/book/ch01.html

ํฐ ๋ฌธ์ œ๋Š” ์•„๋‹ˆ์ง€๋งŒ ๋‚˜๋Š” ์ด๊ฒƒ์„ ๋ฌธ์ œ๋กœ ์ธํ„ฐ๋„ท ๊ฒ€์ƒ‰์„ ํ•˜๋Š” ๋ฐ ๋ช‡ ๋ถ„์„ ๋ณด๋‚ด๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ ๋ช‡ ๋ถ„์˜ ์‹œ๊ฐ„์ด์ง€๋งŒ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด ์ฑ…์„ ํ†ตํ•ด ์ž์‹ ์˜ ๊ธธ์„ ๊ฐ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค... :)

๋ชจ๋“  27 ๋Œ“๊ธ€

ํ˜ผ๋ž€์„ ๋“œ๋ ค ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. NLTK์˜ ์–ธ์–ด ๋ชจ๋ธ๋ง์— ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋ž˜์„œ ์šฐ๋ฆฌ๋Š” ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋  ๋•Œ๊นŒ์ง€ ๊ทธ๊ฒƒ์„ ์ œ๊ฑฐํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๊ณ ์žฅ
ํ…์ŠคํŠธ ์ƒ์„ฑ ๊ธฐ๋Šฅ. ์ฑ…์˜ ์˜จ๋ผ์ธ ๋ฒ„์ „์„ ์—…๋ฐ์ดํŠธํ–ˆ์Šต๋‹ˆ๋‹ค.
์ด์— ๋Œ€ํ•œ ๋ฉ”๋ชจ์™€ ํ•จ๊ป˜ ํ…์ŠคํŠธ ๋ฐ๋ชจ๋ฅผ ์—…๋ฐ์ดํŠธํ–ˆ์Šต๋‹ˆ๋‹ค(
๊ณง ๋ฆด๋ฆฌ์Šค 3.0.0b2).

-์Šคํ‹ฐ๋ธ ๋ฒ„๋“œ

2014๋…„ 8์›” 20์ผ 17:12 Kasran [email protected] ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(Natural Language Processing
ํ•˜์ง€๋งŒ ์ฒซ ๋ฒˆ์งธ ์žฅ์—์„œ ๋ฐ”๋กœ ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ๋ฅผ ์ ‘ํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ํ›„์—
nltk.book์—์„œ ๋ชจ๋“  ๊ฒƒ์„ ๊ฐ€์ ธ์˜ฌ ๋•Œ ๋‚ด ์ฒซ ๋ฒˆ์งธ ์ƒ๊ฐ์€ ์‹œ๋„ํ•˜๋Š” ๊ฒƒ์ด ์—ˆ์Šต๋‹ˆ๋‹ค.
์˜ˆ ์ค‘ ํ•˜๋‚˜์—์„œ ์„ค๋ช…ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ text3.generate(). ๋‹น์—ฐํžˆ ๋‚˜๋Š”
Text ํด๋ž˜์Šค๊ฐ€ ๋ถ„๋ช…ํžˆ ๊ทธ๋ ‡์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ‹์ง„ AttributeError๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‚ด๊ฐ€ ์„ค์น˜ ํ•œ NLTK์— ํ•ด๋‹น ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ nltk.text.demo()๋ฅผ ์‹คํ–‰ํ•ด๋„ ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
๋™์ผํ•œ ์˜ค๋ฅ˜๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
๋ฌผ๋ก , generate() ๋ฉ”์„œ๋“œ์— ๋Œ€ํ•œ ๋ฌธ์„œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋ž˜์„œ ๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ์ œ๊ฑฐ๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์ œ๊ฑฐํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.
nltk.text.demo() ๋ฐ ๊ต๊ณผ์„œ์—์„œ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

NLTK 3.0.0b1๊ณผ ํ•จ๊ป˜ Python 2.7.8์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. text3.generate() ์˜ˆ์ œ๋Š”
๊ต๊ณผ์„œ์˜ ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜„์žฌ ๋ฒ„์ „ ๋ชจ๋‘์—์„œ.

์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/nltk/nltk/issues/736.

๊ดœ์ฐฎ์€! ์™„์ „ํžˆ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ๊ณ ์น  ๋•Œ๊นŒ์ง€ ์˜ˆ์ „์˜ ์กฐ์žกํ•œ ๋ฐฉ์‹์ธ ์ ์šฉ๋œ Markov ์ฒด์ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‚ด ์ž์‹ ์˜ chatterbot์„ ์ž‘์„ฑํ•ด์•ผ ํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋น ๋ฅธ ์‘๋‹ต์— ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ๋˜ํ•œ ๋‚˜๋ฅผ ์กฐ๊ธˆ ์–ด๋ฆฌ๋‘ฅ์ ˆํ•˜๊ฒŒํ•˜๋Š”์ด ๋™์ผํ•œ ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚ฌ์Šต๋‹ˆ๋‹ค. Google ๊ฒ€์ƒ‰์œผ๋กœ ๋‚˜๋ฅผ ๋ฐ๋ ค๊ฐ„ ์ด ์Šค๋ ˆ๋“œ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

ํ . ์ด๊ฒƒ์ด ๋„์›€์ด ๋˜๋‚˜์š”(https://github.com/alvations/nltk/blob/develop/nltk/translate/decoder.py#L33)? ๋””์ฝ”๋”ฉํ•  ๋•Œ ์–ธ์–ด ๋ชจ๋ธ์ด ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. translate ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ๋Œ€ํ•ด ์—…๋ฐ์ดํŠธ๋˜๊ณ  ์ ์ ˆํ•˜๊ฒŒ ๋ฌธ์„œํ™”๋œ ์ฝ”๋“œ๋ฅผ ๊ณง ์˜ฌ๋ฆด ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

ํ•ด์ปคํ†ค, ์›Œํฌ์ƒต ๋ฐ ์ปจํผ๋Ÿฐ์Šค์— ์—ฐ๋‹ฌ์•„ ์ฐธ์„ํ•˜์ง€ ๋ชปํ•ด ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ์ฃผ๋ง๊นŒ์ง€ ํ•ด๋ณผ๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

์•„, ํ•˜์ง€๋งŒ ๋‚ด ๋ชจ๋ธ์€ ์ฒ˜์Œ๋ถ€ํ„ฐ ๋ชจ๋ธ์„ ๋นŒ๋“œํ•˜์ง€ ์•Š๊ณ  ๋ฏธ๋ฆฌ ๊ณ„์‚ฐ๋œ ๋ชจ๋ธ์„ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. ์–ธ์–ด ๋ชจ๋ธ ๋ชจ๋“ˆ์„ ๊ตฌ์ถ•ํ•˜๋Š” ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

์‚ฌ์šฉ์ž ๊ฒฝํ—˜ ์ธก๋ฉด์—์„œ ํ˜„์žฌ ์ฝ”๋“œ์— ๋‘ ๊ฐ€์ง€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค(์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์‚ฌ์‹ค ์™ธ์—).

  1. ์ฑ…์—์„œ ๋ฉ”์„œ๋“œ๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์ง€๋งŒ(์ตœ์†Œํ•œ ์ฑ…์— ์ฒ˜์Œ ๋‚˜ํƒ€๋‚  ๋•Œ) ํ˜„์žฌ ์„œ๋ช…์—๋Š” words ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ฑ…์„ ํŒ”๋กœ์šฐํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

    >>> text1.generate()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: generate() missing 1 required positional argument: 'words'
    

    ์ด๊ฒƒ์€ ์‚ฌ์šฉ์ž ์นœํ™”์ ์ด์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ๋ฐฉ์ง€ํ•˜๋ ค๋ฉด ์„œ๋ช…์—์„œ ๋‹จ์–ด์— ๊ธฐ๋ณธ๊ฐ’์„ ์ง€์ •ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค(์˜ˆ: words=None ).

  2. ์ฑ… ์ง€์นจ์— ๋”ฐ๋ผ ์ผ๋ฐ˜ Python ์ฝ˜์†”์„ ์‚ฌ์šฉํ•˜๋ฉด ์ฝ”๋“œ์—์„œ ๊ธฐ๋กํ•˜๋Š” DeprecationWarning ๊ฐ€ ํ‘œ์‹œ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋‚ด๊ฐ€ ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•œ ์ผ์ž…๋‹ˆ๋‹ค.

    [adrian<strong i="17">@chakra</strong> temporal]$ mkdir nltk
    [adrian<strong i="18">@chakra</strong> temporal]$ cd nltk/
    [adrian<strong i="19">@chakra</strong> nltk]$ python3 -m venv venv
    [adrian<strong i="20">@chakra</strong> nltk]$ . venv/bin/activate
    (venv) [adrian<strong i="21">@chakra</strong> nltk]$ pip install nltk
    Collecting nltk
    Using cached nltk-3.2.2.tar.gz
    Collecting six (from nltk)
    Using cached six-1.10.0-py2.py3-none-any.whl
    Installing collected packages: six, nltk
    Running setup.py install for nltk ... done
    Successfully installed nltk-3.2.2 six-1.10.0
    You are using pip version 8.1.1, however version 9.0.1 is available.
    You should consider upgrading via the 'pip install --upgrade pip' command.
    (venv) [adrian<strong i="22">@chakra</strong> nltk]$ python
    Python 3.5.2 (default, Jan 18 2017, 23:05:33) 
    [GCC 5.2.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import nltk
    >>> # Download the book resources, which requires GUI interaction.
    ... 
    >>> nltk.download()
    showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml
    True
    >>> from nltk.book import *
    *** Introductory Examples for the NLTK Book ***
    Loading text1, ..., text9 and sent1, ..., sent9
    Type the name of the text or sentence to view it.
    Type: 'texts()' or 'sents()' to list the materials.
    text1: Moby Dick by Herman Melville 1851
    text2: Sense and Sensibility by Jane Austen 1811
    text3: The Book of Genesis
    text4: Inaugural Address Corpus
    text5: Chat Corpus
    text6: Monty Python and the Holy Grail
    text7: Wall Street Journal
    text8: Personals Corpus
    text9: The Man Who Was Thursday by G . K . Chesterton 1908
    >>> text1.generate(words=None)
    >>>
    

    generate() ๋Š” ๋‹จ์ˆœํžˆ "์ œ๊ฑฐ๋ฉ๋‹ˆ๋‹ค"์™€ ๊ฐ™์ด ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์„ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ "๋” ์ด์ƒ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค"์™€ ๊ฐ™์ด ์‚ฌ์šฉ๋˜์ง€ ์•Š์œผ๋ฏ€๋กœ ๊ฒฝ๊ณ ๋ฅผ ๊ธฐ๋กํ•˜๋Š” ๋Œ€์‹  NotImplementedError ๋ฅผ ๋ฐœ์ƒ์‹œํ‚ค๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ฒŒ ํ•˜๋ฉด ๊ธฐ์กด ์ฝ”๋“œ๊ฐ€ ์‹คํŒจํ•˜๊ณ (์ด๋Š” ๊ฐœ๋ฐœ์ž๋กœ์„œ ์‹ค์ œ๋กœ ํ•˜๋˜ ์ผ์„ ์‹ค์ œ๋กœ ํ•˜์ง€ ์•Š๊ณ  ๋ถ„๋ช…ํžˆ ์„ฑ๊ณตํ•˜๋Š” ๋Œ€์‹ ์— ํ•˜๊ณ  ์‹ถ์€ ์ผ) ์ฑ… ๋…์ž๋ฅผ ์œ„ํ•œ ์ฝ˜์†”์— ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.

์ด๋Ÿฌํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์— ๋™์˜ํ•˜์ง€๋งŒ ๊ตฌํ˜„ํ•  ์‹œ๊ฐ„์ด๋‚˜ ๋™๊ธฐ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ ์•Œ๋ ค์ฃผ์‹œ๋ฉด ๋ณ‘ํ•ฉ ์š”์ฒญ์„ ๋ณด๋‚ด๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ ์ฑ…์—์„œ generate() ์— ๋Œ€ํ•œ ์ฐธ์กฐ๋ฅผ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค. ์ ์–ด๋„ 1์žฅ์—์„œ๋Š” ์ด ์žฅ์˜ ๋‹ค๋ฅธ ์„น์…˜์—์„œ ์š”๊ตฌํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ œ ์ƒ๊ฐ์—๋Š” ์ œ๊ฑฐ(์ƒ์„ฑ() ์ฐธ์กฐ)์— ๋Œ€ํ•ด ์ „์ ์œผ๋กœ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ์—ฌ์ „ํžˆ โ€‹โ€‹์ฑ…์˜ 1์žฅ์—์„œ ์–ธ๊ธ‰๋ฉ๋‹ˆ๋‹ค: http://www.nltk.org/book/ch01.html

ํฐ ๋ฌธ์ œ๋Š” ์•„๋‹ˆ์ง€๋งŒ ๋‚˜๋Š” ์ด๊ฒƒ์„ ๋ฌธ์ œ๋กœ ์ธํ„ฐ๋„ท ๊ฒ€์ƒ‰์„ ํ•˜๋Š” ๋ฐ ๋ช‡ ๋ถ„์„ ๋ณด๋‚ด๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ ๋ช‡ ๋ถ„์˜ ์‹œ๊ฐ„์ด์ง€๋งŒ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด ์ฑ…์„ ํ†ตํ•ด ์ž์‹ ์˜ ๊ธธ์„ ๊ฐ€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค... :)

Bump, generate() ์ฐธ์กฐ๋Š” ์—ฌ์ „ํžˆ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๊ฒƒ์€ ์ง€๊ธˆ๋„ ๊ฑฐ๊ธฐ์— ์žˆ๋‹ค.

๊ทธ๊ฒƒ์€ ์ง€๊ธˆ๋„ ๊ฑฐ๊ธฐ์— ์žˆ๋‹ค.

๊ทธ๊ฒƒ์€ ์—ฌ์ „ํžˆ โ€‹โ€‹๊ทธ ๊ณณ์— ์žˆ์Šต๋‹ˆ๋‹ค (NLTK ์ฑ…์˜ Safari Online ๋ฒ„์ „์— ์žˆ์Œ)

์•„, ์˜ˆ์ œ ๋’ค์— NLTK 3์—์„œ ์ƒ์„ฑ()์ด ์ œ๊ฑฐ๋˜์—ˆ๋‹ค๋Š” ๋ฉ”๋ชจ๊ฐ€ ์‚ฝ์ž…๋œ ๊ฒƒ์„ ๋ณด์•˜์Šต๋‹ˆ๋‹ค. -- ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค.

์—ฌ์ „ํžˆ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

generate ๋ฉ”์†Œ๋“œ๋Š” NLTK 3.4๋ถ€ํ„ฐ ์กด์žฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. nltk.lm ํŒจํ‚ค์ง€๋ฅผ ํ™•์ธํ•˜์„ธ์š”!

์•ˆ๋…•ํ•˜์„ธ์š”, ์ด ๊ธฐ๋Šฅ์„ ๋Œ€์‹ ํ•˜์—ฌ ํ…์ŠคํŠธ๋ฅผ ์ž๋™ ์ƒ์„ฑํ•˜๋Š” ํ˜„์žฌ ๊ฐ€๋Šฅํ•œ ๋Œ€์•ˆ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ทธ ์ฑ…์˜ 1์žฅ์—์„œ ์ƒ์„ฑ๋œ ํ…์ŠคํŠธ๋ฅผ ์ฝ๊ณ  ์žˆ์œผ๋ฉฐ, "๊ทธ์˜ ํ˜•์ œ๋Š” ์ฒ˜์Œ์— ํ„ธ์ด ๋งŽ์€ ์‚ฌ๋žŒ์ด๋‹ค" ๋˜๋Š” "๋„ค ์‚ฏ์ด ์ด์™€ ๊ฐ™์œผ๋žด"๋ผ๋Š” ๊ทผ๋ณธ์ ์ธ ์งˆ๋ฌธ๊ณผ ๊ฐ™์€ ๋†€๋ผ์šด ๋ฌธ์žฅ์„ ๋งŒ๋“ค๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. .

์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค @eric-burel ๊ท€ํ•˜์˜ ์˜๊ฒฌ์„ ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ๋‚ด ๋ง๋กœ ํ‘œํ˜„ํ•˜๋ ค๊ณ  ํ•˜๊ณ  ๋‹น์‹ ์ด ๊ทธ๊ฒƒ์ด ๋งž๋Š”์ง€ ๋งํ•ด์ฃผ๊ฒ ์†Œ, ์•Œ์•˜์ง€?

์–ธ์–ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ณ  generate ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ ์™ธ์— ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

์•ˆ๋…•ํ•˜์„ธ์š”, ์ œ๊ฐ€ ๋ช…ํ™•ํ•˜์ง€ ์•Š์•„์„œ ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ํ…์ŠคํŠธ ์ƒ์„ฑ ๋ถ€๋ถ„์— ๋„๋‹ฌํ–ˆ์„ ๋•Œ Python์œผ๋กœ NLP ์ฑ…์„ ์ฝ์œผ๋ฉด์„œ ์ข‹์€ ์›ƒ์Œ์„ ์ง€์—ˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์žฅ์ด ์—ฌ์ „ํžˆ ์ด์ƒํ•˜๋ฉด์„œ๋„ ๋งค์šฐ ํ˜„์‹ค์ ์ด์–ด์„œ ์ด ์ฃผ์ œ๊ฐ€ ์ผ๋ฐ˜์ ์œผ๋กœ ํฅ๋ฏธ๋กญ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
generate() ๋Œ€์‹  ํ…์ŠคํŠธ๋ฅผ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ํ˜„์žฌ ๊ถŒ์žฅ๋˜๋Š” ๋Œ€์•ˆ์ด ๋ฌด์—‡์ธ์ง€ ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค. smth๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๊ฑฐ๋‚˜ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋กœ๋“œํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

์Œ, NLTK ๋ฒ„์ „ 3.4๋ถ€ํ„ฐ ์‹ค์ œ๋กœ generate() ์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. lm ๋ชจ๋“ˆ์— ๋Œ€ํ•œ ๋ฌธ์„œ๋ฅผ ์‚ดํŽด๋ณด์‹ญ์‹œ์˜ค.

>>> from nltk import lm
>>> help(lm)

์›ํ•˜๋Š” ๊ฒฝ์šฐ ์ž์‹ ๋งŒ์˜ ์ƒ์„ฑ์„ ๋งŒ๋“ค ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. http://www.cyber-omelette.com/2017/01/markov.html ยถ

์ €๋Š” O'Reilly Natural Language Processing with Python ์ฑ…์˜ ์˜ˆ์ œ๋ฅผ ๋”ฐ๋ฅด๋ ค๊ณ  ์‹œ๋„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์ œ๊ฐ€ ์ด ํ† ๋ก ์— ๋„๋‹ฌํ•œ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์‚ฌ๋žŒ๋“ค์ด nltk.parse.generate ํ•จ์ˆ˜๊ฐ€ ์•„๋‹ˆ๋ผ nltk.text.generate ํ•จ์ˆ˜์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๋Š” ๊ฒƒ ๊ฐ™์•„์„œ ์•ฝ๊ฐ„์˜ ํ˜ผ๋ž€์„ ์ผ์œผํ‚ค๊ณ  ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

text1.generate()๋ฅผ ์ž…๋ ฅํ•˜๋ฉด Moby Dick ์Šคํƒ€์ผ์˜ ๋ฌธ๊ตฌ ๋ชฉ๋ก์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ๋Œ€์‹  ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€ TypeError: generate() missing 1 required positional argument: 'words' ๋ฐ›์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ def generate(self, words) ๋กœ ์ •์˜๋œ site-packages/nltk/text.py์—์„œ ์˜ค๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ด๋ฉฐ, ์œ ์ผํ•œ ๋ชฉ์ ์€ ์ œ๋„ˆ๋ ˆ์ดํ„ฐ ๊ธฐ๋Šฅ์„ ๋” ์ด์ƒ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒฝ๊ณ ๋ฅผ ์ธ์‡„ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๋‹จ์–ด ์ธ์ˆ˜์— ๋Œ€ํ•œ ๊ฐ’์„ ์ „๋‹ฌํ•˜๋”๋ผ๋„.

nltk ๋ฒ„์ „ 3.5.2๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

generate was added ์ดํ›„๋กœ ์ง€๊ธˆ ๋‹ซ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@Copper-Head ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

์ƒ์„ฑ์ด ๋‹ค์‹œ ์ œ๊ฑฐ๋˜์—ˆ์Šต๋‹ˆ๊นŒ? ์ฑ…์„ ํŒ”๋กœ์šฐํ•˜๊ณ  ์žˆ๋Š”๋ฐ generate() missing 1 required positional argument: 'words' ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋„์›€์ด ํ•„์š”ํ•˜์„ธ์š”?

์ด ๊ธฐ๋Šฅ์— ์—ฌ์ „ํžˆ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค :/

@Tserewara @AlbertSawZ ์–ด๋–ค ๋ฒ„์ „์˜ NLTK๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? ์ด ๋ฌธ์ œ๋ฅผ ์žฌํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์†Œํ•œ์˜ ์˜ˆ๋ฅผ ๊ฒŒ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์ง€๊ธˆ๊นŒ์ง€ ๋ณด๋‚ด์ฃผ์‹  ์ •๋ณด๋กœ๋Š” ๋„์›€์„ ๋“œ๋ฆฌ๊ธฐ๊ฐ€ ์กฐ๊ธˆ ์–ด๋ ต์Šต๋‹ˆ๋‹ค :(

๋‚˜๋Š” ๋˜ํ•œ ์ฑ…์„ ๋”ฐ๋ฅด๊ณ  ์žˆ์ง€๋งŒ ๋ฌธ์ œ๋Š” ์—ฌ์ „ํžˆ ์ง€์†๋ฉ๋‹ˆ๋‹ค
์ด ์ƒ์„ฑ ํ•จ์ˆ˜์— ๋Œ€ํ•œ ๋Œ€์ฒด ๋ฐฉ๋ฒ•

TypeError                                 Traceback (most recent call last)
<ipython-input-36-463eb7c367ab> in <module>()
----> 1 text3.generate()

TypeError: generate() missing 1 required positional argument: 'words'
์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰