μμ§νμ΄ λ¬Έμ λ νΈκΈ°μ¬λ§νΌ μ¬κ°νμ§ μμ΅λλ€. NLTKλ₯Ό κ°μ Έμ¬ λ λͺ¨λ Python νμ νλ‘μΈμ€κ° λ€νΈμν¬ νΈμΆμμ μ‘°κΈ°μ μ’ λ£λλ€λ κ²μ λ°κ²¬νμ΅λλ€. μμ μ½λ :
from multiprocessing import Process
import nltk
import time
def child_fn():
print "Fetch URL"
import urllib2
print urllib2.urlopen("https://www.google.com").read()[:100]
print "Done"
while True:
child_process = Process(target=child_fn)
child_process.start()
child_process.join()
print "Child process returned"
time.sleep(1)
κ°μ Έμ¨ NLTKλ‘ μ€ννλ©΄ urlopen () νΈμΆμ΄ μ€νλμ§ μλ κ²μ λ³Ό μ μμ΅λλ€. import nltk
μ€μ μ£Όμ μ²λ¦¬νλ©΄ μ λλ‘ μ€νλ©λλ€.
μ?
* νΈμ§ : μ΄κ²μ Python 2 μ©μ λλ€. μμ§ 3μμ ν μ€νΈνμ§ μμμ΅λλ€.
μμΈκ° μμ΅λκΉ?
μλ. import urllib2; print...
μ£Όμμ try .. except:
μ μ λμμ§λ§ μ무κ²λ μ»μ§ λͺ»νμ΅λλ€.
λλ λκ°μ λ¬Έμ μ μ§λ©΄νκ³ μμ΅λλ€. μ¬κΈ°μ λ§ν¬νλ λ° μ μ© ν μμλ SO μ§λ¬Έμ λ°©κΈ μ΄μμ΅λλ€. http://stackoverflow.com/questions/30766419/python-child-process-silently-crashes-when-issuing-an-http-request
μμ νλ‘μΈμ€λ μ€μ λ‘ μΆκ° ν΅μ§μμ΄ μ‘°μ©ν μΆ©λν©λλ€.
@ oxymor0nμ λμνμ§
The child process is indeed crashing silently without further notice.
nltk, gunicorn (preforkλ₯Ό ν΅ν΄λ‘λ λ nltk ν¬ν¨) λ° flaskμ μ‘°ν©μμλμ΄ λ¬Έμ κ° λ°μν©λλ€.
nltk κ°μ Έ μ€κΈ°λ₯Ό μ κ±°νλ©΄ λͺ¨λ κ²μ΄ μλν©λλ€. nltkλ₯Ό μ μΈνκ³ .
/ cc @escherba
@ninowalker , @ oxymor0n μ΄μν©λλ€. λ΄ νλ‘μΈμ€κ° μ½λλ‘ μ μ€νλ©λλ€.
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="de"><head><meta content
Done
Child process returned
κ·Έκ² μμλλ κ²°κ³Ό μ£ ?
μ΄κ²λ λ΄ μμ²μ κΉ¨μ§ μμ΅λλ€.
alvas<strong i="13">@ubi</strong>:~$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from multiprocessing import Process
>>> import requests
>>> from pprint import pprint
>>> Process(target=lambda: pprint(
... requests.get('https://api.github.com'))).start()
>>> <Response [200]>
>>> import nltk
>>> Process(target=lambda: pprint(
... requests.get('https://api.github.com'))).start()
>>> <Response [200]>
λλ μ¬μ©νκ³ μλ€ :
@Hiestaa μ λμΌν λ¬Έμ κ° λ°μνμ΅λλ€. nltkλ₯Ό κ°μ Έ μ€λ λμ°λ―Έ νμΌ string_util.pythonμ΄ μμ§λ§ λ€μ€ νλ‘μΈμ€ ν¬λ‘€λ¬λ₯Ό μμνκΈ° μν΄ λ€μ€ μ²λ¦¬ λͺ¨λμ μ¬μ©νλ κΈ°λ³Έ Python νμΌμμλ μ¬μ©λμ§ μμ΅λλ€. μ¦μμ μμ νλ‘μΈμ€κ° λ§νκ³ μ€λ₯ λ©μμ§ (μμΈ λ©μμ§λ μλ)κ° μλ€λ κ²μ
λλ€.
nltk κ΄λ ¨ κ°μ Έ μ€κΈ° λ° ν¨μλ₯Ό μ£Όμ μ²λ¦¬ ν ν λ¬Έμ κ° ν΄κ²°λμμ΅λλ€.
μΈλΆ:
μ΄μ체μ : Yosemite 10.10.5
Python : 2.7.10
νμ΄μ§ μ½ν
μΈ κ²μ : μ²μμλ urllib2λ₯Ό μ¬μ©ν λ€μ λμ€μ μμ²μΌλ‘ μ ννμ΅λλ€.
μ΄κ²μ λ§€μ° μ¬κ°ν λ²κ·Έμ΄λ©° λκ΅°κ°κ° κ°μ νμ¬ μμ ν μ μκΈ°λ₯Ό λ°λλλ€. κ°μ¬!
νλ‘λμ μμ€μ NLPλ₯Ό μννλ κ²½μ° μ΄κ²μ΄ μ¬κ°ν λ¬Έμ λΌκ³ μκ°ν©λλ€. μ°λ¦¬λ Rq (http://python-rq.org/) μμ μλ₯Ό μ¬μ©νμ¬ μ¬λ¬ NLP νμ΄ν λΌμΈμ μ€νν©λλ€. λ€νΈμν¬ νΈμΆμ ν λ μ‘°μ©ν μ£½μ΅λλ€. 곧 μμ λκΈ°λ₯Ό λ°λλλ€. κ°μ¬!
@sasinda : nltk-dev λ©μΌ λ§λ¦¬μ€νΈμ μ νλ₯Ό κ±Έμ΄μ΄ λ¬Έμ μ λν΄ κ΄μ¬μ κ°μ§ μ μλμ§ μμ보μΈμ.
@sasinda Rqκ° μ νν μ΄λ»κ² μλνλμ§ μ λͺ¨λ₯΄κ² μ§λ§ νλ‘λμ μμ€μ NLP νλ‘μ νΈμμ κ° νλ‘μΈμ€λ₯Ό λΆλ¦¬λκ³ κ²©λ¦¬ λ Python μΈν°ν리ν°μμ μμνμ¬ μμμ μμ±νλ μ μ€ν¬λ¦½νΈλ₯Ό μ¬μ©νμ¬μ΄ λ¬Έμ λ₯Ό ν΄κ²°ν μμμμ΅λλ€. μ΄ κ²½μ° νμ΄μ¬μ ν¬ν¬ ν νμκ° μμΌλ©° nltkμμ μλ μΆ©λμ΄ λ°μνμ§ μμ΅λλ€. κ·Έλμ λμμ΄ λ μ μμ΅λλ€.
ν¨μ μμ€μμ κ°μ Έ μ€κΈ°λ₯Ό μννλ©΄ λ¬Έμ κ° λ°μνμ§ μλ κ²μΌλ‘ λνλ¬μ΅λλ€.
μ¦, λ€μκ³Ό κ°μ΄ μλν©λλ€.
def split(words):
import nltk
return nltk.word_tokenize(words)
κ·Έλ¦¬κ³ μ΄κ²μ :
import nltk
def split(words):
return nltk.word_tokenize(words)
@mpenkovμκ² κ°μ¬λ립λλ€. λ¬Έμ κ° ν΄κ²° λμμ΅λκΉ?
@stevenbird λλ κ·Έλ κ² μκ°νμ§ μλλ€. ν΄κ²° λ°©λ²μ΄μ§λ§ μμ μ΄ μλλλ€.
IMHO, νμ¬ λΌμ΄λΈλ¬λ¦¬λ₯Ό κ°μ Έ μ€λ κ²μ΄ Python νμ€ λΌμ΄λΈλ¬λ¦¬ κ΅¬μ± μμλ₯Ό μμμν€λ κ²½μ° μ΄λκ°μμ λΆμ ν μΌμ΄ λ°μνκ³ μμ ν΄μΌν©λλ€.
@mpenkov λλ μ΄κ²μ΄ μ μλνλμ§ μμ ν λͺ¨λ₯΄κ² μ§λ§ μ¬κΈ°μ λ΄κ° μ°Ύμ λ λ€λ₯Έ ν΄κ²° λ°©λ²μ΄ μμ΅λλ€. λΆλͺ¨ νλ‘μΈμ€μμ μ€νλλ₯Ό ꡬμΆνλ©΄ λ¬Έμ κ° ν΄κ²°λλ κ²μΌλ‘ 보μ λλ€. @ oxymor0n μ μλ μ½λ μμ :
from multiprocessing import Process
import nltk
import time
import urllib2
# HACK
urllib2.build_opener(urllib2.HTTPHandler())
def child_fn():
print "Fetch URL"
import urllib2
print urllib2.urlopen("https://www.google.com").read()[:100]
print "Done"
while True:
child_process = Process(target=child_fn)
child_process.start()
child_process.join()
print "Child process returned"
time.sleep(1)
@mpenkov @ninowalker , @ oxymor0n @sasinda @wenbowang μ¬λ¬λΆ λͺ¨λ μ¬μ ν κ°μ λ¬Έμ μ μ§λ©΄νκ³ μμ΅λκΉ?
λ΄ μ»΄ν¨ν°μμ λ¬Έμ λ₯Ό 볡μ ν μ μμ΅λλ€.
from multiprocessing import Process
import nltk
import time
def child_fn():
print "Fetch URL"
import urllib2
print urllib2.urlopen("https://www.google.com").read()[:100]
print "Done"
while True:
child_process = Process(target=child_fn)
child_process.start()
child_process.join()
print "Child process returned"
time.sleep(1)
λμκ² μ€λ€ :
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-SG"><head><meta cont
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-SG"><head><meta cont
Done
Child process returned
Fetch URL
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="en-SG"><head><meta cont
Done
λλ :
@alvations μ΄ λ¬Έμ λ₯Ό λ°κ²¬ ν μ§ μ€λλμμ΅λλ€.
μ΄ λ¬Έμ κ°μλ νλ‘μ νΈ κΈ°λ°μ μμ΄ λ²λ €μ μ¬μ ν λ¬Έμ κ° μλμ§ μ¬λΆλ₯Ό λ§ν μ μμμ΅λλ€.
μ£μ‘ν©λλ€!
@alvations μ λμ΄ νΉμ λ¬Έμ λ‘ μΈν΄ κ³ ν΅μ κ²ͺμ νλ‘μ νΈλ₯Ό κΈ°μ΅νμ§ λͺ»ν©λλ€.
λ΄ μ»΄ν¨ν°μμ μ½λλ₯Ό μ€ννμ§λ§ λ¬Έμ λ₯Ό 볡μ ν μ μμ΅λλ€.
νμ΄μ¬ 2.7.12
nltk 3.2.1
macOS 10.12.6
@alvations λλ λ μ΄μ κ·Έ νλ‘μ νΈμμ μΌνκ³ μμ§ μμ΅λλ€. κ·Έλ¬λ μ΄λ¬ν ν΄κ²° λ°©λ² μ€ νλλ₯Ό μ¬μ©νμ΅λλ€.
μ½λλ₯Ό μλνμ§λ§ μ¬μ ν νμ νλ‘μΈμ€κ° μΈκ·Έλ¨ΌνΈ μ€λ₯ (μ’
λ£ μ½λ 11)λ‘ μ’
λ£λ©λλ€ (μ’
λ£ : urllib2.urlopen ( "https://www.google.com") .read () [: 100]).
κ·Έλλ urllib3 (https://urllib3.readthedocs.io/en/latest/)μ ν¨κ» μλνμ΅λλ€.
λ΄κ° λ§ν μμλ ν,μ΄ λ¬Έμ λ macOSμ μν₯μ λ―ΈμΉλ κ² κ°μ΅λλ€. μ§κΈκΉμ§ Python 3.6μ μ¬μ©νμ¬
python3μ λν μμ λ OP μ€ν¬λ¦½νΈ :
from multiprocessing import Process
import nltk
import time
def child_fn():
from urllib.request import urlopen
print("Fetch URL")
print(urlopen("https://www.google.com").read()[:100])
print("Done")
child_process = Process(target=child_fn)
child_process.start()
child_process.join()
print("Child process returned")
time.sleep(1)
μ°μΆ:
Fetch URL
Child process returned
νμ νλ‘μΈμ€κ° μκΈ°μΉ μκ² μ’ λ£λμ΄μ΄ Stack Overflow κ²μλ¬Όμ νμλ κ²κ³Ό μ μ¬ν μΆλ ₯μ μμ ν©λλ€.
λλ μ΄κ²μ΄ μλΉν λ§μμ΄ νλ€λ¦¬λ κ²μ΄λΌκ³ μκ°ν©λλ€. MacOSμμ μ€λ λ μ²λ¦¬μ κ΄λ ¨μ΄μμ μ μμ΅λλ€.
λλ nltkμ λν΄ λ―Ώμ μ μμ μ λλ‘ μ΅μνμ§ μμ§λ§ ν
μ€νΈκ° ν΅κ³Ό / μ€ν¨ν μμΈμ νμΈνκΈ° μν΄ μ½κ° λμ λ©κ²νλ€. ν
μ€νΈλ₯Ό ν΅κ³ΌνκΈ° μν΄ __init__.py
ν¨ν€μ§μ μνν΄μΌνλ μμ
μ λ€μκ³Ό κ°μ΅λλ€.
μΈλΆ μ 보 (νμ₯νλ €λ©΄ ν΄λ¦)
###########################################################
# TOP-LEVEL MODULES
###########################################################
# Import top-level functionality into top-level namespace
from nltk.collocations import *
from nltk.decorators import decorator, memoize
# from nltk.featstruct import *
# from nltk.grammar import *
from nltk.probability import *
from nltk.text import *
# from nltk.tree import *
from nltk.util import *
from nltk.jsontags import *
# ###########################################################
# # PACKAGES
# ###########################################################
# from nltk.chunk import *
# from nltk.classify import *
# from nltk.inference import *
from nltk.metrics import *
# from nltk.parse import *
# from nltk.tag import *
from nltk.tokenize import *
from nltk.translate import *
# from nltk.sem import *
# from nltk.stem import *
# Packages which can be lazily imported
# (a) we don't import *
# (b) they're slow to import or have run-time dependencies
# that can safely fail at run time
from nltk import lazyimport
app = lazyimport.LazyModule('nltk.app', locals(), globals())
chat = lazyimport.LazyModule('nltk.chat', locals(), globals())
corpus = lazyimport.LazyModule('nltk.corpus', locals(), globals())
draw = lazyimport.LazyModule('nltk.draw', locals(), globals())
toolbox = lazyimport.LazyModule('nltk.toolbox', locals(), globals())
# Optional loading
try:
import numpy
except ImportError:
pass
else:
from nltk import cluster
# from nltk.downloader import download, download_shell
# try:
# from six.moves import tkinter
# except ImportError:
# pass
# else:
# try:
# from nltk.downloader import download_gui
# except RuntimeError as e:
# import warnings
# warnings.warn("Corpus downloader GUI not loaded "
# "(RuntimeError during import: %s)" % str(e))
# explicitly import all top-level modules (ensuring
# they override the same names inadvertently imported
# from a subpackage)
# from nltk import ccg, chunk, classify, collocations
# from nltk import data, featstruct, grammar, help, inference, metrics
# from nltk import misc, parse, probability, sem, stem, wsd
# from nltk import tag, tbl, text, tokenize, translate, tree, treetransforms, util
ν₯λ―Έλ‘κ²λ λͺ¨λ λΉνμ±ν λ μμ
μ κΆκ·Ήμ μΌλ‘ tkinter
μμ
μΌλ‘ λμκ°λλ° μ΄κ²μ΄ κ·Όλ³Έ μμΈ import nltk
λ₯Ό import tkinter
λ‘ λ°κΎΈλ©΄ tkinterλ₯Ό μ°Έμ‘°νλ λ§€μ° μ μ¬ν μΆ©λ λ³΄κ³ μκ° νμλ©λλ€.
λ΄κ° μ μ μλ―μ΄μ΄ ν¨ν€μ§λ tkinter
μ§μ κ°μ Έμ΅λλ€.
nltk.app
nltk.draw
nltk.sem
μμ κΈ°λ³Έ ν¨ν€μ§ __init__
λ³κ²½ μ¬νμμ λ¬Έμ κ°μλ κ°μ Έ μ€κΈ°μ tkinter κ°μ Έ μ€κΈ°λ₯Ό μΆμ νλ λ°©λ²
nltk.featstruct
( sem
)nltk.grammar
( featstruct
)nltk.tree
( grammar
)nltk.chunk
( chunk.named_entity
> tree
)nltk.parse
( parse.bllip
> tree
)nltk.tag
( tag.stanford
> parse
)nltk.classify
( classify.senna
> tag
)nltk.inference
( inference.discourse
> sem
, tag
)nltk.stem
( stem.snowball
> corpus
> corpus.reader.timit
> tree
)κ°μ¬ν©λλ€ @rpkilby , κ·Έκ²μ λ§€μ° λμμ΄λ©λλ€!
μ΄ λ¬Έμ μ²λΌ 보μ λλ€ https://stackoverflow.com/questions/16745507/tkinter-how-to-use-threads-to-preventing-main-event-loop-from-freezing
λλ ν ν¬ ν°κ° κ½€ μ€λ«λμ μ°λ¦¬μκ² κ³ ν΅ ν¬μΈνΈλΌκ³ μκ°ν©λλ€. μλ§λ λμμ μ°Ύμ μ μλ€λ©΄ μ’μ κ²μ λλ€.
λλ λμνλ€. λ¨κΈ°μ μΈ ν΄κ²°μ± μ tkinterκ° νμν ν΄λμ€μ λ©μλ μμ tkinter κ°μ Έ μ€κΈ°λ₯Ό λ¬»κ³ νμνμ§ μμ νλ‘κ·Έλ¨μμ κ°μ Έ μ€λ κ²μ νΌνλ κ²μ λλ€. μ°λ¦¬λ μ΄λ―Έ numpyμ λν΄ λΉμ·ν μμ μ μννμ΅λλ€.
κ°μ₯ μ μ©ν λκΈ
λλ nltkμ λν΄ λ―Ώμ μ μμ μ λλ‘ μ΅μνμ§ μμ§λ§ ν μ€νΈκ° ν΅κ³Ό / μ€ν¨ν μμΈμ νμΈνκΈ° μν΄ μ½κ° λμ λ©κ²νλ€. ν μ€νΈλ₯Ό ν΅κ³ΌνκΈ° μν΄
__init__.py
ν¨ν€μ§μ μνν΄μΌνλ μμ μ λ€μκ³Ό κ°μ΅λλ€.μΈλΆ μ 보 (νμ₯νλ €λ©΄ ν΄λ¦)
ν₯λ―Έλ‘κ²λ λͺ¨λ λΉνμ±ν λ μμ μ κΆκ·Ήμ μΌλ‘
tkinter
μμ μΌλ‘ λμκ°λλ° μ΄κ²μ΄ κ·Όλ³Έ μμΈimport nltk
λ₯Όimport tkinter
λ‘ λ°κΎΈλ©΄ tkinterλ₯Ό μ°Έμ‘°νλ λ§€μ° μ μ¬ν μΆ©λ λ³΄κ³ μκ° νμλ©λλ€.λ΄κ° μ μ μλ―μ΄μ΄ ν¨ν€μ§λ
tkinter
μ§μ κ°μ Έμ΅λλ€.nltk.app
nltk.draw
nltk.sem
μμ κΈ°λ³Έ ν¨ν€μ§
__init__
λ³κ²½ μ¬νμμ λ¬Έμ κ°μλ κ°μ Έ μ€κΈ°μ tkinter κ°μ Έ μ€κΈ°λ₯Ό μΆμ νλ λ°©λ²nltk.featstruct
(sem
)nltk.grammar
(featstruct
)nltk.tree
(grammar
)nltk.chunk
(chunk.named_entity
>tree
)nltk.parse
(parse.bllip
>tree
)nltk.tag
(tag.stanford
>parse
)nltk.classify
(classify.senna
>tag
)nltk.inference
(inference.discourse
>sem
,tag
)nltk.stem
(stem.snowball
>corpus
>corpus.reader.timit
>tree
)