Nltk: ๋ถˆ์šฉ์–ด ๋ง๋ญ‰์น˜์˜ ๋‹ซํžˆ์ง€ ์•Š์€ ํŒŒ์ผ

์— ๋งŒ๋“  2018๋…„ 01์›” 03์ผ  ยท  11์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: nltk/nltk

/Users/kiddo/anaconda/lib/python3.6/site-packages/nltk/corpus/reader/wordlist.py:28 : ResourceWarning : ๋‹ซํžˆ์ง€ ์•Š์€ ํŒŒ์ผ <_io.bufferedreader i = "4">
return concat ([self.open (f) .read () for f in fileids])

๋””๋ฒ„๊น… ๋ชจ๋“œ์—์„œ ๋ฐœ๊ฒฌ ํ•œ ๊ฒฝ๊ณ ์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ๋ฆด๋ฆฌ์Šค ์ „์— ์ˆ˜์ •ํ•˜๊ณ  ์‹ถ์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค.

bug corpus enhancement goodfirstbug pythonic

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

์ด๊ฒƒ์— ๋Œ€ํ•œ ์†Œ์‹์ด ์žˆ์Šต๋‹ˆ๊นŒ? Python 3.6์€ ๊ฑฐ์˜ ๋ชจ๋“  ๋ฆฌ์†Œ์Šค์—์„œ NLTK 3.3์— ๋Œ€ํ•ด ์—ฌ์ „ํžˆ ๋ถˆํ‰ํ•ฉ๋‹ˆ๋‹ค.

/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1107: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/lexnames'>
  for i, line in enumerate(self.open('lexnames')):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.adj'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.adv'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.noun'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.verb'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/adj.exc'>
  for line in self.open('%s.exc' % suffix):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/adv.exc'>
  for line in self.open('%s.exc' % suffix):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/noun.exc'>
  for line in self.open('%s.exc' % suffix):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/verb.exc'>

๋ชจ๋“  11 ๋Œ“๊ธ€

์•ˆ๋…•ํ•˜์„ธ์š”,์ด ๋ฌธ์ œ๋ฅผ ๋ฐ›์•„๋„ ๋ ๊นŒ์š”?

@iliaschalkidis @alvations ๋ฆฌ๋ˆ…์Šค์—์„œ ๊ฒฝ๊ณ ๋ฅผ ์–ด๋–ป๊ฒŒ ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

@ sks4903440 ๋ฒ„์ „ 3.2.5๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ช…๋ น ์ค„์—์„œ ๋‹ค์Œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

test.py

import warnings
import nltk
warnings.filterwarnings('error', category=ResourceWarning)
stop_words = nltk.corpus.stopwords.words('english')

$ python test.py

๋‹ค์Œ์„ ๋ฐ›์•„์•ผํ•ฉ๋‹ˆ๋‹ค.

ResourceWarning: unclosed file <_io.BufferedReader name='/Users/kiddo/nltk_data/corpora/stopwords/english'>

# 1945์—์„œ ์ˆ˜์ • ๋จ

Hmmm .. io.BufferedReader ์„ StreamCorpusReader์— ์ƒ์†ํ•˜๋Š” ๊ฒƒ์€ ํฅ๋ฏธ๋กœ์šด ํ•ด๊ฒฐ์ฑ…์ด์ง€๋งŒ ์ปจํ…์ŠคํŠธ ๊ด€๋ฆฌ์ž with ๋ฒ”์œ„๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํŒŒ์ผ์„ ์ œ๋Œ€๋กœ ๋‹ซ๋Š” ๊ฒƒ์ด ๋” ๋‚˜์€ ํ•ด๊ฒฐ์ฑ… ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  Python3.6์—๋Š” ์ด์ „ ๋ฒ„์ „๊ณผ ๋‹ค๋ฅธ ํŒŒ์ผ์— ๋Œ€ํ•œ ๋ช‡ ๊ฐ€์ง€ ํŠน๋ณ„ํ•œ ์š”๊ตฌ ์‚ฌํ•ญ์ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ํ•˜๊ณ ์žˆ๋Š” ์ผ์ด ๋‹จ์ˆœํ•œ ๋ฐ˜์ฐฝ๊ณ ๊ฐ€ ์•„๋‹Œ์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด CPython์—์„œ ๋ณ€๊ฒฝ ๋กœ๊ทธ๋ฅผ ์ฝ์–ด์•ผํ•ฉ๋‹ˆ๋‹ค =)

@alvations with ๊ฒƒ์€ ํ™•์‹คํžˆ ์ข‹์€ ์ƒ๊ฐ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ํ†ตํ•ฉํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. CPython์—์„œ ๊ฐ€๋น„์ง€ ์ˆ˜์ง‘๊ธฐ๋Š” ์ฐธ์กฐ ํšŸ์ˆ˜๊ฐ€ 0์ด๋ฉด ์ž๋™์œผ๋กœ ํŒŒ์ผ์„ ๋‹ซ์œผ๋ฏ€๋กœ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ with ๋ฌธ์ด ์ž‘๋™ํ•˜๋ ค๋ฉด io.BufferedReader ํ•˜๊ฑฐ๋‚˜ __enter__ ๋ฐ __exit__ ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๋ฌด์—‡์ด ๋” ๋‚ซ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์‹ญ๋‹ˆ๊นŒ?

~ BufferedReader์—์„œ ์ƒ์†ํ•˜์ง€ ์•Š๊ณ  ์ปจํ…์ŠคํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์—ด๊ณ  ๋‹ซ์€ ๋‹ค์Œ io ๋ชจ๋“ˆ์ด gc (๊ฐ€๋น„์ง€ ์ˆ˜์ง‘)๋ฅผ ์ฒ˜๋ฆฌํ•˜๋„๋กํ•˜๊ธฐ ๋•Œ๋ฌธ์— enter / exit ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ํ•  ํ•„์š”๊ฐ€ ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ). ~

์ด๊ฒƒ์€ ๊นŒ๋‹ค ๋กญ์Šต๋‹ˆ๋‹ค. io.BufferedReader ์—๋Š” ์ด๋ฏธ seek() like ํ•จ์ˆ˜๊ฐ€ ์žˆ์œผ๋ฉฐ SeekableUnicodeStreamReader ์ˆ˜ํผ __init__() ๋ฅผ ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š๊ณ  ์ƒ์† ํ•  ๋•Œ ์ •ํ™•ํžˆ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. BufferedReader์—์„œ ๊ฐ€์ ธ ์˜ค๋Š” ๊ฒƒ.

๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ๋กœ with ์ปจํ…์ŠคํŠธ ๋‚ด์—์„œ ๋ฒ„ํผ๋ฅผ ํ•ดํ‚นํ•˜์ง€ ์•Š๋Š” ํ•œ ํƒ์ƒ‰ ๋ฐ ์ง€์‹œ ํ•จ์ˆ˜๊ฐ€ ์ž‘๋™ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์‹ค์ œ๋กœ with ๋ฅผ read() with ์•ˆ์— ๋ž˜ํ•‘ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํ  ...

์ด๊ฒƒ์— ๋Œ€ํ•œ ์†Œ์‹์ด ์žˆ์Šต๋‹ˆ๊นŒ? Python 3.6์€ ๊ฑฐ์˜ ๋ชจ๋“  ๋ฆฌ์†Œ์Šค์—์„œ NLTK 3.3์— ๋Œ€ํ•ด ์—ฌ์ „ํžˆ ๋ถˆํ‰ํ•ฉ๋‹ˆ๋‹ค.

/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1107: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/lexnames'>
  for i, line in enumerate(self.open('lexnames')):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.adj'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.adv'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.noun'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1159: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/index.verb'>
  for i, line in enumerate(self.open('index.%s' % suffix)):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/adj.exc'>
  for line in self.open('%s.exc' % suffix):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/adv.exc'>
  for line in self.open('%s.exc' % suffix):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/noun.exc'>
  for line in self.open('%s.exc' % suffix):
/home/user/py36/lib/python3.6/site-packages/nltk/corpus/reader/wordnet.py:1209: ResourceWarning: unclosed file <_io.BufferedReader name='/home/user/nltk_data/corpora/wordnet/verb.exc'>

์ œ์•ˆ ๋œ ์ˆ˜์ • : https://github.com/nltk/nltk/pull/2165

์™„๋ฃŒ๋œ ๊ฒƒ์œผ๋กœ ๋ณด์ด๋Š” ๋˜ ๋‹ค๋ฅธ ๋ฌธ์ œ๋Š” ๋ฌธ์ œ๋ฅผ ์ข…๊ฒฐํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

๋ฌธ์ œ๋ฅผ ์ œ๊ธฐ ํ•ด ์ฃผ์‹  ๋ชจ๋“  ๋ถ„๋“ค๊ป˜ ๊ฐ์‚ฌ ๋“œ๋ฆฌ๋ฉฐ, ์ˆ˜์ • ์‚ฌํ•ญ์— ๋Œ€ํ•ด

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰