Nltk: find_concordance () left_context के लिए खाली सूची लौटाएं

को निर्मित 20 अग॰ 2018 · 4टिप्पणियाँ · स्रोत: nltk/nltk

if offsets:
            for i in offsets:
                query_word = self._tokens[i]
                # Find the context of query word.
                left_context = self._tokens[i-context:i]

जब खोज शब्द की पहली घटना पाठ की शुरुआत में होती है (उदाहरण के लिए ऑफ़सेट 7 पर), मान लें कि चौड़ाई पैरामीटर 20 पर सेट है, तो [i- संदर्भ: i ] का मूल्यांकन [-13:7] के रूप में किया जाएगा। .
इस स्थिति में, यदि पाठ में 20 से अधिक शब्द हैं, तो पाठ के पहले 7 शब्दों वाली सूची के बजाय, चर left_context एक खाली सूची होगी।

एक साधारण फिक्स करेगा:

if offsets:
    for i in offsets:
        query_word = self._tokens[i]
        # Find the context of query word.
        if i - context < 0:
            left_context = self._tokens[:i]
        else:
            left_context = self._tokens[i-context:i]

bug corpus goodfirstbug

स्रोत

BLKSerene

सभी 4 टिप्पणियाँ

क्या आप एक नमूना इनपुट और वांछित आउटपुट प्रदान कर सकते हैं ताकि हम प्रतिगमन परीक्षण में जोड़ सकें?

alvations 20 अग॰ 2018

इनपुट:

jane_eyre = 'Chapter 1\nTHERE was no possibility of taking a walk that day. We had been wandering, indeed, in the leafless shrubbery an hour in the morning; but since dinner (Mrs. Reed, when there was no company, dined early) the cold winter wind had brought with it clouds so sombre, and a rain so penetrating, that further outdoor exercise was now out of the question.'
text = nltk.Text(nltk.word_tokenize(jane_eyre))
text.concordance('taking')
text.concordance_list('taking')[0]

आउटपुट (एनएलटीके 3.3):

Displaying 1 of 1 matches:
    taking a walk that day . We had been wander
ConcordanceLine(left=[],
                query='taking',
                right=['a', 'walk', 'that', 'day', '.', 'We', 'had', 'been', 'wandering', ',', 'indeed', ',', 'in', 'the', 'leafless', 'shrubbery', 'an', 'hour'],
                offset=7,
                left_print='',
                right_print='a walk that day . We had been wande',
                line=' taking a walk that day . We had been wande')

वांछित आउटपुट:

Displaying 1 of 1 matches:
    Chapter 1 THERE was no possibility of taking a walk that day . We had been wander
ConcordanceLine(left=['Chapter', '1', 'THERE', 'was', 'no', 'possibility', 'of'],
                query='taking',
                right=['a', 'walk', 'that', 'day', '.', 'We', 'had', 'been', 'wandering', ',', 'indeed', ',', 'in', 'the', 'leafless', 'shrubbery', 'an', 'hour'],
                offset=7,
                left_print='Chapter 1 THERE was no possibility of',
                right_print='a walk that day . We had been wande',
                line='Chapter 1 THERE was no possibility of taking a walk that day . We had been wande')

BLKSerene 21 अग॰ 2018

👍1

धन्यवाद @BLKSerene बग की रिपोर्ट करने के लिए!

आह, यहाँ एक अच्छा फिक्स है। इसके बजाय अगर-और। हम न्यूनतम बाउंड को max() क्लिप कर सकते हैं, उदाहरण के लिए

left_context = self._tokens[max(0, i-context):i]

निरंतर एकीकरण/प्रतिगमन परीक्षण के लिए https://github.com/nltk/nltk/blob/develop/nltk/test/concordance.doctest में doctest जोड़ना बहुत मददगार होगा =)

Patching https://github.com/nltk/nltk/issues/2088
The left slice of the left context should be clip to 0 if the `i-context` < 0.

>>> from nltk import Text, word_tokenize
>>> jane_eyre = 'Chapter 1\nTHERE was no possibility of taking a walk that day. We had been wandering, indeed, in the leafless shrubbery an hour in the morning; but since dinner (Mrs. Reed, when there was no company, dined early) the cold winter wind had brought with it clouds so sombre, and a rain so penetrating, that further outdoor exercise was now out of the question.'
>>> text = Text(word_tokenize(jane_eyre))
>>> text.concordance_list('taking')[0].left
['Chapter', '1', 'THERE', 'was', 'no', 'possibility', 'of']

alvations 23 अग॰ 2018

👍1

#२१०३ में पैच किया गया। धन्यवाद @BLKSerene और @dnc1994!

alvations 19 सित॰ 2018

क्या यह पृष्ठ उपयोगी था?

0 / 5 - 0 रेटिंग्स

Nltk: find_concordance () left_context के लिए खाली सूची लौटाएं

सभी 4 टिप्पणियाँ

संबंधित मुद्दों