nltk 🚀 - Perbarui berbagai urutan escape regex

Jika tidak ada yang mengerjakan ini, saya mau. Bisakah Anda memberi tahu langkah-langkah untuk menduplikasi masalah?

PabloDino pada 31 Agu 2019

👍2 ❤1

@PabloDino Instal Python 3.6.8 atau yang lebih baru dan coba impor setiap modul. Perbaiki regex baik dengan menggunakan string mentah atau menggunakan escape yang tepat sehingga ini berfungsi baik pada Python 2 dan 3

pombredanne pada 31 Agu 2019

Saya sedang mengerjakan beberapa latihan tetapi tidak melihat peringatan apa pun. Dapatkah Anda memposting cuplikan kode yang manifes peringatan pl

PabloDino pada 3 Sep 2019

@PabloDino :

$ python --version
Python 3.6.8
$ git clone git://github.com/nltk/nltk.git
$ pip install pytest
$ pytest -vvs nltk/ --collect-only

========================================= warnings summary =========================================
nltk/nltk/featstruct.py:1295
  /home/pombreda/tmp/nl/nltk/nltk/featstruct.py:1295: DeprecationWarning: invalid escape sequence \d
    name, n = re.sub("\d+$", "", var.name), 2

nltk/nltk/featstruct.py:2091
  /home/pombreda/tmp/nl/nltk/nltk/featstruct.py:2091: DeprecationWarning: invalid escape sequence \d
    RANGE_RE = re.compile("(-?\d+):(-?\d+)")

nltk/nltk/sem/evaluate.py:307
  /home/pombreda/tmp/nl/nltk/nltk/sem/evaluate.py:307: DeprecationWarning: invalid escape sequence \ 
    """

nltk/nltk/sem/relextract.py:128
  /home/pombreda/tmp/nl/nltk/nltk/sem/relextract.py:128: DeprecationWarning: invalid escape sequence \w
    ENT = re.compile("&(\w+?);")

nltk/nltk/sem/relextract.py:407
  /home/pombreda/tmp/nl/nltk/nltk/sem/relextract.py:407: DeprecationWarning: invalid escape sequence \s
    """

nltk/nltk/sem/boxer.py:776
  /home/pombreda/tmp/nl/nltk/nltk/sem/boxer.py:776: DeprecationWarning: invalid escape sequence \d
    assert re.match("^[exps]\d+$", var), var

nltk/nltk/sem/drt.py:716
  /home/pombreda/tmp/nl/nltk/nltk/sem/drt.py:716: DeprecationWarning: invalid escape sequence \ 
    + [" \  " + blank + line for line in term_lines[1:2]]

nltk/nltk/sem/drt.py:717
  /home/pombreda/tmp/nl/nltk/nltk/sem/drt.py:717: DeprecationWarning: invalid escape sequence \ 
    + [" /\ " + var_string + line for line in term_lines[2:3]]

nltk/nltk/grammar.py:1291
  /home/pombreda/tmp/nl/nltk/nltk/grammar.py:1291: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/grammar.py:1463
  /home/pombreda/tmp/nl/nltk/nltk/grammar.py:1463: DeprecationWarning: invalid escape sequence \w
    _STANDARD_NONTERM_RE = re.compile("( [\w/][\w/^<>-]* ) \s*", re.VERBOSE)

nltk/nltk/text.py:650
  /home/pombreda/tmp/nl/nltk/nltk/text.py:650: DeprecationWarning: invalid escape sequence \w
    _CONTEXT_RE = re.compile("\w+|[\.\!\?]")

nltk/nltk/tokenize/punkt.py:1462
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/punkt.py:1462: DeprecationWarning: invalid escape sequence \s
    pat = "\s*".join(re.escape(c) for c in tok)

nltk/nltk/tokenize/regexp.py:100
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/regexp.py:100: DeprecationWarning: invalid escape sequence \w
    """

nltk/nltk/tokenize/regexp.py:193
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/regexp.py:193: DeprecationWarning: invalid escape sequence \w
    """

nltk/nltk/tokenize/repp.py:133
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/repp.py:133: DeprecationWarning: invalid escape sequence \(
    line_regex = re.compile("^\((\d+), (\d+), (.+)\)$", re.MULTILINE)

nltk/nltk/tokenize/texttiling.py:96
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/texttiling.py:96: DeprecationWarning: invalid escape sequence \-
    c for c in lowercase_text if re.match("[a-z\-' \n\t]", c)

nltk/nltk/tokenize/texttiling.py:229
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/texttiling.py:229: DeprecationWarning: invalid escape sequence \w
    matches = re.finditer("\w+", text)

nltk/nltk/tokenize/toktok.py:53
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/toktok.py:53: DeprecationWarning: invalid escape sequence \]
    FUNKY_PUNCT_1 = re.compile(u'([،;؛¿!"\])}»›”؟¡%٪°±©®।॥…])'), r" \1 "

nltk/nltk/tokenize/toktok.py:55
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/toktok.py:55: DeprecationWarning: invalid escape sequence \[
    FUNKY_PUNCT_2 = re.compile(u"([({\[“‘„‚«‹「『])"), r" \1 "

nltk/nltk/tokenize/toktok.py:62
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/toktok.py:62: DeprecationWarning: invalid escape sequence \|
    PIPE = re.compile("\|"), " &#124; "

nltk/nltk/tokenize/treebank.py:269
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/treebank.py:269: DeprecationWarning: invalid escape sequence \]
    """

nltk/nltk/tokenize/treebank.py:273
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/treebank.py:273: DeprecationWarning: invalid escape sequence \s
    re.compile(pattern.replace("(?#X)", "\s"))

nltk/nltk/tokenize/treebank.py:277
  /home/pombreda/tmp/nl/nltk/nltk/tokenize/treebank.py:277: DeprecationWarning: invalid escape sequence \s
    re.compile(pattern.replace("(?#X)", "\s"))

nltk/nltk/tree.py:99
  /home/pombreda/tmp/nl/nltk/nltk/tree.py:99: DeprecationWarning: invalid escape sequence \ 
    """

nltk/nltk/tree.py:652
  /home/pombreda/tmp/nl/nltk/nltk/tree.py:652: DeprecationWarning: invalid escape sequence \s
    if re.search("\s", brackets):

nltk/nltk/tree.py:658
  /home/pombreda/tmp/nl/nltk/nltk/tree.py:658: DeprecationWarning: invalid escape sequence \s
    node_pattern = "[^\s%s%s]+" % (open_pattern, close_pattern)

nltk/nltk/tree.py:660
  /home/pombreda/tmp/nl/nltk/nltk/tree.py:660: DeprecationWarning: invalid escape sequence \s
    leaf_pattern = "[^\s%s%s]+" % (open_pattern, close_pattern)

nltk/nltk/tree.py:662
  /home/pombreda/tmp/nl/nltk/nltk/tree.py:662: DeprecationWarning: invalid escape sequence \s
    "%s\s*(%s)?|%s|(%s)"

nltk/nltk/tree.py:900
  /home/pombreda/tmp/nl/nltk/nltk/tree.py:900: DeprecationWarning: invalid escape sequence \$
    reserved_chars = re.compile("([#\$%&~_\{\}])")

nltk/nltk/parse/chart.py:1034
  /home/pombreda/tmp/nl/nltk/nltk/parse/chart.py:1034: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/chart.py:1073
  /home/pombreda/tmp/nl/nltk/nltk/parse/chart.py:1073: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/chart.py:1128
  /home/pombreda/tmp/nl/nltk/nltk/parse/chart.py:1128: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/chart.py:1148
  /home/pombreda/tmp/nl/nltk/nltk/parse/chart.py:1148: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/chart.py:1218
  /home/pombreda/tmp/nl/nltk/nltk/parse/chart.py:1218: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/chart.py:1241
  /home/pombreda/tmp/nl/nltk/nltk/parse/chart.py:1241: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/featurechart.py:270
  /home/pombreda/tmp/nl/nltk/nltk/parse/featurechart.py:270: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/parse/featurechart.py:369
  /home/pombreda/tmp/nl/nltk/nltk/parse/featurechart.py:369: DeprecationWarning: invalid escape sequence \*
    """

nltk/nltk/tag/sequential.py:730
  /home/pombreda/tmp/nl/nltk/nltk/tag/sequential.py:730: DeprecationWarning: invalid escape sequence \w
    elif re.match("\w+$", word):

nltk/nltk/tag/sequential.py:724
  /home/pombreda/tmp/nl/nltk/nltk/tag/sequential.py:724: DeprecationWarning: invalid escape sequence \W
    elif re.match("\W+$", word):

nltk/nltk/tag/sequential.py:722
  /home/pombreda/tmp/nl/nltk/nltk/tag/sequential.py:722: DeprecationWarning: invalid escape sequence \.
    if re.match("[0-9]+(\.[0-9]*)?|[0-9]*\.[0-9]+$", word):

nltk/nltk/classify/rte_classify.py:61
  /home/pombreda/tmp/nl/nltk/nltk/classify/rte_classify.py:61: DeprecationWarning: invalid escape sequence \w
    tokenizer = RegexpTokenizer("[\w.@:/]+|\w+|\$[\d.]+")

nltk/nltk/classify/maxent.py:1351
  /home/pombreda/tmp/nl/nltk/nltk/classify/maxent.py:1351: DeprecationWarning: invalid escape sequence \ 
    """

nltk/nltk/chunk/util.py:371
  /home/pombreda/tmp/nl/nltk/nltk/chunk/util.py:371: DeprecationWarning: invalid escape sequence \S
    _LINE_RE = re.compile("(\S+)\s+(\S+)\s+([IOB])-?(\S+)?")

nltk/nltk/chunk/util.py:517
  /home/pombreda/tmp/nl/nltk/nltk/chunk/util.py:517: DeprecationWarning: invalid escape sequence \w
    _IEER_TYPE_RE = re.compile('<b_\w+\s+[^>]*?type="(?P<type>\w+)"')

nltk/nltk/chunk/util.py:526
  /home/pombreda/tmp/nl/nltk/nltk/chunk/util.py:526: DeprecationWarning: invalid escape sequence \s
    for piece_m in re.finditer("<[^>]+>|[^\s<]+", s):

nltk/nltk/chunk/regexp.py:70
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:70: DeprecationWarning: invalid escape sequence \{
    _BRACKETS = re.compile("[^\{\}]+")

nltk/nltk/chunk/regexp.py:215
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:215: DeprecationWarning: invalid escape sequence \{
    s = re.sub("\{\}", "", s)

nltk/nltk/chunk/regexp.py:426
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:426: DeprecationWarning: invalid escape sequence \g
    RegexpChunkRule.__init__(self, regexp, "{\g<chunk>}", descr)

nltk/nltk/chunk/regexp.py:471
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:471: DeprecationWarning: invalid escape sequence \g
    RegexpChunkRule.__init__(self, regexp, "}\g<chink>{", descr)

nltk/nltk/chunk/regexp.py:510
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:510: DeprecationWarning: invalid escape sequence \{
    regexp = re.compile("\{(?P<chunk>%s)\}" % tag_pattern2re_pattern(tag_pattern))

nltk/nltk/chunk/regexp.py:511
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:511: DeprecationWarning: invalid escape sequence \g
    RegexpChunkRule.__init__(self, regexp, "\g<chunk>", descr)

nltk/nltk/chunk/regexp.py:575
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:575: DeprecationWarning: invalid escape sequence \g
    RegexpChunkRule.__init__(self, regexp, "\g<left>", descr)

nltk/nltk/chunk/regexp.py:708
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:708: DeprecationWarning: invalid escape sequence \{
    "(?P<left>%s)\{(?P<right>%s)"

nltk/nltk/chunk/regexp.py:714
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:714: DeprecationWarning: invalid escape sequence \g
    RegexpChunkRule.__init__(self, regexp, "{\g<left>\g<right>", descr)

nltk/nltk/chunk/regexp.py:778
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:778: DeprecationWarning: invalid escape sequence \}
    "(?P<left>%s)\}(?P<right>%s)"

nltk/nltk/chunk/regexp.py:784
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:784: DeprecationWarning: invalid escape sequence \g
    RegexpChunkRule.__init__(self, regexp, "\g<left>\g<right>}", descr)

nltk/nltk/chunk/regexp.py:896
nltk/nltk/chunk/regexp.py:896
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:896: DeprecationWarning: invalid escape sequence \{
    r"^((%s|<%s>)*)$" % ("([^\{\}<>]|\{\d+,?\}|\{\d*,\d+\})+", "[^\{\}<>]+")

nltk/nltk/chunk/regexp.py:1175
  /home/pombreda/tmp/nl/nltk/nltk/chunk/regexp.py:1175: DeprecationWarning: invalid escape sequence \.
    """

nltk/nltk/inference/discourse.py:44
  /home/pombreda/tmp/nl/nltk/nltk/inference/discourse.py:44: DeprecationWarning: invalid escape sequence \ 
    """

nltk/nltk/stem/lancaster.py:192
  /home/pombreda/tmp/nl/nltk/nltk/stem/lancaster.py:192: DeprecationWarning: invalid escape sequence \*
    valid_rule = re.compile("^[a-z]+\*?\d[a-z]*[>\.]?$")

nltk/nltk/stem/lancaster.py:225
  /home/pombreda/tmp/nl/nltk/nltk/stem/lancaster.py:225: DeprecationWarning: invalid escape sequence \*
    valid_rule = re.compile("^([a-z]+)(\*?)(\d)([a-z]*)([>\.]?)$")

nltk/nltk/stem/porter.py:177
  /home/pombreda/tmp/nl/nltk/nltk/stem/porter.py:177: DeprecationWarning: invalid escape sequence \m
    """

nltk/nltk/corpus/__init__.py:116
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:116: DeprecationWarning: invalid escape sequence \.
    ".*\.(test|train).*",

nltk/nltk/corpus/__init__.py:123
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:123: DeprecationWarning: invalid escape sequence \.
    ".*\.(test|train).*",

nltk/nltk/corpus/__init__.py:126
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:126: DeprecationWarning: invalid escape sequence \.
    crubadan = LazyCorpusLoader("crubadan", CrubadanCorpusReader, ".*\.txt")

nltk/nltk/corpus/__init__.py:128
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:128: DeprecationWarning: invalid escape sequence \.
    "dependency_treebank", DependencyCorpusReader, ".*\.dp", encoding="ascii"

nltk/nltk/corpus/__init__.py:311
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:311: DeprecationWarning: invalid escape sequence \.
    "timit", TimitTaggedCorpusReader, ".+\.tags", tagset="wsj", encoding="ascii"

nltk/nltk/corpus/__init__.py:335
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:335: DeprecationWarning: invalid escape sequence \.
    twitter_samples = LazyCorpusLoader("twitter_samples", TwitterCorpusReader, ".*\.json")

nltk/nltk/corpus/__init__.py:364
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:364: DeprecationWarning: invalid escape sequence \.
    wordnet_ic = LazyCorpusLoader("wordnet_ic", WordNetICCorpusReader, ".*\.dat")

nltk/nltk/corpus/__init__.py:374
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:374: DeprecationWarning: invalid escape sequence \.
    "frames/.*\.xml",

nltk/nltk/corpus/__init__.py:383
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:383: DeprecationWarning: invalid escape sequence \.
    "frames/.*\.xml",

nltk/nltk/corpus/__init__.py:392
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:392: DeprecationWarning: invalid escape sequence \.
    "frames/.*\.xml",

nltk/nltk/corpus/__init__.py:401
  /home/pombreda/tmp/nl/nltk/nltk/corpus/__init__.py:401: DeprecationWarning: invalid escape sequence \.
    "frames/.*\.xml",

nltk/nltk/corpus/reader/plaintext.py:62
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/plaintext.py:62: DeprecationWarning: invalid escape sequence \.
    """

nltk/nltk/corpus/reader/util.py:635
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/util.py:635: DeprecationWarning: invalid escape sequence \d
    if re.match("^\d+-\d+", line) is not None:

nltk/nltk/corpus/reader/util.py:859
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/util.py:859: DeprecationWarning: invalid escape sequence \s
    if re.match("======+\s*$", line):

nltk/nltk/corpus/reader/api.py:77
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/api.py:77: DeprecationWarning: invalid escape sequence \.
    m = re.match("(.*\.zip)/?(.*)$|", root)

nltk/nltk/corpus/reader/timit.py:165
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/timit.py:165: DeprecationWarning: invalid escape sequence \.
    encoding = [(".*\.wav", None), (".*", encoding)]

nltk/nltk/corpus/reader/bracket_parse.py:214
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/bracket_parse.py:214: DeprecationWarning: invalid escape sequence \.
    "alpino\.xml",

nltk/nltk/corpus/reader/xmldocs.py:232
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/xmldocs.py:232: DeprecationWarning: invalid escape sequence \s
    _XML_TAG_NAME = re.compile("<\s*/?\s*([^\s>]+)")

nltk/nltk/toolbox.py:209
  /home/pombreda/tmp/nl/nltk/nltk/toolbox.py:209: DeprecationWarning: invalid escape sequence \_
    """

nltk/nltk/corpus/reader/bnc.py:29
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/bnc.py:29: DeprecationWarning: invalid escape sequence \w
    """

nltk/nltk/corpus/reader/switchboard.py:113
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/switchboard.py:113: DeprecationWarning: invalid escape sequence \w
    _UTTERANCE_RE = re.compile("(\w+)\.(\d+)\:\s*(.*)")

nltk/nltk/corpus/reader/childes.py:281
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/childes.py:281: DeprecationWarning: invalid escape sequence \d
    m = re.match("P(\d+)Y(\d+)M?(\d?\d?)D?", age_year)

nltk/nltk/corpus/reader/framenet.py:2753
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/framenet.py:2753: DeprecationWarning: invalid escape sequence \w
    """

nltk/nltk/corpus/reader/udhr.py:30
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/udhr.py:30: DeprecationWarning: invalid escape sequence \-
    ("Abkhaz\-Cyrillic\+Abkh", "cp1251"),

nltk/nltk/corpus/reader/twitter.py:54
  /home/pombreda/tmp/nl/nltk/nltk/corpus/reader/twitter.py:54: DeprecationWarning: invalid escape sequence \.
    """

nltk/nltk/ccg/combinator.py:225
  /home/pombreda/tmp/nl/nltk/nltk/ccg/combinator.py:225: DeprecationWarning: invalid escape sequence \Y
    """

nltk/nltk/treetransforms.py:108
  /home/pombreda/tmp/nl/nltk/nltk/treetransforms.py:108: DeprecationWarning: invalid escape sequence \ 
    """

pombredanne pada 7 Sep 2019

Dan FWIW: https://docs.python.org/3/reference/lexical_analysis.html#string -and-bytes-literals

Tidak seperti Standar C, semua urutan escape yang tidak dikenali dibiarkan dalam string tidak berubah, yaitu, garis miring terbalik dibiarkan di hasil. (Perilaku ini berguna saat men-debug: jika urutan pelolosan salah ketik, keluaran yang dihasilkan lebih mudah dikenali sebagai rusak.) Penting juga untuk dicatat bahwa urutan pelolosan yang hanya dikenali dalam string literal termasuk dalam kategori pelarian tak dikenal untuk byte literal.

Berubah pada versi 3.6: Urutan pelolosan yang tidak dikenal menghasilkan DeprecationWarning. Dalam beberapa versi Python yang akan datang, mereka akan menjadi SyntaxError.

pombredanne pada 7 Sep 2019

$ python --version
Python 3.6.7
$ pytest --version
Ini adalah versi pytest 5.1.2, diimpor dari /pytest.py$ pytest -vvs nltk / --collect-only============================= sesi tes dimulai ================== ============platform linux - Python 3.6.7, pytest-5.1.2, py-1.8.0, pluggy-0.12.0 - * / python3
cachedir: .pytest_cache
rootdir: ** / nltk
mengumpulkan 381 item

Pengujian unit untuk nltk.compat.
Lihat juga nltk / test / compat.doctest.

Tes unit untuk nltk.metrics.aline

Uji algoritma Aline untuk menyelaraskan urutan fonetik

Uji aline untuk menghitung perbedaan antara dua segmen

Tes untuk Brill tagger.

Uji bug https://github.com/nltk/nltk/issues/1597

    Ensures that curly bracket quantifiers can be used inside a chunk rule.
    This type of quantifier has been used for the supplementary example
    in http://www.nltk.org/book/ch07.html#exploring-text-corpora.

Pengujian unit untuk nltk.classify. Lihat juga: nltk / test / classify.doctest

Teks dibuat menggunakan: http://www.nltk.org/book/ch01.html

Uji tiruan untuk pembungkus Stanford CoreNLP.

Tes Regresi Tampilan Corpus

Kelas yang berisi pengujian unit untuk nltk.metrics.agreement.Disagreement.

Tes lebih lanjut, berdasarkan
http://www.agreestat.com/research_papers/onkrippendorffalpha.pdf

Contoh lanjutan yang sama, tetapi dengan 1 peringkat dihapus.
Sekali lagi, penghapusan 1 peringkat itu seharusnya tidak masalah.

Tes sederhana, berdasarkan
https://github.com/foolswood/krippendorffs_alpha/raw/master/krippendorff.pdf.

Tes sederhana yang sama dengan 1 peringkat dihapus.
Penghapusan peringkat itu tidak masalah: K-Apha mengabaikan item dengan
hanya 1 peringkat.

Tes regresi untuk json2csv() dan json2csv_entities() di Twitter
paket.

Sanity memeriksa bahwa perbandingan file tidak memberikan hasil positif palsu.

Tes unit untuk nltk.corpus.nombank

Tes untuk nltk.pos_tag

Tes berikut melakukan serangkaian pembacaan, pencarian, dan
memberitahu, dan memeriksa bahwa hasilnya konsisten.

Tes unit untuk Senna

Unittest untuk nltk.classify.senna

Antarmuka pipa Senna

Unittest untuk nltk.tag.senna

unit ini menguji untuk menguji bola salju arabic light stemmer
stemmer ini berhubungan dengan prefiks dan sufiks

Uji bug https://github.com/nltk/nltk/issues/1581

    Ensures that 'oed' can be stemmed without throwing an error.
  <TestCaseFunction test_vocabulary_martin_mode>
    Tests all words from the test vocabulary provided by M Porter

    The sample vocabulary and output were sourced from:
    http://tartarus.org/martin/PorterStemmer/voc.txt
    http://tartarus.org/martin/PorterStemmer/output.txt
    and are linked to from the Porter Stemmer algorithm's homepage
    at
    http://tartarus.org/martin/PorterStemmer/
  <TestCaseFunction test_vocabulary_nltk_mode>
  <TestCaseFunction test_vocabulary_original_mode>

Tes unit untuk nltk.tgrep.

Kelas yang berisi pengujian unit untuk nltk.tgrep.

Uji penanganan kesalahan operator tgrep yang tidak ditentukan.

Uji apakah komentar disaring dengan benar dari pencarian tgrep
string.

Uji Contoh Dasar dari manual TGrep2.

Uji node berlabel.

    Test case from Emily M. Bender.
  <TestCaseFunction test_multiple_conjs>
    Test that multiple (3 or more) conjunctions of node relations are
    handled properly.
  <TestCaseFunction test_node_encoding>
    Test that tgrep search strings handles bytes and strs the same
    way.
  <TestCaseFunction test_node_nocase>
    Test selecting nodes using case insensitive node names.
  <TestCaseFunction test_node_noleaves>
    Test node name matching with the search_leaves flag set to False.
  <TestCaseFunction test_node_printing>
    Test that the tgrep print operator ' is properly ignored.
  <TestCaseFunction test_node_quoted>
    Test selecting nodes using quoted node names.
  <TestCaseFunction test_node_regex>
    Test regex matching on nodes.
  <TestCaseFunction test_node_regex_2>
    Test regex matching on nodes.
  <TestCaseFunction test_node_simple>
    Test a simple use of tgrep for finding nodes matching a given
    pattern.
  <TestCaseFunction test_node_tree_position>
    Test matching on nodes based on NLTK tree position.
  <TestCaseFunction test_rel_precedence>
    Test matching nodes based on precedence relations.
  <TestCaseFunction test_rel_sister_nodes>
    Test matching sister nodes in a tree.
  <TestCaseFunction test_tokenize_encoding>
    Test that tokenization handles bytes and strs the same way.
  <TestCaseFunction test_tokenize_examples>
    Test tokenization of the TGrep2 manual example patterns.
  <TestCaseFunction test_tokenize_link_types>
    Test tokenization of basic link types.
  <TestCaseFunction test_tokenize_macros>
    Test tokenization of macro definitions.
  <TestCaseFunction test_tokenize_node_labels>
    Test tokenization of labeled nodes.
  <TestCaseFunction test_tokenize_nodenames>
    Test tokenization of node names.
  <TestCaseFunction test_tokenize_quoting>
    Test tokenization of quoting.
  <TestCaseFunction test_tokenize_segmented_patterns>
    Test tokenization of segmented patterns.
  <TestCaseFunction test_tokenize_simple>
    Simple test of tokenization.
  <TestCaseFunction test_trailing_semicolon>
    Test that semicolons at the end of a tgrep2 search string won't
    cause a parse failure.
  <TestCaseFunction test_use_macros>
    Test defining and using tgrep2 macros.
  <TestCaseFunction tests_rel_dominance>
    Test matching nodes based on dominance relations.
  <TestCaseFunction tests_rel_indexed_children>
    Test matching nodes based on their index in their parent node.

Tes unit untuk nltk.tokenize.
Lihat juga nltk / test / tokenize.doctest

Uji padding asterisk untuk tokenisasi kata.

Uji padding dotdot * untuk tokenisasi kata.

Uji string yang menyerupai nomor telepon tetapi berisi baris baru

Uji remove_handle () dari casual.py dengan case edge yang dibuat khusus

Uji Tokenizer SyllableTokenizer.

Uji Stanford Word Segmenter untuk bahasa Arab (konfigurasi default)

Uji Stanford Word Segmenter untuk bahasa China (konfigurasi default)

Uji fungsi TreebankWordTokenizer.span_tokenize

Uji TweetTokenizer menggunakan kata-kata dengan karakter khusus dan beraksen.

Uji fungsi word_tokenize

Menguji bagian statis dari paket Twitter

Menguji bahwa informasi kredensial Twitter dari file ditangani dengan benar.

File kredensial default diidentifikasi

File kredensial default telah dibaca dengan benar

Jalur ke file kredensial default dibentuk dengan baik, jika ditentukan
subdir.

Mengatur subdir ke jalur kosong akan menimbulkan kesalahan.

Menyetel subdirektori ke None akan memunculkan kesalahan.

Uji bahwa variabel lingkungan telah dibaca dengan benar.

File kredensial 'bad_oauth1-1.txt' tidak lengkap

Kunci pertama dalam file kredensial 'bad_oauth1-2.txt' salah format

Kunci pertama dalam file kredensial 'bad_oauth1-2.txt' salah format

Menyetel subdir ke direktori yang tidak ada seharusnya menimbulkan kesalahan.

Default untuk otentikasi akan gagal karena 'credentials.txt' bukan
hadir dalam subdirektori default, seperti yang dibaca dari os.environ['TWITTER'] .

File kredensial 'foobar' tidak dapat ditemukan di subdir default.

Tes unit untuk nltk.corpus.wordnet
Lihat juga nltk / test / wordnet.doctest

Tes untuk NgramCounter yang hanya melibatkan pencarian, tanpa modifikasi.

Tes unit untuk model ngram MLE.

Tes model trigram MLE

Tes unit untuk kelas Lidstone

Tes unit untuk kelas Laplace

Menggunakan model MLE, buat beberapa teks.

tes Kelas Kosakata

Tes untuk metrik evaluasi terjemahan BLEU

Contoh dari kertas BLEU asli
http://www.aclweb.org/anthology/P02-1040.pdf

Menguji keselarasan GDFA

Menguji GDFA dengan 10 keluaran eflomal pertama dari masalah # 1829
https://github.com/nltk/nltk/issues/1829

Pengujian untuk metode pelatihan Model 1 IBM

Pengujian untuk metode pelatihan Model 2 IBM

Pengujian untuk metode pelatihan Model 3 IBM

Pengujian untuk metode pelatihan Model 4 IBM

Pengujian untuk metode pelatihan Model 5 IBM

============================ tidak ada tes yang dijalankan dalam 2.13s ================ =============
$

PabloDino pada 9 Sep 2019

Saya melihat hasil yang sama dengan @pombredanne.

stevenbird pada 12 Sep 2019

Hai, apakah @PabloDino masih berencana untuk menangani masalah ini?

Saya dapat mereplikasi keluaran @pombredanne dan ingin bekerja untuk memperbaiki masalah ini.

ab-10 pada 30 Sep 2019

Silakan, saya belum mereplikasi

Pada hari Sen, 30 Sep 2019 pukul 11:40 Armin Stepanjan [email protected]
menulis:

Hai, apakah @PabloDino https://github.com/PabloDino masih berencana untuk bekerja
tentang masalah ini?
Saya telah mampu meniru @pombredanne
https://github.com/pombredanne keluaran dan ingin dikerjakan
memperbaiki masalah ini.
-
Anda menerima ini karena Anda disebutkan.
Balas email ini secara langsung, lihat di GitHub
https://github.com/nltk/nltk/issues/2378?email_source=notifications&email_token=ABRSN4KL27M5TYFOR65HRMDQMIMVPA5CNFSM4IRCRGM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW2HS4DFVREXG43VMVBW2HS4DFVREXG43VMVBW63S4DFXG43VMVBW63Z62CNFSM4IRCRGM2YYY3PNVWWK3TUL52HS4DFVREXG43VMVBW2Z6NTD115CNFSM4IRCRGM2YYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63S4DFVREXG43VMVBW63Z62CNFSM4IRCRGM2YYY3PNVWWK3TUL52HS4DFVREXG43VMVBW2Z6NTFXG43VMVBW63Z6NTFXG3
atau nonaktifkan utasnya
https://github.com/notifications/unsubscribe-auth/ABRSN4KRASPRV6I4VLHFNILQMIMVPANCNFSM4IRCRGMQ
.

PabloDino pada 12 Okt 2019

@ ab-10 Apakah Anda dapat memperbaiki peringatan dep tersebut?

gertjanwytynck pada 16 Jan 2020

Daftar yang diperbarui dengan Python 3.8 dengan menjalankan perintah di bawah ini:

find . -iname '*.py' | xargs -P 4 -I{} python3.8 -Wall -m py_compile {}

./nltk/chat/iesha.py:52: DeprecationWarning: invalid escape sequence \<
  "u think I can%2??! really?? kekeke \<_\<",
./nltk/tag/sequential.py:730: DeprecationWarning: invalid escape sequence \w
  elif re.match("\w+$", word):
./nltk/tag/sequential.py:724: DeprecationWarning: invalid escape sequence \W
  elif re.match("\W+$", word):
./nltk/tag/sequential.py:722: DeprecationWarning: invalid escape sequence \.
  if re.match("[0-9]+(\.[0-9]*)?|[0-9]*\.[0-9]+$", word):
./nltk/app/chunkparser_app.py:206: DeprecationWarning: invalid escape sequence \#
  "\t<regexp><\#><CD> # This is a comment...</regexp>\n"
./nltk/app/chunkparser_app.py:315: DeprecationWarning: invalid escape sequence \s
  grammar = re.sub("\n\s+", "\n", grammar)
./nltk/app/chunkparser_app.py:1061: DeprecationWarning: invalid escape sequence \w
  key=lambda t_w: re.match("\w+", t_w[0])
./nltk/app/chunkparser_app.py:1422: DeprecationWarning: invalid escape sequence \#
  "^\# Regexp Chunk Parsing Grammar[\s\S]*" "F-score:.*\n", "", grammar
./nltk/sem/cooper_storage.py:48: DeprecationWarning: invalid escape sequence \P
  """
./nltk/sem/relextract.py:128: DeprecationWarning: invalid escape sequence \w
  ENT = re.compile("&(\w+?);")
./nltk/sem/relextract.py:382: DeprecationWarning: invalid escape sequence \s
  roles = """
./nltk/sem/boxer.py:776: DeprecationWarning: invalid escape sequence \d
  assert re.match("^[exps]\d+$", var), var
./nltk/sem/drt.py:716: DeprecationWarning: invalid escape sequence \ 
  + [" \  " + blank + line for line in term_lines[1:2]]
./nltk/sem/drt.py:717: DeprecationWarning: invalid escape sequence \ 
  + [" /\ " + var_string + line for line in term_lines[2:3]]
./nltk/sem/chat80.py:9: DeprecationWarning: invalid escape sequence \P
  """
./nltk/sem/chat80.py:705: DeprecationWarning: invalid escape sequence \P
  template = "PropN[num=sg, sem=<\P.(P %s)>] -> '%s'\n"
./nltk/sem/evaluate.py:257: DeprecationWarning: invalid escape sequence \ 
  """
./nltk/corpus/reader/util.py:635: DeprecationWarning: invalid escape sequence \d
  if re.match("^\d+-\d+", line) is not None:
./nltk/corpus/reader/util.py:859: DeprecationWarning: invalid escape sequence \s
  if re.match("======+\s*$", line):
./nltk/corpus/reader/framenet.py:2748: DeprecationWarning: invalid escape sequence \w
  """
./nltk/corpus/reader/bracket_parse.py:215: DeprecationWarning: invalid escape sequence \.
  "alpino\.xml",
./nltk/corpus/reader/twitter.py:25: DeprecationWarning: invalid escape sequence \.
  """
./nltk/corpus/reader/xmldocs.py:232: DeprecationWarning: invalid escape sequence \s
  _XML_TAG_NAME = re.compile("<\s*/?\s*([^\s>]+)")
./nltk/corpus/reader/bnc.py:15: DeprecationWarning: invalid escape sequence \w
  """Corpus reader for the XML version of the British National Corpus.
./nltk/corpus/reader/udhr.py:30: DeprecationWarning: invalid escape sequence \-
  ("Abkhaz\-Cyrillic\+Abkh", "cp1251"),
./nltk/corpus/reader/timit.py:165: DeprecationWarning: invalid escape sequence \.
  encoding = [(".*\.wav", None), (".*", encoding)]
./nltk/corpus/reader/childes.py:281: DeprecationWarning: invalid escape sequence \d
  m = re.match("P(\d+)Y(\d+)M?(\d?\d?)D?", age_year)
./nltk/corpus/reader/plaintext.py:47: DeprecationWarning: invalid escape sequence \.
  """
./nltk/corpus/reader/switchboard.py:113: DeprecationWarning: invalid escape sequence \w
  _UTTERANCE_RE = re.compile("(\w+)\.(\d+)\:\s*(.*)")
./nltk/corpus/reader/api.py:77: DeprecationWarning: invalid escape sequence \.
  m = re.match("(.*\.zip)/?(.*)$|", root)
./nltk/corpus/__init__.py:116: DeprecationWarning: invalid escape sequence \.
  ".*\.(test|train).*",
./nltk/corpus/__init__.py:123: DeprecationWarning: invalid escape sequence \.
  ".*\.(test|train).*",
./nltk/corpus/__init__.py:126: DeprecationWarning: invalid escape sequence \.
  crubadan = LazyCorpusLoader("crubadan", CrubadanCorpusReader, ".*\.txt")
./nltk/corpus/__init__.py:128: DeprecationWarning: invalid escape sequence \.
  "dependency_treebank", DependencyCorpusReader, ".*\.dp", encoding="ascii"
./nltk/corpus/__init__.py:311: DeprecationWarning: invalid escape sequence \.
  "timit", TimitTaggedCorpusReader, ".+\.tags", tagset="wsj", encoding="ascii"
./nltk/corpus/__init__.py:335: DeprecationWarning: invalid escape sequence \.
  twitter_samples = LazyCorpusLoader("twitter_samples", TwitterCorpusReader, ".*\.json")
./nltk/corpus/__init__.py:364: DeprecationWarning: invalid escape sequence \.
  wordnet_ic = LazyCorpusLoader("wordnet_ic", WordNetICCorpusReader, ".*\.dat")
./nltk/corpus/__init__.py:374: DeprecationWarning: invalid escape sequence \.
  "frames/.*\.xml",
./nltk/corpus/__init__.py:383: DeprecationWarning: invalid escape sequence \.
  "frames/.*\.xml",
./nltk/corpus/__init__.py:392: DeprecationWarning: invalid escape sequence \.
  "frames/.*\.xml",
./nltk/corpus/__init__.py:401: DeprecationWarning: invalid escape sequence \.
  "frames/.*\.xml",
./nltk/text.py:650: DeprecationWarning: invalid escape sequence \w
  _CONTEXT_RE = re.compile("\w+|[\.\!\?]")
./nltk/inference/discourse.py:9: DeprecationWarning: invalid escape sequence \ 
  """
./nltk/tree.py:38: DeprecationWarning: invalid escape sequence \ 
  """
./nltk/tree.py:652: DeprecationWarning: invalid escape sequence \s
  if re.search("\s", brackets):
./nltk/tree.py:658: DeprecationWarning: invalid escape sequence \s
  node_pattern = "[^\s%s%s]+" % (open_pattern, close_pattern)
./nltk/tree.py:660: DeprecationWarning: invalid escape sequence \s
  leaf_pattern = "[^\s%s%s]+" % (open_pattern, close_pattern)
./nltk/tree.py:662: DeprecationWarning: invalid escape sequence \s
  "%s\s*(%s)?|%s|(%s)"
./nltk/tree.py:900: DeprecationWarning: invalid escape sequence \$
  reserved_chars = re.compile("([#\$%&~_\{\}])")
./nltk/ccg/combinator.py:220: DeprecationWarning: invalid escape sequence \Y
  """
./nltk/tokenize/toktok.py:53: DeprecationWarning: invalid escape sequence \]
  FUNKY_PUNCT_1 = re.compile(u'([،;؛¿!"\])}»›”؟¡%٪°±©®।॥…])'), r" \1 "
./nltk/tokenize/toktok.py:55: DeprecationWarning: invalid escape sequence \[
  FUNKY_PUNCT_2 = re.compile(u"([({\[“‘„‚«‹「『])"), r" \1 "
./nltk/tokenize/toktok.py:62: DeprecationWarning: invalid escape sequence \|
  PIPE = re.compile("\|"), " &#124; "
./nltk/tokenize/punkt.py:1462: DeprecationWarning: invalid escape sequence \s
  pat = "\s*".join(re.escape(c) for c in tok)
./nltk/tokenize/repp.py:133: DeprecationWarning: invalid escape sequence \(
  line_regex = re.compile("^\((\d+), (\d+), (.+)\)$", re.MULTILINE)
./nltk/tokenize/nist.py:81: DeprecationWarning: invalid escape sequence \{
  PUNCT = re.compile("([\{-\~\[-\` -\&\(-\+\:-\@\/])"), " \\1 "
./nltk/tokenize/nist.py:83: DeprecationWarning: invalid escape sequence \.
  PERIOD_COMMA_PRECEED = re.compile("([^0-9])([\.,])"), "\\1 \\2 "
./nltk/tokenize/nist.py:85: DeprecationWarning: invalid escape sequence \.
  PERIOD_COMMA_FOLLOW = re.compile("([\.,])([^0-9])"), " \\1 \\2"
./nltk/tokenize/treebank.py:194: DeprecationWarning: invalid escape sequence \]
  """
./nltk/tokenize/treebank.py:255: DeprecationWarning: invalid escape sequence \s
  re.compile(pattern.replace("(?#X)", "\s"))
./nltk/tokenize/treebank.py:259: DeprecationWarning: invalid escape sequence \s
  re.compile(pattern.replace("(?#X)", "\s"))
./nltk/tokenize/texttiling.py:96: DeprecationWarning: invalid escape sequence \-
  c for c in lowercase_text if re.match("[a-z\-' \n\t]", c)
./nltk/tokenize/texttiling.py:229: DeprecationWarning: invalid escape sequence \w
  matches = re.finditer("\w+", text)
./nltk/tokenize/regexp.py:76: DeprecationWarning: invalid escape sequence \w
  """
./nltk/tokenize/regexp.py:184: DeprecationWarning: invalid escape sequence \w
  """
./nltk/classify/maxent.py:1292: DeprecationWarning: invalid escape sequence \ 
  """
./nltk/classify/rte_classify.py:61: DeprecationWarning: invalid escape sequence \w
  tokenizer = RegexpTokenizer("[\w.@:/]+|\w+|\$[\d.]+")
./nltk/parse/chart.py:1024: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/chart.py:1057: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/chart.py:1123: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/chart.py:1140: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/chart.py:1213: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/chart.py:1232: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/featurechart.py:251: DeprecationWarning: invalid escape sequence \*
  """
./nltk/parse/featurechart.py:353: DeprecationWarning: invalid escape sequence \*
  """
./nltk/chunk/util.py:371: DeprecationWarning: invalid escape sequence \S
  _LINE_RE = re.compile("(\S+)\s+(\S+)\s+([IOB])-?(\S+)?")
./nltk/chunk/util.py:517: DeprecationWarning: invalid escape sequence \w
  _IEER_TYPE_RE = re.compile('<b_\w+\s+[^>]*?type="(?P<type>\w+)"')
./nltk/chunk/util.py:526: DeprecationWarning: invalid escape sequence \s
  for piece_m in re.finditer("<[^>]+>|[^\s<]+", s):
./nltk/chunk/named_entity.py:178: DeprecationWarning: invalid escape sequence \w
  elif re.match("\w+$", word, re.UNICODE):
./nltk/chunk/named_entity.py:176: DeprecationWarning: invalid escape sequence \W
  elif re.match("\W+$", word, re.UNICODE):
./nltk/chunk/named_entity.py:174: DeprecationWarning: invalid escape sequence \.
  if re.match("[0-9]+(\.[0-9]*)?|[0-9]*\.[0-9]+$", word, re.UNICODE):
./nltk/chunk/named_entity.py:250: DeprecationWarning: invalid escape sequence \s
  text = re.sub("[\s\S]*<TEXT>", subfunc, text)
./nltk/chunk/named_entity.py:251: DeprecationWarning: invalid escape sequence \s
  text = re.sub("</TEXT>[\s\S]*", "", text)
./nltk/chunk/regexp.py:70: DeprecationWarning: invalid escape sequence \{
  _BRACKETS = re.compile("[^\{\}]+")
./nltk/chunk/regexp.py:215: DeprecationWarning: invalid escape sequence \{
  s = re.sub("\{\}", "", s)
./nltk/chunk/regexp.py:426: DeprecationWarning: invalid escape sequence \g
  RegexpChunkRule.__init__(self, regexp, "{\g<chunk>}", descr)
./nltk/chunk/regexp.py:471: DeprecationWarning: invalid escape sequence \g
  RegexpChunkRule.__init__(self, regexp, "}\g<chink>{", descr)
./nltk/chunk/regexp.py:510: DeprecationWarning: invalid escape sequence \{
  regexp = re.compile("\{(?P<chunk>%s)\}" % tag_pattern2re_pattern(tag_pattern))
./nltk/chunk/regexp.py:511: DeprecationWarning: invalid escape sequence \g
  RegexpChunkRule.__init__(self, regexp, "\g<chunk>", descr)
./nltk/chunk/regexp.py:575: DeprecationWarning: invalid escape sequence \g
  RegexpChunkRule.__init__(self, regexp, "\g<left>", descr)
./nltk/chunk/regexp.py:708: DeprecationWarning: invalid escape sequence \{
  "(?P<left>%s)\{(?P<right>%s)"
./nltk/chunk/regexp.py:714: DeprecationWarning: invalid escape sequence \g
  RegexpChunkRule.__init__(self, regexp, "{\g<left>\g<right>", descr)
./nltk/chunk/regexp.py:778: DeprecationWarning: invalid escape sequence \}
  "(?P<left>%s)\}(?P<right>%s)"
./nltk/chunk/regexp.py:784: DeprecationWarning: invalid escape sequence \g
  RegexpChunkRule.__init__(self, regexp, "\g<left>\g<right>}", descr)
./nltk/chunk/regexp.py:896: DeprecationWarning: invalid escape sequence \{
  r"^((%s|<%s>)*)$" % ("([^\{\}<>]|\{\d+,?\}|\{\d*,\d+\})+", "[^\{\}<>]+")
./nltk/chunk/regexp.py:896: DeprecationWarning: invalid escape sequence \{
  r"^((%s|<%s>)*)$" % ("([^\{\}<>]|\{\d+,?\}|\{\d*,\d+\})+", "[^\{\}<>]+")
./nltk/chunk/regexp.py:1136: DeprecationWarning: invalid escape sequence \.
  """
./nltk/featstruct.py:1295: DeprecationWarning: invalid escape sequence \d
  name, n = re.sub("\d+$", "", var.name), 2
./nltk/featstruct.py:2091: DeprecationWarning: invalid escape sequence \d
  RANGE_RE = re.compile("(-?\d+):(-?\d+)")
./nltk/draw/cfg.py:166: DeprecationWarning: invalid escape sequence \s
  _ARROW_RE = re.compile("\s*(->|(" + ARROW + "))\s*")
./nltk/draw/cfg.py:166: DeprecationWarning: invalid escape sequence \s
  _ARROW_RE = re.compile("\s*(->|(" + ARROW + "))\s*")
./nltk/draw/cfg.py:171: DeprecationWarning: invalid escape sequence \s
  + "))\s*"
./nltk/toolbox.py:159: DeprecationWarning: invalid escape sequence \_
  """
./nltk/grammar.py:1278: DeprecationWarning: invalid escape sequence \*
  """
./nltk/grammar.py:1463: DeprecationWarning: invalid escape sequence \w
  _STANDARD_NONTERM_RE = re.compile("( [\w/][\w/^<>-]* ) \s*", re.VERBOSE)
./nltk/stem/porter.py:145: DeprecationWarning: invalid escape sequence \m
  """Returns the 'measure' of stem, per definition in the paper
./nltk/stem/lancaster.py:192: DeprecationWarning: invalid escape sequence \*
  valid_rule = re.compile("^[a-z]+\*?\d[a-z]*[>\.]?$")
./nltk/stem/lancaster.py:225: DeprecationWarning: invalid escape sequence \*
  valid_rule = re.compile("^([a-z]+)(\*?)(\d)([a-z]*)([>\.]?)$")
./nltk/treetransforms.py:8: DeprecationWarning: invalid escape sequence \ 
  """
./tools/nltk_term_index.py:52: DeprecationWarning: invalid escape sequence \s
  SCAN_RE1 = "<programlisting>[\s\S]*?</programlisting>"
./tools/nltk_term_index.py:53: DeprecationWarning: invalid escape sequence \s
  SCAN_RE2 = "<literal>[\s\S]*?</literal>"
./tools/nltk_term_index.py:56: DeprecationWarning: invalid escape sequence \w
  TOKEN_RE = re.compile('[\w\.]+')
./tools/find_deprecated.py:43: DeprecationWarning: invalid escape sequence \s
  '"""[\s\S]*?"""|'
./tools/find_deprecated.py:45: DeprecationWarning: invalid escape sequence \s
  "'''[\s\S]*?'''|"
./tools/find_deprecated.py:47: DeprecationWarning: invalid escape sequence \s
  ")\s*"
./tools/find_deprecated.py:64: DeprecationWarning: invalid escape sequence \.
  '({})\.read\('.format('|'.join(re.escape(n) for n in dir(nltk.corpus)))
./tools/find_deprecated.py:67: DeprecationWarning: invalid escape sequence \s
  CLASS_DEF_RE = re.compile('^\s*class\s+(\w+)\s*[:\(]')

tirkarthi pada 19 Jan 2020

@gertjanwytynck Saat ini saya sedang memperbaikinya satu per satu, harus selesai pada akhir minggu.

ab-10 pada 21 Jan 2020

🚀2

Apakah ini sudah selesai?

morrme pada 19 Okt 2020

Sepertinya masih ada beberapa yang tersisa. Saya ingin tahu apakah menambahkan tes unit bisa membantu.

./nltk/tools/nltk_term_index.py
./nltk/tools/find_deprecated.py
./nltk/nltk/tokenize/punkt.py

... dan meskipun dampak dari penghentian alat tidak terlalu besar, ada sedikit ironi bahwa skrip find_deprecated.py menggunakan sintaks yang tidak digunakan lagi :)

$ git clone https://github.com/nltk/nltk.git
$ find . -iname '*.py' | xargs -P 4 -I{} python3.8 -Wall -m py_compile {}
./nltk/tools/nltk_term_index.py:51: DeprecationWarning: invalid escape sequence \s
  SCAN_RE1 = "<programlisting>[\s\S]*?</programlisting>"
./nltk/tools/nltk_term_index.py:52: DeprecationWarning: invalid escape sequence \s
  SCAN_RE2 = "<literal>[\s\S]*?</literal>"
./nltk/tools/nltk_term_index.py:55: DeprecationWarning: invalid escape sequence \w
  TOKEN_RE = re.compile('[\w\.]+')
./nltk/tools/find_deprecated.py:42: DeprecationWarning: invalid escape sequence \s
  '"""[\s\S]*?"""|'
./nltk/tools/find_deprecated.py:44: DeprecationWarning: invalid escape sequence \s
  "'''[\s\S]*?'''|"
./nltk/tools/find_deprecated.py:46: DeprecationWarning: invalid escape sequence \s
  ")\s*"
./nltk/tools/find_deprecated.py:63: DeprecationWarning: invalid escape sequence \.
  '({})\.read\('.format('|'.join(re.escape(n) for n in dir(nltk.corpus)))
./nltk/tools/find_deprecated.py:66: DeprecationWarning: invalid escape sequence \s
  CLASS_DEF_RE = re.compile('^\s*class\s+(\w+)\s*[:\(]')
./nltk/nltk/tokenize/punkt.py:223: DeprecationWarning: invalid escape sequence \]
  return "(?:[)\";}\]\*:@\'\({\[%s])" % re.escape("".join(set(self.sent_end_chars) - {"."}))

pombredanne pada 19 Okt 2020

Nltk: Perbarui berbagai urutan escape regex

Komentar yang paling membantu

Semua 14 komentar

Masalah terkait