Xapian-haystack: ๊ธฐ๊ฐ„์ด ๋„ˆ๋ฌด ๊ธด ๋ฌธ์ œ

์— ๋งŒ๋“  2011๋…„ 05์›” 22์ผ  ยท  3์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: notanumber/xapian-haystack

When I run manage.py rebuild_index I get following error: manage.py rebuild_index ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ๋‹ค์Œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

```Traceback (most recent call last): ```์ถ”์ (๊ฐ€์žฅ ์ตœ๊ทผ ํ˜ธ์ถœ ๋งˆ์ง€๋ง‰):
File "./manage.py", line 11, in ํŒŒ์ผ "./manage.py", 11ํ–‰,
execute_manager(settings) execute_manager(์„ค์ •)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py", line 438, in execute_manager "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py" ํŒŒ์ผ, 438ํ–‰, execute_manager
utility.execute() ์œ ํ‹ธ๋ฆฌํ‹ฐ.execute()
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py", line 379, in execute "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py" ํŒŒ์ผ, 379ํ–‰, ์‹คํ–‰ ์ค‘
self.fetch_command(subcommand).run_from_argv(self.argv) self.fetch_command(ํ•˜์œ„ ๋ช…๋ น).run_from_argv(self.argv)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 191, in run_from_argv run_from_argv์—์„œ ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", 191ํ–‰
self.execute(_args, *_options. dict ) self.execute(_args, *_options. dict )
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 220, in execute ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", ๋ผ์ธ 220, ์‹คํ–‰ ์ค‘
output = self.handle(_args, *_options) ์ถœ๋ ฅ = self.handle(_args, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/rebuild_index.py", line 14, in handle ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/rebuild_index.py", ์ค„ 14, ํ•ธ๋“ค
call_command('update_index', *_options) call_command('์—…๋ฐ์ดํŠธ_์ธ๋ฑ์Šค', *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py", line 166, in call_command call_command์—์„œ ํŒŒ์ผ " /home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/init .py", 166ํ–‰
return klass.execute(_args, *_defaults) ๋ฐ˜ํ™˜ klass.execute(_args, *_defaults)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 220, in execute ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", ๋ผ์ธ 220, ์‹คํ–‰ ์ค‘
output = self.handle(_args, *_options) ์ถœ๋ ฅ = self.handle(_args, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 184, in handle ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", ์ค„ 184, ํ•ธ๋“ค
return super(Command, self).handle(_apps, *_options) return super(๋ช…๋ น, ์ž๊ธฐ).handle(_apps, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 286, in handle ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", 286ํ–‰, ํ•ธ๋“ค
app_output = self.handle_app(app, *_options) app_output = self.handle_app(์•ฑ, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 218, in handle_app ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", 218ํ–‰, handle_app
do_update(index, qs, start, end, total, self.verbosity) do_update(์ธ๋ฑ์Šค, qs, ์‹œ์ž‘, ์ข…๋ฃŒ, ํ•ฉ๊ณ„, self.verbosity)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 100, in do_update ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", do_update์˜ 100ํ–‰
index.backend.update(index, current_qs) index.backend.update(์ธ๋ฑ์Šค, current_qs)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/xapian_backend.py", line 257, in update ํŒŒ์ผ "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/xapian_backend.py", 257ํ–‰, ์—…๋ฐ์ดํŠธ ์ค‘
database.replace_document(document_id, document) database.replace_document(document_id, ๋ฌธ์„œ)
xapian.InvalidArgumentError: Term too long (> 245): 4f6d6e6961206d6561206d6563756d20706f72746f202d20e2f1e520f1e2eee520edeef8f320f120f1eee1eefe0d0a566974612073696e65206c69626572746174652c206e6968696c202d20e6e8e7edfc20e1e5e720f1e2eee1eee4fb202d20ede8f7f2ee0d0a417273206c6f6e67612c207669746120627265766973 xapian.InvalidArgumentError: Term too long (> 245): 4f6d6e6961206d6561206d6563756d20706f72746f202d20e2f1e520f1e2eee520edeef8f320f120f1eee1eefe0d0a566974612073696e65206c69626572746174652c206e6968696c202d20e6e8e7edfc20e1e5e720f1e2eee1eee4fb202d20ede8f7f2ee0d0a417273206c6f6e67612c207669746120627265766973

Do you have any idea how to handle this problem?

Thanks.

P.S.
python 2.6
python-xapian (debian) 1.2.4-1
libxapian (debian) 1.2.5
recent versions of django, haystack-xapian, django-xapian
en

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

@notanumber @jorgecarleitao What do you think about this https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd ? @notanumber @jorgecarleitao https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd ์— ๋Œ€ํ•ด ์–ด๋–ป๊ฒŒ ์ƒ๊ฐํ•˜์„ธ์š”?

the text document is usually more than 245, I know it's raised from xapian itself but this shouldn't break the [re]building the index. ํ…์ŠคํŠธ ๋ฌธ์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ 245๋ณด๋‹ค ํฌ๋ฉฐ xapian ์ž์ฒด์—์„œ ๊ฐ€์ ธ์˜จ ๊ฒƒ์œผ๋กœ ์•Œ๊ณ  ์žˆ์ง€๋งŒ ์ธ๋ฑ์Šค [์žฌ]๋นŒ๋“œ๋ฅผ ์ค‘๋‹จํ•ด์„œ๋Š” ์•ˆ๋ฉ๋‹ˆ๋‹ค.

en

๋ชจ๋“  3 ๋Œ“๊ธ€

I've created quick fix, it just ignores such error and I think documents with invalid terms are not added to index completely. ๋น ๋ฅธ ์ˆ˜์ •์„ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ์˜ค๋ฅ˜๋ฅผ ๋ฌด์‹œํ•˜๊ณ  ์ž˜๋ชป๋œ ์šฉ์–ด๊ฐ€ ์žˆ๋Š” ๋ฌธ์„œ๊ฐ€ ์ƒ‰์ธ์— ์™„์ „ํžˆ ์ถ”๊ฐ€๋˜์ง€ ์•Š์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

``` lorien@big :/tmp/xapian-haystack$ git diff ``` lorien@big :/tmp/xapian-haystack$ git diff
diff --git a/xapian_backend.py b/xapian_backend.py diff --git a/xapian_backend.py b/xapian_backend.py
index fbbe221..1884613 100755 ์ธ๋ฑ์Šค fbbe221..1884613 100755
--- a/xapian_backend.py --- a/xapian_backend.py
+++ b/xapian_backend.py +++ b/xapian_backend.py
@@ -259,7 +259,10 @@ class SearchBackend(BaseSearchBackend): @@ -259,7 +259,10 @@ ํด๋ž˜์Šค SearchBackend(BaseSearchBackend):
DOCUMENT_CT_TERM_PREFIX + u'%s.%s' % DOCUMENT_CT_TERM_PREFIX + u'%s.%s' %
(obj._meta.app_label, obj._meta.module_name) (obj._meta.app_label, obj._meta.module_name)
) )

  • database.replace_document(document_id, document) database.replace_document(document_id, ๋ฌธ์„œ)
  • try: ๋…ธ๋ ฅํ•˜๋‹ค:
  • database.replace_document(document_id, document) database.replace_document(document_id, ๋ฌธ์„œ)
  • except xapian.InvalidArgumentError, ex: xapian.InvalidArgumentError ์ œ์™ธ, ์˜ˆ:
  • ``` ```
    sys.stderr.write('xapian.InvalidArgumentErrorn') sys.stderr.write('xapian.InvalidArgumentErrorn')

except UnicodeDecodeError: UnicodeDecodeError ์ œ์™ธ:
sys.stderr.write('Chunk failed.n') sys.stderr.write('์ฒญํฌ ์‹คํŒจ.n')
``` ```


en

This is actually raised by Xapian itself. ์ด๊ฒƒ์€ ์‹ค์ œ๋กœ Xapian ์ž์ฒด์— ์˜ํ•ด ์ œ๊ธฐ๋ฉ๋‹ˆ๋‹ค. I've left the exception to bubble up so it's possible for a developer to see the issue. ๊ฐœ๋ฐœ์ž๊ฐ€ ๋ฌธ์ œ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ๋„๋ก ์˜ˆ์™ธ๋ฅผ ๊ฑฐํ’ˆ์œผ๋กœ ๋‚จ๊ฒจ๋‘์—ˆ์Šต๋‹ˆ๋‹ค. The solution is to ensure your terms are no longer than 245 characters, unfortunately. ํ•ด๊ฒฐ์ฑ…์€ ๋ถˆํ–‰ํžˆ๋„ ์šฉ์–ด๊ฐ€ 245์ž๋ฅผ ๋„˜์ง€ ์•Š๋„๋ก ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

en

@notanumber @jorgecarleitao What do you think about this https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd ? @notanumber @jorgecarleitao https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd ์— ๋Œ€ํ•ด ์–ด๋–ป๊ฒŒ ์ƒ๊ฐํ•˜์„ธ์š”?

the text document is usually more than 245, I know it's raised from xapian itself but this shouldn't break the [re]building the index. ํ…์ŠคํŠธ ๋ฌธ์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ 245๋ณด๋‹ค ํฌ๋ฉฐ xapian ์ž์ฒด์—์„œ ๊ฐ€์ ธ์˜จ ๊ฒƒ์œผ๋กœ ์•Œ๊ณ  ์žˆ์ง€๋งŒ ์ธ๋ฑ์Šค [์žฌ]๋นŒ๋“œ๋ฅผ ์ค‘๋‹จํ•ด์„œ๋Š” ์•ˆ๋ฉ๋‹ˆ๋‹ค.

en
์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰