Xapian-haystack: 期限过长问题

创建于 2011-05-22  ·  3评论  ·  资料来源: notanumber/xapian-haystack

When I run manage.py rebuild_index I get following error:当我运行manage.py rebuild_index时,出现以下错误:

```Traceback (most recent call last): ```Traceback(最近一次调用最后):
File "./manage.py", line 11, in文件“./manage.py”,第 11 行,在
execute_manager(settings)执行管理器(设置)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py", line 438, in execute_manager文件“ /home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/init .py”,第 438 行,在 execute_manager
utility.execute()实用程序.execute()
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py", line 379, in execute文件“ /home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/init .py”,第 379 行,在执行
self.fetch_command(subcommand).run_from_argv(self.argv) self.fetch_command(子命令).run_from_argv(self.argv)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 191, in run_from_argv文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py”,第 191 行,在 run_from_argv
self.execute(_args, *_options. dict ) self.execute(_args, * _options.dict )
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 220, in execute文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py”,第220行,在执行
output = self.handle(_args, *_options)输出 = self.handle(_args, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/rebuild_index.py", line 14, in handle文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/rebuild_index.py”,第 14 行,在句柄中
call_command('update_index', *_options) call_command('update_index', *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/ init .py", line 166, in call_command文件“ /home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/init .py”,第 166 行,在 call_command
return klass.execute(_args, *_defaults)返回 klass.execute(_args, *_defaults)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 220, in execute文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py”,第220行,在执行
output = self.handle(_args, *_options)输出 = self.handle(_args, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 184, in handle文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py”,第 184 行,在句柄中
return super(Command, self).handle(_apps, *_options) return super(Command, self).handle(_apps, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py", line 286, in handle文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/django/core/management/base.py”,第 286 行,在句柄中
app_output = self.handle_app(app, *_options) app_output = self.handle_app(app, *_options)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 218, in handle_app文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py”,第 218 行,在 handle_app
do_update(index, qs, start, end, total, self.verbosity) do_update(index, qs, start, end, total, self.verbosity)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py", line 100, in do_update文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/haystack/management/commands/update_index.py”,第 100 行,在 do_update
index.backend.update(index, current_qs) index.backend.update(索引,current_qs)
File "/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/xapian_backend.py", line 257, in update文件“/home/lorien/web/dumpz/.env/lib/python2.6/site-packages/xapian_backend.py”,第 257 行,更新中
database.replace_document(document_id, document) database.replace_document(document_id,文档)
xapian.InvalidArgumentError: Term too long (> 245): 4f6d6e6961206d6561206d6563756d20706f72746f202d20e2f1e520f1e2eee520edeef8f320f120f1eee1eefe0d0a566974612073696e65206c69626572746174652c206e6968696c202d20e6e8e7edfc20e1e5e720f1e2eee1eee4fb202d20ede8f7f2ee0d0a417273206c6f6e67612c207669746120627265766973 xapian.InvalidArgumentError: Term too long (> 245): 4f6d6e6961206d6561206d6563756d20706f72746f202d20e2f1e520f1e2eee520edeef8f320f120f1eee1eefe0d0a566974612073696e65206c69626572746174652c206e6968696c202d20e6e8e7edfc20e1e5e720f1e2eee1eee4fb202d20ede8f7f2ee0d0a417273206c6f6e67612c207669746120627265766973

Do you have any idea how to handle this problem?

Thanks.

P.S.
python 2.6
python-xapian (debian) 1.2.4-1
libxapian (debian) 1.2.5
recent versions of django, haystack-xapian, django-xapian
en

最有用的评论

@notanumber @jorgecarleitao What do you think about this https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd ? @notanumber @jorgecarleitao你怎么看这个https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd

the text document is usually more than 245, I know it's raised from xapian itself but this shouldn't break the [re]building the index.文本文档通常超过 245,我知道它是从 xapian 本身提出的,但这不应该破坏 [re] 构建索引。

en

所有3条评论

I've created quick fix, it just ignores such error and I think documents with invalid terms are not added to index completely.我已经创建了快速修复,它只是忽略了这样的错误,我认为带有无效术语的文档没有完全添加到索引中。

``` lorien@big :/tmp/xapian-haystack$ git diff ``` lorien@big :/tmp/xapian-haystack$ git diff
diff --git a/xapian_backend.py b/xapian_backend.py差异 --git a/xapian_backend.py b/xapian_backend.py
index fbbe221..1884613 100755索引 fbbe221..1884613 100755
--- a/xapian_backend.py --- a/xapian_backend.py
+++ b/xapian_backend.py +++ b/xapian_backend.py
@@ -259,7 +259,10 @@ class SearchBackend(BaseSearchBackend): @@ -259,7 +259,10 @@ 类 SearchBackend(BaseSearchBackend):
DOCUMENT_CT_TERM_PREFIX + u'%s.%s' % DOCUMENT_CT_TERM_PREFIX + u'%s.%s' %
(obj._meta.app_label, obj._meta.module_name) (obj._meta.app_label,obj._meta.module_name)
) )

  • database.replace_document(document_id, document) database.replace_document(document_id,文档)
  • try:尝试:
  • database.replace_document(document_id, document) database.replace_document(document_id,文档)
  • except xapian.InvalidArgumentError, ex:除了 xapian.InvalidArgumentError,例如:
  • ``` ```
    sys.stderr.write('xapian.InvalidArgumentErrorn') sys.stderr.write('xapian.InvalidArgumentErrorn')

except UnicodeDecodeError:除了 UnicodeDecodeError:
sys.stderr.write('Chunk failed.n') sys.stderr.write('块失败.n')
``` ```


en

This is actually raised by Xapian itself.这实际上是由 Xapian 自己提出的。 I've left the exception to bubble up so it's possible for a developer to see the issue.我已经让异常冒泡了,这样开发人员就有可能看到这个问题。 The solution is to ensure your terms are no longer than 245 characters, unfortunately.不幸的是,解决方案是确保您的条款不超过 245 个字符。

en

@notanumber @jorgecarleitao What do you think about this https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd ? @notanumber @jorgecarleitao你怎么看这个https://github.com/Alir3z4/xapian-haystack/commit/a249b46c48957f4d8a776ef41b0ce12490ad52dd

the text document is usually more than 245, I know it's raised from xapian itself but this shouldn't break the [re]building the index.文本文档通常超过 245,我知道它是从 xapian 本身提出的,但这不应该破坏 [re] 构建索引。

en
此页面是否有帮助?
0 / 5 - 0 等级

相关问题

nvie picture nvie  ·  3评论

mitsuhiko picture mitsuhiko  ·  3评论

gw0 picture gw0  ·  3评论

jterrace picture jterrace  ·  3评论

ghost picture ghost  ·  3评论