Revision c2b9023...

Go back to digest for 23rd June 2013

Features in KDE Base

Vishesh Handa committed changes in [nepomuk-core] /fileindexer/indexer:

Indexer: Make the plugins only extract a part of the full text

Introduce a maxPlainTextSize() which informs the plugin how much text
they should extract.

This is useful in two ways -

1. Sometimes one doesn't want any of the plain text, so one can set it
to 0. This is used by the FileMetadataWidget to directly display the
indexed data. Since we do not show the plain text, we do not need to
extract it.

2. Virtuoso cannot handle queries above a certain number of bytes.
2500243 seems to be the magic number. If you go above this limit,
just a '0' is stored. Therefore it doesn't make sense to extract all
of the plain text, when virtuoso can clearly not handle all of it.

Virtuoso does not support streaming in text

File Changes

Modified 9 files
  • /fileindexer/indexer
  •   services/epubextractor.cpp
  •   services/extractorplugin.cpp
  •   services/extractorplugin.h
  •   services/indexer.cpp
  •   services/main.cpp
  •   services/odfextractor.cpp
  •   services/office2007extractor.cpp
  •   services/popplerextractor.cpp
  •   services/mobipocket/mobiextractor.cpp
9 files changed in total