Group-Office
latest

Administration

  • Installation
  • Upgrade
  • Backup
  • Migration
  • Logging
  • System settings
  • Troubleshooting

Using Group-Office

  • Account settings
  • Connect a device
  • Links
  • Search
  • Comments
  • Sharing
  • Custom filters

Modules

  • E-mail
  • Calendar
  • Address book
  • Newsletters
  • Document templates
  • Files
  • Notes
  • Business
  • Tickets
  • Tasks
  • Projects
  • Holidays
  • Billing
  • File search
    • Indexing
      • OCR
  • Rocket Chat
  • Studio
  • OAuth2 Client
  • DokuWiki

Contribute

  • Translations
  • Developer
Group-Office
  • Docs »
  • File search
  • Edit on GitHub

File search¶

With the file search module you can deep search the contents of files. This module in combination with custom fields and the quick edit pane makes the perfect E-Discovery solution.

Group-Office can index the following file types:

  • Microsoft Office Documents
  • Open Document format (Open Office, LibreOffice)
  • Saved E-mails including attachments. It does not search an IMAP server.
  • PDF
  • Plain text
  • Scanned images using OCR

Next to the regular search it’s also possible to create complex queries with the advanced search.

../_images/file-search-module.png

File search module

Indexing¶

File uploads are not indexed straight away. A schedule task is defined that will run every night at 1:00 am. If you want to run it more often you can adjust the “Filesearch index” task at Start menu -> Manage system tasks.

If you want to index directly after upload. You can put this in config.php:

$config["filesearch_direct_index"] = true;

Warning

This may cause great delays on uploads. We don’t recommend using this setting.

Note

If indexing is not working you might need to install some additional tools. See the installation instructions.

OCR¶

By default OCR is enabled for TIFF files only. You can enable JPEG, PNG and PDF too with this config option:

$config['filesearch_ocr_extensions'] = ['tiff', 'tif', 'png', 'jpg', 'jpeg', 'pdf'];

Note that the index process will cause much more load and that OCR results from JPEG files are not that great.

You can also set the language for Tesseract with this option:

$config['filesearch_language'] = 'nld';

Make sure that language is installed:

apt install tesseract-ocr-nld
Next Previous

© Copyright 2021, Intermesh BV Revision 56e662f2.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: latest
Versions
latest
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.