Search Indexing tab
Related Topics
- Overview
- Use this tab to configure specific MIME types in order to search within files (such as PDF files) that have been uploaded to the Tiki File Gallery.
- To Access
- From the File Gallery Admin page, click the Search Indexing tab.
- Note
- In order to search within uploaded files, your server may require additional applications, such as strings or pdftotext.
Option | Description | Default |
---|---|---|
Automatic indexing of file content | Uses command line tools to extract the information from the files based on their MIME types. | Disabled |
Automatic indexing of emails stored as files | Parses message/rfc822 types of files (aka eml files) and stores individual email headers and content in search index. | Disabled |
Asynchronous indexing | Enabled | |
OCR Files | Extract and index text from supported file types. | Disabled |
OCR Every File | Attempt to OCR every supported file. | Disabled |
Allow file level OCR languages | Allow users to change the default languages that will be used to OCR a file. | Enabled |
OCR limit languages | Limit the number of languages one can select from this list. Auto detect languages | Afrikaans (Afrikaans) | Albanian (Shqip) | Amharic (አማርኛ) | Arabic | Arabic (العربية) | Armenian | Armenian (Հայերեն) | Assamese (অসমীয়া) | Azerbaijani (azərbaycan dili) | Azerbaijani (azərbaycan dili) (cyrl) | Basque (euskara, euskera) | Belarusian (беларуская мова) | Bengali | Bengali (বাংলা) | Bosnian (bos... |
None |
tesseract path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
/usr/bin/tesseract |
pdfimages path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
Pdfimages |
Option | Description | Default |
---|---|---|
Automatic indexing of file content | Uses command line tools to extract the information from the files based on their MIME types. | Disabled |
Automatic indexing of emails stored as files | Parses message/rfc822 types of files (aka eml files) and stores individual email headers and content in search index. | Disabled |
Asynchronous indexing | Enabled | |
OCR Files | Extract and index text from supported file types. | Disabled |
OCR Every File | Attempt to OCR every supported file. | Disabled |
Allow file level OCR languages | Allow users to change the default languages that will be used to OCR a file. | Enabled |
OCR limit languages | Limit the number of languages one can select from this list. Auto detect languages | Afrikaans (Afrikaans) | Albanian (Shqip) | Amharic (አማርኛ) | Arabic | Arabic (العربية) | Armenian | Armenian (Հայերեն) | Assamese (অসমীয়া) | Azerbaijani (azərbaycan dili) | Azerbaijani (azərbaycan dili) (cyrl) | Basque (euskara, euskera) | Belarusian (беларуская мова) | Bengali | Bengali (বাংলা) | Bosnian (bos... |
None |
tesseract path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
/usr/bin/tesseract |
pdfimages path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
Pdfimages |
Option | Description | Default |
---|---|---|
Automatic indexing of file content | Uses command line tools to extract the information from the files based on their MIME types. | Disabled |
Automatic indexing of emails stored as files | Parses message/rfc822 types of files (aka eml files) and stores individual email headers and content in search index. | Disabled |
Asynchronous indexing | Enabled | |
OCR Files | Extract and index text from supported file types. | Disabled |
OCR Every File | Attempt to OCR every supported file. | Disabled |
Allow file level OCR languages | Allow users to change the default languages that will be used to OCR a file. | Enabled |
OCR limit languages | Limit the number of languages one can select from this list. Auto detect languages | Afrikaans (Afrikaans) | Albanian (Shqip) | Amharic (አማርኛ) | Arabic | Arabic (العربية) | Armenian | Armenian (Հայերեն) | Assamese (অসমীয়া) | Azerbaijani (azərbaycan dili) | Azerbaijani (azərbaycan dili) (cyrl) | Basque (euskara, euskera) | Belarusian (беларуская мова) | Bengali | Bengali (বাংলা) | Bosnian (bos... |
None |
tesseract path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
/usr/bin/tesseract |
pdfimages path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
Pdfimages |
Option | Description | Default |
---|---|---|
Automatic indexing of file content | Uses command line tools to extract the information from the files based on their MIME types. | Disabled |
Automatic indexing of emails stored as files | Parses message/rfc822 types of files (aka eml files) and stores individual email headers and content in search index. | Disabled |
Asynchronous indexing | Enabled | |
OCR Files | Extract and index text from supported file types. | Disabled |
OCR Every File | Attempt to OCR every supported file. | Disabled |
Allow file level OCR languages | Allow users to change the default languages that will be used to OCR a file. | Enabled |
OCR limit languages | Limit the number of languages one can select from this list. Auto detect languages |
None |
tesseract path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
Tesseract |
pdfimages path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
/usr/bin/pdfimages |
Option | Description | Default |
---|---|---|
Automatic indexing of file content | Uses command line tools to extract the information from the files based on their MIME types. | Disabled |
Asynchronous indexing | Enabled | |
OCR Files | Extract and index text from supported file types. | Disabled |
OCR Every File | Attempt to OCR every supported file. | Disabled |
Allow file level OCR languages | Allow users to change the default languages that will be used to OCR a file. | Enabled |
OCR limit languages | Limit the number of languages one can select from this list. Auto detect languages |
None |
tesseract path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
sh: 1: where: not found |
pdfimages path | Path to the location of the binary. Defaults to the $PATH location. If blank, the $PATH will be used, but will likely fail with scheduler. |
sh: 1: where: not found |