History: Search and List from Unified Index
Preview of version: 96
Unified Index
This applied to the search capabilities in Tiki, such as those used by PluginList or the search at tiki-searchindex.php rely on a search index.
Tiki can support multiple search engines internally. Each of those will have different capabilities and limitations. The default engine should provide capabilities good enough for small and medium sites. Larger sites may need additional infrastructure to get the most performance. Please see: Unified Index Comparison
Fields
Below is a matrix between the fields and the object types.
Legend:
- Tokenized, as in decomposed in words for full text search
- Sortable
Field | Type | Tokenized | Sortable | wiki page | forum post | blog post | article | file | trackeritem | sheet | comment | user | Available in Tiki version |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
object_type | Generic | X | X | X | X | X | X | X | X | X | X | 7 | |
object_id | Generic | X | X | X | X | X | X | X | X | X | X | 7 | |
title | Generic | X | X | X | X | X | X | X | X | X | ? | X | 7 |
language | Generic | X | X | X | ? | 7 | |||||||
creation_date | Generic | X | X | X | X | X | X | X | X | X | 7/15 | ||
modification_date | Generic | X | X | X | X | X | X | X | X | X | 7 | ||
contributors | Generic | X | X | X | X | X | X | X | X | 7 | |||
description | Generic | X | X | X | X | X | 7 | ||||||
contents | Generic | X | X | X | X | X | X | X | X | X | X | 7 | |
wiki_content | Specific | X | X | 7 | |||||||||
wiki_uptodateness | Specific | X | X | 7 | |||||||||
wiki_approval_state | Specific | X | 11 | ||||||||||
post_content | Specific | X | 7 | ||||||||||
post_snippet | Specific | X | 14 | ||||||||||
parent_thread_id (not to be confused with parent_object_id) | Specific | X | 8 | ||||||||||
root_thread_id | Specific | X | 14 | ||||||||||
parent_contributors | Specific | X | 14 | ||||||||||
blog_id | Specific | X | X | 7 | |||||||||
blog_excerpt | Specific | X | 7 | ||||||||||
blog_content | Specific | X | 7 | ||||||||||
topic_id | Specific | X | X | 7 | |||||||||
article_content | Specific | X | 7 | ||||||||||
article_topline | Specific | X | 7 | ||||||||||
article_subtitle | Specific | X | 7 | ||||||||||
article_author | Specific | X | 9 | ||||||||||
article_type | Specific | X | 9 | ||||||||||
article_heading available as description |
Specific | X | 9 | ||||||||||
published | Specific | X | 13 | ||||||||||
sitetitle | Specific | X | 13 | ||||||||||
siteurl | Specific | X | 13 | ||||||||||
gallery_id | Specific | X | X | 7 | |||||||||
filename_id | Specific | X | X | 7 | |||||||||
filetype | Specific | X | X | X | 7 | ||||||||
filesize | Specific | X | X | X | 15 | ||||||||
file_comment | Specific | X | 7 | ||||||||||
file_content | Specific | X | 7 | ||||||||||
tracker_id | Specific | X | X | 7 | |||||||||
tracker_status | Specific | X | X | 7 | |||||||||
tracker_field_PERMNAME/ID (see below for more details) | Specific | X | ? | X | 7 | ||||||||
sheet_content | Specific | X | 7 | ||||||||||
comment_content | Specific | X | X | 7 | |||||||||
user_country | Specific | X | X | X | 10 | ||||||||
groups | Specific | X | X | X | ? | ||||||||
hits | Specific | X | 15 | ||||||||||
lastpost_title | Specific | X | 15 | ||||||||||
lastpost_modification_date | Specific | X | 15 | ||||||||||
lastpost_contributors | Specific | X | 15 | ||||||||||
lastpost_post_content | Specific | X | 15 | ||||||||||
lastpost_post_snippet | Specific | X | 15 | ||||||||||
lastpost_hits | Specific | X | 15 | ||||||||||
lastpost_thread_id | Specific | X | 15 | ||||||||||
view_permission | Internal | X | X | X | 7 | ||||||||
parent_object_type | Internal | X | X | X | X | X | X | X | 7 | ||||
parent_object_id | Internal | X | X | X | X | X | X | X | 7 | ||||
parent_view_permission | Internal | X | X | X | X | X | X | 7 | |||||
global_view_permission | Internal | X | 7 | ||||||||||
hash | Internal | X | 7 | ||||||||||
url | Internal | X | X | 7 | |||||||||
categories | Global | X | X | X | X | X | X | X | X | 7 | |||
deep_categories | Global | X | X | X | X | X | X | X | X | 7 | |||
allowed_groups | Global | X | X | X | X | X | X | X | X | 7 | |||
freetags | Global | X | X | X | X | X | X | X | X | 7 | |||
freetags_text | Global | X | X | X | X | X | X | X | X | 7 | |||
adv_rating_ID | Global | X | X | X | X | X | X | X | X | X | X | 7 | |
comment_count | Global | X | X | X | X | X | X | X | X | X | 8 | ||
relations | Global | X | X | X | X | X | X | X | X | X | 8 | ||
attachments | Global | X | X | X | X | X | X | X | X | 7 | |||
attachment_contents | Global | X | X | X | X | X | X | X | X | X | 7 | ||
geo_located | Global | X | X | X | X | X | X | X | X | X | 9 | ||
geo_location | Global | X | X | X | X | X | X | X | X | X | 9 | ||
visits | Global | X | X | X | 9.2 |
/ Static value
? Depends on the dataTracker Fields
The indexing for tracker fields will vary depending on the field type. As a general rule, tracker_field_PERMNAME/ID will be used as the field and will be sortable. However, there are a few exceptions:
- Image and File fields are not indexed
- TextArea is not sortable
Multilingual fields are indexed as multiple fields
- The main one (tracker_field_PERMNAME/ID) contains all languages
- tracker_field_PERMNAME/ID_lang contains one language only (tracker_field_12_fr for example)
Rating and related fields store as multiple fields
- tracker_field_PERMNAME/ID contains the average
- tracker_field_PERMNAME/ID_sum contains the vote totals
- tracker_field_PERMNAME/ID_count contains the number of votes
Items List and Item Link fields
- tracker_field_PERMNAME/ID_text contains the text instead of the IDs of the linked/listed items
Language of the tracker item
- If a language field is set for the tracker item, that language is indexed as the item language, i.e. the language field.
Some used in buildQuery/tiki-searchindex.php (need explanation on whether these are real fields or just helpers):
type: refers to object_type
deep: if this is set, categories will be considered deep-categories
autocomplete: Will search for items with title starting with this
Rebuild search index
The index is stored in temp/unified-index/
. While the rebuild is occuring, a directory temp/unified-index-new/
will appear (This is to permit the existing index to be used until the new one is ready). If temp/unified-index-new/
doesn't disappear after the indexing, something must have gone wrong. You can delete it and try the re-indexing again. You may want to run sh setup.sh to make sure the permissions are OK.
From the Tiki interface
You can visit this url: tiki-admin.php?page=search&rebuild=now and search index will be rebuilt if the site is small. For medium to high load sites, you can do that from the command line.
From the command line
The search index can be rebuilt from the command line, and since Tiki9 can be run using a Cron job where the server runs the command automatically - see Cron Job to Rebuild Search Index.
Below are the commands that may be used to rebuild the index.
All commands below assume you are already in the Tiki root directory.
From Tiki11
From Tiki11, you can also rebuild it using the unified console.php command, with the appropriate parameters. For example:
Basic command
php console.php index:rebuild <-- or --> php console.php i:r
Multitiki sites
For multitiki sites, you can rebuild with commands like:
php console.php index:rebuild --site=site1.example.com php console.php index:rebuild --site=site2.example.com ...
Successful rebuild
If the rebuild is successful a message like the following will be produced (for cron jobs, this can usually be sent to you via email as part of automatically running the command):
Started rebuilding index... Indexation wiki page: 150 forum post: 67 blog post: 412 article: 61 file: 1294 trackeritem: 196 comment: 0 Rebuilding index done X-Powered-By: PHP/5.5.8 Content-type: text/html; charset=utf-8
Troubleshooting
If the rebuild is unsuccessful, instead of the above message you may get a message that indicates there has been an internal server error, or it may say "Rebuild in progress." This may be because the rebuild process uses more memory or takes more time than allowed by the server's php settings. Such settings can be changed on the fly as part of the rebuild command - examples of how to do this are shown below.
Increase memory limit
One way to increase memory is to change the memory_limit php setting as follows (this example changes the memory limit to 4 gigabytes while the rebuild process is running):
php -dmemory_limit=4G console.php i:r --log
You could also direct php to use a specific php.ini file, where there may be a higher memory limit setting or no limit. In this case you would use the -c parameter followed by the path to the php.ini file, as in the example below:
php -c /etc/php5/cli/php.ini console.php i:r --log
Increase maximum execution time
Getting an internal server error may indicate the rebuild process takes longer than the max_execution_time php setting. That can be increased as part of the command as shown below where the max execution time is set to 300 seconds, or 5 minutes. (This command is also increasing the memory limit as described above):
php -dmemory_limit=4G -dmax_execution_time=300 console.php i:r --log
Force rebuild
When the rebuild is unsuccessful with a "Rebuild in progress" message, this usually means that the rebuild failed previously in the middle of the process, leaving a temporary folder called temp/unified-index-new on the server. When a new rebuild is started and the program sees this folder, it thinks there is a rebuild already in progress and will stop. You can either delete this folder before rebuilding again or include the --force parameter in the rebuild command as follows:
php -dmemory_limit=4G -dmax_execution_time=300 console.php i:r --force --log
Related:
Developer Notes
Content sources
These can be found in lib/core/Search/ContentSource
What determines if a field is full-text searchable in the default contents field
For each indexed source, there is a getGlobalFields() function. If the field in question is listed as one of the keys of the returned array, then it can be searched using full-text in the default contents fields, in addition, if the value is set to true, then it is preserved after it is included in the contents field, otherwise it is discarded. For example:
function getGlobalFields() { return array( 'title' => true, // included in contents field and the field itself is preserved 'description' => true, // included in contents field and the field itself is preserved 'article_content' => false, // included in contents field and the field itself is discarded thereafter 'article_topline' => false, // included in contents field and the field itself is discarded thereafter 'article_subtitle' => false, // included in contents field and the field itself is discarded thereafter 'article_author' => false, // included in contents field and the field itself is discarded thereafter ); }
Adding new fields to be indexed
- Remember to add it to the getProvidedFields functions. That function is used to get a list of indexed fields to iterate through for various purposes within Tiki.
Unified Search | UnifiedSearch | Enterprise search | Search Index | SearchIndex | UnifiedIndex | IndexRebuild | Index Rebuild