Parsed
extends SearchApi
in package
implements
SearchApiInterface
Search index API.
Table of Contents
Interfaces
- SearchApiInterface
- Interface SearchApiInterface
Constants
- BATCH_SIZE = 250
- MAX_LENGTH = 100
- MIN_WORD_LENGTH = 2
Properties
- $admin_subactions : array<string|int, mixed>
- $blacklisted_words : array<string|int, mixed>
- $compressed_params : string
- $default_params : array<string|int, mixed>
- $errors : array<string|int, mixed>
- $excludedIndexWords : array<string|int, mixed>
- $excludedPhrases : array<string|int, mixed>
- $excludedSubjectWords : array<string|int, mixed>
- $excludedWords : array<string|int, mixed>
- $humungousTopicPosts : int
- $ignored : array<string|int, mixed>
- $is_supported : bool
- $loadedApi : object
- $marked : array<string|int, mixed>
- $maxMembersToSearch : int
- $maxMessageResults : int
- $min_smf_version : string
- $params : array<string|int, mixed>
- $participants : array<string|int, mixed>
- $recentPercentage : float
- $results : array<string|int, mixed>
- $searchArray : array<string|int, mixed>
- $searchWords : array<string|int, mixed>
- $sort_columns : array<string|int, mixed>
- $status : string
- $version_compatible : string
- $weight_factors : array<string|int, mixed>
- $wildcard_words : array<string|int, mixed>
- $ageMinMsg : int
- $ageRecentMsg : int
- $boardQuery : string
- $maxMsgID : int
- $memberlist : array<string|int, mixed>
- $minMsgID : int
- $query_match_type : string
- $userQuery : string
- $weight : array<string|int, mixed>
- $weight_total : int
- $backcompat : array<string|int, mixed>
- $size : int
Methods
- __construct() : mixed
- Constructor.
- build() : void
- Builds the parsed index from the content of the messages table, and saves it in the log_search_dictionary and log_search_parsed tables.
- compressParams() : string
- Compresses $this->params to a string for use as an URL parameter.
- detect() : array<string|int, mixed>
- Get the installed Search API implementations.
- exportStatic() : void
- Provides a way to export a class's public static properties and methods to global namespace.
- formContext() : void
- Lets APIs interact with Utils::$context when setting up the search form.
- getAdminSubactions() : array<string|int, mixed>
- Gets info about sub-actions to support in the admin panel for this API.
- getDescription() : string
- Returns the expected Lang::$txt key for this API's localized description.
- getLabel() : string
- Returns the expected Lang::$txt key for this API's localized label.
- getLangStopWords() : array<string|int, mixed>
- Gets a list of all the words in Lang::$txt['search_stopwords'] for all installed language packs.
- getQueryParams() : array<string|int, mixed>
- Returns a copy of $this->params with a few extra pieces of data added in.
- getSize() : int
- Gets the size, in bytes, of this API's search index.
- getStatus() : string
- Gets whether the index for this API exists.
- indexedWordQuery() : mixed
- Search for indexed words.
- initializeSearch() : void
- Sets whatever properties are necessary in order to perform the search.
- isValid() : bool
- Whether this method is valid for implementation or not.
- load() : SearchApiInterface
- Creates a search API and returns the object.
- postCreated() : void
- Callback when a post is created.
- postModified() : void
- Callback when a post is modified.
- postRemoved() : void
- Callback when a post is removed.
- prepareIndexes() : void
- Callback while preparing indexes for searching.
- remove() : void
- Deletes the log_search_dictionary and log_search_parsed tables and resets to standard search method.
- resultsContext() : void
- Lets APIs interact with Utils::$context when setting up the results page.
- searchQuery() : mixed
- Callback for actually performing the search query.
- searchSort() : int
- Callback function for usort used to sort the fulltext results.
- setParticipants() : void
- Figures out which search result topics the user participated in.
- supportsMethod() : bool
- Check whether the specific search operation can be performed by this API.
- topicMerge() : void
- Callback when a topic is merged.
- topicsMoved() : void
- Callback when a topic is moved.
- topicSplit() : void
- Callback when a topic is merged.
- topicsRemoved() : void
- Callback when a topic is removed.
- updateStopwordsSetting() : void
- Updates Config::$modSettings['search_stopwords_parsed'].
- accentSensitive() : bool
- Gets whether the current collation is accent sensitive.
- calculateWeight() : void
- Calculates the weight values to use when organizing results by relevance.
- createTables() : void
- Creates the log_search_parsed and log_search_dictionary tables.
- escapeAccents() : string
- Escapes accents in the string as HTML entities, but only if the database collation ignores accents.
- escapeSqlRegex() : string
- Uses regex to escape SQL in the given string
- getWords() : array<string|int, mixed>
- Extracts words from a string and returns them along with info about where each extracted word occurred in the list of extracted words.
- prepareString() : string
- Gets rid of all the BBCode and HTML in a string, folds the case of all characters, and applies compatibility composition Unicode normalization.
- removeAccents() : string
- Removes accents in the string.
- save() : void
- Saves word data to log_search_dictionary and log_search_parsed tables.
- searchSubjectAndMessage() : mixed
- Handles searching both in the subject and message text
- searchSubjectOnly() : void
- Handles searching for posts in the subject only
- setBlacklistedWords() : void
- Like SeachApi::setBlacklistedWords(), except that this doesn't blacklist BBCode tags.
- setBoardQuery() : void
- Sets $this->boardQuery based on the given params
- setMsgBounds() : void
- Finds the lowest and highest message ID based on the given min and/or max age
- setParams() : void
- Figures out the values for $this->params and related properties.
- setSearchTerms() : void
- Populates $this->searchArray, $this->excludedWords, etc.
- setSort() : void
- Get the sorting parameters right. Default to sort by relevance descending.
- setUserQuery() : void
- Sets $this->userQuery based on given params
- wordBoundaryWrapper() : string
- Wraps the given string in regex to set a word boundary
Constants
BATCH_SIZE
public
int
BATCH_SIZE
= 250
How many messages to process at once in self::build().
MAX_LENGTH
public
mixed
MAX_LENGTH
= 100
MIN_WORD_LENGTH
public
int
MIN_WORD_LENGTH
= 2
The minimum word length. Words shorter than this will not be indexed.
Note that this restriction only applies to words made out of letters, diacritical marks, and numbers. It does not apply not to emojis, etc.
Properties
$admin_subactions
public
static array<string|int, mixed>
$admin_subactions
= ['build' => ['sa' => 'createparsed', 'func' => __CLASS__ . '::build'], 'remove' => ['sa' => 'removeparsed', 'func' => __CLASS__ . '::remove']]
Sub-actions to add for SMF\Actions\Admin\Search::$subactions.
$blacklisted_words
public
array<string|int, mixed>
$blacklisted_words
= []
Words to ignore when searching.
Populated with the contents of:
- Lang::$txt['search_stopwords'] for all installed languages.
- Config::$modSettings['search_stopwords']
- Config::$modSettings['search_stopwords_custom']
- All known BBCode tags
$compressed_params
public
string
$compressed_params
URL-safe variant of a Base64 string representation of $this->params. The encoded string only includes values where $this->params differs from the defaults.
$default_params
public
static array<string|int, mixed>
$default_params
= ['advanced' => false, 'brd' => [], 'maxage' => 9999, 'minage' => 0, 'search' => '', 'searchtype' => 1, 'show_complete' => false, 'sort' => null, 'sort_dir' => null, 'subject_only' => false, 'topic' => '', 'userspec' => '']
Default values for $this->params.
$errors
public
array<string|int, mixed>
$errors
= []
Records errors encountered while preparing to search.
$excludedIndexWords
public
array<string|int, mixed>
$excludedIndexWords
= []
Terms to exclude when building a search index.
$excludedPhrases
public
array<string|int, mixed>
$excludedPhrases
= []
Phrases to exclude from the search.
$excludedSubjectWords
public
array<string|int, mixed>
$excludedSubjectWords
= []
Terms to exclude from a subject search.
$excludedWords
public
array<string|int, mixed>
$excludedWords
= []
Terms that the user wants to exclude from the search.
$humungousTopicPosts
public
int
$humungousTopicPosts
= 200
Used to calculate relevance. Specifically, caps the weight assigned to huge topics so that they do not completely overwhelm the search results.
$ignored
public
array<string|int, mixed>
$ignored
= []
User-supplied search terms that we have chosen to ignore.
$is_supported
public
bool
$is_supported
= true
Whether or not it's supported.
$loadedApi
public
static object
$loadedApi
The loaded search API.
For backward compatibility, also referenced as global $searchAPI.
$marked
public
array<string|int, mixed>
$marked
= []
Array of replacements for highlighting.
$maxMembersToSearch
public
int
$maxMembersToSearch
= 500
If more than this many users match the 'userspec' param, don't bother searching by name at all.
$maxMessageResults
public
int
$maxMessageResults
= 0
Upper limit when performing an indexedWordQuery(). Zero for no limit.
$min_smf_version
public
string
$min_smf_version
= '3.0 Alpha 1'
The minimum SMF version that this will work with.
$params
public
array<string|int, mixed>
$params
= []
The supplied search parameters. Any unsupplied values will be set to the values in self::$default_params.
$participants
public
array<string|int, mixed>
$participants
= []
Info about who participated in the search result's topic. Keys are topic IDs, values are booleans about whether the current user has posted anything in that topic.
$recentPercentage
public
float
$recentPercentage
= 0.3
Used to calculate relevance. Specifically, controls the weight assigned for how recent the post is.
$results
public
array<string|int, mixed>
$results
= []
The results of the search. Keys are message IDs, values are arrays of relevance data.
$searchArray
public
array<string|int, mixed>
$searchArray
= []
The list of terms to search for.
$searchWords
public
array<string|int, mixed>
$searchWords
= []
Structured list of search term data.
$sort_columns
public
array<string|int, mixed>
$sort_columns
= ['relevance', 'num_replies', 'id_msg']
Names of columns that results can be sorted by.
$status
public
string
$status
The status of this API's index.
Either 'exists', 'partial', or 'none'.
$version_compatible
public
string
$version_compatible
= '3.0.999'
The maximum SMF version that this will work with.
$weight_factors
public
static array<string|int, mixed>
$weight_factors
= ['frequency' => ['search' => 'COUNT(*) / (MAX(t.num_replies) + 1)', 'results' => '(t.num_replies + 1)'], 'age' => ['search' => 'CASE WHEN MAX(m.id_msg) < {int:min_msg} THEN 0 ELSE (MAX(m.id_msg) - {int:min_msg}) / {int:recent_message} END', 'results' => 'CASE WHEN t.id_first_msg < {int:min_msg} THEN 0 ELSE (t.id_first_msg - {int:min_msg}) / {int:recent_message} END'], 'length' => ['search' => 'CASE WHEN MAX(t.num_replies) < {int:huge_topic_posts} THEN MAX(t.num_replies) / {int:huge_topic_posts} ELSE 1 END', 'results' => 'CASE WHEN t.num_replies < {int:huge_topic_posts} THEN t.num_replies / {int:huge_topic_posts} ELSE 1 END'], 'subject' => ['search' => 0, 'results' => 0], 'first_message' => ['search' => 'CASE WHEN MIN(m.id_msg) = MAX(t.id_first_msg) THEN 1 ELSE 0 END'], 'sticky' => ['search' => 'MAX(t.is_sticky)', 'results' => 't.is_sticky']]
Info about how to weigh different factors when searching for relevant results.
$wildcard_words
public
array<string|int, mixed>
$wildcard_words
= []
Search terms that had a * wildcard in them.
$ageMinMsg
protected
int
$ageMinMsg
= 0
Messages with IDs less than this will get a 0 for the age weight factor.
$ageRecentMsg
protected
int
$ageRecentMsg
= 0
ID of the most recent message considered for the age weight factor.
$boardQuery
protected
string
$boardQuery
= ''
SQL query string to filter results by board.
$maxMsgID
protected
int
$maxMsgID
= 0
Messages with IDs greater than this will be ignored in the search.
$memberlist
protected
array<string|int, mixed>
$memberlist
= []
IDs of members to filter our results by.
$minMsgID
protected
int
$minMsgID
= 0
Messages with IDs less than this will be ignored in the search.
$query_match_type
protected
string
$query_match_type
= 'LIKE'
The SQL match function to use. If 'RLIKE', search will be performed using regular expressions. If 'LIKE', search will be performed using simple string matching.
$userQuery
protected
string
$userQuery
= ''
SQL query string to filter results by author.
$weight
protected
array<string|int, mixed>
$weight
= []
Calculated weight factors.
$weight_total
protected
int
$weight_total
= 0
Weight factor total. Used to ensure that calculated factors are given the correct percentage.
$backcompat
private
static array<string|int, mixed>
$backcompat
= ['prop_names' => ['loadedApi' => 'searchAPI']]
BackwardCompatibility settings for this class.
$size
private
int
$size
The size of the index.
Methods
__construct()
Constructor.
public
__construct() : mixed
build()
Builds the parsed index from the content of the messages table, and saves it in the log_search_dictionary and log_search_parsed tables.
public
static build([int $start_id = 1 ]) : void
Operates in batches to avoid running out of time.
Parameters
- $start_id : int = 1
-
The ID of the message we should start with.
compressParams()
Compresses $this->params to a string for use as an URL parameter.
public
compressParams() : string
Return values
string —URL-safe variant of a Base64 string.
detect()
Get the installed Search API implementations.
public
final static detect() : array<string|int, mixed>
Return values
array<string|int, mixed> —Info about the detected search APIs.
exportStatic()
Provides a way to export a class's public static properties and methods to global namespace.
public
static exportStatic() : void
To do so:
- Use this trait in the class.
- At the END of the class's file, call its exportStatic() method.
Although it might not seem that way at first glance, this approach conforms to section 2.3 of PSR 1, since executing this method is simply a dynamic means of declaring functions when the file is included; it has no other side effects.
Regarding the $backcompat items:
A class's static properties are not exported to global variables unless explicitly included in $backcompat['prop_names'].
$backcompat['prop_names'] is a simple array where the keys are the names of one or more of a class's static properties, and the values are the names of global variables. In each case, the global variable will be set to a reference to the static property. Static properties that are not named in this array will not be exported.
Adding non-static properties to the $backcompat arrays will produce runtime errors. It is the responsibility of the developer to make sure not to do this.
formContext()
Lets APIs interact with Utils::$context when setting up the search form.
public
formContext() : void
getAdminSubactions()
Gets info about sub-actions to support in the admin panel for this API.
public
getAdminSubactions([string|null $type = null ]) : array<string|int, mixed>
Parameters
- $type : string|null = null
Return values
array<string|int, mixed> —Info about sub-actions.
getDescription()
Returns the expected Lang::$txt key for this API's localized description.
public
getDescription() : string
Return values
string —Localized description for this API.
getLabel()
Returns the expected Lang::$txt key for this API's localized label.
public
getLabel() : string
Return values
string —Localized label for this API.
getLangStopWords()
Gets a list of all the words in Lang::$txt['search_stopwords'] for all installed language packs.
public
final static getLangStopWords() : array<string|int, mixed>
Return values
array<string|int, mixed>getQueryParams()
Returns a copy of $this->params with a few extra pieces of data added in.
public
getQueryParams() : array<string|int, mixed>
This exists only for the sake of backward compatibility; mods extending this class can already access the included data directly.
Return values
array<string|int, mixed> —Data about this search query.
getSize()
Gets the size, in bytes, of this API's search index.
public
getSize() : int
Return values
int —Size of the index.
getStatus()
Gets whether the index for this API exists.
public
getStatus() : string
Return values
string —Either 'exists', 'partial', 'none', or null for APIs that don't use an index.
indexedWordQuery()
Search for indexed words.
public
indexedWordQuery(array<string|int, mixed> $words, array<string|int, mixed> $search_data) : mixed
Parameters
- $words : array<string|int, mixed>
-
An array of words
- $search_data : array<string|int, mixed>
-
An array of search data
initializeSearch()
Sets whatever properties are necessary in order to perform the search.
public
initializeSearch() : void
isValid()
Whether this method is valid for implementation or not.
public
isValid() : bool
Return values
bool —Whether or not this method is valid
load()
Creates a search API and returns the object.
public
final static load() : SearchApiInterface
Return values
SearchApiInterface —An instance of the search API interface.
postCreated()
Callback when a post is created.
public
postCreated(array<string|int, mixed> &$msgOptions, array<string|int, mixed> &$topicOptions, array<string|int, mixed> &$posterOptions) : void
Parameters
- $msgOptions : array<string|int, mixed>
-
An array of post data
- $topicOptions : array<string|int, mixed>
-
An array of topic data
- $posterOptions : array<string|int, mixed>
-
An array of info about the person who made this post
postModified()
Callback when a post is modified.
public
postModified(array<string|int, mixed> &$msgOptions, array<string|int, mixed> &$topicOptions, array<string|int, mixed> &$posterOptions) : void
Parameters
- $msgOptions : array<string|int, mixed>
-
An array of post data
- $topicOptions : array<string|int, mixed>
-
An array of topic data
- $posterOptions : array<string|int, mixed>
-
An array of info about the person who made this post
postRemoved()
Callback when a post is removed.
public
postRemoved(int $id_msg) : void
Parameters
- $id_msg : int
-
The ID of the post that was removed
prepareIndexes()
Callback while preparing indexes for searching.
public
prepareIndexes(string $word, array<string|int, mixed> &$wordsSearch, array<string|int, mixed> &$wordsExclude, bool $isExcluded) : void
Parameters
- $word : string
-
A word to index
- $wordsSearch : array<string|int, mixed>
-
Search words
- $wordsExclude : array<string|int, mixed>
-
Words to exclude
- $isExcluded : bool
-
Whether the specified word should be excluded
remove()
Deletes the log_search_dictionary and log_search_parsed tables and resets to standard search method.
public
static remove() : void
resultsContext()
Lets APIs interact with Utils::$context when setting up the results page.
public
resultsContext() : void
searchQuery()
Callback for actually performing the search query.
public
searchQuery(array<string|int, mixed> $query_params, array<string|int, mixed> $searchWords, array<string|int, mixed> $excludedIndexWords, array<string|int, mixed> &$participants, array<string|int, mixed> &$searchArray) : mixed
Parameters
- $query_params : array<string|int, mixed>
-
An array of parameters for the query
- $searchWords : array<string|int, mixed>
-
The words that were searched for
- $excludedIndexWords : array<string|int, mixed>
-
Indexed words that should be excluded
- $participants : array<string|int, mixed>
- $searchArray : array<string|int, mixed>
searchSort()
Callback function for usort used to sort the fulltext results.
public
searchSort(string $a, string $b) : int
Parameters
- $a : string
-
Word A
- $b : string
-
Word B
Return values
int —An integer indicating how the words should be sorted
setParticipants()
Figures out which search result topics the user participated in.
public
setParticipants() : void
supportsMethod()
Check whether the specific search operation can be performed by this API.
public
supportsMethod(string $methodName[, array<string|int, mixed> $query_params = [] ]) : bool
Parameters
- $methodName : string
-
The method
- $query_params : array<string|int, mixed> = []
-
Any parameters for the query
Return values
bool —Whether or not the specified method is supported
topicMerge()
Callback when a topic is merged.
public
topicMerge(int $id_topic, array<string|int, mixed> $topics, array<string|int, mixed> $affected_msgs, string|null $subject) : void
Parameters
- $id_topic : int
-
The ID of the topic that messages where merged into
- $topics : array<string|int, mixed>
-
The ID(s) of the merged topic(s)
- $affected_msgs : array<string|int, mixed>
- $subject : string|null
topicsMoved()
Callback when a topic is moved.
public
topicsMoved(array<string|int, mixed> $topics, int $board_to) : void
Parameters
- $topics : array<string|int, mixed>
-
The ID(s) of the moved topic(s)
- $board_to : int
-
The board that the topics were moved to
topicSplit()
Callback when a topic is merged.
public
topicSplit(int $id_topic, array<string|int, mixed> $affected_msgs) : void
Parameters
- $id_topic : int
-
The ID of the topic that messages where merged into
- $affected_msgs : array<string|int, mixed>
topicsRemoved()
Callback when a topic is removed.
public
topicsRemoved(array<string|int, mixed> $topics) : void
Parameters
- $topics : array<string|int, mixed>
-
The ID(s) of the removed topic(s)
updateStopwordsSetting()
Updates Config::$modSettings['search_stopwords_parsed'].
public
static updateStopwordsSetting() : void
accentSensitive()
Gets whether the current collation is accent sensitive.
protected
accentSensitive() : bool
Return values
boolcalculateWeight()
Calculates the weight values to use when organizing results by relevance.
protected
calculateWeight() : void
createTables()
Creates the log_search_parsed and log_search_dictionary tables.
protected
static createTables() : void
escapeAccents()
Escapes accents in the string as HTML entities, but only if the database collation ignores accents.
protected
escapeAccents(string $string) : string
Parameters
- $string : string
-
The string.
Return values
string —A version of $string with (possibly) escaped accents.
escapeSqlRegex()
Uses regex to escape SQL in the given string
protected
escapeSqlRegex(string $str) : string
Parameters
- $str : string
-
The string to escape
Return values
string —The escaped string
getWords()
Extracts words from a string and returns them along with info about where each extracted word occurred in the list of extracted words.
protected
getWords(string $string[, bool $filter_blacklist = true ]) : array<string|int, mixed>
For example, if $string were 'foo bar foo', the returned value would be ['foo' => [0, 2], 'bar' => [1]]
Parameters
- $string : string
-
A string.
- $filter_blacklist : bool = true
-
Whether to filter out blacklisted words. Default: true.
Return values
array<string|int, mixed> —Each word and its positions in the list of extracted words.
prepareString()
Gets rid of all the BBCode and HTML in a string, folds the case of all characters, and applies compatibility composition Unicode normalization.
protected
prepareString(string $string) : string
Parameters
- $string : string
-
A string.
Return values
string —Version of the string prepared for indexing.
removeAccents()
Removes accents in the string.
protected
removeAccents(string $string) : string
Parameters
- $string : string
-
The string.
Return values
string —A version of $string without accents.
save()
Saves word data to log_search_dictionary and log_search_parsed tables.
protected
save(array<string|int, mixed> $word_data) : void
Parameters
- $word_data : array<string|int, mixed>
-
Data about the words.
searchSubjectAndMessage()
Handles searching both in the subject and message text
protected
searchSubjectAndMessage() : mixed
searchSubjectOnly()
Handles searching for posts in the subject only
protected
searchSubjectOnly() : void
setBlacklistedWords()
Like SeachApi::setBlacklistedWords(), except that this doesn't blacklist BBCode tags.
protected
setBlacklistedWords() : void
Why? Because this API builds its index from the parsed version of the messages rather than the unparsed version, which means that raw BBCodes are never part of the indexed strings and therefore don't need to be filtered out of the search terms.
setBoardQuery()
Sets $this->boardQuery based on the given params
protected
setBoardQuery() : void
setMsgBounds()
Finds the lowest and highest message ID based on the given min and/or max age
protected
setMsgBounds() : void
setParams()
Figures out the values for $this->params and related properties.
protected
setParams() : void
setSearchTerms()
Populates $this->searchArray, $this->excludedWords, etc.
protected
setSearchTerms() : void
setSort()
Get the sorting parameters right. Default to sort by relevance descending.
protected
setSort() : void
setUserQuery()
Sets $this->userQuery based on given params
protected
setUserQuery() : void
wordBoundaryWrapper()
Wraps the given string in regex to set a word boundary
protected
wordBoundaryWrapper(string $str) : string
Parameters
- $str : string
-
The string