Indexer
Indexing class for TYPO3 frontend
Table of Contents
Properties
- $conf : array<string|int, mixed>
- $content_md5h : int
- Content of TYPO3 page
- $contentParts : array<string|int, mixed>
- Hash array for files
- $defaultContentArray : array<string|int, mixed>
- Set when crawler is detected (internal)
- $defaultGrList : string
- External parser objects, keys are file extension names. Values are objects with certain methods.
- $enableMetaphoneSearch : bool
- $excludeSections : string
- HTML code blocks to exclude from indexing
- $external_parsers : array<string|int, mixed>
- Supported Extensions for external files
- $externalFileCounter : int
- $file_phash_arr : array<string|int, mixed>
- Hash array, contains phash and phash_grouping
- $flagBitMask : int
- $forceIndexing : bool
- Max number of external files to index.
- $freqMax : float
- $freqRange : int
- $hash : array<string|int, mixed>
- Indexer configuration, coming from TYPO3's system configuration for EXT:indexed_search
- $indexerConfig : array<string|int, mixed>
- Configuration set internally (see init functions for required keys and their meaning)
- $indexExternalUrl_content : string
- Internal log
- $internal_log : array<string|int, mixed>
- $lexerObj : Lexer
- Lexer object for word splitting
- $maxExternalFiles : int
- If set, this tells a minimum limit before a document can be indexed again. This is regardless of mtime.
- $metaphoneContent : string
- $metaphoneObj : DoubleMetaPhoneUtility
- Metaphone object, if any
- $reasons : array<string|int, mixed>
- $storeMetaphoneInfoAsWords : bool
- $tstamp_maxAge : int
- Min/Max times
- $tstamp_minAge : int
- If set, this tells a number of seconds that is the maximum age of an indexed document.
- $wordcount : int
- $timeTracker : TimeTracker
Methods
- __construct() : mixed
- Indexer constructor.
- analyzeBody() : mixed
- Calculates relevant information for bodycontent
- analyzeHeaderinfo() : mixed
- Calculates relevant information for headercontent
- bodyDescription() : string
- Extracts the sample description text from the content array.
- charsetEntity2utf8() : mixed
- Convert character set and HTML entities in the value of input content array keys
- checkContentHash() : mixed
- Check content hash in phash table
- checkExternalDocContentHash() : bool
- Check content hash for external documents Returns TRUE if the document needs to be indexed (that is, there was no result)
- checkMtimeTstamp() : int
- Check the mtime / tstamp of the currently indexed page/file (based on phash) Return positive integer if the page needs to be indexed
- checkWordList() : mixed
- Adds new words to db
- convertHTMLToUtf8() : string
- Converts a HTML document to utf-8
- embracingTags() : bool
- Finds first occurrence of embracing tags and returns the embraced content and the original string with the tag removed in the two passed variables. Returns FALSE if no match found. ie. useful for finding <title> of document or removing <script>-sections
- extractBaseHref() : string
- Extracts the "base href" from content string.
- extractHyperLinks() : array<string|int, mixed>
- Extracts all links to external documents from the HTML content string
- extractLinks() : mixed
- Extract links (hrefs) from HTML content and if indexable media is found, it is indexed.
- fileContentParts() : array<string|int, mixed>
- Creates an array with pointers to divisions of document.
- freqMap() : int
- maps frequency from a real number in [0;1] to an integer in [0;$this->freqRange] with anything above $this->freqMax as 1 and back.
- getHTMLcharset() : string
- Extract the charset value from HTML meta tag.
- getRootLineFields() : mixed
- Adding values for root-line fields.
- getUrlHeaders() : mixed
- Getting HTTP request headers of URL
- indexAnalyze() : array<string|int, mixed>
- Analyzes content to use for indexing,
- indexExternalUrl() : mixed
- Index External URLs HTML content
- indexRegularDocument() : mixed
- Indexing a regular document given as $file (relative to public web path, local file)
- indexTypo3PageContent() : mixed
- Start indexing of the TYPO3 page
- init() : mixed
- Initializes the object.
- initializeExternalParsers() : mixed
- Initialize external parsers
- is_grlist_set() : bool
- Checks if a grlist record has been set for the phash value input (looking at the "real" phash of the current content, not the linked-to phash of the common search result page)
- log_pull() : mixed
- Pull function wrapper for TT logging
- log_push() : mixed
- Push function wrapper for TT logging
- log_setTSlogMessage() : mixed
- Set log message function wrapper for TT logging
- metaphone() : mixed
- Creating metaphone based hash from input word
- processWordsInArrays() : array<string|int, mixed>
- Processing words in the array from split*Content -functions
- readFileContent() : array<string|int, mixed>
- Reads the content of an external file being indexed.
- removeOldIndexedFiles() : mixed
- Removes records for the indexed page, $phash
- removeOldIndexedPages() : mixed
- Removes records for the indexed page, $phash
- setExtHashes() : array<string|int, mixed>
- Get search hash, external files
- setT3Hashes() : mixed
- Get search hash, T3 pages
- splitHTMLContent() : array<string|int, mixed>
- Splits HTML content and returns an associative array, with title, a list of metatags, and a list of words in the body.
- splitRegularContent() : array<string|int, mixed>
- Splits non-HTML content (from external files for instance)
- submit_grlist() : mixed
- Stores gr_list in the database.
- submit_section() : mixed
- Stores section $hash and $hash_t3 are the same for TYPO3 pages, but different when it is external files.
- submitFile_grlist() : mixed
- Stores file gr_list for a file IF it does not exist already
- submitFile_section() : mixed
- Stores file section for a file IF it does not exist
- submitFilePage() : mixed
- Updates db with information about the file
- submitPage() : mixed
- Updates db with information about the page (TYPO3 page, not external media)
- submitWords() : mixed
- Submits RELATIONS between words and phash
- typoSearchTags() : bool
- Removes content that shouldn't be indexed according to TYPO3SEARCH-tags.
- update_grlist() : mixed
- Check if a grlist-entry for this hash exists and if not so, write one.
- updateParsetime() : mixed
- Update parsetime for phash row.
- updateRootline() : mixed
- Update section rootline for the page
- updateSetId() : mixed
- Update SetID of the index_phash record.
- updateTstamp() : mixed
- Update tstamp for a phash row.
- addSpacesToKeywordList() : string
- Makes sure that keywords are space-separated. This is important for their proper displaying as a part of fulltext index.
- createLocalPath() : string
- Checks if the file is local
- createLocalPathFromAbsoluteURL() : string
- Attempts to create a local file path from the absolute URL without schema.
- createLocalPathFromRelativeURL() : string
- Attempts to create a local file path from the relative URL.
- createLocalPathUsingAbsRefPrefix() : string
- Attempts to create a local file path by matching absRefPrefix. This requires TSFE. If TSFE is missing, this function does nothing.
- createLocalPathUsingDomainURL() : string
- Attempts to create a local file path by matching a current request URL.
- isAllowedLocalFile() : bool
- Checks if the path points to the file inside the web site
- isRelativeURL() : bool
- Checks if URL is relative.
Properties
$conf
public
array<string|int, mixed>
$conf
= []
$content_md5h
Content of TYPO3 page
public
int
$content_md5h
$contentParts
Hash array for files
public
array<string|int, mixed>
$contentParts
= []
$defaultContentArray
Set when crawler is detected (internal)
public
array<string|int, mixed>
$defaultContentArray
= ['title' => '', 'description' => '', 'keywords' => '', 'body' => '']
$defaultGrList
External parser objects, keys are file extension names. Values are objects with certain methods.
public
string
$defaultGrList
= '0,-1'
Fe-group list (pages might be indexed separately for each usergroup combination to support search in access limited pages!)
$enableMetaphoneSearch
public
bool
$enableMetaphoneSearch
= false
$excludeSections
HTML code blocks to exclude from indexing
public
string
$excludeSections
= 'script,style'
$external_parsers
Supported Extensions for external files
public
array<string|int, mixed>
$external_parsers
= []
$externalFileCounter
public
int
$externalFileCounter
= 0
$file_phash_arr
Hash array, contains phash and phash_grouping
public
array<string|int, mixed>
$file_phash_arr
= []
$flagBitMask
public
int
$flagBitMask
$forceIndexing
Max number of external files to index.
public
bool
$forceIndexing
= false
$freqMax
public
float
$freqMax
= 0.1
$freqRange
public
int
$freqRange
= 32000
$hash
Indexer configuration, coming from TYPO3's system configuration for EXT:indexed_search
public
array<string|int, mixed>
$hash
= []
$indexerConfig
Configuration set internally (see init functions for required keys and their meaning)
public
array<string|int, mixed>
$indexerConfig
= []
$indexExternalUrl_content
Internal log
public
string
$indexExternalUrl_content
= ''
$internal_log
public
array<string|int, mixed>
$internal_log
= []
$lexerObj
Lexer object for word splitting
public
Lexer
$lexerObj
$maxExternalFiles
If set, this tells a minimum limit before a document can be indexed again. This is regardless of mtime.
public
int
$maxExternalFiles
= 0
$metaphoneContent
public
string
$metaphoneContent
= ''
$metaphoneObj
Metaphone object, if any
public
DoubleMetaPhoneUtility
$metaphoneObj
$reasons
public
array<string|int, mixed>
$reasons
= [-1 => 'mtime matched the document, so no changes detected and no content updated', -2 => 'The minimum age was not exceeded', 1 => 'The configured max-age was exceeded for the document and thus it\'s indexed.', 2 => 'The minimum age was exceed and mtime was set and the mtime was different, so the page was indexed.', 3 => 'The minimum age was exceed, but mtime was not set, so the page was indexed.', 4 => 'Page has never been indexed (is not represented in the index_phash table).']
$storeMetaphoneInfoAsWords
public
bool
$storeMetaphoneInfoAsWords
$tstamp_maxAge
Min/Max times
public
int
$tstamp_maxAge
= 0
$tstamp_minAge
If set, this tells a number of seconds that is the maximum age of an indexed document.
public
int
$tstamp_minAge
= 0
Regardless of mtime the document will be re-indexed if this limit is exceeded.
$wordcount
public
int
$wordcount
= 0
$timeTracker
protected
TimeTracker
$timeTracker
Methods
__construct()
Indexer constructor.
public
__construct() : mixed
analyzeBody()
Calculates relevant information for bodycontent
public
analyzeBody(array<string|int, mixed> &$retArr, array<string|int, mixed> $content) : mixed
Parameters
- $retArr : array<string|int, mixed>
-
Index array, passed by reference
- $content : array<string|int, mixed>
-
Standard content array
analyzeHeaderinfo()
Calculates relevant information for headercontent
public
analyzeHeaderinfo(array<string|int, mixed> &$retArr, array<string|int, mixed> $content, string $key, int $offset) : mixed
Parameters
- $retArr : array<string|int, mixed>
-
Index array, passed by reference
- $content : array<string|int, mixed>
-
Standard content array
- $key : string
-
Key from standard content array
- $offset : int
-
Bit-wise priority to type
bodyDescription()
Extracts the sample description text from the content array.
public
bodyDescription(array<string|int, mixed> $contentArr) : string
Parameters
- $contentArr : array<string|int, mixed>
-
Content array
Return values
string —Description string
charsetEntity2utf8()
Convert character set and HTML entities in the value of input content array keys
public
charsetEntity2utf8(array<string|int, mixed> &$contentArr, string $charset) : mixed
Parameters
- $contentArr : array<string|int, mixed>
-
Standard content array
- $charset : string
-
Charset of the input content (converted to utf-8)
checkContentHash()
Check content hash in phash table
public
checkContentHash() : mixed
Return values
mixed —Returns TRUE if the page needs to be indexed (that is, there was no result), otherwise the phash value (in an array) of the phash record to which the grlist_record should be related!
checkExternalDocContentHash()
Check content hash for external documents Returns TRUE if the document needs to be indexed (that is, there was no result)
public
checkExternalDocContentHash(int $hashGr, int $content_md5h) : bool
Parameters
- $hashGr : int
-
phash value to check (phash_grouping)
- $content_md5h : int
-
Content hash to check
Return values
bool —Returns TRUE if the document needs to be indexed (that is, there was no result)
checkMtimeTstamp()
Check the mtime / tstamp of the currently indexed page/file (based on phash) Return positive integer if the page needs to be indexed
public
checkMtimeTstamp(int $mtime, int $phash) : int
Parameters
- $mtime : int
-
mtime value to test against limits and indexed page (usually this is the mtime of the cached document)
- $phash : int
-
"phash" used to select any already indexed page to see what its mtime is.
Return values
int —Result integer: Generally: <0 = No indexing, >0 = Do indexing (see $this->reasons): -2) Min age was NOT exceeded and so indexing cannot occur. -1) mtime matched so no need to reindex page. 0) N/A 1) Max age exceeded, page must be indexed again. 2) mtime of indexed page doesn't match mtime given for current content and we must index page. 3) No mtime was set, so we will index... 4) No indexed page found, so of course we will index.
checkWordList()
Adds new words to db
public
checkWordList(array<string|int, mixed> $wordListArray) : mixed
Parameters
- $wordListArray : array<string|int, mixed>
-
Word List array (where each word has information about position etc).
convertHTMLToUtf8()
Converts a HTML document to utf-8
public
convertHTMLToUtf8(string $content[, string $charset = '' ]) : string
Parameters
- $content : string
-
HTML content, any charset
- $charset : string = ''
-
Optional charset (otherwise extracted from HTML)
Return values
string —Converted HTML
embracingTags()
Finds first occurrence of embracing tags and returns the embraced content and the original string with the tag removed in the two passed variables. Returns FALSE if no match found. ie. useful for finding <title> of document or removing <script>-sections
public
embracingTags(string $string, string $tagName, string &$tagContent, string &$stringAfter, string &$paramList) : bool
Parameters
- $string : string
-
String to search in
- $tagName : string
-
Tag name, eg. "script
- $tagContent : string
-
Passed by reference: Content inside found tag
- $stringAfter : string
-
Passed by reference: Content after found tag
- $paramList : string
-
Passed by reference: Attributes of the found tag.
Return values
bool —Returns FALSE if tag was not found, otherwise TRUE.
extractBaseHref()
Extracts the "base href" from content string.
public
extractBaseHref(string $html) : string
Parameters
- $html : string
-
Content to analyze
Return values
string —The base href or an empty string if not found
extractHyperLinks()
Extracts all links to external documents from the HTML content string
public
extractHyperLinks(string $html) : array<string|int, mixed>
Parameters
- $html : string
Tags
Return values
array<string|int, mixed> —Array of hyperlinks (keys: tag, href, localPath (empty if not local))
extractLinks()
Extract links (hrefs) from HTML content and if indexable media is found, it is indexed.
public
extractLinks(string $content) : mixed
Parameters
- $content : string
-
HTML content
fileContentParts()
Creates an array with pointers to divisions of document.
public
fileContentParts(string $ext, string $absFile) : array<string|int, mixed>
Parameters
- $ext : string
-
File extension
- $absFile : string
-
Absolute filename (must exist and be validated OK before calling function)
Return values
array<string|int, mixed> —Array of pointers to sections that the document should be divided into
freqMap()
maps frequency from a real number in [0;1] to an integer in [0;$this->freqRange] with anything above $this->freqMax as 1 and back.
public
freqMap(float $freq) : int
Parameters
- $freq : float
-
Frequency
Return values
int —Frequency in range.
getHTMLcharset()
Extract the charset value from HTML meta tag.
public
getHTMLcharset(string $content) : string
Parameters
- $content : string
-
HTML content
Return values
string —The charset value if found.
getRootLineFields()
Adding values for root-line fields.
public
getRootLineFields(array<string|int, mixed> &$fieldArray) : mixed
rl0, rl1 and rl2 are standard. A hook might add more.
Parameters
- $fieldArray : array<string|int, mixed>
-
Field array, passed by reference
getUrlHeaders()
Getting HTTP request headers of URL
public
getUrlHeaders(string $url) : mixed
Parameters
- $url : string
-
The URL
Return values
mixed —If no answer, returns FALSE. Otherwise an array where HTTP headers are keys
indexAnalyze()
Analyzes content to use for indexing,
public
indexAnalyze(array<string|int, mixed> $content) : array<string|int, mixed>
Parameters
- $content : array<string|int, mixed>
-
Standard content array: an array with the keys title,keywords,description and body, which all contain an array of words.
Return values
array<string|int, mixed> —Index Array (whatever that is...)
indexExternalUrl()
Index External URLs HTML content
public
indexExternalUrl(string $externalUrl) : mixed
Parameters
- $externalUrl : string
-
URL, eg. "https://typo3.org/
Tags
indexRegularDocument()
Indexing a regular document given as $file (relative to public web path, local file)
public
indexRegularDocument(string $file[, bool $force = false ][, string $contentTmpFile = '' ][, string $altExtension = '' ]) : mixed
Parameters
- $file : string
-
Relative Filename, relative to public web path. It can also be an absolute path as long as it is inside the lockRootPath (validated with \TYPO3\CMS\Core\Utility\GeneralUtility::isAbsPath()). Finally, if $contentTmpFile is set, this value can be anything, most likely a URL
- $force : bool = false
-
If set, indexing is forced (despite content hashes, mtime etc).
- $contentTmpFile : string = ''
-
Temporary file with the content to read it from (instead of $file). Used when the $file is a URL.
- $altExtension : string = ''
-
File extension for temporary file.
indexTypo3PageContent()
Start indexing of the TYPO3 page
public
indexTypo3PageContent() : mixed
init()
Initializes the object.
public
init([array<string|int, mixed>|null $configuration = null ]) : mixed
Parameters
- $configuration : array<string|int, mixed>|null = null
-
will be used to set $this->conf, otherwise $this->conf MUST be set with proper values prior to this call
initializeExternalParsers()
Initialize external parsers
public
initializeExternalParsers() : mixed
Tags
is_grlist_set()
Checks if a grlist record has been set for the phash value input (looking at the "real" phash of the current content, not the linked-to phash of the common search result page)
public
is_grlist_set(int $phash_x) : bool
Parameters
- $phash_x : int
-
Phash integer to test.
Return values
boollog_pull()
Pull function wrapper for TT logging
public
log_pull() : mixed
log_push()
Push function wrapper for TT logging
public
log_push(string $msg, string $key) : mixed
Parameters
- $msg : string
-
Title to set
- $key : string
-
Key (?)
log_setTSlogMessage()
Set log message function wrapper for TT logging
public
log_setTSlogMessage(string $msg[, int|string $logLevel = LogLevel::INFO ]) : mixed
Parameters
- $msg : string
-
Message to set
- $logLevel : int|string = LogLevel::INFO
metaphone()
Creating metaphone based hash from input word
public
metaphone(string $word[, bool $returnRawMetaphoneValue = false ]) : mixed
Parameters
- $word : string
-
Word to convert
- $returnRawMetaphoneValue : bool = false
-
If set, returns the raw metaphone value (not hashed)
Return values
mixed —Metaphone hash integer (or raw value, string)
processWordsInArrays()
Processing words in the array from split*Content -functions
public
processWordsInArrays(array<string|int, mixed> $contentArr) : array<string|int, mixed>
Parameters
- $contentArr : array<string|int, mixed>
-
Array of content to index, see splitHTMLContent() and splitRegularContent()
Return values
array<string|int, mixed> —Content input array modified so each key is not a unique array of words
readFileContent()
Reads the content of an external file being indexed.
public
readFileContent(string $fileExtension, string $absoluteFileName, string $sectionPointer) : array<string|int, mixed>
The content from the external parser MUST be returned in utf-8!
Parameters
- $fileExtension : string
-
File extension, eg. "pdf", "doc" etc.
- $absoluteFileName : string
-
Absolute filename of file (must exist and be validated OK before calling function)
- $sectionPointer : string
-
Pointer to section (zero for all other than PDF which will have an indication of pages into which the document should be splitted.)
Return values
array<string|int, mixed> —Standard content array (title, description, keywords, body keys)
removeOldIndexedFiles()
Removes records for the indexed page, $phash
public
removeOldIndexedFiles(int $phash) : mixed
Parameters
- $phash : int
-
phash value to flush
removeOldIndexedPages()
Removes records for the indexed page, $phash
public
removeOldIndexedPages(int $phash) : mixed
Parameters
- $phash : int
-
phash value to flush
setExtHashes()
Get search hash, external files
public
setExtHashes(string $file[, array<string|int, mixed> $subinfo = [] ]) : array<string|int, mixed>
Parameters
- $file : string
-
File name / path which identifies it on the server
- $subinfo : array<string|int, mixed> = []
-
Additional content identifying the (subpart of) content. For instance; PDF files are divided into groups of pages for indexing.
Return values
array<string|int, mixed> —Array with "phash_grouping" and "phash" inside.
setT3Hashes()
Get search hash, T3 pages
public
setT3Hashes() : mixed
splitHTMLContent()
Splits HTML content and returns an associative array, with title, a list of metatags, and a list of words in the body.
public
splitHTMLContent(string $content) : array<string|int, mixed>
Parameters
- $content : string
-
HTML content to index. To some degree expected to be made by TYPO3 (ei. splitting the header by ":")
Tags
Return values
array<string|int, mixed> —Array of content, having keys "title", "body", "keywords" and "description" set.
splitRegularContent()
Splits non-HTML content (from external files for instance)
public
splitRegularContent(string $content) : array<string|int, mixed>
Parameters
- $content : string
-
Input content (non-HTML) to index.
Tags
Return values
array<string|int, mixed> —Array of content, having the key "body" set (plus "title", "description" and "keywords", but empty)
submit_grlist()
Stores gr_list in the database.
public
submit_grlist(int $hash, int $phash_x) : mixed
Parameters
- $hash : int
-
Search result record phash
- $phash_x : int
-
Actual phash of current content
Tags
submit_section()
Stores section $hash and $hash_t3 are the same for TYPO3 pages, but different when it is external files.
public
submit_section(int $hash, int $hash_t3) : mixed
Parameters
- $hash : int
-
phash of TYPO3 parent search result record
- $hash_t3 : int
-
phash of the file indexation search record
submitFile_grlist()
Stores file gr_list for a file IF it does not exist already
public
submitFile_grlist(int $hash) : mixed
Parameters
- $hash : int
-
phash value of file
submitFile_section()
Stores file section for a file IF it does not exist
public
submitFile_section(int $hash) : mixed
Parameters
- $hash : int
-
phash value of file
submitFilePage()
Updates db with information about the file
public
submitFilePage(array<string|int, mixed> $hash, string $file, array<string|int, mixed> $subinfo, string $ext, int $mtime, int $ctime, int $size, int $content_md5h, array<string|int, mixed> $contentParts) : mixed
Parameters
- $hash : array<string|int, mixed>
-
Array with phash and phash_grouping keys for file
- $file : string
-
File name
- $subinfo : array<string|int, mixed>
-
Array of "static_page_arguments" for files: This is for instance the page index for a PDF file (other document types it will be a zero)
- $ext : string
-
File extension determining the type of media.
- $mtime : int
-
Modification time of file.
- $ctime : int
-
Creation time of file.
- $size : int
-
Size of file in bytes
- $content_md5h : int
-
Content HASH value.
- $contentParts : array<string|int, mixed>
-
Standard content array (using only title and body for a file)
submitPage()
Updates db with information about the page (TYPO3 page, not external media)
public
submitPage() : mixed
submitWords()
Submits RELATIONS between words and phash
public
submitWords(array<string|int, mixed> $wordList, int $phash) : mixed
Parameters
- $wordList : array<string|int, mixed>
-
Word list array
- $phash : int
-
phash value
typoSearchTags()
Removes content that shouldn't be indexed according to TYPO3SEARCH-tags.
public
typoSearchTags(string &$body) : bool
Parameters
- $body : string
-
HTML Content, passed by reference
Return values
bool —Returns TRUE if a TYPOSEARCH_ tag was found, otherwise FALSE.
update_grlist()
Check if a grlist-entry for this hash exists and if not so, write one.
public
update_grlist(int $phash, int $phash_x) : mixed
Parameters
- $phash : int
-
phash of the search result that should be found
- $phash_x : int
-
The real phash of the current content. The two values are different when a page with userlogin turns out to contain the exact same content as another already indexed version of the page; This is the whole reason for the grlist table in fact...
Tags
updateParsetime()
Update parsetime for phash row.
public
updateParsetime(int $phash, int $parsetime) : mixed
Parameters
- $phash : int
-
phash value.
- $parsetime : int
-
Parsetime value to set.
updateRootline()
Update section rootline for the page
public
updateRootline() : mixed
updateSetId()
Update SetID of the index_phash record.
public
updateSetId(int $phash) : mixed
Parameters
- $phash : int
-
phash value
updateTstamp()
Update tstamp for a phash row.
public
updateTstamp(int $phash[, int $mtime = 0 ]) : mixed
Parameters
- $phash : int
-
phash value
- $mtime : int = 0
-
If set, update the mtime field to this value.
addSpacesToKeywordList()
Makes sure that keywords are space-separated. This is important for their proper displaying as a part of fulltext index.
protected
addSpacesToKeywordList(string $keywordList) : string
Parameters
- $keywordList : string
Tags
Return values
stringcreateLocalPath()
Checks if the file is local
protected
createLocalPath(string $sourcePath) : string
Parameters
- $sourcePath : string
Return values
string —Absolute path to file if file is local, else empty string
createLocalPathFromAbsoluteURL()
Attempts to create a local file path from the absolute URL without schema.
protected
createLocalPathFromAbsoluteURL(string $sourcePath) : string
Parameters
- $sourcePath : string
Return values
stringcreateLocalPathFromRelativeURL()
Attempts to create a local file path from the relative URL.
protected
createLocalPathFromRelativeURL(string $sourcePath) : string
Parameters
- $sourcePath : string
Return values
stringcreateLocalPathUsingAbsRefPrefix()
Attempts to create a local file path by matching absRefPrefix. This requires TSFE. If TSFE is missing, this function does nothing.
protected
createLocalPathUsingAbsRefPrefix(string $sourcePath) : string
Parameters
- $sourcePath : string
Return values
stringcreateLocalPathUsingDomainURL()
Attempts to create a local file path by matching a current request URL.
protected
createLocalPathUsingDomainURL(string $sourcePath) : string
Parameters
- $sourcePath : string
Return values
stringisAllowedLocalFile()
Checks if the path points to the file inside the web site
protected
static isAllowedLocalFile(string $filePath) : bool
Parameters
- $filePath : string
Return values
boolisRelativeURL()
Checks if URL is relative.
protected
static isRelativeURL(string $url) : bool
Parameters
- $url : string