TYPO3 CMS  TYPO3_7-6
TYPO3\CMS\IndexedSearch\FileContentParser Class Reference

Public Member Functions

 __construct ()
 
 initParser ($extension)
 
 softInit ($extension)
 
 searchTypeMediaTitle ($extension)
 
 isMultiplePageExtension ($extension)
 
 readFileContent ($ext, $absFile, $cPKey)
 
 fileContentParts ($ext, $absFile)
 
 splitPdfInfo ($pdfInfoArray)
 
 removeEndJunk ($string)
 
 getIcon ($extension)
 

Public Attributes

 $pdf_mode = -20
 
 $app = []
 
 $ext2itemtype_map = []
 
 $supportedExtensions = []
 
 $pObj
 

Protected Member Functions

 sL ($reference, $useHtmlSpecialChar=false)
 
 setLocaleForServerFileSystem ($resetLocale=false)
 

Protected Attributes

 $langObject
 

Detailed Description

External standard parsers for indexed_search MUST RETURN utf-8 content!

Definition at line 25 of file FileContentParser.php.

Constructor & Destructor Documentation

◆ __construct()

TYPO3\CMS\IndexedSearch\FileContentParser::__construct ( )

Constructs this external parsers object

Definition at line 64 of file FileContentParser.php.

References $GLOBALS.

Member Function Documentation

◆ fileContentParts()

TYPO3\CMS\IndexedSearch\FileContentParser::fileContentParts (   $ext,
  $absFile 
)

Creates an array with pointers to divisions of document.

ONLY for PDF files at this point. All other types will have an array with a single element with the value "0" (zero) coming back.

Parameters
string$extFile extension
string$absFileAbsolute filename (must exist and be validated OK before calling function)
Returns
array Array of pointers to sections that the document should be divided into

Definition at line 746 of file FileContentParser.php.

References $a, TYPO3\CMS\Core\Utility\CommandUtility\exec(), TYPO3\CMS\Core\Utility\MathUtility\forceIntegerInRange(), TYPO3\CMS\IndexedSearch\FileContentParser\setLocaleForServerFileSystem(), and TYPO3\CMS\IndexedSearch\FileContentParser\splitPdfInfo().

◆ getIcon()

TYPO3\CMS\IndexedSearch\FileContentParser::getIcon (   $extension)

Return icon for file extension

Parameters
string$extensionFile extension, lowercase.
Returns
string Relative file reference, resolvable by GeneralUtility::getFileAbsFileName()

Definition at line 823 of file FileContentParser.php.

◆ initParser()

TYPO3\CMS\IndexedSearch\FileContentParser::initParser (   $extension)

Initialize external parser for parsing content.

Parameters
string$extensionFile extension
Returns
bool Returns TRUE if extension is supported/enabled, otherwise FALSE.

Definition at line 76 of file FileContentParser.php.

References $GLOBALS, TYPO3\CMS\Core\Utility\MathUtility\forceIntegerInRange(), TYPO3\CMS\IndexedSearch\FileContentParser\sL(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

◆ isMultiplePageExtension()

TYPO3\CMS\IndexedSearch\FileContentParser::isMultiplePageExtension (   $extension)

Returns TRUE if the input extension (item_type) is a potentially a multi-page extension

Parameters
string$extensionExtension / item_type string
Returns
bool Return TRUE if multi-page

Definition at line 423 of file FileContentParser.php.

◆ readFileContent()

TYPO3\CMS\IndexedSearch\FileContentParser::readFileContent (   $ext,
  $absFile,
  $cPKey 
)

Reads the content of an external file being indexed.

Parameters
string$extFile extension, eg. "pdf", "doc" etc.
string$absFileAbsolute filename of file (must exist and be validated OK before calling function)
string$cPKeyPointer to section (zero for all other than PDF which will have an indication of pages into which the document should be split.)
Returns
array Standard content array (title, description, keywords, body keys)

Definition at line 459 of file FileContentParser.php.

References TYPO3\CMS\Core\Utility\CommandUtility\exec(), TYPO3\CMS\Core\Utility\GeneralUtility\getUrl(), TYPO3\CMS\IndexedSearch\FileContentParser\removeEndJunk(), TYPO3\CMS\IndexedSearch\FileContentParser\setLocaleForServerFileSystem(), TYPO3\CMS\IndexedSearch\FileContentParser\sL(), TYPO3\CMS\IndexedSearch\FileContentParser\splitPdfInfo(), TYPO3\CMS\Core\Utility\GeneralUtility\tempnam(), and TYPO3\CMS\Core\Utility\GeneralUtility\xml2tree().

◆ removeEndJunk()

TYPO3\CMS\IndexedSearch\FileContentParser::removeEndJunk (   $string)

Removes some strange char(12) characters and line breaks that then to occur in the end of the string from external files.

Parameters
string$stringString to clean up
Returns
string String

Definition at line 807 of file FileContentParser.php.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent().

◆ searchTypeMediaTitle()

TYPO3\CMS\IndexedSearch\FileContentParser::searchTypeMediaTitle (   $extension)

Return title of entry in media type selector box.

Parameters
string$extensionFile extension
Returns
string String with label value of entry in media type search selector box (frontend plugin).

Definition at line 288 of file FileContentParser.php.

References $GLOBALS, TYPO3\CMS\IndexedSearch\FileContentParser\sL(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

◆ setLocaleForServerFileSystem()

TYPO3\CMS\IndexedSearch\FileContentParser::setLocaleForServerFileSystem (   $resetLocale = false)
protected

Sets the locale for LC_CTYPE to $TYPO3_CONF_VARS['SYS']['systemLocale'] if $TYPO3_CONF_VARS['SYS']['UTF8filesystem'] is set.

Parameter $resetLocale has to be FALSE and TRUE alternating for all calls.

string $lastLocale Stores the locale used before it is overridden by this method.

Parameters
bool$resetLocaleTRUE resets the locale to $lastLocale.
Returns
void
Exceptions

Definition at line 714 of file FileContentParser.php.

References $GLOBALS.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\fileContentParts(), and TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent().

◆ sL()

TYPO3\CMS\IndexedSearch\FileContentParser::sL (   $reference,
  $useHtmlSpecialChar = false 
)
protected

Wraps the "splitLabel function" of the language object.

Parameters
string$reference,Reference/key of the label
bool$useHtmlSpecialChar,Convert special chars to HTML entities (default: FALSE)
Returns
string The label of the reference/key to be fetched

Definition at line 441 of file FileContentParser.php.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\initParser(), TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent(), and TYPO3\CMS\IndexedSearch\FileContentParser\searchTypeMediaTitle().

◆ softInit()

TYPO3\CMS\IndexedSearch\FileContentParser::softInit (   $extension)

Initialize external parser for backend modules Doesn't evaluate if parser is configured right - more like returning POSSIBLE supported extensions (for showing icons etc) in backend and frontend plugin

Parameters
string$extensionFile extension to initialize for.
Returns
bool Returns TRUE if the extension is supported and enabled, otherwise FALSE.

Definition at line 245 of file FileContentParser.php.

◆ splitPdfInfo()

TYPO3\CMS\IndexedSearch\FileContentParser::splitPdfInfo (   $pdfInfoArray)

Analysing PDF info into a useable format.

Parameters
array$pdfInfoArrayArray of PDF content, coming from the pdfinfo tool
Returns
array Result array private
See also
fileContentParts()

Definition at line 787 of file FileContentParser.php.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\fileContentParts(), and TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent().

Member Data Documentation

◆ $app

TYPO3\CMS\IndexedSearch\FileContentParser::$app = []

Definition at line 39 of file FileContentParser.php.

◆ $ext2itemtype_map

TYPO3\CMS\IndexedSearch\FileContentParser::$ext2itemtype_map = []

Definition at line 44 of file FileContentParser.php.

◆ $langObject

TYPO3\CMS\IndexedSearch\FileContentParser::$langObject
protected

Definition at line 59 of file FileContentParser.php.

◆ $pdf_mode

TYPO3\CMS\IndexedSearch\FileContentParser::$pdf_mode = -20

Definition at line 34 of file FileContentParser.php.

◆ $pObj

TYPO3\CMS\IndexedSearch\FileContentParser::$pObj

Definition at line 54 of file FileContentParser.php.

◆ $supportedExtensions

TYPO3\CMS\IndexedSearch\FileContentParser::$supportedExtensions = []

Definition at line 49 of file FileContentParser.php.