TYPO3 CMS  TYPO3_6-2
TYPO3\CMS\IndexedSearch\FileContentParser Class Reference
Inheritance diagram for TYPO3\CMS\IndexedSearch\FileContentParser:
tx_indexed_search_extparse

Public Member Functions

 __construct ()
 
 initParser ($extension)
 
 softInit ($extension)
 
 searchTypeMediaTitle ($extension)
 
 isMultiplePageExtension ($extension)
 
 readFileContent ($ext, $absFile, $cPKey)
 
 fileContentParts ($ext, $absFile)
 
 splitPdfInfo ($pdfInfoArray)
 
 removeEndJunk ($string)
 
 getIcon ($extension)
 

Public Attributes

 $pdf_mode = -20
 
 $app = array()
 
 $ext2itemtype_map = array()
 
 $supportedExtensions = array()
 
 $pObj
 

Protected Member Functions

 sL ($reference, $useHtmlSpecialChar=FALSE)
 
 setLocaleForServerFileSystem ($resetLocale=FALSE)
 

Protected Attributes

 $langObject
 

Detailed Description

This file is part of the TYPO3 CMS project.

It is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, either version 2 of the License, or any later version.

For the full copyright and license information, please read the LICENSE.txt file that was distributed with this source code.

The TYPO3 project - inspiring people to share! External standard parsers for indexed_search

Author
Kasper Skårhøj kaspe.nosp@m.rYYY.nosp@m.Y@typ.nosp@m.o3.c.nosp@m.om Olivier Simah nonam.nosp@m.e_pa.nosp@m.ris@y.nosp@m.ahoo.nosp@m..fr External standard parsers for indexed_search MUST RETURN utf-8 content!
Kasper Skårhøj kaspe.nosp@m.rYYY.nosp@m.Y@typ.nosp@m.o3.c.nosp@m.om

Definition at line 28 of file FileContentParser.php.

Constructor & Destructor Documentation

◆ __construct()

TYPO3\CMS\IndexedSearch\FileContentParser::__construct ( )

Constructs this external parsers object

Definition at line 65 of file FileContentParser.php.

References $GLOBALS, and TYPO3_MODE.

Member Function Documentation

◆ fileContentParts()

TYPO3\CMS\IndexedSearch\FileContentParser::fileContentParts (   $ext,
  $absFile 
)

Creates an array with pointers to divisions of document.

ONLY for PDF files at this point. All other types will have an array with a single element with the value "0" (zero) coming back.

Parameters
string$extFile extension
string$absFileAbsolute filename (must exist and be validated OK before calling function)
Returns
array Array of pointers to sections that the document should be divided into
Todo:
Define visibility

Definition at line 686 of file FileContentParser.php.

References TYPO3\CMS\Core\Utility\CommandUtility\exec(), TYPO3\CMS\Core\Utility\MathUtility\forceIntegerInRange(), TYPO3\CMS\IndexedSearch\FileContentParser\setLocaleForServerFileSystem(), and TYPO3\CMS\IndexedSearch\FileContentParser\splitPdfInfo().

◆ getIcon()

TYPO3\CMS\IndexedSearch\FileContentParser::getIcon (   $extension)

Return icon for file extension

Parameters
stringFile extension, lowercase.
Returns
string Relative file reference, resolvable by ::getFileAbsFileName()
Todo:
Define visibility

Definition at line 763 of file FileContentParser.php.

◆ initParser()

TYPO3\CMS\IndexedSearch\FileContentParser::initParser (   $extension)

Initialize external parser for parsing content.

Parameters
stringFile extension
Returns
boolean Returns TRUE if extension is supported/enabled, otherwise FALSE.
Todo:
Define visibility

Definition at line 77 of file FileContentParser.php.

References $GLOBALS, TYPO3\CMS\Core\Utility\MathUtility\forceIntegerInRange(), TYPO3\CMS\IndexedSearch\FileContentParser\sL(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

◆ isMultiplePageExtension()

TYPO3\CMS\IndexedSearch\FileContentParser::isMultiplePageExtension (   $extension)

Returns TRUE if the input extension (item_type) is a potentially a multi-page extension

Parameters
stringExtension / item_type string
Returns
boolean Return TRUE if multi-page
Todo:
Define visibility

Definition at line 409 of file FileContentParser.php.

◆ readFileContent()

TYPO3\CMS\IndexedSearch\FileContentParser::readFileContent (   $ext,
  $absFile,
  $cPKey 
)

Reads the content of an external file being indexed.

Parameters
string$extFile extension, eg. "pdf", "doc" etc.
string$absFileAbsolute filename of file (must exist and be validated OK before calling function)
string$cPKeyPointer to section (zero for all other than PDF which will have an indication of pages into which the document should be split.)
Returns
array Standard content array (title, description, keywords, body keys)
Todo:
Define visibility

Definition at line 443 of file FileContentParser.php.

References TYPO3\CMS\Core\Utility\CommandUtility\exec(), TYPO3\CMS\Core\Utility\GeneralUtility\getUrl(), TYPO3\CMS\IndexedSearch\FileContentParser\removeEndJunk(), TYPO3\CMS\IndexedSearch\FileContentParser\setLocaleForServerFileSystem(), TYPO3\CMS\IndexedSearch\FileContentParser\sL(), TYPO3\CMS\IndexedSearch\FileContentParser\splitPdfInfo(), TYPO3\CMS\Core\Utility\GeneralUtility\tempnam(), and TYPO3\CMS\Core\Utility\GeneralUtility\xml2tree().

◆ removeEndJunk()

TYPO3\CMS\IndexedSearch\FileContentParser::removeEndJunk (   $string)

Removes some strange char(12) characters and line breaks that then to occur in the end of the string from external files.

Parameters
stringString to clean up
Returns
string String
Todo:
Define visibility

Definition at line 747 of file FileContentParser.php.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent().

◆ searchTypeMediaTitle()

TYPO3\CMS\IndexedSearch\FileContentParser::searchTypeMediaTitle (   $extension)

Return title of entry in media type selector box.

Parameters
stringFile extension
Returns
string String with label value of entry in media type search selector box (frontend plugin).
Todo:
Define visibility

Definition at line 291 of file FileContentParser.php.

References $GLOBALS, TYPO3\CMS\IndexedSearch\FileContentParser\sL(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

◆ setLocaleForServerFileSystem()

TYPO3\CMS\IndexedSearch\FileContentParser::setLocaleForServerFileSystem (   $resetLocale = FALSE)
protected

Sets the locale for LC_CTYPE to $TYPO3_CONF_VARS['SYS']['systemLocale'] if $TYPO3_CONF_VARS['SYS']['UTF8filesystem'] is set.

Parameter $resetLocale has to be FALSE and TRUE alternating for all calls.

string $lastLocale Stores the locale used before it is overridden by this method.

Parameters
boolean$resetLocaleTRUE resets the locale to $lastLocale.
Returns
void
Exceptions

Definition at line 654 of file FileContentParser.php.

References $GLOBALS.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\fileContentParts(), and TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent().

◆ sL()

TYPO3\CMS\IndexedSearch\FileContentParser::sL (   $reference,
  $useHtmlSpecialChar = FALSE 
)
protected

Wraps the "splitLabel function" of the language object.

Parameters
string$reference,Reference/key of the label
boolean$useHtmlSpecialChar,Convert special chars to HTML entities (default: FALSE)
Returns
string The label of the reference/key to be fetched

Definition at line 425 of file FileContentParser.php.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\initParser(), TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent(), and TYPO3\CMS\IndexedSearch\FileContentParser\searchTypeMediaTitle().

◆ softInit()

TYPO3\CMS\IndexedSearch\FileContentParser::softInit (   $extension)

Initialize external parser for backend modules Doesn't evaluate if parser is configured right - more like returning POSSIBLE supported extensions (for showing icons etc) in backend and frontend plugin

Parameters
stringFile extension to initialize for.
Returns
boolean Returns TRUE if the extension is supported and enabled, otherwise FALSE.
Todo:
Define visibility

Definition at line 237 of file FileContentParser.php.

◆ splitPdfInfo()

TYPO3\CMS\IndexedSearch\FileContentParser::splitPdfInfo (   $pdfInfoArray)

Analysing PDF info into a useable format.

Parameters
arrayArray of PDF content, coming from the pdfinfo tool
Returns
array Result array private
See also
fileContentParts()
Todo:
Define visibility

Definition at line 727 of file FileContentParser.php.

Referenced by TYPO3\CMS\IndexedSearch\FileContentParser\fileContentParts(), and TYPO3\CMS\IndexedSearch\FileContentParser\readFileContent().

Member Data Documentation

◆ $app

TYPO3\CMS\IndexedSearch\FileContentParser::$app = array()
Todo:
Define visibility

Definition at line 41 of file FileContentParser.php.

◆ $ext2itemtype_map

TYPO3\CMS\IndexedSearch\FileContentParser::$ext2itemtype_map = array()
Todo:
Define visibility

Definition at line 46 of file FileContentParser.php.

◆ $langObject

TYPO3\CMS\IndexedSearch\FileContentParser::$langObject
protected

Definition at line 59 of file FileContentParser.php.

◆ $pdf_mode

TYPO3\CMS\IndexedSearch\FileContentParser::$pdf_mode = -20
Todo:
Define visibility

Definition at line 34 of file FileContentParser.php.

◆ $pObj

TYPO3\CMS\IndexedSearch\FileContentParser::$pObj
Todo:
Define visibility

Definition at line 56 of file FileContentParser.php.

◆ $supportedExtensions

TYPO3\CMS\IndexedSearch\FileContentParser::$supportedExtensions = array()
Todo:
Define visibility

Definition at line 51 of file FileContentParser.php.