‪TYPO3CMS  10.4
TYPO3\CMS\Core\Html\RteHtmlParser Class Reference
Inheritance diagram for TYPO3\CMS\Core\Html\RteHtmlParser:
TYPO3\CMS\Core\Html\HtmlParser

Public Member Functions

 __construct (EventDispatcherInterface $eventDispatcher)
 
 init ($elRef='', $recPid=0)
 
string transformTextForRichTextEditor (string $value, array $processingConfiguration)
 
string transformTextForPersistence (string $value, array $processingConfiguration)
 
string RTE_transform ($value, $_=null, $direction='rte', $thisConfig=[])
 
- ‪Public Member Functions inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
array splitIntoBlock ($tag, $content, $eliminateExtraEndTags=false)
 
string splitIntoBlockRecursiveProc ($tag, $content, &$procObj, $callBackContent, $callBackTags, $level=0)
 
array splitTags ($tag, $content)
 
string removeFirstAndLastTag ($str)
 
string getFirstTag ($str)
 
string getFirstTagName ($str, $preserveCase=false)
 
array get_tag_attributes ($tag, $deHSC=false)
 
array split_tag_attributes ($tag)
 
string HTMLcleaner ($content, $tags=[], $keepAll=0, $hSC=0, $addConfig=[])
 
string bidir_htmlspecialchars ($value, $dir)
 
string prefixResourcePath ($main_prefix, $content, $alternatives=[], $suffix='')
 
string prefixRelPath ($prefix, $srcVal, $suffix='')
 
array string caseShift ($str, $caseSensitiveComparison, $cacheKey='')
 
string compileTagAttribs ($tagAttrib, $meta=[])
 
array HTMLparserConfig ($TSconfig, $keepTags=[])
 
string stripEmptyTags ($content, $tagList='', $treatNonBreakingSpaceAsEmpty=false, $keepTags=false)
 

Protected Member Functions

 setProcessingConfiguration (array $processingConfiguration)
 
array resolveAppliedTransformationModes (string $direction)
 
string runHtmlParserIfConfigured ($content, $configurationDirective)
 
string TS_links_db ($value)
 
string TS_transform_db ($value)
 
string TS_transform_rte ($value)
 
string HTMLcleaner_db ($content)
 
array getKeepTags ($direction='rte')
 
string array divideIntoLines ($value, $count=5, $returnArray=false)
 
string setDivTags ($value)
 
string processContentWithinParagraph (string $content, string $fullContentWithTag)
 
string sanitizeLineBreaksForContentOnly (string $content)
 
string streamlineLineBreaksForProcessing (string $content)
 
string streamlineLineBreaksAfterProcessing (string $content)
 
string markBrokenLinks (string $content)
 
string removeBrokenLinkMarkers (string $content)
 
 htmlSanitize (string $content, array $configuration)
 
- ‪Protected Member Functions inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
string stripEmptyTagsIfConfigured ($value, $configuration)
 

Protected Attributes

string $blockElementList = 'DIV,TABLE,BLOCKQUOTE,PRE,UL,OL,H1,H2,H3,H4,H5,H6,ADDRESS,DL,DD,HEADER,SECTION,FOOTER,NAV,ARTICLE,ASIDE,FIGURE'
 
string $defaultAllowedTagsList = 'b,i,u,a,img,br,div,center,pre,figure,figcaption,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote,strike,span,abbr,acronym,dfn'
 
array $procOptions = array( )
 
int $TS_transform_db_safecounter = 100
 
array $getKeepTags_cache = array( )
 
array $allowedClasses = array( )
 
array $allowedAttributesForParagraphTags
 
array $allowedTagsOutsideOfParagraphs
 
EventDispatcherInterface $eventDispatcher
 
- ‪Protected Attributes inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
array $caseShift_cache = array( )
 

Additional Inherited Members

- ‪Public Attributes inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
const VOID_ELEMENTS = 'area|base|br|col|command|embed|hr|img|input|keygen|meta|param|source|track|wbr'
 

Detailed Description

Class for parsing HTML for the Rich Text Editor. (also called transformations)

Concerning line breaks: Regardless if LF (Unix-style) or CRLF (Windows) was put in, the HtmlParser works with LFs and migrates all line breaks to LFs internally, however when all transformations are done, all LFs are transformed to CRLFs. This means: RteHtmlParser always returns CRLFs to be maximum compatible with all formats.

Definition at line 37 of file RteHtmlParser.php.

Constructor & Destructor Documentation

◆ __construct()

TYPO3\CMS\Core\Html\RteHtmlParser::__construct ( EventDispatcherInterface  $eventDispatcher)

Member Function Documentation

◆ divideIntoLines()

string array TYPO3\CMS\Core\Html\RteHtmlParser::divideIntoLines (   $value,
  $count = 5,
  $returnArray = false 
)
protected

This resolves the $value into parts based on

-sections. These are returned as lines separated by LF. This point is to resolve the HTML-code returned from RTE into ordinary lines so it's 'human-readable' The function ->setDivTags does the opposite. This function processes content to go into the database.

Parameters
string$value‪Value to process.
int$count‪Recursion brake. Decremented on each recursion down to zero. Default is 5 (which equals the allowed nesting levels of p tags).
bool$returnArray‪If TRUE, an array with the lines is returned, otherwise a string of the processed input value.
Returns
‪string|array Processed input value.
See also
setDivTags()

Definition at line 626 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\processContentWithinParagraph(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), TYPO3\CMS\Core\Html\RteHtmlParser\sanitizeLineBreaksForContentOnly(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\TS_transform_db().

◆ getKeepTags()

array TYPO3\CMS\Core\Html\RteHtmlParser::getKeepTags (   $direction = 'rte')
protected

Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction)

Parameters
string$direction‪The direction of the content being processed by the output configuration; "db" (content going into the database FROM the rte) or "rte" (content going into the form)
Returns
‪array Configuration array
See also
HTMLcleaner_db()

Definition at line 555 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\$allowedClasses, TYPO3\CMS\Core\Html\HtmlParser\HTMLparserConfig(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\HTMLcleaner_db(), TYPO3\CMS\Core\Html\RteHtmlParser\setDivTags(), and TYPO3\CMS\Core\Html\RteHtmlParser\TS_transform_db().

◆ HTMLcleaner_db()

string TYPO3\CMS\Core\Html\RteHtmlParser::HTMLcleaner_db (   $content)
protected

Function for cleaning content going into the database. Content is cleaned eg. by removing unallowed HTML and ds-HSC content It is basically calling HTMLcleaner from the parent class with some preset configuration specifically set up for cleaning content going from the RTE into the db

Parameters
string$content‪Content to clean up
Returns
‪string Clean content
See also
getKeepTags()

Definition at line 541 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\getKeepTags(), and TYPO3\CMS\Core\Html\HtmlParser\HTMLcleaner().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\processContentWithinParagraph().

◆ htmlSanitize()

TYPO3\CMS\Core\Html\RteHtmlParser::htmlSanitize ( string  $content,
array  $configuration 
)
protected

◆ init()

TYPO3\CMS\Core\Html\RteHtmlParser::init (   $elRef = '',
  $recPid = 0 
)

Initialize, setting element reference and record PID

Parameters
string$elRef‪Element reference, eg "tt_content:bodytext
int$recPid‪PID of the record (page id)
Deprecated:
‪will be removed in TYPO3 v11.0, as it serves no purpose anymore

Definition at line 130 of file RteHtmlParser.php.

◆ markBrokenLinks()

string TYPO3\CMS\Core\Html\RteHtmlParser::markBrokenLinks ( string  $content)
protected

Content Transformation from DB to RTE Checks all tags which reference a t3://page and checks if the page is available If not, some offensive styling is added.

Parameters
string$content
Returns
‪string the modified content

Definition at line 811 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ processContentWithinParagraph()

string TYPO3\CMS\Core\Html\RteHtmlParser::processContentWithinParagraph ( string  $content,
string  $fullContentWithTag 
)
protected

Used for transformation from RTE to DB

Works on a single line within a

tag when storing into the database This always adds

tags and validates the arguments, additionally the content is cleaned up via the HTMLcleaner.

Parameters
string$content‪the content within the

tag

Parameters
string$fullContentWithTag‪the whole

tag surrounded as well

Returns
‪string the full

tag with cleaned content

Definition at line 724 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\compileTagAttribs(), TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\RteHtmlParser\HTMLcleaner_db(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\divideIntoLines().

◆ removeBrokenLinkMarkers()

string TYPO3\CMS\Core\Html\RteHtmlParser::removeBrokenLinkMarkers ( string  $content)
protected

Content Transformation from RTE to DB Removes link information error attributes from tags that are added to broken links

Parameters
string$content‪the content to process
Returns
‪string the modified content

Definition at line 854 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence().

◆ resolveAppliedTransformationModes()

array TYPO3\CMS\Core\Html\RteHtmlParser::resolveAppliedTransformationModes ( string  $direction)
protected

Ensures what transformation modes should be executed, and that they are only executed once.

Parameters
string$direction
Returns
‪array the resolved transformation modes

Definition at line 303 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ RTE_transform()

string TYPO3\CMS\Core\Html\RteHtmlParser::RTE_transform (   $value,
  $_ = null,
  $direction = 'rte',
  $thisConfig = [] 
)

Transform value for RTE based on specConf in the direction specified by $direction (rte/db) This is the main function called from DataHandler and transfer data classes, but has been superseded by the methods

Parameters
string$value‪Input value
null$_‪unused
string$direction‪Direction of the transformation. Two keywords are allowed; "db" or "rte". If "db" it means the transformation will clean up content coming from the Rich Text Editor and goes into the database. The other direction, "rte", is of course when content is coming from database and must be transformed to fit the RTE.
array$thisConfig‪Parsed TypoScript content configuring the RTE, probably coming from Page TSconfig.
Returns
‪string Output value
Deprecated:
‪will be removed in TYPO3 v11.0, use the transformText* methods instead.

Definition at line 285 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ runHtmlParserIfConfigured()

string TYPO3\CMS\Core\Html\RteHtmlParser::runHtmlParserIfConfigured (   $content,
  $configurationDirective 
)
protected

Runs the HTML parser if it is configured Getting additional HTML cleaner configuration. These are applied either before or after the main transformation is done and thus totally independent processing options you can set up.

This is only possible via TSconfig (procOptions) currently.

Parameters
string$content
string$configurationDirective‪used to look up in the procOptions if enabled, and then fetch the
Returns
‪string the processed content

Definition at line 338 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\HTMLcleaner(), and TYPO3\CMS\Core\Html\HtmlParser\HTMLparserConfig().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ sanitizeLineBreaksForContentOnly()

string TYPO3\CMS\Core\Html\RteHtmlParser::sanitizeLineBreaksForContentOnly ( string  $content)
protected

Wrap


tags with LFs, and also remove double LFs, used when transforming from RTE to DB

Parameters
string$content
Returns
‪string the modified content

Definition at line 762 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\divideIntoLines().

◆ setDivTags()

string TYPO3\CMS\Core\Html\RteHtmlParser::setDivTags (   $value)
protected

Converts all lines into

-sections (unless the line has a p - tag already) For processing of content going FROM database TO RTE.

Parameters
string$value‪Value to convert
Returns
‪string Processed value.
See also
divideIntoLines()

Definition at line 679 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\getKeepTags(), and TYPO3\CMS\Core\Html\HtmlParser\HTMLcleaner().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\TS_transform_rte().

◆ setProcessingConfiguration()

TYPO3\CMS\Core\Html\RteHtmlParser::setProcessingConfiguration ( array  $processingConfiguration)
protected

Sanitize and streamline given options (usually from RichTextConfiguration results "proc." and set them to the respective properties.

Parameters
array$processingConfiguration

Definition at line 141 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ streamlineLineBreaksAfterProcessing()

string TYPO3\CMS\Core\Html\RteHtmlParser::streamlineLineBreaksAfterProcessing ( string  $content)
protected

Called after any processing / transformation was made just before the content is returned by the RTE parser all line breaks get unified to be "CRLF"s again.

Historical note: Previously it was possible to disable this functionality via disableUnifyLineBreaks.

Parameters
string$content‪the content to process
Returns
‪string the modified content

Definition at line 795 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\streamlineLineBreaksForProcessing().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ streamlineLineBreaksForProcessing()

string TYPO3\CMS\Core\Html\RteHtmlParser::streamlineLineBreaksForProcessing ( string  $content)
protected

Called before any processing / transformation is made Removing any CRs (char 13) and only deal with LFs (char 10) internally. CR has a very disturbing effect, so just remove all CR and rely on LF

Historical note: Previously it was possible to disable this functionality via disableUnifyLineBreaks.

Parameters
string$content‪the content to process
Returns
‪string the modified content

Definition at line 780 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\streamlineLineBreaksAfterProcessing(), TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ transformTextForPersistence()

◆ transformTextForRichTextEditor()

string TYPO3\CMS\Core\Html\RteHtmlParser::transformTextForRichTextEditor ( string  $value,
array  $processingConfiguration 
)

◆ TS_links_db()

string TYPO3\CMS\Core\Html\RteHtmlParser::TS_links_db (   $value)
protected

Transformation handler: 'ts_links' / direction: "db" Processing anchor tags, and resolves them correctly again via the LinkService syntax

Splits content into tag blocks and processes each tag, and allows hooks to actually render the result.

Parameters
string$value‪Content input
Returns
‪string Content output

Definition at line 363 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence().

◆ TS_transform_db()

string TYPO3\CMS\Core\Html\RteHtmlParser::TS_transform_db (   $value)
protected

◆ TS_transform_rte()

string TYPO3\CMS\Core\Html\RteHtmlParser::TS_transform_rte (   $value)
protected

Transformation handler: css_transform / direction: "rte" Set (->rte) for standard content elements (ts)

Parameters
string$value‪Content input
Returns
‪string Content output
See also
TS_transform_db()

Definition at line 475 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTagName(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), TYPO3\CMS\Core\Html\RteHtmlParser\setDivTags(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

Member Data Documentation

◆ $allowedAttributesForParagraphTags

array TYPO3\CMS\Core\Html\RteHtmlParser::$allowedAttributesForParagraphTags
protected
Initial value:
= array(
'class',
'align',
'id',
'title',
'dir',
'lang',
'xml:lang',
'itemscope',
'itemtype',
'itemprop'
)

A list of HTML attributes for

tags. Because

tags are wrapped currently in a special handling, they have a special place for configuration via 'proc.keepPDIVattribs'

Definition at line 80 of file RteHtmlParser.php.

◆ $allowedClasses

array TYPO3\CMS\Core\Html\RteHtmlParser::$allowedClasses = array( )
protected

Storage of the allowed CSS class names in the RTE

Definition at line 73 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\getKeepTags().

◆ $allowedTagsOutsideOfParagraphs

array TYPO3\CMS\Core\Html\RteHtmlParser::$allowedTagsOutsideOfParagraphs
protected
Initial value:
= array(
'address',
'article',
'aside',
'blockquote',
'div',
'footer',
'figure',
'figcaption',
'header',
'hr',
'nav',
'section'
)

Any tags that are allowed outside of

sections - usually similar to the block elements plus some special tags like


and (if images are allowed). Completely overrideable via 'proc.allowTagsOutside'

Definition at line 99 of file RteHtmlParser.php.

◆ $blockElementList

string TYPO3\CMS\Core\Html\RteHtmlParser::$blockElementList = 'DIV,TABLE,BLOCKQUOTE,PRE,UL,OL,H1,H2,H3,H4,H5,H6,ADDRESS,DL,DD,HEADER,SECTION,FOOTER,NAV,ARTICLE,ASIDE,FIGURE'
protected

List of elements that are not wrapped into a "p" tag while doing the transformation.

Definition at line 44 of file RteHtmlParser.php.

◆ $defaultAllowedTagsList

string TYPO3\CMS\Core\Html\RteHtmlParser::$defaultAllowedTagsList = 'b,i,u,a,img,br,div,center,pre,figure,figcaption,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote,strike,span,abbr,acronym,dfn'
protected

List of all tags that are allowed by default

Definition at line 49 of file RteHtmlParser.php.

◆ $eventDispatcher

EventDispatcherInterface TYPO3\CMS\Core\Html\RteHtmlParser::$eventDispatcher
protected

Definition at line 116 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\__construct().

◆ $getKeepTags_cache

array TYPO3\CMS\Core\Html\RteHtmlParser::$getKeepTags_cache = array( )
protected

Data caching for processing function

Definition at line 67 of file RteHtmlParser.php.

◆ $procOptions

array TYPO3\CMS\Core\Html\RteHtmlParser::$procOptions = array( )
protected

Set to the TSconfig options coming from Page TSconfig

Definition at line 55 of file RteHtmlParser.php.

◆ $TS_transform_db_safecounter

int TYPO3\CMS\Core\Html\RteHtmlParser::$TS_transform_db_safecounter = 100
protected

Run-away brake for recursive calls.

Definition at line 61 of file RteHtmlParser.php.