‪TYPO3CMS  11.5
TYPO3\CMS\Core\Html\RteHtmlParser Class Reference
Inheritance diagram for TYPO3\CMS\Core\Html\RteHtmlParser:
TYPO3\CMS\Core\Html\HtmlParser

Public Member Functions

 __construct (EventDispatcherInterface $eventDispatcher)
 
string transformTextForRichTextEditor (string $value, array $processingConfiguration)
 
string transformTextForPersistence (string $value, array $processingConfiguration)
 
- ‪Public Member Functions inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
array splitIntoBlock ($tag, $content, $eliminateExtraEndTags=false)
 
string splitIntoBlockRecursiveProc ($tag, $content, &$procObj, $callBackContent, $callBackTags, $level=0)
 
array splitTags ($tag, $content)
 
string removeFirstAndLastTag ($str)
 
string getFirstTag ($str)
 
string getFirstTagName ($str, $preserveCase=false)
 
array get_tag_attributes ($tag, $deHSC=false)
 
array split_tag_attributes ($tag)
 
string HTMLcleaner ($content, $tags=[], $keepAll=0, $hSC=0, $addConfig=[])
 
string bidir_htmlspecialchars ($value, $dir)
 
string prefixResourcePath ($main_prefix, $content, $alternatives=[], $suffix='')
 
string prefixRelPath ($prefix, $srcVal, $suffix='')
 
array string caseShift ($str, $caseSensitiveComparison, $cacheKey='')
 
string compileTagAttribs ($tagAttrib, $meta=[])
 
array HTMLparserConfig ($TSconfig, $keepTags=[])
 
string stripEmptyTags ($content, $tagList='', $treatNonBreakingSpaceAsEmpty=false, $keepTags=false)
 

Protected Member Functions

 setProcessingConfiguration (array $processingConfiguration)
 
array resolveAppliedTransformationModes (string $direction)
 
string runHtmlParserIfConfigured ($content, $configurationDirective)
 
string TS_links_db ($value)
 
string TS_transform_db ($value)
 
string TS_transform_rte ($value)
 
string HTMLcleaner_db ($content)
 
array getKeepTags ($direction='rte')
 
string array divideIntoLines ($value, $count=5, $returnArray=false)
 
string setDivTags ($value)
 
string processContentWithinParagraph (string $content, string $fullContentWithTag)
 
string sanitizeLineBreaksForContentOnly (string $content)
 
string streamlineLineBreaksForProcessing (string $content)
 
string streamlineLineBreaksAfterProcessing (string $content)
 
string markBrokenLinks (string $content)
 
string removeBrokenLinkMarkers (string $content)
 
 htmlSanitize (string $content, array $configuration)
 
- ‪Protected Member Functions inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
string stripEmptyTagsIfConfigured ($value, $configuration)
 

Protected Attributes

string $blockElementList = 'DIV,TABLE,BLOCKQUOTE,PRE,UL,OL,H1,H2,H3,H4,H5,H6,ADDRESS,DL,DD,HEADER,SECTION,FOOTER,NAV,ARTICLE,ASIDE,FIGURE'
 
string $defaultAllowedTagsList = 'b,i,u,a,img,br,div,center,pre,figure,figcaption,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote,strike,span,abbr,acronym,dfn'
 
array $procOptions = array( )
 
int $TS_transform_db_safecounter = 100
 
array $getKeepTags_cache = array( )
 
array $allowedClasses = array( )
 
array $allowedAttributesForParagraphTags
 
array $allowedTagsOutsideOfParagraphs
 
EventDispatcherInterface $eventDispatcher
 
- ‪Protected Attributes inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
array $caseShift_cache = array( )
 

Additional Inherited Members

- ‪Public Attributes inherited from ‪TYPO3\CMS\Core\Html\HtmlParser
const VOID_ELEMENTS = 'area|base|br|col|command|embed|hr|img|input|keygen|meta|param|source|track|wbr'
 

Detailed Description

Class for parsing HTML for the Rich Text Editor. (also called transformations)

Concerning line breaks: Regardless if LF (Unix-style) or CRLF (Windows) was put in, the HtmlParser works with LFs and migrates all line breaks to LFs internally, however when all transformations are done, all LFs are transformed to CRLFs. This means: RteHtmlParser always returns CRLFs to be maximum compatible with all formats.

Definition at line 37 of file RteHtmlParser.php.

Constructor & Destructor Documentation

◆ __construct()

TYPO3\CMS\Core\Html\RteHtmlParser::__construct ( EventDispatcherInterface  $eventDispatcher)

Member Function Documentation

◆ divideIntoLines()

string array TYPO3\CMS\Core\Html\RteHtmlParser::divideIntoLines (   $value,
  $count = 5,
  $returnArray = false 
)
protected

This resolves the $value into parts based on

-sections. These are returned as lines separated by LF. This point is to resolve the HTML-code returned from RTE into ordinary lines so it's 'human-readable' The function ->setDivTags does the opposite. This function processes content to go into the database.

Parameters
string$value‪Value to process.
int$count‪Recursion brake. Decremented on each recursion down to zero. Default is 5 (which equals the allowed nesting levels of p tags).
bool$returnArray‪If TRUE, an array with the lines is returned, otherwise a string of the processed input value.
Returns
‪string|array Processed input value.
See also
setDivTags()

Definition at line 597 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\processContentWithinParagraph(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), TYPO3\CMS\Core\Html\RteHtmlParser\sanitizeLineBreaksForContentOnly(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\TS_transform_db().

◆ getKeepTags()

array TYPO3\CMS\Core\Html\RteHtmlParser::getKeepTags (   $direction = 'rte')
protected

Creates an array of configuration for the HTMLcleaner function based on whether content go TO or FROM the Rich Text Editor ($direction)

Parameters
string$direction‪The direction of the content being processed by the output configuration; "db" (content going into the database FROM the rte) or "rte" (content going into the form)
Returns
‪array Configuration array
See also
HTMLcleaner_db()

Definition at line 526 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\$allowedClasses, TYPO3\CMS\Core\Html\HtmlParser\HTMLparserConfig(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\HTMLcleaner_db(), TYPO3\CMS\Core\Html\RteHtmlParser\setDivTags(), and TYPO3\CMS\Core\Html\RteHtmlParser\TS_transform_db().

◆ HTMLcleaner_db()

string TYPO3\CMS\Core\Html\RteHtmlParser::HTMLcleaner_db (   $content)
protected

Function for cleaning content going into the database. Content is cleaned eg. by removing unallowed HTML and ds-HSC content It is basically calling HTMLcleaner from the parent class with some preset configuration specifically set up for cleaning content going from the RTE into the db

Parameters
string$content‪Content to clean up
Returns
‪string Clean content
See also
getKeepTags()

Definition at line 512 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\getKeepTags(), and TYPO3\CMS\Core\Html\HtmlParser\HTMLcleaner().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\processContentWithinParagraph().

◆ htmlSanitize()

TYPO3\CMS\Core\Html\RteHtmlParser::htmlSanitize ( string  $content,
array  $configuration 
)
protected

◆ markBrokenLinks()

string TYPO3\CMS\Core\Html\RteHtmlParser::markBrokenLinks ( string  $content)
protected

Content Transformation from DB to RTE Checks all tags which reference a t3://page and checks if the page is available If not, some offensive styling is added.

Parameters
string$content
Returns
‪string the modified content

Definition at line 783 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ processContentWithinParagraph()

string TYPO3\CMS\Core\Html\RteHtmlParser::processContentWithinParagraph ( string  $content,
string  $fullContentWithTag 
)
protected

Used for transformation from RTE to DB

Works on a single line within a

tag when storing into the database This always adds

tags and validates the arguments, additionally the content is cleaned up via the HTMLcleaner.

Parameters
string$content‪the content within the

tag

Parameters
string$fullContentWithTag‪the whole

tag surrounded as well

Returns
‪string the full

tag with cleaned content

Definition at line 696 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\compileTagAttribs(), TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\RteHtmlParser\HTMLcleaner_db(), and TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\divideIntoLines().

◆ removeBrokenLinkMarkers()

string TYPO3\CMS\Core\Html\RteHtmlParser::removeBrokenLinkMarkers ( string  $content)
protected

Content Transformation from RTE to DB Removes link information error attributes from tags that are added to broken links

Parameters
string$content‪the content to process
Returns
‪string the modified content

Definition at line 826 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence().

◆ resolveAppliedTransformationModes()

array TYPO3\CMS\Core\Html\RteHtmlParser::resolveAppliedTransformationModes ( string  $direction)
protected

Ensures what transformation modes should be executed, and that they are only executed once.

Parameters
string$direction
Returns
‪array the resolved transformation modes

Definition at line 270 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ runHtmlParserIfConfigured()

string TYPO3\CMS\Core\Html\RteHtmlParser::runHtmlParserIfConfigured (   $content,
  $configurationDirective 
)
protected

Runs the HTML parser if it is configured Getting additional HTML cleaner configuration. These are applied either before or after the main transformation is done and thus totally independent processing options you can set up.

This is only possible via TSconfig (procOptions) currently.

Parameters
string$content
string$configurationDirective‪used to look up in the procOptions if enabled, and then fetch the
Returns
‪string the processed content

Definition at line 305 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\HTMLcleaner(), and TYPO3\CMS\Core\Html\HtmlParser\HTMLparserConfig().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ sanitizeLineBreaksForContentOnly()

string TYPO3\CMS\Core\Html\RteHtmlParser::sanitizeLineBreaksForContentOnly ( string  $content)
protected

Wrap


tags with LFs, and also remove double LFs, used when transforming from RTE to DB

Parameters
string$content
Returns
‪string the modified content

Definition at line 734 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\divideIntoLines().

◆ setDivTags()

string TYPO3\CMS\Core\Html\RteHtmlParser::setDivTags (   $value)
protected

Converts all lines into

-sections (unless the line has a p - tag already) For processing of content going FROM database TO RTE.

Parameters
string$value‪Value to convert
Returns
‪string Processed value.
See also
divideIntoLines()

Definition at line 650 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\getFirstTagName(), TYPO3\CMS\Core\Html\RteHtmlParser\getKeepTags(), and TYPO3\CMS\Core\Html\HtmlParser\HTMLcleaner().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\TS_transform_rte().

◆ setProcessingConfiguration()

TYPO3\CMS\Core\Html\RteHtmlParser::setProcessingConfiguration ( array  $processingConfiguration)
protected

Sanitize and streamline given options (usually from RichTextConfiguration results "proc." and set them to the respective properties.

Parameters
array$processingConfiguration

Definition at line 129 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Utility\GeneralUtility\trimExplode().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ streamlineLineBreaksAfterProcessing()

string TYPO3\CMS\Core\Html\RteHtmlParser::streamlineLineBreaksAfterProcessing ( string  $content)
protected

Called after any processing / transformation was made just before the content is returned by the RTE parser all line breaks get unified to be "CRLF"s again.

Historical note: Previously it was possible to disable this functionality via disableUnifyLineBreaks.

Parameters
string$content‪the content to process
Returns
‪string the modified content

Definition at line 767 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\RteHtmlParser\streamlineLineBreaksForProcessing().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ streamlineLineBreaksForProcessing()

string TYPO3\CMS\Core\Html\RteHtmlParser::streamlineLineBreaksForProcessing ( string  $content)
protected

Called before any processing / transformation is made Removing any CRs (char 13) and only deal with LFs (char 10) internally. CR has a very disturbing effect, so just remove all CR and rely on LF

Historical note: Previously it was possible to disable this functionality via disableUnifyLineBreaks.

Parameters
string$content‪the content to process
Returns
‪string the modified content

Definition at line 752 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\streamlineLineBreaksAfterProcessing(), TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence(), and TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

◆ transformTextForPersistence()

◆ transformTextForRichTextEditor()

string TYPO3\CMS\Core\Html\RteHtmlParser::transformTextForRichTextEditor ( string  $value,
array  $processingConfiguration 
)

◆ TS_links_db()

string TYPO3\CMS\Core\Html\RteHtmlParser::TS_links_db (   $value)
protected

Transformation handler: 'ts_links' / direction: "db" Processing anchor tags, and resolves them correctly again via the LinkService syntax

Splits content into tag blocks and processes each tag, and allows hooks to actually render the result.

Parameters
string$value‪Content input
Returns
‪string Content output

Definition at line 330 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\get_tag_attributes(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForPersistence().

◆ TS_transform_db()

string TYPO3\CMS\Core\Html\RteHtmlParser::TS_transform_db (   $value)
protected

◆ TS_transform_rte()

string TYPO3\CMS\Core\Html\RteHtmlParser::TS_transform_rte (   $value)
protected

Transformation handler: css_transform / direction: "rte" Set (->rte) for standard content elements (ts)

Parameters
string$value‪Content input
Returns
‪string Content output
See also
TS_transform_db()

Definition at line 443 of file RteHtmlParser.php.

References TYPO3\CMS\Core\Html\HtmlParser\getFirstTag(), TYPO3\CMS\Core\Html\HtmlParser\getFirstTagName(), TYPO3\CMS\Core\Html\HtmlParser\removeFirstAndLastTag(), TYPO3\CMS\Core\Html\RteHtmlParser\setDivTags(), and TYPO3\CMS\Core\Html\HtmlParser\splitIntoBlock().

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\transformTextForRichTextEditor().

Member Data Documentation

◆ $allowedAttributesForParagraphTags

array TYPO3\CMS\Core\Html\RteHtmlParser::$allowedAttributesForParagraphTags
protected
Initial value:
= array(
'class',
'align',
'id',
'title',
'dir',
'lang',
'xml:lang',
'itemscope',
'itemtype',
'itemprop',
)

A list of HTML attributes for

tags. Because

tags are wrapped currently in a special handling, they have a special place for configuration via 'proc.keepPDIVattribs'

Definition at line 80 of file RteHtmlParser.php.

◆ $allowedClasses

array TYPO3\CMS\Core\Html\RteHtmlParser::$allowedClasses = array( )
protected

Storage of the allowed CSS class names in the RTE

Definition at line 73 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\getKeepTags().

◆ $allowedTagsOutsideOfParagraphs

array TYPO3\CMS\Core\Html\RteHtmlParser::$allowedTagsOutsideOfParagraphs
protected
Initial value:
= array(
'address',
'article',
'aside',
'blockquote',
'div',
'footer',
'figure',
'figcaption',
'header',
'hr',
'nav',
'section',
)

Any tags that are allowed outside of

sections - usually similar to the block elements plus some special tags like


and (if images are allowed). Completely overrideable via 'proc.allowTagsOutside'

Definition at line 99 of file RteHtmlParser.php.

◆ $blockElementList

string TYPO3\CMS\Core\Html\RteHtmlParser::$blockElementList = 'DIV,TABLE,BLOCKQUOTE,PRE,UL,OL,H1,H2,H3,H4,H5,H6,ADDRESS,DL,DD,HEADER,SECTION,FOOTER,NAV,ARTICLE,ASIDE,FIGURE'
protected

List of elements that are not wrapped into a "p" tag while doing the transformation.

Definition at line 44 of file RteHtmlParser.php.

◆ $defaultAllowedTagsList

string TYPO3\CMS\Core\Html\RteHtmlParser::$defaultAllowedTagsList = 'b,i,u,a,img,br,div,center,pre,figure,figcaption,font,hr,sub,sup,p,strong,em,li,ul,ol,blockquote,strike,span,abbr,acronym,dfn'
protected

List of all tags that are allowed by default

Definition at line 49 of file RteHtmlParser.php.

◆ $eventDispatcher

EventDispatcherInterface TYPO3\CMS\Core\Html\RteHtmlParser::$eventDispatcher
protected

Definition at line 116 of file RteHtmlParser.php.

Referenced by TYPO3\CMS\Core\Html\RteHtmlParser\__construct().

◆ $getKeepTags_cache

array TYPO3\CMS\Core\Html\RteHtmlParser::$getKeepTags_cache = array( )
protected

Data caching for processing function

Definition at line 67 of file RteHtmlParser.php.

◆ $procOptions

array TYPO3\CMS\Core\Html\RteHtmlParser::$procOptions = array( )
protected

Set to the TSconfig options coming from Page TSconfig

Definition at line 55 of file RteHtmlParser.php.

◆ $TS_transform_db_safecounter

int TYPO3\CMS\Core\Html\RteHtmlParser::$TS_transform_db_safecounter = 100
protected

Run-away brake for recursive calls.

Definition at line 61 of file RteHtmlParser.php.