TokenInterface

A readonly token: Each line of TypoScript is split into a list of lines consisting of tokens by the tokenizers.

As example, a "foo.bar = baz" line creates a LineIdentifierAssignment line, having TokenType::T_IDENTIFIER 'foo', plus TokenType::T_IDENTIFIER 'bar' as TokenStream for LineIdentifierAssignment->getIdentifierTokenStream(), plus a TokenType::T_VALUE 'baz' as LineIdentifierAssignment->getValueTokenStream().

We have two different Token implementations: The casual "Token" class for everything, plus the "TokenIdentifier" class for identifier tokens. Identifier tokens are those "left" of for instance an assignment like "foo.bar = baz" ("foo" and "bar" are TokenIdentifier instances), and also on the right side when using expression with "<" and "=<" operator: Example "foo.bar < baz": "baz" is an instance of a TokenIdentifier ("foo" and "bar" as well).

The reason to have two implementations is that TokenIdentifier needs to be handled slightly different when cast to string: For identifiers, all "." (dots) within a single identifier token need to be quoted with "" (backslash), to not confuse the parser. The classic use-case is having dots in FlexForm identifiers for PageTS: "foo.bar.baz.foobar = value" - three identifier tokens (not four!): "foo", "bar.baz" and "foobar". So the difference between "TokenIdentifier" and "Token" is just that "TokenIdentifier" quotes dots in its value when string'ified, while Token does not and __toString() on Token simply says ->getValue().

Multiple tokens are encapsulated in TokenStreamInterface. TokenStreamInterface has a __toString() method as well, which calls __toString() on all assigned tokens. This way, a TokenIdentifier will do its quoting magic, and casual Token instances return their value.

The idea is here that TokenStreams are cast to string quite often. For instance, an assignment line like "foo = bar" creates a token stream of one token for the right side (things after "="): A T_VALUE Token instance with value "bar". The AST builder then at some point needs to resolve this TokenStream to string. This will directly call __toString on token "bar", and does not deal with quoting, since its no TokenIdentifier and just a Token.

Note on getLine() and getColumn(): These two represent the position of a token in the source file: We start counting at 0 (zero): The first token on the first line is line 0, column 0. Only the LosslessTokenizer sets these, it's too expensive and of no relevance for the LossyTokenizer that is used for instance in FE TS tokenizing. That's why these two properties are optional and 0 (zero) by default.

Tags
internal:

Internal tokenizer structure.

Table of Contents

Methods

__toString()  : string
getColumn()  : int
getLine()  : int
getType()  : TokenType
getValue()  : string

Methods

__toString()

public __toString() : string
Return values
string

getColumn()

public getColumn() : int
Return values
int

getLine()

public getLine() : int
Return values
int

getValue()

public getValue() : string
Return values
string

        
On this page

Search results