pro_v3.5.1/vendor/doctrine/lexer/docs/en/simple-parser-example.rst

Simple Parser Example
=====================

Extend the ``Doctrine\Common\Lexer\AbstractLexer`` class and implement
the ``getCatchablePatterns``, ``getNonCatchablePatterns``, and ``getType``
methods. Here is a very simple example lexer implementation named ``CharacterTypeLexer``.
It tokenizes a string to ``T_UPPER``, ``T_LOWER`` and``T_NUMBER`` tokens:

.. code-block:: php
    <?php

    use Doctrine\Common\Lexer\AbstractLexer;

    /**
     * @extends AbstractLexer<CharacterTypeLexer::T_*, string>
     */
    class CharacterTypeLexer extends AbstractLexer
    {
        const T_UPPER =  1;
        const T_LOWER =  2;
        const T_NUMBER = 3;

        protected function getCatchablePatterns(): array
        {
            return [
                '[a-bA-Z0-9]',
            ];
        }

        protected function getNonCatchablePatterns(): array
        {
            return [];
        }

        protected function getType(&$value): int
        {
            if (is_numeric($value)) {
                return self::T_NUMBER;
            }

            if (strtoupper($value) === $value) {
                return self::T_UPPER;
            }

            if (strtolower($value) === $value) {
                return self::T_LOWER;
            }
        }
    }

Use ``CharacterTypeLexer`` to extract an array of upper case characters:

.. code-block:: php
    <?php

    class UpperCaseCharacterExtracter
    {
        public function __construct(private CharacterTypeLexer $lexer)
        {
        }

        /** @return list<string> */
        public function getUpperCaseCharacters(string $string): array
        {
            $this->lexer->setInput($string);
            $this->lexer->moveNext();

            $upperCaseChars = [];
            while (true) {
                if (!$this->lexer->lookahead) {
                    break;
                }

                $this->lexer->moveNext();

                if ($this->lexer->token->isA(CharacterTypeLexer::T_UPPER)) {
                    $upperCaseChars[] = $this->lexer->token->value;
                }
            }

            return $upperCaseChars;
        }
    }

    $upperCaseCharacterExtractor = new UpperCaseCharacterExtracter(new CharacterTypeLexer());
    $upperCaseCharacters = $upperCaseCharacterExtractor->getUpperCaseCharacters('1aBcdEfgHiJ12');

    print_r($upperCaseCharacters);

The variable ``$upperCaseCharacters`` contains all of the upper case
characters:

.. code-block:: php
    Array
    (
        [0] => B
        [1] => E
        [2] => H
        [3] => J
    )

This is a simple example but it should demonstrate the low level API
that can be used to build more complex parsers.
new files 2026-03-07 22:29:07 +08:00			`Simple Parser Example`
			`=====================`

			Extend the ``Doctrine\Common\Lexer\AbstractLexer`` class and implement
			the ``getCatchablePatterns``, ``getNonCatchablePatterns``, and ``getType``
			methods. Here is a very simple example lexer implementation named ``CharacterTypeLexer``.
			It tokenizes a string to ``T_UPPER``, ``T_LOWER`` and``T_NUMBER`` tokens:

			`.. code-block:: php`
			`<?php`

			`use Doctrine\Common\Lexer\AbstractLexer;`

			`/**`
			`* @extends AbstractLexer<CharacterTypeLexer::T_*, string>`
			`*/`
			`class CharacterTypeLexer extends AbstractLexer`
			`{`
			`const T_UPPER = 1;`
			`const T_LOWER = 2;`
			`const T_NUMBER = 3;`

			`protected function getCatchablePatterns(): array`
			`{`
			`return [`
			`'[a-bA-Z0-9]',`
			`];`
			`}`

			`protected function getNonCatchablePatterns(): array`
			`{`
			`return [];`
			`}`

			`protected function getType(&$value): int`
			`{`
			`if (is_numeric($value)) {`
			`return self::T_NUMBER;`
			`}`

			`if (strtoupper($value) === $value) {`
			`return self::T_UPPER;`
			`}`

			`if (strtolower($value) === $value) {`
			`return self::T_LOWER;`
			`}`
			`}`
			`}`

			Use ``CharacterTypeLexer`` to extract an array of upper case characters:

			`.. code-block:: php`
			`<?php`

			`class UpperCaseCharacterExtracter`
			`{`
			`public function __construct(private CharacterTypeLexer $lexer)`
			`{`
			`}`

			`/** @return list<string> */`
			`public function getUpperCaseCharacters(string $string): array`
			`{`
			`$this->lexer->setInput($string);`
			`$this->lexer->moveNext();`

			`$upperCaseChars = [];`
			`while (true) {`
			`if (!$this->lexer->lookahead) {`
			`break;`
			`}`

			`$this->lexer->moveNext();`

			`if ($this->lexer->token->isA(CharacterTypeLexer::T_UPPER)) {`
			`$upperCaseChars[] = $this->lexer->token->value;`
			`}`
			`}`

			`return $upperCaseChars;`
			`}`
			`}`

			`$upperCaseCharacterExtractor = new UpperCaseCharacterExtracter(new CharacterTypeLexer());`
			`$upperCaseCharacters = $upperCaseCharacterExtractor->getUpperCaseCharacters('1aBcdEfgHiJ12');`

			`print_r($upperCaseCharacters);`

			The variable ``$upperCaseCharacters`` contains all of the upper case
			`characters:`

			`.. code-block:: php`
			`Array`
			`(`
			`[0] => B`
			`[1] => E`
			`[2] => H`
			`[3] => J`
			`)`

			`This is a simple example but it should demonstrate the low level API`
			`that can be used to build more complex parsers.`