[ Index ]

PHP Cross Reference of WordPress Trunk (Updated Daily)

Search

title

Body

[close]

/wp-includes/html-api/ -> class-wp-html-processor.php (source)

   1  <?php
   2  /**
   3   * HTML API: WP_HTML_Processor class
   4   *
   5   * @package WordPress
   6   * @subpackage HTML-API
   7   * @since 6.4.0
   8   */
   9  
  10  /**
  11   * Core class used to safely parse and modify an HTML document.
  12   *
  13   * The HTML Processor class properly parses and modifies HTML5 documents.
  14   *
  15   * It supports a subset of the HTML5 specification, and when it encounters
  16   * unsupported markup, it aborts early to avoid unintentionally breaking
  17   * the document. The HTML Processor should never break an HTML document.
  18   *
  19   * While the `WP_HTML_Tag_Processor` is a valuable tool for modifying
  20   * attributes on individual HTML tags, the HTML Processor is more capable
  21   * and useful for the following operations:
  22   *
  23   *  - Querying based on nested HTML structure.
  24   *
  25   * Eventually the HTML Processor will also support:
  26   *  - Wrapping a tag in surrounding HTML.
  27   *  - Unwrapping a tag by removing its parent.
  28   *  - Inserting and removing nodes.
  29   *  - Reading and changing inner content.
  30   *  - Navigating up or around HTML structure.
  31   *
  32   * ## Usage
  33   *
  34   * Use of this class requires three steps:
  35   *
  36   *   1. Call a static creator method with your input HTML document.
  37   *   2. Find the location in the document you are looking for.
  38   *   3. Request changes to the document at that location.
  39   *
  40   * Example:
  41   *
  42   *     $processor = WP_HTML_Processor::create_fragment( $html );
  43   *     if ( $processor->next_tag( array( 'breadcrumbs' => array( 'DIV', 'FIGURE', 'IMG' ) ) ) ) {
  44   *         $processor->add_class( 'responsive-image' );
  45   *     }
  46   *
  47   * #### Breadcrumbs
  48   *
  49   * Breadcrumbs represent the stack of open elements from the root
  50   * of the document or fragment down to the currently-matched node,
  51   * if one is currently selected. Call WP_HTML_Processor::get_breadcrumbs()
  52   * to inspect the breadcrumbs for a matched tag.
  53   *
  54   * Breadcrumbs can specify nested HTML structure and are equivalent
  55   * to a CSS selector comprising tag names separated by the child
  56   * combinator, such as "DIV > FIGURE > IMG".
  57   *
  58   * Since all elements find themselves inside a full HTML document
  59   * when parsed, the return value from `get_breadcrumbs()` will always
  60   * contain any implicit outermost elements. For example, when parsing
  61   * with `create_fragment()` in the `BODY` context (the default), any
  62   * tag in the given HTML document will contain `array( 'HTML', 'BODY', … )`
  63   * in its breadcrumbs.
  64   *
  65   * Despite containing the implied outermost elements in their breadcrumbs,
  66   * tags may be found with the shortest-matching breadcrumb query. That is,
  67   * `array( 'IMG' )` matches all IMG elements and `array( 'P', 'IMG' )`
  68   * matches all IMG elements directly inside a P element. To ensure that no
  69   * partial matches erroneously match it's possible to specify in a query
  70   * the full breadcrumb match all the way down from the root HTML element.
  71   *
  72   * Example:
  73   *
  74   *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
  75   *     //               ----- Matches here.
  76   *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'IMG' ) ) );
  77   *
  78   *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
  79   *     //                                  ---- Matches here.
  80   *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'FIGCAPTION', 'EM' ) ) );
  81   *
  82   *     $html = '<div><img></div><img>';
  83   *     //                       ----- Matches here, because IMG must be a direct child of the implicit BODY.
  84   *     $processor->next_tag( array( 'breadcrumbs' => array( 'BODY', 'IMG' ) ) );
  85   *
  86   * ## HTML Support
  87   *
  88   * This class implements a small part of the HTML5 specification.
  89   * It's designed to operate within its support and abort early whenever
  90   * encountering circumstances it can't properly handle. This is
  91   * the principle way in which this class remains as simple as possible
  92   * without cutting corners and breaking compliance.
  93   *
  94   * ### Supported elements
  95   *
  96   * If any unsupported element appears in the HTML input the HTML Processor
  97   * will abort early and stop all processing. This draconian measure ensures
  98   * that the HTML Processor won't break any HTML it doesn't fully understand.
  99   *
 100   * The HTML Processor supports all elements other than a specific set:
 101   *
 102   *  - Any element inside a TABLE.
 103   *  - Any element inside foreign content, including SVG and MATH.
 104   *  - Any element outside the IN BODY insertion mode, e.g. doctype declarations, meta, links.
 105   *
 106   * ### Supported markup
 107   *
 108   * Some kinds of non-normative HTML involve reconstruction of formatting elements and
 109   * re-parenting of mis-nested elements. For example, a DIV tag found inside a TABLE
 110   * may in fact belong _before_ the table in the DOM. If the HTML Processor encounters
 111   * such a case it will stop processing.
 112   *
 113   * The following list illustrates some common examples of unexpected HTML inputs that
 114   * the HTML Processor properly parses and represents:
 115   *
 116   *  - HTML with optional tags omitted, e.g. `<p>one<p>two`.
 117   *  - HTML with unexpected tag closers, e.g. `<p>one </span> more</p>`.
 118   *  - Non-void tags with self-closing flag, e.g. `<div/>the DIV is still open.</div>`.
 119   *  - Heading elements which close open heading elements of another level, e.g. `<h1>Closed by </h2>`.
 120   *  - Elements containing text that looks like other tags but isn't, e.g. `<title>The <img> is plaintext</title>`.
 121   *  - SCRIPT and STYLE tags containing text that looks like HTML but isn't, e.g. `<script>document.write('<p>Hi</p>');</script>`.
 122   *  - SCRIPT content which has been escaped, e.g. `<script><!-- document.write('<script>console.log("hi")</script>') --></script>`.
 123   *
 124   * ### Unsupported Features
 125   *
 126   * This parser does not report parse errors.
 127   *
 128   * Normally, when additional HTML or BODY tags are encountered in a document, if there
 129   * are any additional attributes on them that aren't found on the previous elements,
 130   * the existing HTML and BODY elements adopt those missing attribute values. This
 131   * parser does not add those additional attributes.
 132   *
 133   * In certain situations, elements are moved to a different part of the document in
 134   * a process called "adoption" and "fostering." Because the nodes move to a location
 135   * in the document that the parser had already processed, this parser does not support
 136   * these situations and will bail.
 137   *
 138   * @since 6.4.0
 139   *
 140   * @see WP_HTML_Tag_Processor
 141   * @see https://html.spec.whatwg.org/
 142   */
 143  class WP_HTML_Processor extends WP_HTML_Tag_Processor {
 144      /**
 145       * The maximum number of bookmarks allowed to exist at any given time.
 146       *
 147       * HTML processing requires more bookmarks than basic tag processing,
 148       * so this class constant from the Tag Processor is overwritten.
 149       *
 150       * @since 6.4.0
 151       *
 152       * @var int
 153       */
 154      const MAX_BOOKMARKS = 100;
 155  
 156      /**
 157       * Holds the working state of the parser, including the stack of
 158       * open elements and the stack of active formatting elements.
 159       *
 160       * Initialized in the constructor.
 161       *
 162       * @since 6.4.0
 163       *
 164       * @var WP_HTML_Processor_State
 165       */
 166      private $state;
 167  
 168      /**
 169       * Used to create unique bookmark names.
 170       *
 171       * This class sets a bookmark for every tag in the HTML document that it encounters.
 172       * The bookmark name is auto-generated and increments, starting with `1`. These are
 173       * internal bookmarks and are automatically released when the referring WP_HTML_Token
 174       * goes out of scope and is garbage-collected.
 175       *
 176       * @since 6.4.0
 177       *
 178       * @see WP_HTML_Processor::$release_internal_bookmark_on_destruct
 179       *
 180       * @var int
 181       */
 182      private $bookmark_counter = 0;
 183  
 184      /**
 185       * Stores an explanation for why something failed, if it did.
 186       *
 187       * @see self::get_last_error
 188       *
 189       * @since 6.4.0
 190       *
 191       * @var string|null
 192       */
 193      private $last_error = null;
 194  
 195      /**
 196       * Stores context for why the parser bailed on unsupported HTML, if it did.
 197       *
 198       * @see self::get_unsupported_exception
 199       *
 200       * @since 6.7.0
 201       *
 202       * @var WP_HTML_Unsupported_Exception|null
 203       */
 204      private $unsupported_exception = null;
 205  
 206      /**
 207       * Releases a bookmark when PHP garbage-collects its wrapping WP_HTML_Token instance.
 208       *
 209       * This function is created inside the class constructor so that it can be passed to
 210       * the stack of open elements and the stack of active formatting elements without
 211       * exposing it as a public method on the class.
 212       *
 213       * @since 6.4.0
 214       *
 215       * @var Closure|null
 216       */
 217      private $release_internal_bookmark_on_destruct = null;
 218  
 219      /**
 220       * Stores stack events which arise during parsing of the
 221       * HTML document, which will then supply the "match" events.
 222       *
 223       * @since 6.6.0
 224       *
 225       * @var WP_HTML_Stack_Event[]
 226       */
 227      private $element_queue = array();
 228  
 229      /**
 230       * Stores the current breadcrumbs.
 231       *
 232       * @since 6.7.0
 233       *
 234       * @var string[]
 235       */
 236      private $breadcrumbs = array();
 237  
 238      /**
 239       * Current stack event, if set, representing a matched token.
 240       *
 241       * Because the parser may internally point to a place further along in a document
 242       * than the nodes which have already been processed (some "virtual" nodes may have
 243       * appeared while scanning the HTML document), this will point at the "current" node
 244       * being processed. It comes from the front of the element queue.
 245       *
 246       * @since 6.6.0
 247       *
 248       * @var WP_HTML_Stack_Event|null
 249       */
 250      private $current_element = null;
 251  
 252      /**
 253       * Context node if created as a fragment parser.
 254       *
 255       * @var WP_HTML_Token|null
 256       */
 257      private $context_node = null;
 258  
 259      /*
 260       * Public Interface Functions
 261       */
 262  
 263      /**
 264       * Creates an HTML processor in the fragment parsing mode.
 265       *
 266       * Use this for cases where you are processing chunks of HTML that
 267       * will be found within a bigger HTML document, such as rendered
 268       * block output that exists within a post, `the_content` inside a
 269       * rendered site layout.
 270       *
 271       * Fragment parsing occurs within a context, which is an HTML element
 272       * that the document will eventually be placed in. It becomes important
 273       * when special elements have different rules than others, such as inside
 274       * a TEXTAREA or a TITLE tag where things that look like tags are text,
 275       * or inside a SCRIPT tag where things that look like HTML syntax are JS.
 276       *
 277       * The context value should be a representation of the tag into which the
 278       * HTML is found. For most cases this will be the body element. The HTML
 279       * form is provided because a context element may have attributes that
 280       * impact the parse, such as with a SCRIPT tag and its `type` attribute.
 281       *
 282       * ## Current HTML Support
 283       *
 284       *  - The only supported context is `<body>`, which is the default value.
 285       *  - The only supported document encoding is `UTF-8`, which is the default value.
 286       *
 287       * @since 6.4.0
 288       * @since 6.6.0 Returns `static` instead of `self` so it can create subclass instances.
 289       *
 290       * @param string $html     Input HTML fragment to process.
 291       * @param string $context  Context element for the fragment, must be default of `<body>`.
 292       * @param string $encoding Text encoding of the document; must be default of 'UTF-8'.
 293       * @return static|null The created processor if successful, otherwise null.
 294       */
 295  	public static function create_fragment( $html, $context = '<body>', $encoding = 'UTF-8' ) {
 296          if ( '<body>' !== $context || 'UTF-8' !== $encoding ) {
 297              return null;
 298          }
 299  
 300          $processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
 301          $processor->state->context_node        = array( 'BODY', array() );
 302          $processor->state->insertion_mode      = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
 303          $processor->state->encoding            = $encoding;
 304          $processor->state->encoding_confidence = 'certain';
 305  
 306          // @todo Create "fake" bookmarks for non-existent but implied nodes.
 307          $processor->bookmarks['root-node']    = new WP_HTML_Span( 0, 0 );
 308          $processor->bookmarks['context-node'] = new WP_HTML_Span( 0, 0 );
 309  
 310          $root_node = new WP_HTML_Token(
 311              'root-node',
 312              'HTML',
 313              false
 314          );
 315  
 316          $processor->state->stack_of_open_elements->push( $root_node );
 317  
 318          $context_node = new WP_HTML_Token(
 319              'context-node',
 320              $processor->state->context_node[0],
 321              false
 322          );
 323  
 324          $processor->context_node = $context_node;
 325          $processor->breadcrumbs  = array( 'HTML', $context_node->node_name );
 326  
 327          return $processor;
 328      }
 329  
 330      /**
 331       * Creates an HTML processor in the full parsing mode.
 332       *
 333       * It's likely that a fragment parser is more appropriate, unless sending an
 334       * entire HTML document from start to finish. Consider a fragment parser with
 335       * a context node of `<body>`.
 336       *
 337       * Since UTF-8 is the only currently-accepted charset, if working with a
 338       * document that isn't UTF-8, it's important to convert the document before
 339       * creating the processor: pass in the converted HTML.
 340       *
 341       * @param string      $html                    Input HTML document to process.
 342       * @param string|null $known_definite_encoding Optional. If provided, specifies the charset used
 343       *                                             in the input byte stream. Currently must be UTF-8.
 344       * @return static|null The created processor if successful, otherwise null.
 345       */
 346  	public static function create_full_parser( $html, $known_definite_encoding = 'UTF-8' ) {
 347          if ( 'UTF-8' !== $known_definite_encoding ) {
 348              return null;
 349          }
 350  
 351          $processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
 352          $processor->state->encoding            = $known_definite_encoding;
 353          $processor->state->encoding_confidence = 'certain';
 354  
 355          return $processor;
 356      }
 357  
 358      /**
 359       * Constructor.
 360       *
 361       * Do not use this method. Use the static creator methods instead.
 362       *
 363       * @access private
 364       *
 365       * @since 6.4.0
 366       *
 367       * @see WP_HTML_Processor::create_fragment()
 368       *
 369       * @param string      $html                                  HTML to process.
 370       * @param string|null $use_the_static_create_methods_instead This constructor should not be called manually.
 371       */
 372  	public function __construct( $html, $use_the_static_create_methods_instead = null ) {
 373          parent::__construct( $html );
 374  
 375          if ( self::CONSTRUCTOR_UNLOCK_CODE !== $use_the_static_create_methods_instead ) {
 376              _doing_it_wrong(
 377                  __METHOD__,
 378                  sprintf(
 379                      /* translators: %s: WP_HTML_Processor::create_fragment(). */
 380                      __( 'Call %s to create an HTML Processor instead of calling the constructor directly.' ),
 381                      '<code>WP_HTML_Processor::create_fragment()</code>'
 382                  ),
 383                  '6.4.0'
 384              );
 385          }
 386  
 387          $this->state = new WP_HTML_Processor_State();
 388  
 389          $this->state->stack_of_open_elements->set_push_handler(
 390              function ( WP_HTML_Token $token ): void {
 391                  $is_virtual            = ! isset( $this->state->current_token ) || $this->is_tag_closer();
 392                  $same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
 393                  $provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
 394                  $this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::PUSH, $provenance );
 395  
 396                  $this->change_parsing_namespace( $token->integration_node_type ? 'html' : $token->namespace );
 397              }
 398          );
 399  
 400          $this->state->stack_of_open_elements->set_pop_handler(
 401              function ( WP_HTML_Token $token ): void {
 402                  $is_virtual            = ! isset( $this->state->current_token ) || ! $this->is_tag_closer();
 403                  $same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
 404                  $provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
 405                  $this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::POP, $provenance );
 406  
 407                  $adjusted_current_node = $this->get_adjusted_current_node();
 408  
 409                  if ( $adjusted_current_node ) {
 410                      $this->change_parsing_namespace( $adjusted_current_node->integration_node_type ? 'html' : $adjusted_current_node->namespace );
 411                  } else {
 412                      $this->change_parsing_namespace( 'html' );
 413                  }
 414              }
 415          );
 416  
 417          /*
 418           * Create this wrapper so that it's possible to pass
 419           * a private method into WP_HTML_Token classes without
 420           * exposing it to any public API.
 421           */
 422          $this->release_internal_bookmark_on_destruct = function ( string $name ): void {
 423              parent::release_bookmark( $name );
 424          };
 425      }
 426  
 427      /**
 428       * Creates a fragment processor at the current node.
 429       *
 430       * HTML Fragment parsing always happens with a context node. HTML Fragment Processors can be
 431       * instantiated with a `BODY` context node via `WP_HTML_Processor::create_fragment( $html )`.
 432       *
 433       * The context node may impact how a fragment of HTML is parsed. For example, consider the HTML
 434       * fragment `<td />Inside TD?</td>`.
 435       *
 436       * A BODY context node will produce the following tree:
 437       *
 438       *     └─#text Inside TD?
 439       *
 440       * Notice that the `<td>` tags are completely ignored.
 441       *
 442       * Compare that with an SVG context node that produces the following tree:
 443       *
 444       *     ├─svg:td
 445       *     └─#text Inside TD?
 446       *
 447       * Here, a `td` node in the `svg` namespace is created, and its self-closing flag is respected.
 448       * This is a peculiarity of parsing HTML in foreign content like SVG.
 449       *
 450       * Finally, consider the tree produced with a TABLE context node:
 451       *
 452       *     └─TBODY
 453       *       └─TR
 454       *         └─TD
 455       *           └─#text Inside TD?
 456       *
 457       * These examples demonstrate how important the context node may be when processing an HTML
 458       * fragment. Special care must be taken when processing fragments that are expected to appear
 459       * in specific contexts. SVG and TABLE are good examples, but there are others.
 460       *
 461       * @see https://html.spec.whatwg.org/multipage/parsing.html#html-fragment-parsing-algorithm
 462       *
 463       * @param string $html Input HTML fragment to process.
 464       * @return static|null The created processor if successful, otherwise null.
 465       */
 466  	public function create_fragment_at_current_node( string $html ) {
 467          if ( $this->get_token_type() !== '#tag' || $this->is_tag_closer() ) {
 468              return null;
 469          }
 470  
 471          $namespace = $this->current_element->token->namespace;
 472  
 473          /*
 474           * Prevent creating fragments at nodes that require a special tokenizer state.
 475           * This is unsupported by the HTML Processor.
 476           */
 477          if (
 478              'html' === $namespace &&
 479              in_array( $this->current_element->token->node_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP', 'PLAINTEXT' ), true )
 480          ) {
 481              return null;
 482          }
 483  
 484          $fragment_processor = static::create_fragment( $html );
 485          if ( null === $fragment_processor ) {
 486              return null;
 487          }
 488  
 489          $fragment_processor->compat_mode = $this->compat_mode;
 490  
 491          $fragment_processor->context_node                = clone $this->state->current_token;
 492          $fragment_processor->context_node->bookmark_name = 'context-node';
 493          $fragment_processor->context_node->on_destroy    = null;
 494  
 495          $fragment_processor->state->context_node = array( $fragment_processor->context_node->node_name, array() );
 496  
 497          $attribute_names = $this->get_attribute_names_with_prefix( '' );
 498          if ( null !== $attribute_names ) {
 499              foreach ( $attribute_names as $name ) {
 500                  $fragment_processor->state->context_node[1][ $name ] = $this->get_attribute( $name );
 501              }
 502          }
 503  
 504          $fragment_processor->breadcrumbs = array( 'HTML', $fragment_processor->context_node->node_name );
 505  
 506          if ( 'TEMPLATE' === $fragment_processor->context_node->node_name ) {
 507              $fragment_processor->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
 508          }
 509  
 510          $fragment_processor->reset_insertion_mode_appropriately();
 511  
 512          /*
 513           * > Set the parser's form element pointer to the nearest node to the context element that
 514           * > is a form element (going straight up the ancestor chain, and including the element
 515           * > itself, if it is a form element), if any. (If there is no such form element, the
 516           * > form element pointer keeps its initial value, null.)
 517           */
 518          foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
 519              if ( 'FORM' === $element->node_name && 'html' === $element->namespace ) {
 520                  $fragment_processor->state->form_element                = clone $element;
 521                  $fragment_processor->state->form_element->bookmark_name = null;
 522                  $fragment_processor->state->form_element->on_destroy    = null;
 523                  break;
 524              }
 525          }
 526  
 527          $fragment_processor->state->encoding_confidence = 'irrelevant';
 528  
 529          /*
 530           * Update the parsing namespace near the end of the process.
 531           * This is important so that any push/pop from the stack of open
 532           * elements does not change the parsing namespace.
 533           */
 534          $fragment_processor->change_parsing_namespace(
 535              $this->current_element->token->integration_node_type ? 'html' : $namespace
 536          );
 537  
 538          return $fragment_processor;
 539      }
 540  
 541      /**
 542       * Stops the parser and terminates its execution when encountering unsupported markup.
 543       *
 544       * @throws WP_HTML_Unsupported_Exception Halts execution of the parser.
 545       *
 546       * @since 6.7.0
 547       *
 548       * @param string $message Explains support is missing in order to parse the current node.
 549       */
 550  	private function bail( string $message ) {
 551          $here  = $this->bookmarks[ $this->state->current_token->bookmark_name ];
 552          $token = substr( $this->html, $here->start, $here->length );
 553  
 554          $open_elements = array();
 555          foreach ( $this->state->stack_of_open_elements->stack as $item ) {
 556              $open_elements[] = $item->node_name;
 557          }
 558  
 559          $active_formats = array();
 560          foreach ( $this->state->active_formatting_elements->walk_down() as $item ) {
 561              $active_formats[] = $item->node_name;
 562          }
 563  
 564          $this->last_error = self::ERROR_UNSUPPORTED;
 565  
 566          $this->unsupported_exception = new WP_HTML_Unsupported_Exception(
 567              $message,
 568              $this->state->current_token->node_name,
 569              $here->start,
 570              $token,
 571              $open_elements,
 572              $active_formats
 573          );
 574  
 575          throw $this->unsupported_exception;
 576      }
 577  
 578      /**
 579       * Returns the last error, if any.
 580       *
 581       * Various situations lead to parsing failure but this class will
 582       * return `false` in all those cases. To determine why something
 583       * failed it's possible to request the last error. This can be
 584       * helpful to know to distinguish whether a given tag couldn't
 585       * be found or if content in the document caused the processor
 586       * to give up and abort processing.
 587       *
 588       * Example
 589       *
 590       *     $processor = WP_HTML_Processor::create_fragment( '<template><strong><button><em><p><em>' );
 591       *     false === $processor->next_tag();
 592       *     WP_HTML_Processor::ERROR_UNSUPPORTED === $processor->get_last_error();
 593       *
 594       * @since 6.4.0
 595       *
 596       * @see self::ERROR_UNSUPPORTED
 597       * @see self::ERROR_EXCEEDED_MAX_BOOKMARKS
 598       *
 599       * @return string|null The last error, if one exists, otherwise null.
 600       */
 601  	public function get_last_error(): ?string {
 602          return $this->last_error;
 603      }
 604  
 605      /**
 606       * Returns context for why the parser aborted due to unsupported HTML, if it did.
 607       *
 608       * This is meant for debugging purposes, not for production use.
 609       *
 610       * @since 6.7.0
 611       *
 612       * @see self::$unsupported_exception
 613       *
 614       * @return WP_HTML_Unsupported_Exception|null
 615       */
 616  	public function get_unsupported_exception() {
 617          return $this->unsupported_exception;
 618      }
 619  
 620      /**
 621       * Finds the next tag matching the $query.
 622       *
 623       * @todo Support matching the class name and tag name.
 624       *
 625       * @since 6.4.0
 626       * @since 6.6.0 Visits all tokens, including virtual ones.
 627       *
 628       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
 629       *
 630       * @param array|string|null $query {
 631       *     Optional. Which tag name to find, having which class, etc. Default is to find any tag.
 632       *
 633       *     @type string|null $tag_name     Which tag to find, or `null` for "any tag."
 634       *     @type string      $tag_closers  'visit' to pause at tag closers, 'skip' or unset to only visit openers.
 635       *     @type int|null    $match_offset Find the Nth tag matching all search criteria.
 636       *                                     1 for "first" tag, 3 for "third," etc.
 637       *                                     Defaults to first tag.
 638       *     @type string|null $class_name   Tag must contain this whole class name to match.
 639       *     @type string[]    $breadcrumbs  DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
 640       *                                     May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
 641       * }
 642       * @return bool Whether a tag was matched.
 643       */
 644  	public function next_tag( $query = null ): bool {
 645          $visit_closers = isset( $query['tag_closers'] ) && 'visit' === $query['tag_closers'];
 646  
 647          if ( null === $query ) {
 648              while ( $this->next_token() ) {
 649                  if ( '#tag' !== $this->get_token_type() ) {
 650                      continue;
 651                  }
 652  
 653                  if ( ! $this->is_tag_closer() || $visit_closers ) {
 654                      return true;
 655                  }
 656              }
 657  
 658              return false;
 659          }
 660  
 661          if ( is_string( $query ) ) {
 662              $query = array( 'breadcrumbs' => array( $query ) );
 663          }
 664  
 665          if ( ! is_array( $query ) ) {
 666              _doing_it_wrong(
 667                  __METHOD__,
 668                  __( 'Please pass a query array to this function.' ),
 669                  '6.4.0'
 670              );
 671              return false;
 672          }
 673  
 674          if ( isset( $query['tag_name'] ) ) {
 675              $query['tag_name'] = strtoupper( $query['tag_name'] );
 676          }
 677  
 678          $needs_class = ( isset( $query['class_name'] ) && is_string( $query['class_name'] ) )
 679              ? $query['class_name']
 680              : null;
 681  
 682          if ( ! ( array_key_exists( 'breadcrumbs', $query ) && is_array( $query['breadcrumbs'] ) ) ) {
 683              while ( $this->next_token() ) {
 684                  if ( '#tag' !== $this->get_token_type() ) {
 685                      continue;
 686                  }
 687  
 688                  if ( isset( $query['tag_name'] ) && $query['tag_name'] !== $this->get_token_name() ) {
 689                      continue;
 690                  }
 691  
 692                  if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
 693                      continue;
 694                  }
 695  
 696                  if ( ! $this->is_tag_closer() || $visit_closers ) {
 697                      return true;
 698                  }
 699              }
 700  
 701              return false;
 702          }
 703  
 704          $breadcrumbs  = $query['breadcrumbs'];
 705          $match_offset = isset( $query['match_offset'] ) ? (int) $query['match_offset'] : 1;
 706  
 707          while ( $match_offset > 0 && $this->next_token() ) {
 708              if ( '#tag' !== $this->get_token_type() || $this->is_tag_closer() ) {
 709                  continue;
 710              }
 711  
 712              if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
 713                  continue;
 714              }
 715  
 716              if ( $this->matches_breadcrumbs( $breadcrumbs ) && 0 === --$match_offset ) {
 717                  return true;
 718              }
 719          }
 720  
 721          return false;
 722      }
 723  
 724      /**
 725       * Finds the next token in the HTML document.
 726       *
 727       * This doesn't currently have a way to represent non-tags and doesn't process
 728       * semantic rules for text nodes. For access to the raw tokens consider using
 729       * WP_HTML_Tag_Processor instead.
 730       *
 731       * @since 6.5.0 Added for internal support; do not use.
 732       * @since 6.7.1 Refactored so subclasses may extend.
 733       *
 734       * @return bool Whether a token was parsed.
 735       */
 736  	public function next_token(): bool {
 737          return $this->next_visitable_token();
 738      }
 739  
 740      /**
 741       * Ensures internal accounting is maintained for HTML semantic rules while
 742       * the underlying Tag Processor class is seeking to a bookmark.
 743       *
 744       * This doesn't currently have a way to represent non-tags and doesn't process
 745       * semantic rules for text nodes. For access to the raw tokens consider using
 746       * WP_HTML_Tag_Processor instead.
 747       *
 748       * Note that this method may call itself recursively. This is why it is not
 749       * implemented as {@see WP_HTML_Processor::next_token()}, which instead calls
 750       * this method similarly to how {@see WP_HTML_Tag_Processor::next_token()}
 751       * calls the {@see WP_HTML_Tag_Processor::base_class_next_token()} method.
 752       *
 753       * @since 6.7.1 Added for internal support.
 754       *
 755       * @access private
 756       *
 757       * @return bool
 758       */
 759  	private function next_visitable_token(): bool {
 760          $this->current_element = null;
 761  
 762          if ( isset( $this->last_error ) ) {
 763              return false;
 764          }
 765  
 766          /*
 767           * Prime the events if there are none.
 768           *
 769           * @todo In some cases, probably related to the adoption agency
 770           *       algorithm, this call to step() doesn't create any new
 771           *       events. Calling it again creates them. Figure out why
 772           *       this is and if it's inherent or if it's a bug. Looping
 773           *       until there are events or until there are no more
 774           *       tokens works in the meantime and isn't obviously wrong.
 775           */
 776          if ( empty( $this->element_queue ) && $this->step() ) {
 777              return $this->next_visitable_token();
 778          }
 779  
 780          // Process the next event on the queue.
 781          $this->current_element = array_shift( $this->element_queue );
 782          if ( ! isset( $this->current_element ) ) {
 783              // There are no tokens left, so close all remaining open elements.
 784              while ( $this->state->stack_of_open_elements->pop() ) {
 785                  continue;
 786              }
 787  
 788              return empty( $this->element_queue ) ? false : $this->next_visitable_token();
 789          }
 790  
 791          $is_pop = WP_HTML_Stack_Event::POP === $this->current_element->operation;
 792  
 793          /*
 794           * The root node only exists in the fragment parser, and closing it
 795           * indicates that the parse is complete. Stop before popping it from
 796           * the breadcrumbs.
 797           */
 798          if ( 'root-node' === $this->current_element->token->bookmark_name ) {
 799              return $this->next_visitable_token();
 800          }
 801  
 802          // Adjust the breadcrumbs for this event.
 803          if ( $is_pop ) {
 804              array_pop( $this->breadcrumbs );
 805          } else {
 806              $this->breadcrumbs[] = $this->current_element->token->node_name;
 807          }
 808  
 809          // Avoid sending close events for elements which don't expect a closing.
 810          if ( $is_pop && ! $this->expects_closer( $this->current_element->token ) ) {
 811              return $this->next_visitable_token();
 812          }
 813  
 814          return true;
 815      }
 816  
 817      /**
 818       * Indicates if the current tag token is a tag closer.
 819       *
 820       * Example:
 821       *
 822       *     $p = WP_HTML_Processor::create_fragment( '<div></div>' );
 823       *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
 824       *     $p->is_tag_closer() === false;
 825       *
 826       *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
 827       *     $p->is_tag_closer() === true;
 828       *
 829       * @since 6.6.0 Subclassed for HTML Processor.
 830       *
 831       * @return bool Whether the current tag is a tag closer.
 832       */
 833  	public function is_tag_closer(): bool {
 834          return $this->is_virtual()
 835              ? ( WP_HTML_Stack_Event::POP === $this->current_element->operation && '#tag' === $this->get_token_type() )
 836              : parent::is_tag_closer();
 837      }
 838  
 839      /**
 840       * Indicates if the currently-matched token is virtual, created by a stack operation
 841       * while processing HTML, rather than a token found in the HTML text itself.
 842       *
 843       * @since 6.6.0
 844       *
 845       * @return bool Whether the current token is virtual.
 846       */
 847  	private function is_virtual(): bool {
 848          return (
 849              isset( $this->current_element->provenance ) &&
 850              'virtual' === $this->current_element->provenance
 851          );
 852      }
 853  
 854      /**
 855       * Indicates if the currently-matched tag matches the given breadcrumbs.
 856       *
 857       * A "*" represents a single tag wildcard, where any tag matches, but not no tags.
 858       *
 859       * At some point this function _may_ support a `**` syntax for matching any number
 860       * of unspecified tags in the breadcrumb stack. This has been intentionally left
 861       * out, however, to keep this function simple and to avoid introducing backtracking,
 862       * which could open up surprising performance breakdowns.
 863       *
 864       * Example:
 865       *
 866       *     $processor = WP_HTML_Processor::create_fragment( '<div><span><figure><img></figure></span></div>' );
 867       *     $processor->next_tag( 'img' );
 868       *     true  === $processor->matches_breadcrumbs( array( 'figure', 'img' ) );
 869       *     true  === $processor->matches_breadcrumbs( array( 'span', 'figure', 'img' ) );
 870       *     false === $processor->matches_breadcrumbs( array( 'span', 'img' ) );
 871       *     true  === $processor->matches_breadcrumbs( array( 'span', '*', 'img' ) );
 872       *
 873       * @since 6.4.0
 874       *
 875       * @param string[] $breadcrumbs DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
 876       *                              May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
 877       * @return bool Whether the currently-matched tag is found at the given nested structure.
 878       */
 879  	public function matches_breadcrumbs( $breadcrumbs ): bool {
 880          // Everything matches when there are zero constraints.
 881          if ( 0 === count( $breadcrumbs ) ) {
 882              return true;
 883          }
 884  
 885          // Start at the last crumb.
 886          $crumb = end( $breadcrumbs );
 887  
 888          if ( '*' !== $crumb && $this->get_tag() !== strtoupper( $crumb ) ) {
 889              return false;
 890          }
 891  
 892          for ( $i = count( $this->breadcrumbs ) - 1; $i >= 0; $i-- ) {
 893              $node  = $this->breadcrumbs[ $i ];
 894              $crumb = strtoupper( current( $breadcrumbs ) );
 895  
 896              if ( '*' !== $crumb && $node !== $crumb ) {
 897                  return false;
 898              }
 899  
 900              if ( false === prev( $breadcrumbs ) ) {
 901                  return true;
 902              }
 903          }
 904  
 905          return false;
 906      }
 907  
 908      /**
 909       * Indicates if the currently-matched node expects a closing
 910       * token, or if it will self-close on the next step.
 911       *
 912       * Most HTML elements expect a closer, such as a P element or
 913       * a DIV element. Others, like an IMG element are void and don't
 914       * have a closing tag. Special elements, such as SCRIPT and STYLE,
 915       * are treated just like void tags. Text nodes and self-closing
 916       * foreign content will also act just like a void tag, immediately
 917       * closing as soon as the processor advances to the next token.
 918       *
 919       * @since 6.6.0
 920       *
 921       * @param WP_HTML_Token|null $node Optional. Node to examine, if provided.
 922       *                                 Default is to examine current node.
 923       * @return bool|null Whether to expect a closer for the currently-matched node,
 924       *                   or `null` if not matched on any token.
 925       */
 926  	public function expects_closer( ?WP_HTML_Token $node = null ): ?bool {
 927          $token_name = $node->node_name ?? $this->get_token_name();
 928  
 929          if ( ! isset( $token_name ) ) {
 930              return null;
 931          }
 932  
 933          $token_namespace        = $node->namespace ?? $this->get_namespace();
 934          $token_has_self_closing = $node->has_self_closing_flag ?? $this->has_self_closing_flag();
 935  
 936          return ! (
 937              // Comments, text nodes, and other atomic tokens.
 938              '#' === $token_name[0] ||
 939              // Doctype declarations.
 940              'html' === $token_name ||
 941              // Void elements.
 942              ( 'html' === $token_namespace && self::is_void( $token_name ) ) ||
 943              // Special atomic elements.
 944              ( 'html' === $token_namespace && in_array( $token_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) ||
 945              // Self-closing elements in foreign content.
 946              ( 'html' !== $token_namespace && $token_has_self_closing )
 947          );
 948      }
 949  
 950      /**
 951       * Steps through the HTML document and stop at the next tag, if any.
 952       *
 953       * @since 6.4.0
 954       *
 955       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
 956       *
 957       * @see self::PROCESS_NEXT_NODE
 958       * @see self::REPROCESS_CURRENT_NODE
 959       *
 960       * @param string $node_to_process Whether to parse the next node or reprocess the current node.
 961       * @return bool Whether a tag was matched.
 962       */
 963  	public function step( $node_to_process = self::PROCESS_NEXT_NODE ): bool {
 964          // Refuse to proceed if there was a previous error.
 965          if ( null !== $this->last_error ) {
 966              return false;
 967          }
 968  
 969          if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
 970              /*
 971               * Void elements still hop onto the stack of open elements even though
 972               * there's no corresponding closing tag. This is important for managing
 973               * stack-based operations such as "navigate to parent node" or checking
 974               * on an element's breadcrumbs.
 975               *
 976               * When moving on to the next node, therefore, if the bottom-most element
 977               * on the stack is a void element, it must be closed.
 978               */
 979              $top_node = $this->state->stack_of_open_elements->current_node();
 980              if ( isset( $top_node ) && ! $this->expects_closer( $top_node ) ) {
 981                  $this->state->stack_of_open_elements->pop();
 982              }
 983          }
 984  
 985          if ( self::PROCESS_NEXT_NODE === $node_to_process ) {
 986              parent::next_token();
 987              if ( WP_HTML_Tag_Processor::STATE_TEXT_NODE === $this->parser_state ) {
 988                  parent::subdivide_text_appropriately();
 989              }
 990          }
 991  
 992          // Finish stepping when there are no more tokens in the document.
 993          if (
 994              WP_HTML_Tag_Processor::STATE_INCOMPLETE_INPUT === $this->parser_state ||
 995              WP_HTML_Tag_Processor::STATE_COMPLETE === $this->parser_state
 996          ) {
 997              return false;
 998          }
 999  
1000          $adjusted_current_node = $this->get_adjusted_current_node();
1001          $is_closer             = $this->is_tag_closer();
1002          $is_start_tag          = WP_HTML_Tag_Processor::STATE_MATCHED_TAG === $this->parser_state && ! $is_closer;
1003          $token_name            = $this->get_token_name();
1004  
1005          if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
1006              $this->state->current_token = new WP_HTML_Token(
1007                  $this->bookmark_token(),
1008                  $token_name,
1009                  $this->has_self_closing_flag(),
1010                  $this->release_internal_bookmark_on_destruct
1011              );
1012          }
1013  
1014          $parse_in_current_insertion_mode = (
1015              0 === $this->state->stack_of_open_elements->count() ||
1016              'html' === $adjusted_current_node->namespace ||
1017              (
1018                  'math' === $adjusted_current_node->integration_node_type &&
1019                  (
1020                      ( $is_start_tag && ! in_array( $token_name, array( 'MGLYPH', 'MALIGNMARK' ), true ) ) ||
1021                      '#text' === $token_name
1022                  )
1023              ) ||
1024              (
1025                  'math' === $adjusted_current_node->namespace &&
1026                  'ANNOTATION-XML' === $adjusted_current_node->node_name &&
1027                  $is_start_tag && 'SVG' === $token_name
1028              ) ||
1029              (
1030                  'html' === $adjusted_current_node->integration_node_type &&
1031                  ( $is_start_tag || '#text' === $token_name )
1032              )
1033          );
1034  
1035          try {
1036              if ( ! $parse_in_current_insertion_mode ) {
1037                  return $this->step_in_foreign_content();
1038              }
1039  
1040              switch ( $this->state->insertion_mode ) {
1041                  case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
1042                      return $this->step_initial();
1043  
1044                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
1045                      return $this->step_before_html();
1046  
1047                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
1048                      return $this->step_before_head();
1049  
1050                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
1051                      return $this->step_in_head();
1052  
1053                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
1054                      return $this->step_in_head_noscript();
1055  
1056                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
1057                      return $this->step_after_head();
1058  
1059                  case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
1060                      return $this->step_in_body();
1061  
1062                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
1063                      return $this->step_in_table();
1064  
1065                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
1066                      return $this->step_in_table_text();
1067  
1068                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
1069                      return $this->step_in_caption();
1070  
1071                  case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
1072                      return $this->step_in_column_group();
1073  
1074                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
1075                      return $this->step_in_table_body();
1076  
1077                  case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
1078                      return $this->step_in_row();
1079  
1080                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
1081                      return $this->step_in_cell();
1082  
1083                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
1084                      return $this->step_in_select();
1085  
1086                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
1087                      return $this->step_in_select_in_table();
1088  
1089                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
1090                      return $this->step_in_template();
1091  
1092                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
1093                      return $this->step_after_body();
1094  
1095                  case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
1096                      return $this->step_in_frameset();
1097  
1098                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
1099                      return $this->step_after_frameset();
1100  
1101                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
1102                      return $this->step_after_after_body();
1103  
1104                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
1105                      return $this->step_after_after_frameset();
1106  
1107                  // This should be unreachable but PHP doesn't have total type checking on switch.
1108                  default:
1109                      $this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
1110              }
1111          } catch ( WP_HTML_Unsupported_Exception $e ) {
1112              /*
1113               * Exceptions are used in this class to escape deep call stacks that
1114               * otherwise might involve messier calling and return conventions.
1115               */
1116              return false;
1117          }
1118      }
1119  
1120      /**
1121       * Computes the HTML breadcrumbs for the currently-matched node, if matched.
1122       *
1123       * Breadcrumbs start at the outermost parent and descend toward the matched element.
1124       * They always include the entire path from the root HTML node to the matched element.
1125       *
1126       * @todo It could be more efficient to expose a generator-based version of this function
1127       *       to avoid creating the array copy on tag iteration. If this is done, it would likely
1128       *       be more useful to walk up the stack when yielding instead of starting at the top.
1129       *
1130       * Example
1131       *
1132       *     $processor = WP_HTML_Processor::create_fragment( '<p><strong><em><img></em></strong></p>' );
1133       *     $processor->next_tag( 'IMG' );
1134       *     $processor->get_breadcrumbs() === array( 'HTML', 'BODY', 'P', 'STRONG', 'EM', 'IMG' );
1135       *
1136       * @since 6.4.0
1137       *
1138       * @return string[]|null Array of tag names representing path to matched node, if matched, otherwise NULL.
1139       */
1140  	public function get_breadcrumbs(): ?array {
1141          return $this->breadcrumbs;
1142      }
1143  
1144      /**
1145       * Returns the nesting depth of the current location in the document.
1146       *
1147       * Example:
1148       *
1149       *     $processor = WP_HTML_Processor::create_fragment( '<div><p></p></div>' );
1150       *     // The processor starts in the BODY context, meaning it has depth from the start: HTML > BODY.
1151       *     2 === $processor->get_current_depth();
1152       *
1153       *     // Opening the DIV element increases the depth.
1154       *     $processor->next_token();
1155       *     3 === $processor->get_current_depth();
1156       *
1157       *     // Opening the P element increases the depth.
1158       *     $processor->next_token();
1159       *     4 === $processor->get_current_depth();
1160       *
1161       *     // The P element is closed during `next_token()` so the depth is decreased to reflect that.
1162       *     $processor->next_token();
1163       *     3 === $processor->get_current_depth();
1164       *
1165       * @since 6.6.0
1166       *
1167       * @return int Nesting-depth of current location in the document.
1168       */
1169  	public function get_current_depth(): int {
1170          return count( $this->breadcrumbs );
1171      }
1172  
1173      /**
1174       * Normalizes an HTML fragment by serializing it.
1175       *
1176       * This method assumes that the given HTML snippet is found in BODY context.
1177       * For normalizing full documents or fragments found in other contexts, create
1178       * a new processor using {@see WP_HTML_Processor::create_fragment} or
1179       * {@see WP_HTML_Processor::create_full_parser} and call {@see WP_HTML_Processor::serialize}
1180       * on the created instances.
1181       *
1182       * Many aspects of an input HTML fragment may be changed during normalization.
1183       *
1184       *  - Attribute values will be double-quoted.
1185       *  - Duplicate attributes will be removed.
1186       *  - Omitted tags will be added.
1187       *  - Tag and attribute name casing will be lower-cased,
1188       *    except for specific SVG and MathML tags or attributes.
1189       *  - Text will be re-encoded, null bytes handled,
1190       *    and invalid UTF-8 replaced with U+FFFD.
1191       *  - Any incomplete syntax trailing at the end will be omitted,
1192       *    for example, an unclosed comment opener will be removed.
1193       *
1194       * Example:
1195       *
1196       *     echo WP_HTML_Processor::normalize( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1197       *     // <a href="#anchor" v="5" enabled>One</a>
1198       *
1199       *     echo WP_HTML_Processor::normalize( '<div></p>fun<table><td>cell</div>' );
1200       *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1201       *
1202       *     echo WP_HTML_Processor::normalize( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1203       *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1204       *
1205       * @since 6.7.0
1206       *
1207       * @param string $html Input HTML to normalize.
1208       *
1209       * @return string|null Normalized output, or `null` if unable to normalize.
1210       */
1211  	public static function normalize( string $html ): ?string {
1212          return static::create_fragment( $html )->serialize();
1213      }
1214  
1215      /**
1216       * Returns normalized HTML for a fragment by serializing it.
1217       *
1218       * This differs from {@see WP_HTML_Processor::normalize} in that it starts with
1219       * a specific HTML Processor, which _must_ not have already started scanning;
1220       * it must be in the initial ready state and will be in the completed state once
1221       * serialization is complete.
1222       *
1223       * Many aspects of an input HTML fragment may be changed during normalization.
1224       *
1225       *  - Attribute values will be double-quoted.
1226       *  - Duplicate attributes will be removed.
1227       *  - Omitted tags will be added.
1228       *  - Tag and attribute name casing will be lower-cased,
1229       *    except for specific SVG and MathML tags or attributes.
1230       *  - Text will be re-encoded, null bytes handled,
1231       *    and invalid UTF-8 replaced with U+FFFD.
1232       *  - Any incomplete syntax trailing at the end will be omitted,
1233       *    for example, an unclosed comment opener will be removed.
1234       *
1235       * Example:
1236       *
1237       *     $processor = WP_HTML_Processor::create_fragment( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1238       *     echo $processor->serialize();
1239       *     // <a href="#anchor" v="5" enabled>One</a>
1240       *
1241       *     $processor = WP_HTML_Processor::create_fragment( '<div></p>fun<table><td>cell</div>' );
1242       *     echo $processor->serialize();
1243       *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1244       *
1245       *     $processor = WP_HTML_Processor::create_fragment( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1246       *     echo $processor->serialize();
1247       *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1248       *
1249       * @since 6.7.0
1250       *
1251       * @return string|null Normalized HTML markup represented by processor,
1252       *                     or `null` if unable to generate serialization.
1253       */
1254  	public function serialize(): ?string {
1255          if ( WP_HTML_Tag_Processor::STATE_READY !== $this->parser_state ) {
1256              wp_trigger_error(
1257                  __METHOD__,
1258                  'An HTML Processor which has already started processing cannot serialize its contents. Serialize immediately after creating the instance.',
1259                  E_USER_WARNING
1260              );
1261              return null;
1262          }
1263  
1264          $html = '';
1265          while ( $this->next_token() ) {
1266              $html .= $this->serialize_token();
1267          }
1268  
1269          if ( null !== $this->get_last_error() ) {
1270              wp_trigger_error(
1271                  __METHOD__,
1272                  "Cannot serialize HTML Processor with parsing error: {$this->get_last_error()}.",
1273                  E_USER_WARNING
1274              );
1275              return null;
1276          }
1277  
1278          return $html;
1279      }
1280  
1281      /**
1282       * Serializes the currently-matched token.
1283       *
1284       * This method produces a fully-normative HTML string for the currently-matched token,
1285       * if able. If not matched at any token or if the token doesn't correspond to any HTML
1286       * it will return an empty string (for example, presumptuous end tags are ignored).
1287       *
1288       * @see static::serialize()
1289       *
1290       * @since 6.7.0
1291       *
1292       * @return string Serialization of token, or empty string if no serialization exists.
1293       */
1294  	protected function serialize_token(): string {
1295          $html       = '';
1296          $token_type = $this->get_token_type();
1297  
1298          switch ( $token_type ) {
1299              case '#doctype':
1300                  $doctype = $this->get_doctype_info();
1301                  if ( null === $doctype ) {
1302                      break;
1303                  }
1304  
1305                  $html .= '<!DOCTYPE';
1306  
1307                  if ( $doctype->name ) {
1308                      $html .= " {$doctype->name}";
1309                  }
1310  
1311                  if ( null !== $doctype->public_identifier ) {
1312                      $quote = str_contains( $doctype->public_identifier, '"' ) ? "'" : '"';
1313                      $html .= " PUBLIC {$quote}{$doctype->public_identifier}{$quote}";
1314                  }
1315                  if ( null !== $doctype->system_identifier ) {
1316                      if ( null === $doctype->public_identifier ) {
1317                          $html .= ' SYSTEM';
1318                      }
1319                      $quote = str_contains( $doctype->system_identifier, '"' ) ? "'" : '"';
1320                      $html .= " {$quote}{$doctype->system_identifier}{$quote}";
1321                  }
1322  
1323                  $html .= '>';
1324                  break;
1325  
1326              case '#text':
1327                  $html .= htmlspecialchars( $this->get_modifiable_text(), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1328                  break;
1329  
1330              // Unlike the `<>` which is interpreted as plaintext, this is ignored entirely.
1331              case '#presumptuous-tag':
1332                  break;
1333  
1334              case '#funky-comment':
1335              case '#comment':
1336                  $html .= "<!--{$this->get_full_comment_text()}-->";
1337                  break;
1338  
1339              case '#cdata-section':
1340                  $html .= "<![CDATA[{$this->get_modifiable_text()}]]>";
1341                  break;
1342          }
1343  
1344          if ( '#tag' !== $token_type ) {
1345              return $html;
1346          }
1347  
1348          $tag_name       = str_replace( "\x00", "\u{FFFD}", $this->get_tag() );
1349          $in_html        = 'html' === $this->get_namespace();
1350          $qualified_name = $in_html ? strtolower( $tag_name ) : $this->get_qualified_tag_name();
1351  
1352          if ( $this->is_tag_closer() ) {
1353              $html .= "</{$qualified_name}>";
1354              return $html;
1355          }
1356  
1357          $attribute_names = $this->get_attribute_names_with_prefix( '' );
1358          if ( ! isset( $attribute_names ) ) {
1359              $html .= "<{$qualified_name}>";
1360              return $html;
1361          }
1362  
1363          $html .= "<{$qualified_name}";
1364          foreach ( $attribute_names as $attribute_name ) {
1365              $html .= " {$this->get_qualified_attribute_name( $attribute_name )}";
1366              $value = $this->get_attribute( $attribute_name );
1367  
1368              if ( is_string( $value ) ) {
1369                  $html .= '="' . htmlspecialchars( $value, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5 ) . '"';
1370              }
1371  
1372              $html = str_replace( "\x00", "\u{FFFD}", $html );
1373          }
1374  
1375          if ( ! $in_html && $this->has_self_closing_flag() ) {
1376              $html .= ' /';
1377          }
1378  
1379          $html .= '>';
1380  
1381          // Flush out self-contained elements.
1382          if ( $in_html && in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) {
1383              $text = $this->get_modifiable_text();
1384  
1385              switch ( $tag_name ) {
1386                  case 'IFRAME':
1387                  case 'NOEMBED':
1388                  case 'NOFRAMES':
1389                      $text = '';
1390                      break;
1391  
1392                  case 'SCRIPT':
1393                  case 'STYLE':
1394                      break;
1395  
1396                  default:
1397                      $text = htmlspecialchars( $text, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1398              }
1399  
1400              $html .= "{$text}</{$qualified_name}>";
1401          }
1402  
1403          return $html;
1404      }
1405  
1406      /**
1407       * Parses next element in the 'initial' insertion mode.
1408       *
1409       * This internal function performs the 'initial' insertion mode
1410       * logic for the generalized WP_HTML_Processor::step() function.
1411       *
1412       * @since 6.7.0
1413       *
1414       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1415       *
1416       * @see https://html.spec.whatwg.org/#the-initial-insertion-mode
1417       * @see WP_HTML_Processor::step
1418       *
1419       * @return bool Whether an element was found.
1420       */
1421  	private function step_initial(): bool {
1422          $token_name = $this->get_token_name();
1423          $token_type = $this->get_token_type();
1424          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
1425          $op         = "{$op_sigil}{$token_name}";
1426  
1427          switch ( $op ) {
1428              /*
1429               * > A character token that is one of U+0009 CHARACTER TABULATION,
1430               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1431               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1432               *
1433               * Parse error: ignore the token.
1434               */
1435              case '#text':
1436                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1437                      return $this->step();
1438                  }
1439                  goto initial_anything_else;
1440                  break;
1441  
1442              /*
1443               * > A comment token
1444               */
1445              case '#comment':
1446              case '#funky-comment':
1447              case '#presumptuous-tag':
1448                  $this->insert_html_element( $this->state->current_token );
1449                  return true;
1450  
1451              /*
1452               * > A DOCTYPE token
1453               */
1454              case 'html':
1455                  $doctype = $this->get_doctype_info();
1456                  if ( null !== $doctype && 'quirks' === $doctype->indicated_compatability_mode ) {
1457                      $this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
1458                  }
1459  
1460                  /*
1461                   * > Then, switch the insertion mode to "before html".
1462                   */
1463                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1464                  $this->insert_html_element( $this->state->current_token );
1465                  return true;
1466          }
1467  
1468          /*
1469           * > Anything else
1470           */
1471          initial_anything_else:
1472          $this->compat_mode           = WP_HTML_Tag_Processor::QUIRKS_MODE;
1473          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1474          return $this->step( self::REPROCESS_CURRENT_NODE );
1475      }
1476  
1477      /**
1478       * Parses next element in the 'before html' insertion mode.
1479       *
1480       * This internal function performs the 'before html' insertion mode
1481       * logic for the generalized WP_HTML_Processor::step() function.
1482       *
1483       * @since 6.7.0
1484       *
1485       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1486       *
1487       * @see https://html.spec.whatwg.org/#the-before-html-insertion-mode
1488       * @see WP_HTML_Processor::step
1489       *
1490       * @return bool Whether an element was found.
1491       */
1492  	private function step_before_html(): bool {
1493          $token_name = $this->get_token_name();
1494          $token_type = $this->get_token_type();
1495          $is_closer  = parent::is_tag_closer();
1496          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1497          $op         = "{$op_sigil}{$token_name}";
1498  
1499          switch ( $op ) {
1500              /*
1501               * > A DOCTYPE token
1502               */
1503              case 'html':
1504                  // Parse error: ignore the token.
1505                  return $this->step();
1506  
1507              /*
1508               * > A comment token
1509               */
1510              case '#comment':
1511              case '#funky-comment':
1512              case '#presumptuous-tag':
1513                  $this->insert_html_element( $this->state->current_token );
1514                  return true;
1515  
1516              /*
1517               * > A character token that is one of U+0009 CHARACTER TABULATION,
1518               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1519               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1520               *
1521               * Parse error: ignore the token.
1522               */
1523              case '#text':
1524                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1525                      return $this->step();
1526                  }
1527                  goto before_html_anything_else;
1528                  break;
1529  
1530              /*
1531               * > A start tag whose tag name is "html"
1532               */
1533              case '+HTML':
1534                  $this->insert_html_element( $this->state->current_token );
1535                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1536                  return true;
1537  
1538              /*
1539               * > An end tag whose tag name is one of: "head", "body", "html", "br"
1540               *
1541               * Closing BR tags are always reported by the Tag Processor as opening tags.
1542               */
1543              case '-HEAD':
1544              case '-BODY':
1545              case '-HTML':
1546                  /*
1547                   * > Act as described in the "anything else" entry below.
1548                   */
1549                  goto before_html_anything_else;
1550                  break;
1551          }
1552  
1553          /*
1554           * > Any other end tag
1555           */
1556          if ( $is_closer ) {
1557              // Parse error: ignore the token.
1558              return $this->step();
1559          }
1560  
1561          /*
1562           * > Anything else.
1563           *
1564           * > Create an html element whose node document is the Document object.
1565           * > Append it to the Document object. Put this element in the stack of open elements.
1566           * > Switch the insertion mode to "before head", then reprocess the token.
1567           */
1568          before_html_anything_else:
1569          $this->insert_virtual_node( 'HTML' );
1570          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1571          return $this->step( self::REPROCESS_CURRENT_NODE );
1572      }
1573  
1574      /**
1575       * Parses next element in the 'before head' insertion mode.
1576       *
1577       * This internal function performs the 'before head' insertion mode
1578       * logic for the generalized WP_HTML_Processor::step() function.
1579       *
1580       * @since 6.7.0 Stub implementation.
1581       *
1582       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1583       *
1584       * @see https://html.spec.whatwg.org/#the-before-head-insertion-mode
1585       * @see WP_HTML_Processor::step
1586       *
1587       * @return bool Whether an element was found.
1588       */
1589  	private function step_before_head(): bool {
1590          $token_name = $this->get_token_name();
1591          $token_type = $this->get_token_type();
1592          $is_closer  = parent::is_tag_closer();
1593          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1594          $op         = "{$op_sigil}{$token_name}";
1595  
1596          switch ( $op ) {
1597              /*
1598               * > A character token that is one of U+0009 CHARACTER TABULATION,
1599               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1600               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1601               *
1602               * Parse error: ignore the token.
1603               */
1604              case '#text':
1605                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1606                      return $this->step();
1607                  }
1608                  goto before_head_anything_else;
1609                  break;
1610  
1611              /*
1612               * > A comment token
1613               */
1614              case '#comment':
1615              case '#funky-comment':
1616              case '#presumptuous-tag':
1617                  $this->insert_html_element( $this->state->current_token );
1618                  return true;
1619  
1620              /*
1621               * > A DOCTYPE token
1622               */
1623              case 'html':
1624                  // Parse error: ignore the token.
1625                  return $this->step();
1626  
1627              /*
1628               * > A start tag whose tag name is "html"
1629               */
1630              case '+HTML':
1631                  return $this->step_in_body();
1632  
1633              /*
1634               * > A start tag whose tag name is "head"
1635               */
1636              case '+HEAD':
1637                  $this->insert_html_element( $this->state->current_token );
1638                  $this->state->head_element   = $this->state->current_token;
1639                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1640                  return true;
1641  
1642              /*
1643               * > An end tag whose tag name is one of: "head", "body", "html", "br"
1644               * > Act as described in the "anything else" entry below.
1645               *
1646               * Closing BR tags are always reported by the Tag Processor as opening tags.
1647               */
1648              case '-HEAD':
1649              case '-BODY':
1650              case '-HTML':
1651                  goto before_head_anything_else;
1652                  break;
1653          }
1654  
1655          if ( $is_closer ) {
1656              // Parse error: ignore the token.
1657              return $this->step();
1658          }
1659  
1660          /*
1661           * > Anything else
1662           *
1663           * > Insert an HTML element for a "head" start tag token with no attributes.
1664           */
1665          before_head_anything_else:
1666          $this->state->head_element   = $this->insert_virtual_node( 'HEAD' );
1667          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1668          return $this->step( self::REPROCESS_CURRENT_NODE );
1669      }
1670  
1671      /**
1672       * Parses next element in the 'in head' insertion mode.
1673       *
1674       * This internal function performs the 'in head' insertion mode
1675       * logic for the generalized WP_HTML_Processor::step() function.
1676       *
1677       * @since 6.7.0
1678       *
1679       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1680       *
1681       * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inhead
1682       * @see WP_HTML_Processor::step
1683       *
1684       * @return bool Whether an element was found.
1685       */
1686  	private function step_in_head(): bool {
1687          $token_name = $this->get_token_name();
1688          $token_type = $this->get_token_type();
1689          $is_closer  = parent::is_tag_closer();
1690          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1691          $op         = "{$op_sigil}{$token_name}";
1692  
1693          switch ( $op ) {
1694              case '#text':
1695                  /*
1696                   * > A character token that is one of U+0009 CHARACTER TABULATION,
1697                   * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1698                   * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1699                   */
1700                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1701                      // Insert the character.
1702                      $this->insert_html_element( $this->state->current_token );
1703                      return true;
1704                  }
1705  
1706                  goto in_head_anything_else;
1707                  break;
1708  
1709              /*
1710               * > A comment token
1711               */
1712              case '#comment':
1713              case '#funky-comment':
1714              case '#presumptuous-tag':
1715                  $this->insert_html_element( $this->state->current_token );
1716                  return true;
1717  
1718              /*
1719               * > A DOCTYPE token
1720               */
1721              case 'html':
1722                  // Parse error: ignore the token.
1723                  return $this->step();
1724  
1725              /*
1726               * > A start tag whose tag name is "html"
1727               */
1728              case '+HTML':
1729                  return $this->step_in_body();
1730  
1731              /*
1732               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link"
1733               */
1734              case '+BASE':
1735              case '+BASEFONT':
1736              case '+BGSOUND':
1737              case '+LINK':
1738                  $this->insert_html_element( $this->state->current_token );
1739                  return true;
1740  
1741              /*
1742               * > A start tag whose tag name is "meta"
1743               */
1744              case '+META':
1745                  $this->insert_html_element( $this->state->current_token );
1746  
1747                  /*
1748                   * > If the active speculative HTML parser is null, then:
1749                   * >   - If the element has a charset attribute, and getting an encoding from
1750                   * >     its value results in an encoding, and the confidence is currently
1751                   * >     tentative, then change the encoding to the resulting encoding.
1752                   */
1753                  $charset = $this->get_attribute( 'charset' );
1754                  if ( is_string( $charset ) && 'tentative' === $this->state->encoding_confidence ) {
1755                      $this->bail( 'Cannot yet process META tags with charset to determine encoding.' );
1756                  }
1757  
1758                  /*
1759                   * >   - Otherwise, if the element has an http-equiv attribute whose value is
1760                   * >     an ASCII case-insensitive match for the string "Content-Type", and
1761                   * >     the element has a content attribute, and applying the algorithm for
1762                   * >     extracting a character encoding from a meta element to that attribute's
1763                   * >     value returns an encoding, and the confidence is currently tentative,
1764                   * >     then change the encoding to the extracted encoding.
1765                   */
1766                  $http_equiv = $this->get_attribute( 'http-equiv' );
1767                  $content    = $this->get_attribute( 'content' );
1768                  if (
1769                      is_string( $http_equiv ) &&
1770                      is_string( $content ) &&
1771                      0 === strcasecmp( $http_equiv, 'Content-Type' ) &&
1772                      'tentative' === $this->state->encoding_confidence
1773                  ) {
1774                      $this->bail( 'Cannot yet process META tags with http-equiv Content-Type to determine encoding.' );
1775                  }
1776  
1777                  return true;
1778  
1779              /*
1780               * > A start tag whose tag name is "title"
1781               */
1782              case '+TITLE':
1783                  $this->insert_html_element( $this->state->current_token );
1784                  return true;
1785  
1786              /*
1787               * > A start tag whose tag name is "noscript", if the scripting flag is enabled
1788               * > A start tag whose tag name is one of: "noframes", "style"
1789               *
1790               * The scripting flag is never enabled in this parser.
1791               */
1792              case '+NOFRAMES':
1793              case '+STYLE':
1794                  $this->insert_html_element( $this->state->current_token );
1795                  return true;
1796  
1797              /*
1798               * > A start tag whose tag name is "noscript", if the scripting flag is disabled
1799               */
1800              case '+NOSCRIPT':
1801                  $this->insert_html_element( $this->state->current_token );
1802                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT;
1803                  return true;
1804  
1805              /*
1806               * > A start tag whose tag name is "script"
1807               *
1808               * @todo Could the adjusted insertion location be anything other than the current location?
1809               */
1810              case '+SCRIPT':
1811                  $this->insert_html_element( $this->state->current_token );
1812                  return true;
1813  
1814              /*
1815               * > An end tag whose tag name is "head"
1816               */
1817              case '-HEAD':
1818                  $this->state->stack_of_open_elements->pop();
1819                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1820                  return true;
1821  
1822              /*
1823               * > An end tag whose tag name is one of: "body", "html", "br"
1824               *
1825               * BR tags are always reported by the Tag Processor as opening tags.
1826               */
1827              case '-BODY':
1828              case '-HTML':
1829                  /*
1830                   * > Act as described in the "anything else" entry below.
1831                   */
1832                  goto in_head_anything_else;
1833                  break;
1834  
1835              /*
1836               * > A start tag whose tag name is "template"
1837               *
1838               * @todo Could the adjusted insertion location be anything other than the current location?
1839               */
1840              case '+TEMPLATE':
1841                  $this->state->active_formatting_elements->insert_marker();
1842                  $this->state->frameset_ok = false;
1843  
1844                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1845                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1846  
1847                  $this->insert_html_element( $this->state->current_token );
1848                  return true;
1849  
1850              /*
1851               * > An end tag whose tag name is "template"
1852               */
1853              case '-TEMPLATE':
1854                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
1855                      // @todo Indicate a parse error once it's possible.
1856                      return $this->step();
1857                  }
1858  
1859                  $this->generate_implied_end_tags_thoroughly();
1860                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'TEMPLATE' ) ) {
1861                      // @todo Indicate a parse error once it's possible.
1862                  }
1863  
1864                  $this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
1865                  $this->state->active_formatting_elements->clear_up_to_last_marker();
1866                  array_pop( $this->state->stack_of_template_insertion_modes );
1867                  $this->reset_insertion_mode_appropriately();
1868                  return true;
1869          }
1870  
1871          /*
1872           * > A start tag whose tag name is "head"
1873           * > Any other end tag
1874           */
1875          if ( '+HEAD' === $op || $is_closer ) {
1876              // Parse error: ignore the token.
1877              return $this->step();
1878          }
1879  
1880          /*
1881           * > Anything else
1882           */
1883          in_head_anything_else:
1884          $this->state->stack_of_open_elements->pop();
1885          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1886          return $this->step( self::REPROCESS_CURRENT_NODE );
1887      }
1888  
1889      /**
1890       * Parses next element in the 'in head noscript' insertion mode.
1891       *
1892       * This internal function performs the 'in head noscript' insertion mode
1893       * logic for the generalized WP_HTML_Processor::step() function.
1894       *
1895       * @since 6.7.0 Stub implementation.
1896       *
1897       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1898       *
1899       * @see https://html.spec.whatwg.org/#parsing-main-inheadnoscript
1900       * @see WP_HTML_Processor::step
1901       *
1902       * @return bool Whether an element was found.
1903       */
1904  	private function step_in_head_noscript(): bool {
1905          $token_name = $this->get_token_name();
1906          $token_type = $this->get_token_type();
1907          $is_closer  = parent::is_tag_closer();
1908          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1909          $op         = "{$op_sigil}{$token_name}";
1910  
1911          switch ( $op ) {
1912              /*
1913               * > A character token that is one of U+0009 CHARACTER TABULATION,
1914               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1915               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1916               *
1917               * Parse error: ignore the token.
1918               */
1919              case '#text':
1920                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1921                      return $this->step_in_head();
1922                  }
1923  
1924                  goto in_head_noscript_anything_else;
1925                  break;
1926  
1927              /*
1928               * > A DOCTYPE token
1929               */
1930              case 'html':
1931                  // Parse error: ignore the token.
1932                  return $this->step();
1933  
1934              /*
1935               * > A start tag whose tag name is "html"
1936               */
1937              case '+HTML':
1938                  return $this->step_in_body();
1939  
1940              /*
1941               * > An end tag whose tag name is "noscript"
1942               */
1943              case '-NOSCRIPT':
1944                  $this->state->stack_of_open_elements->pop();
1945                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1946                  return true;
1947  
1948              /*
1949               * > A comment token
1950               * >
1951               * > A start tag whose tag name is one of: "basefont", "bgsound",
1952               * > "link", "meta", "noframes", "style"
1953               */
1954              case '#comment':
1955              case '#funky-comment':
1956              case '#presumptuous-tag':
1957              case '+BASEFONT':
1958              case '+BGSOUND':
1959              case '+LINK':
1960              case '+META':
1961              case '+NOFRAMES':
1962              case '+STYLE':
1963                  return $this->step_in_head();
1964  
1965              /*
1966               * > An end tag whose tag name is "br"
1967               *
1968               * This should never happen, as the Tag Processor prevents showing a BR closing tag.
1969               */
1970          }
1971  
1972          /*
1973           * > A start tag whose tag name is one of: "head", "noscript"
1974           * > Any other end tag
1975           */
1976          if ( '+HEAD' === $op || '+NOSCRIPT' === $op || $is_closer ) {
1977              // Parse error: ignore the token.
1978              return $this->step();
1979          }
1980  
1981          /*
1982           * > Anything else
1983           *
1984           * Anything here is a parse error.
1985           */
1986          in_head_noscript_anything_else:
1987          $this->state->stack_of_open_elements->pop();
1988          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1989          return $this->step( self::REPROCESS_CURRENT_NODE );
1990      }
1991  
1992      /**
1993       * Parses next element in the 'after head' insertion mode.
1994       *
1995       * This internal function performs the 'after head' insertion mode
1996       * logic for the generalized WP_HTML_Processor::step() function.
1997       *
1998       * @since 6.7.0 Stub implementation.
1999       *
2000       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2001       *
2002       * @see https://html.spec.whatwg.org/#the-after-head-insertion-mode
2003       * @see WP_HTML_Processor::step
2004       *
2005       * @return bool Whether an element was found.
2006       */
2007  	private function step_after_head(): bool {
2008          $token_name = $this->get_token_name();
2009          $token_type = $this->get_token_type();
2010          $is_closer  = parent::is_tag_closer();
2011          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
2012          $op         = "{$op_sigil}{$token_name}";
2013  
2014          switch ( $op ) {
2015              /*
2016               * > A character token that is one of U+0009 CHARACTER TABULATION,
2017               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2018               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
2019               */
2020              case '#text':
2021                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
2022                      // Insert the character.
2023                      $this->insert_html_element( $this->state->current_token );
2024                      return true;
2025                  }
2026                  goto after_head_anything_else;
2027                  break;
2028  
2029              /*
2030               * > A comment token
2031               */
2032              case '#comment':
2033              case '#funky-comment':
2034              case '#presumptuous-tag':
2035                  $this->insert_html_element( $this->state->current_token );
2036                  return true;
2037  
2038              /*
2039               * > A DOCTYPE token
2040               */
2041              case 'html':
2042                  // Parse error: ignore the token.
2043                  return $this->step();
2044  
2045              /*
2046               * > A start tag whose tag name is "html"
2047               */
2048              case '+HTML':
2049                  return $this->step_in_body();
2050  
2051              /*
2052               * > A start tag whose tag name is "body"
2053               */
2054              case '+BODY':
2055                  $this->insert_html_element( $this->state->current_token );
2056                  $this->state->frameset_ok    = false;
2057                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2058                  return true;
2059  
2060              /*
2061               * > A start tag whose tag name is "frameset"
2062               */
2063              case '+FRAMESET':
2064                  $this->insert_html_element( $this->state->current_token );
2065                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
2066                  return true;
2067  
2068              /*
2069               * > A start tag whose tag name is one of: "base", "basefont", "bgsound",
2070               * > "link", "meta", "noframes", "script", "style", "template", "title"
2071               *
2072               * Anything here is a parse error.
2073               */
2074              case '+BASE':
2075              case '+BASEFONT':
2076              case '+BGSOUND':
2077              case '+LINK':
2078              case '+META':
2079              case '+NOFRAMES':
2080              case '+SCRIPT':
2081              case '+STYLE':
2082              case '+TEMPLATE':
2083              case '+TITLE':
2084                  /*
2085                   * > Push the node pointed to by the head element pointer onto the stack of open elements.
2086                   * > Process the token using the rules for the "in head" insertion mode.
2087                   * > Remove the node pointed to by the head element pointer from the stack of open elements. (It might not be the current node at this point.)
2088                   */
2089                  $this->bail( 'Cannot process elements after HEAD which reopen the HEAD element.' );
2090                  /*
2091                   * Do not leave this break in when adding support; it's here to prevent
2092                   * WPCS from getting confused at the switch structure without a return,
2093                   * because it doesn't know that `bail()` always throws.
2094                   */
2095                  break;
2096  
2097              /*
2098               * > An end tag whose tag name is "template"
2099               */
2100              case '-TEMPLATE':
2101                  return $this->step_in_head();
2102  
2103              /*
2104               * > An end tag whose tag name is one of: "body", "html", "br"
2105               *
2106               * Closing BR tags are always reported by the Tag Processor as opening tags.
2107               */
2108              case '-BODY':
2109              case '-HTML':
2110                  /*
2111                   * > Act as described in the "anything else" entry below.
2112                   */
2113                  goto after_head_anything_else;
2114                  break;
2115          }
2116  
2117          /*
2118           * > A start tag whose tag name is "head"
2119           * > Any other end tag
2120           */
2121          if ( '+HEAD' === $op || $is_closer ) {
2122              // Parse error: ignore the token.
2123              return $this->step();
2124          }
2125  
2126          /*
2127           * > Anything else
2128           * > Insert an HTML element for a "body" start tag token with no attributes.
2129           */
2130          after_head_anything_else:
2131          $this->insert_virtual_node( 'BODY' );
2132          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2133          return $this->step( self::REPROCESS_CURRENT_NODE );
2134      }
2135  
2136      /**
2137       * Parses next element in the 'in body' insertion mode.
2138       *
2139       * This internal function performs the 'in body' insertion mode
2140       * logic for the generalized WP_HTML_Processor::step() function.
2141       *
2142       * @since 6.4.0
2143       *
2144       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2145       *
2146       * @see https://html.spec.whatwg.org/#parsing-main-inbody
2147       * @see WP_HTML_Processor::step
2148       *
2149       * @return bool Whether an element was found.
2150       */
2151  	private function step_in_body(): bool {
2152          $token_name = $this->get_token_name();
2153          $token_type = $this->get_token_type();
2154          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
2155          $op         = "{$op_sigil}{$token_name}";
2156  
2157          switch ( $op ) {
2158              case '#text':
2159                  /*
2160                   * > A character token that is U+0000 NULL
2161                   *
2162                   * Any successive sequence of NULL bytes is ignored and won't
2163                   * trigger active format reconstruction. Therefore, if the text
2164                   * only comprises NULL bytes then the token should be ignored
2165                   * here, but if there are any other characters in the stream
2166                   * the active formats should be reconstructed.
2167                   */
2168                  if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
2169                      // Parse error: ignore the token.
2170                      return $this->step();
2171                  }
2172  
2173                  $this->reconstruct_active_formatting_elements();
2174  
2175                  /*
2176                   * Whitespace-only text does not affect the frameset-ok flag.
2177                   * It is probably inter-element whitespace, but it may also
2178                   * contain character references which decode only to whitespace.
2179                   */
2180                  if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
2181                      $this->state->frameset_ok = false;
2182                  }
2183  
2184                  $this->insert_html_element( $this->state->current_token );
2185                  return true;
2186  
2187              case '#comment':
2188              case '#funky-comment':
2189              case '#presumptuous-tag':
2190                  $this->insert_html_element( $this->state->current_token );
2191                  return true;
2192  
2193              /*
2194               * > A DOCTYPE token
2195               * > Parse error. Ignore the token.
2196               */
2197              case 'html':
2198                  return $this->step();
2199  
2200              /*
2201               * > A start tag whose tag name is "html"
2202               */
2203              case '+HTML':
2204                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2205                      /*
2206                       * > Otherwise, for each attribute on the token, check to see if the attribute
2207                       * > is already present on the top element of the stack of open elements. If
2208                       * > it is not, add the attribute and its corresponding value to that element.
2209                       *
2210                       * This parser does not currently support this behavior: ignore the token.
2211                       */
2212                  }
2213  
2214                  // Ignore the token.
2215                  return $this->step();
2216  
2217              /*
2218               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
2219               * > "meta", "noframes", "script", "style", "template", "title"
2220               * >
2221               * > An end tag whose tag name is "template"
2222               */
2223              case '+BASE':
2224              case '+BASEFONT':
2225              case '+BGSOUND':
2226              case '+LINK':
2227              case '+META':
2228              case '+NOFRAMES':
2229              case '+SCRIPT':
2230              case '+STYLE':
2231              case '+TEMPLATE':
2232              case '+TITLE':
2233              case '-TEMPLATE':
2234                  return $this->step_in_head();
2235  
2236              /*
2237               * > A start tag whose tag name is "body"
2238               *
2239               * This tag in the IN BODY insertion mode is a parse error.
2240               */
2241              case '+BODY':
2242                  if (
2243                      1 === $this->state->stack_of_open_elements->count() ||
2244                      'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2245                      $this->state->stack_of_open_elements->contains( 'TEMPLATE' )
2246                  ) {
2247                      // Ignore the token.
2248                      return $this->step();
2249                  }
2250  
2251                  /*
2252                   * > Otherwise, set the frameset-ok flag to "not ok"; then, for each attribute
2253                   * > on the token, check to see if the attribute is already present on the body
2254                   * > element (the second element) on the stack of open elements, and if it is
2255                   * > not, add the attribute and its corresponding value to that element.
2256                   *
2257                   * This parser does not currently support this behavior: ignore the token.
2258                   */
2259                  $this->state->frameset_ok = false;
2260                  return $this->step();
2261  
2262              /*
2263               * > A start tag whose tag name is "frameset"
2264               *
2265               * This tag in the IN BODY insertion mode is a parse error.
2266               */
2267              case '+FRAMESET':
2268                  if (
2269                      1 === $this->state->stack_of_open_elements->count() ||
2270                      'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2271                      false === $this->state->frameset_ok
2272                  ) {
2273                      // Ignore the token.
2274                      return $this->step();
2275                  }
2276  
2277                  /*
2278                   * > Otherwise, run the following steps:
2279                   */
2280                  $this->bail( 'Cannot process non-ignored FRAMESET tags.' );
2281                  break;
2282  
2283              /*
2284               * > An end tag whose tag name is "body"
2285               */
2286              case '-BODY':
2287                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2288                      // Parse error: ignore the token.
2289                      return $this->step();
2290                  }
2291  
2292                  /*
2293                   * > Otherwise, if there is a node in the stack of open elements that is not either a
2294                   * > dd element, a dt element, an li element, an optgroup element, an option element,
2295                   * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2296                   * > element, a td element, a tfoot element, a th element, a thread element, a tr
2297                   * > element, the body element, or the html element, then this is a parse error.
2298                   *
2299                   * There is nothing to do for this parse error, so don't check for it.
2300                   */
2301  
2302                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2303                  return true;
2304  
2305              /*
2306               * > An end tag whose tag name is "html"
2307               */
2308              case '-HTML':
2309                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2310                      // Parse error: ignore the token.
2311                      return $this->step();
2312                  }
2313  
2314                  /*
2315                   * > Otherwise, if there is a node in the stack of open elements that is not either a
2316                   * > dd element, a dt element, an li element, an optgroup element, an option element,
2317                   * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2318                   * > element, a td element, a tfoot element, a th element, a thread element, a tr
2319                   * > element, the body element, or the html element, then this is a parse error.
2320                   *
2321                   * There is nothing to do for this parse error, so don't check for it.
2322                   */
2323  
2324                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2325                  return $this->step( self::REPROCESS_CURRENT_NODE );
2326  
2327              /*
2328               * > A start tag whose tag name is one of: "address", "article", "aside",
2329               * > "blockquote", "center", "details", "dialog", "dir", "div", "dl",
2330               * > "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
2331               * > "main", "menu", "nav", "ol", "p", "search", "section", "summary", "ul"
2332               */
2333              case '+ADDRESS':
2334              case '+ARTICLE':
2335              case '+ASIDE':
2336              case '+BLOCKQUOTE':
2337              case '+CENTER':
2338              case '+DETAILS':
2339              case '+DIALOG':
2340              case '+DIR':
2341              case '+DIV':
2342              case '+DL':
2343              case '+FIELDSET':
2344              case '+FIGCAPTION':
2345              case '+FIGURE':
2346              case '+FOOTER':
2347              case '+HEADER':
2348              case '+HGROUP':
2349              case '+MAIN':
2350              case '+MENU':
2351              case '+NAV':
2352              case '+OL':
2353              case '+P':
2354              case '+SEARCH':
2355              case '+SECTION':
2356              case '+SUMMARY':
2357              case '+UL':
2358                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2359                      $this->close_a_p_element();
2360                  }
2361  
2362                  $this->insert_html_element( $this->state->current_token );
2363                  return true;
2364  
2365              /*
2366               * > A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2367               */
2368              case '+H1':
2369              case '+H2':
2370              case '+H3':
2371              case '+H4':
2372              case '+H5':
2373              case '+H6':
2374                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2375                      $this->close_a_p_element();
2376                  }
2377  
2378                  if (
2379                      in_array(
2380                          $this->state->stack_of_open_elements->current_node()->node_name,
2381                          array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' ),
2382                          true
2383                      )
2384                  ) {
2385                      // @todo Indicate a parse error once it's possible.
2386                      $this->state->stack_of_open_elements->pop();
2387                  }
2388  
2389                  $this->insert_html_element( $this->state->current_token );
2390                  return true;
2391  
2392              /*
2393               * > A start tag whose tag name is one of: "pre", "listing"
2394               */
2395              case '+PRE':
2396              case '+LISTING':
2397                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2398                      $this->close_a_p_element();
2399                  }
2400  
2401                  /*
2402                   * > If the next token is a U+000A LINE FEED (LF) character token,
2403                   * > then ignore that token and move on to the next one. (Newlines
2404                   * > at the start of pre blocks are ignored as an authoring convenience.)
2405                   *
2406                   * This is handled in `get_modifiable_text()`.
2407                   */
2408  
2409                  $this->insert_html_element( $this->state->current_token );
2410                  $this->state->frameset_ok = false;
2411                  return true;
2412  
2413              /*
2414               * > A start tag whose tag name is "form"
2415               */
2416              case '+FORM':
2417                  $stack_contains_template = $this->state->stack_of_open_elements->contains( 'TEMPLATE' );
2418  
2419                  if ( isset( $this->state->form_element ) && ! $stack_contains_template ) {
2420                      // Parse error: ignore the token.
2421                      return $this->step();
2422                  }
2423  
2424                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2425                      $this->close_a_p_element();
2426                  }
2427  
2428                  $this->insert_html_element( $this->state->current_token );
2429                  if ( ! $stack_contains_template ) {
2430                      $this->state->form_element = $this->state->current_token;
2431                  }
2432  
2433                  return true;
2434  
2435              /*
2436               * > A start tag whose tag name is "li"
2437               * > A start tag whose tag name is one of: "dd", "dt"
2438               */
2439              case '+DD':
2440              case '+DT':
2441              case '+LI':
2442                  $this->state->frameset_ok = false;
2443                  $node                     = $this->state->stack_of_open_elements->current_node();
2444                  $is_li                    = 'LI' === $token_name;
2445  
2446                  in_body_list_loop:
2447                  /*
2448                   * The logic for LI and DT/DD is the same except for one point: LI elements _only_
2449                   * close other LI elements, but a DT or DD element closes _any_ open DT or DD element.
2450                   */
2451                  if ( $is_li ? 'LI' === $node->node_name : ( 'DD' === $node->node_name || 'DT' === $node->node_name ) ) {
2452                      $node_name = $is_li ? 'LI' : $node->node_name;
2453                      $this->generate_implied_end_tags( $node_name );
2454                      if ( ! $this->state->stack_of_open_elements->current_node_is( $node_name ) ) {
2455                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2456                      }
2457  
2458                      $this->state->stack_of_open_elements->pop_until( $node_name );
2459                      goto in_body_list_done;
2460                  }
2461  
2462                  if (
2463                      'ADDRESS' !== $node->node_name &&
2464                      'DIV' !== $node->node_name &&
2465                      'P' !== $node->node_name &&
2466                      self::is_special( $node )
2467                  ) {
2468                      /*
2469                       * > If node is in the special category, but is not an address, div,
2470                       * > or p element, then jump to the step labeled done below.
2471                       */
2472                      goto in_body_list_done;
2473                  } else {
2474                      /*
2475                       * > Otherwise, set node to the previous entry in the stack of open elements
2476                       * > and return to the step labeled loop.
2477                       */
2478                      foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
2479                          $node = $item;
2480                          break;
2481                      }
2482                      goto in_body_list_loop;
2483                  }
2484  
2485                  in_body_list_done:
2486                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2487                      $this->close_a_p_element();
2488                  }
2489  
2490                  $this->insert_html_element( $this->state->current_token );
2491                  return true;
2492  
2493              case '+PLAINTEXT':
2494                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2495                      $this->close_a_p_element();
2496                  }
2497  
2498                  /*
2499                   * @todo This may need to be handled in the Tag Processor and turn into
2500                   *       a single self-contained tag like TEXTAREA, whose modifiable text
2501                   *       is the rest of the input document as plaintext.
2502                   */
2503                  $this->bail( 'Cannot process PLAINTEXT elements.' );
2504                  break;
2505  
2506              /*
2507               * > A start tag whose tag name is "button"
2508               */
2509              case '+BUTTON':
2510                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
2511                      // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2512                      $this->generate_implied_end_tags();
2513                      $this->state->stack_of_open_elements->pop_until( 'BUTTON' );
2514                  }
2515  
2516                  $this->reconstruct_active_formatting_elements();
2517                  $this->insert_html_element( $this->state->current_token );
2518                  $this->state->frameset_ok = false;
2519  
2520                  return true;
2521  
2522              /*
2523               * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
2524               * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
2525               * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
2526               * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
2527               */
2528              case '-ADDRESS':
2529              case '-ARTICLE':
2530              case '-ASIDE':
2531              case '-BLOCKQUOTE':
2532              case '-BUTTON':
2533              case '-CENTER':
2534              case '-DETAILS':
2535              case '-DIALOG':
2536              case '-DIR':
2537              case '-DIV':
2538              case '-DL':
2539              case '-FIELDSET':
2540              case '-FIGCAPTION':
2541              case '-FIGURE':
2542              case '-FOOTER':
2543              case '-HEADER':
2544              case '-HGROUP':
2545              case '-LISTING':
2546              case '-MAIN':
2547              case '-MENU':
2548              case '-NAV':
2549              case '-OL':
2550              case '-PRE':
2551              case '-SEARCH':
2552              case '-SECTION':
2553              case '-SUMMARY':
2554              case '-UL':
2555                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2556                      // @todo Report parse error.
2557                      // Ignore the token.
2558                      return $this->step();
2559                  }
2560  
2561                  $this->generate_implied_end_tags();
2562                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2563                      // @todo Record parse error: this error doesn't impact parsing.
2564                  }
2565                  $this->state->stack_of_open_elements->pop_until( $token_name );
2566                  return true;
2567  
2568              /*
2569               * > An end tag whose tag name is "form"
2570               */
2571              case '-FORM':
2572                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2573                      $node                      = $this->state->form_element;
2574                      $this->state->form_element = null;
2575  
2576                      /*
2577                       * > If node is null or if the stack of open elements does not have node
2578                       * > in scope, then this is a parse error; return and ignore the token.
2579                       *
2580                       * @todo It's necessary to check if the form token itself is in scope, not
2581                       *       simply whether any FORM is in scope.
2582                       */
2583                      if (
2584                          null === $node ||
2585                          ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' )
2586                      ) {
2587                          // Parse error: ignore the token.
2588                          return $this->step();
2589                      }
2590  
2591                      $this->generate_implied_end_tags();
2592                      if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
2593                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2594                          $this->bail( 'Cannot close a FORM when other elements remain open as this would throw off the breadcrumbs for the following tokens.' );
2595                      }
2596  
2597                      $this->state->stack_of_open_elements->remove_node( $node );
2598                      return true;
2599                  } else {
2600                      /*
2601                       * > If the stack of open elements does not have a form element in scope,
2602                       * > then this is a parse error; return and ignore the token.
2603                       *
2604                       * Note that unlike in the clause above, this is checking for any FORM in scope.
2605                       */
2606                      if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' ) ) {
2607                          // Parse error: ignore the token.
2608                          return $this->step();
2609                      }
2610  
2611                      $this->generate_implied_end_tags();
2612  
2613                      if ( ! $this->state->stack_of_open_elements->current_node_is( 'FORM' ) ) {
2614                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2615                      }
2616  
2617                      $this->state->stack_of_open_elements->pop_until( 'FORM' );
2618                      return true;
2619                  }
2620                  break;
2621  
2622              /*
2623               * > An end tag whose tag name is "p"
2624               */
2625              case '-P':
2626                  if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2627                      $this->insert_html_element( $this->state->current_token );
2628                  }
2629  
2630                  $this->close_a_p_element();
2631                  return true;
2632  
2633              /*
2634               * > An end tag whose tag name is "li"
2635               * > An end tag whose tag name is one of: "dd", "dt"
2636               */
2637              case '-DD':
2638              case '-DT':
2639              case '-LI':
2640                  if (
2641                      /*
2642                       * An end tag whose tag name is "li":
2643                       * If the stack of open elements does not have an li element in list item scope,
2644                       * then this is a parse error; ignore the token.
2645                       */
2646                      (
2647                          'LI' === $token_name &&
2648                          ! $this->state->stack_of_open_elements->has_element_in_list_item_scope( 'LI' )
2649                      ) ||
2650                      /*
2651                       * An end tag whose tag name is one of: "dd", "dt":
2652                       * If the stack of open elements does not have an element in scope that is an
2653                       * HTML element with the same tag name as that of the token, then this is a
2654                       * parse error; ignore the token.
2655                       */
2656                      (
2657                          'LI' !== $token_name &&
2658                          ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name )
2659                      )
2660                  ) {
2661                      /*
2662                       * This is a parse error, ignore the token.
2663                       *
2664                       * @todo Indicate a parse error once it's possible.
2665                       */
2666                      return $this->step();
2667                  }
2668  
2669                  $this->generate_implied_end_tags( $token_name );
2670  
2671                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2672                      // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2673                  }
2674  
2675                  $this->state->stack_of_open_elements->pop_until( $token_name );
2676                  return true;
2677  
2678              /*
2679               * > An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2680               */
2681              case '-H1':
2682              case '-H2':
2683              case '-H3':
2684              case '-H4':
2685              case '-H5':
2686              case '-H6':
2687                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( '(internal: H1 through H6 - do not use)' ) ) {
2688                      /*
2689                       * This is a parse error; ignore the token.
2690                       *
2691                       * @todo Indicate a parse error once it's possible.
2692                       */
2693                      return $this->step();
2694                  }
2695  
2696                  $this->generate_implied_end_tags();
2697  
2698                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2699                      // @todo Record parse error: this error doesn't impact parsing.
2700                  }
2701  
2702                  $this->state->stack_of_open_elements->pop_until( '(internal: H1 through H6 - do not use)' );
2703                  return true;
2704  
2705              /*
2706               * > A start tag whose tag name is "a"
2707               */
2708              case '+A':
2709                  foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
2710                      switch ( $item->node_name ) {
2711                          case 'marker':
2712                              break 2;
2713  
2714                          case 'A':
2715                              $this->run_adoption_agency_algorithm();
2716                              $this->state->active_formatting_elements->remove_node( $item );
2717                              $this->state->stack_of_open_elements->remove_node( $item );
2718                              break 2;
2719                      }
2720                  }
2721  
2722                  $this->reconstruct_active_formatting_elements();
2723                  $this->insert_html_element( $this->state->current_token );
2724                  $this->state->active_formatting_elements->push( $this->state->current_token );
2725                  return true;
2726  
2727              /*
2728               * > A start tag whose tag name is one of: "b", "big", "code", "em", "font", "i",
2729               * > "s", "small", "strike", "strong", "tt", "u"
2730               */
2731              case '+B':
2732              case '+BIG':
2733              case '+CODE':
2734              case '+EM':
2735              case '+FONT':
2736              case '+I':
2737              case '+S':
2738              case '+SMALL':
2739              case '+STRIKE':
2740              case '+STRONG':
2741              case '+TT':
2742              case '+U':
2743                  $this->reconstruct_active_formatting_elements();
2744                  $this->insert_html_element( $this->state->current_token );
2745                  $this->state->active_formatting_elements->push( $this->state->current_token );
2746                  return true;
2747  
2748              /*
2749               * > A start tag whose tag name is "nobr"
2750               */
2751              case '+NOBR':
2752                  $this->reconstruct_active_formatting_elements();
2753  
2754                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'NOBR' ) ) {
2755                      // Parse error.
2756                      $this->run_adoption_agency_algorithm();
2757                      $this->reconstruct_active_formatting_elements();
2758                  }
2759  
2760                  $this->insert_html_element( $this->state->current_token );
2761                  $this->state->active_formatting_elements->push( $this->state->current_token );
2762                  return true;
2763  
2764              /*
2765               * > An end tag whose tag name is one of: "a", "b", "big", "code", "em", "font", "i",
2766               * > "nobr", "s", "small", "strike", "strong", "tt", "u"
2767               */
2768              case '-A':
2769              case '-B':
2770              case '-BIG':
2771              case '-CODE':
2772              case '-EM':
2773              case '-FONT':
2774              case '-I':
2775              case '-NOBR':
2776              case '-S':
2777              case '-SMALL':
2778              case '-STRIKE':
2779              case '-STRONG':
2780              case '-TT':
2781              case '-U':
2782                  $this->run_adoption_agency_algorithm();
2783                  return true;
2784  
2785              /*
2786               * > A start tag whose tag name is one of: "applet", "marquee", "object"
2787               */
2788              case '+APPLET':
2789              case '+MARQUEE':
2790              case '+OBJECT':
2791                  $this->reconstruct_active_formatting_elements();
2792                  $this->insert_html_element( $this->state->current_token );
2793                  $this->state->active_formatting_elements->insert_marker();
2794                  $this->state->frameset_ok = false;
2795                  return true;
2796  
2797              /*
2798               * > A end tag token whose tag name is one of: "applet", "marquee", "object"
2799               */
2800              case '-APPLET':
2801              case '-MARQUEE':
2802              case '-OBJECT':
2803                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2804                      // Parse error: ignore the token.
2805                      return $this->step();
2806                  }
2807  
2808                  $this->generate_implied_end_tags();
2809                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2810                      // This is a parse error.
2811                  }
2812  
2813                  $this->state->stack_of_open_elements->pop_until( $token_name );
2814                  $this->state->active_formatting_elements->clear_up_to_last_marker();
2815                  return true;
2816  
2817              /*
2818               * > A start tag whose tag name is "table"
2819               */
2820              case '+TABLE':
2821                  /*
2822                   * > If the Document is not set to quirks mode, and the stack of open elements
2823                   * > has a p element in button scope, then close a p element.
2824                   */
2825                  if (
2826                      WP_HTML_Tag_Processor::QUIRKS_MODE !== $this->compat_mode &&
2827                      $this->state->stack_of_open_elements->has_p_in_button_scope()
2828                  ) {
2829                      $this->close_a_p_element();
2830                  }
2831  
2832                  $this->insert_html_element( $this->state->current_token );
2833                  $this->state->frameset_ok    = false;
2834                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
2835                  return true;
2836  
2837              /*
2838               * > An end tag whose tag name is "br"
2839               *
2840               * This is prevented from happening because the Tag Processor
2841               * reports all closing BR tags as if they were opening tags.
2842               */
2843  
2844              /*
2845               * > A start tag whose tag name is one of: "area", "br", "embed", "img", "keygen", "wbr"
2846               */
2847              case '+AREA':
2848              case '+BR':
2849              case '+EMBED':
2850              case '+IMG':
2851              case '+KEYGEN':
2852              case '+WBR':
2853                  $this->reconstruct_active_formatting_elements();
2854                  $this->insert_html_element( $this->state->current_token );
2855                  $this->state->frameset_ok = false;
2856                  return true;
2857  
2858              /*
2859               * > A start tag whose tag name is "input"
2860               */
2861              case '+INPUT':
2862                  $this->reconstruct_active_formatting_elements();
2863                  $this->insert_html_element( $this->state->current_token );
2864  
2865                  /*
2866                   * > If the token does not have an attribute with the name "type", or if it does,
2867                   * > but that attribute's value is not an ASCII case-insensitive match for the
2868                   * > string "hidden", then: set the frameset-ok flag to "not ok".
2869                   */
2870                  $type_attribute = $this->get_attribute( 'type' );
2871                  if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
2872                      $this->state->frameset_ok = false;
2873                  }
2874  
2875                  return true;
2876  
2877              /*
2878               * > A start tag whose tag name is one of: "param", "source", "track"
2879               */
2880              case '+PARAM':
2881              case '+SOURCE':
2882              case '+TRACK':
2883                  $this->insert_html_element( $this->state->current_token );
2884                  return true;
2885  
2886              /*
2887               * > A start tag whose tag name is "hr"
2888               */
2889              case '+HR':
2890                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2891                      $this->close_a_p_element();
2892                  }
2893                  $this->insert_html_element( $this->state->current_token );
2894                  $this->state->frameset_ok = false;
2895                  return true;
2896  
2897              /*
2898               * > A start tag whose tag name is "image"
2899               */
2900              case '+IMAGE':
2901                  /*
2902                   * > Parse error. Change the token's tag name to "img" and reprocess it. (Don't ask.)
2903                   *
2904                   * Note that this is handled elsewhere, so it should not be possible to reach this code.
2905                   */
2906                  $this->bail( "Cannot process an IMAGE tag. (Don't ask.)" );
2907                  break;
2908  
2909              /*
2910               * > A start tag whose tag name is "textarea"
2911               */
2912              case '+TEXTAREA':
2913                  $this->insert_html_element( $this->state->current_token );
2914  
2915                  /*
2916                   * > If the next token is a U+000A LINE FEED (LF) character token, then ignore
2917                   * > that token and move on to the next one. (Newlines at the start of
2918                   * > textarea elements are ignored as an authoring convenience.)
2919                   *
2920                   * This is handled in `get_modifiable_text()`.
2921                   */
2922  
2923                  $this->state->frameset_ok = false;
2924  
2925                  /*
2926                   * > Switch the insertion mode to "text".
2927                   *
2928                   * As a self-contained node, this behavior is handled in the Tag Processor.
2929                   */
2930                  return true;
2931  
2932              /*
2933               * > A start tag whose tag name is "xmp"
2934               */
2935              case '+XMP':
2936                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2937                      $this->close_a_p_element();
2938                  }
2939  
2940                  $this->reconstruct_active_formatting_elements();
2941                  $this->state->frameset_ok = false;
2942  
2943                  /*
2944                   * > Follow the generic raw text element parsing algorithm.
2945                   *
2946                   * As a self-contained node, this behavior is handled in the Tag Processor.
2947                   */
2948                  $this->insert_html_element( $this->state->current_token );
2949                  return true;
2950  
2951              /*
2952               * A start tag whose tag name is "iframe"
2953               */
2954              case '+IFRAME':
2955                  $this->state->frameset_ok = false;
2956  
2957                  /*
2958                   * > Follow the generic raw text element parsing algorithm.
2959                   *
2960                   * As a self-contained node, this behavior is handled in the Tag Processor.
2961                   */
2962                  $this->insert_html_element( $this->state->current_token );
2963                  return true;
2964  
2965              /*
2966               * > A start tag whose tag name is "noembed"
2967               * > A start tag whose tag name is "noscript", if the scripting flag is enabled
2968               *
2969               * The scripting flag is never enabled in this parser.
2970               */
2971              case '+NOEMBED':
2972                  $this->insert_html_element( $this->state->current_token );
2973                  return true;
2974  
2975              /*
2976               * > A start tag whose tag name is "select"
2977               */
2978              case '+SELECT':
2979                  $this->reconstruct_active_formatting_elements();
2980                  $this->insert_html_element( $this->state->current_token );
2981                  $this->state->frameset_ok = false;
2982  
2983                  switch ( $this->state->insertion_mode ) {
2984                      /*
2985                       * > If the insertion mode is one of "in table", "in caption", "in table body", "in row",
2986                       * > or "in cell", then switch the insertion mode to "in select in table".
2987                       */
2988                      case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
2989                      case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
2990                      case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
2991                      case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
2992                      case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
2993                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
2994                          break;
2995  
2996                      /*
2997                       * > Otherwise, switch the insertion mode to "in select".
2998                       */
2999                      default:
3000                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
3001                          break;
3002                  }
3003                  return true;
3004  
3005              /*
3006               * > A start tag whose tag name is one of: "optgroup", "option"
3007               */
3008              case '+OPTGROUP':
3009              case '+OPTION':
3010                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
3011                      $this->state->stack_of_open_elements->pop();
3012                  }
3013                  $this->reconstruct_active_formatting_elements();
3014                  $this->insert_html_element( $this->state->current_token );
3015                  return true;
3016  
3017              /*
3018               * > A start tag whose tag name is one of: "rb", "rtc"
3019               */
3020              case '+RB':
3021              case '+RTC':
3022                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3023                      $this->generate_implied_end_tags();
3024  
3025                      if ( $this->state->stack_of_open_elements->current_node_is( 'RUBY' ) ) {
3026                          // @todo Indicate a parse error once it's possible.
3027                      }
3028                  }
3029  
3030                  $this->insert_html_element( $this->state->current_token );
3031                  return true;
3032  
3033              /*
3034               * > A start tag whose tag name is one of: "rp", "rt"
3035               */
3036              case '+RP':
3037              case '+RT':
3038                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3039                      $this->generate_implied_end_tags( 'RTC' );
3040  
3041                      $current_node_name = $this->state->stack_of_open_elements->current_node()->node_name;
3042                      if ( 'RTC' === $current_node_name || 'RUBY' === $current_node_name ) {
3043                          // @todo Indicate a parse error once it's possible.
3044                      }
3045                  }
3046  
3047                  $this->insert_html_element( $this->state->current_token );
3048                  return true;
3049  
3050              /*
3051               * > A start tag whose tag name is "math"
3052               */
3053              case '+MATH':
3054                  $this->reconstruct_active_formatting_elements();
3055  
3056                  /*
3057                   * @todo Adjust MathML attributes for the token. (This fixes the case of MathML attributes that are not all lowercase.)
3058                   * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink.)
3059                   *
3060                   * These ought to be handled in the attribute methods.
3061                   */
3062                  $this->state->current_token->namespace = 'math';
3063                  $this->insert_html_element( $this->state->current_token );
3064                  if ( $this->state->current_token->has_self_closing_flag ) {
3065                      $this->state->stack_of_open_elements->pop();
3066                  }
3067                  return true;
3068  
3069              /*
3070               * > A start tag whose tag name is "svg"
3071               */
3072              case '+SVG':
3073                  $this->reconstruct_active_formatting_elements();
3074  
3075                  /*
3076                   * @todo Adjust SVG attributes for the token. (This fixes the case of SVG attributes that are not all lowercase.)
3077                   * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink in SVG.)
3078                   *
3079                   * These ought to be handled in the attribute methods.
3080                   */
3081                  $this->state->current_token->namespace = 'svg';
3082                  $this->insert_html_element( $this->state->current_token );
3083                  if ( $this->state->current_token->has_self_closing_flag ) {
3084                      $this->state->stack_of_open_elements->pop();
3085                  }
3086                  return true;
3087  
3088              /*
3089               * > A start tag whose tag name is one of: "caption", "col", "colgroup",
3090               * > "frame", "head", "tbody", "td", "tfoot", "th", "thead", "tr"
3091               */
3092              case '+CAPTION':
3093              case '+COL':
3094              case '+COLGROUP':
3095              case '+FRAME':
3096              case '+HEAD':
3097              case '+TBODY':
3098              case '+TD':
3099              case '+TFOOT':
3100              case '+TH':
3101              case '+THEAD':
3102              case '+TR':
3103                  // Parse error. Ignore the token.
3104                  return $this->step();
3105          }
3106  
3107          if ( ! parent::is_tag_closer() ) {
3108              /*
3109               * > Any other start tag
3110               */
3111              $this->reconstruct_active_formatting_elements();
3112              $this->insert_html_element( $this->state->current_token );
3113              return true;
3114          } else {
3115              /*
3116               * > Any other end tag
3117               */
3118  
3119              /*
3120               * Find the corresponding tag opener in the stack of open elements, if
3121               * it exists before reaching a special element, which provides a kind
3122               * of boundary in the stack. For example, a `</custom-tag>` should not
3123               * close anything beyond its containing `P` or `DIV` element.
3124               */
3125              foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
3126                  if ( 'html' === $node->namespace && $token_name === $node->node_name ) {
3127                      break;
3128                  }
3129  
3130                  if ( self::is_special( $node ) ) {
3131                      // This is a parse error, ignore the token.
3132                      return $this->step();
3133                  }
3134              }
3135  
3136              $this->generate_implied_end_tags( $token_name );
3137              if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
3138                  // @todo Record parse error: this error doesn't impact parsing.
3139              }
3140  
3141              foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
3142                  $this->state->stack_of_open_elements->pop();
3143                  if ( $node === $item ) {
3144                      return true;
3145                  }
3146              }
3147          }
3148  
3149          $this->bail( 'Should not have been able to reach end of IN BODY processing. Check HTML API code.' );
3150          // This unnecessary return prevents tools from inaccurately reporting type errors.
3151          return false;
3152      }
3153  
3154      /**
3155       * Parses next element in the 'in table' insertion mode.
3156       *
3157       * This internal function performs the 'in table' insertion mode
3158       * logic for the generalized WP_HTML_Processor::step() function.
3159       *
3160       * @since 6.7.0
3161       *
3162       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3163       *
3164       * @see https://html.spec.whatwg.org/#parsing-main-intable
3165       * @see WP_HTML_Processor::step
3166       *
3167       * @return bool Whether an element was found.
3168       */
3169  	private function step_in_table(): bool {
3170          $token_name = $this->get_token_name();
3171          $token_type = $this->get_token_type();
3172          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3173          $op         = "{$op_sigil}{$token_name}";
3174  
3175          switch ( $op ) {
3176              /*
3177               * > A character token, if the current node is table,
3178               * > tbody, template, tfoot, thead, or tr element
3179               */
3180              case '#text':
3181                  $current_node      = $this->state->stack_of_open_elements->current_node();
3182                  $current_node_name = $current_node ? $current_node->node_name : null;
3183                  if (
3184                      $current_node_name && (
3185                          'TABLE' === $current_node_name ||
3186                          'TBODY' === $current_node_name ||
3187                          'TEMPLATE' === $current_node_name ||
3188                          'TFOOT' === $current_node_name ||
3189                          'THEAD' === $current_node_name ||
3190                          'TR' === $current_node_name
3191                      )
3192                  ) {
3193                      /*
3194                       * If the text is empty after processing HTML entities and stripping
3195                       * U+0000 NULL bytes then ignore the token.
3196                       */
3197                      if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
3198                          return $this->step();
3199                      }
3200  
3201                      /*
3202                       * This follows the rules for "in table text" insertion mode.
3203                       *
3204                       * Whitespace-only text nodes are inserted in-place. Otherwise
3205                       * foster parenting is enabled and the nodes would be
3206                       * inserted out-of-place.
3207                       *
3208                       * > If any of the tokens in the pending table character tokens
3209                       * > list are character tokens that are not ASCII whitespace,
3210                       * > then this is a parse error: reprocess the character tokens
3211                       * > in the pending table character tokens list using the rules
3212                       * > given in the "anything else" entry in the "in table"
3213                       * > insertion mode.
3214                       * >
3215                       * > Otherwise, insert the characters given by the pending table
3216                       * > character tokens list.
3217                       *
3218                       * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3219                       */
3220                      if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3221                          $this->insert_html_element( $this->state->current_token );
3222                          return true;
3223                      }
3224  
3225                      // Non-whitespace would trigger fostering, unsupported at this time.
3226                      $this->bail( 'Foster parenting is not supported.' );
3227                      break;
3228                  }
3229                  break;
3230  
3231              /*
3232               * > A comment token
3233               */
3234              case '#comment':
3235              case '#funky-comment':
3236              case '#presumptuous-tag':
3237                  $this->insert_html_element( $this->state->current_token );
3238                  return true;
3239  
3240              /*
3241               * > A DOCTYPE token
3242               */
3243              case 'html':
3244                  // Parse error: ignore the token.
3245                  return $this->step();
3246  
3247              /*
3248               * > A start tag whose tag name is "caption"
3249               */
3250              case '+CAPTION':
3251                  $this->state->stack_of_open_elements->clear_to_table_context();
3252                  $this->state->active_formatting_elements->insert_marker();
3253                  $this->insert_html_element( $this->state->current_token );
3254                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
3255                  return true;
3256  
3257              /*
3258               * > A start tag whose tag name is "colgroup"
3259               */
3260              case '+COLGROUP':
3261                  $this->state->stack_of_open_elements->clear_to_table_context();
3262                  $this->insert_html_element( $this->state->current_token );
3263                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3264                  return true;
3265  
3266              /*
3267               * > A start tag whose tag name is "col"
3268               */
3269              case '+COL':
3270                  $this->state->stack_of_open_elements->clear_to_table_context();
3271  
3272                  /*
3273                   * > Insert an HTML element for a "colgroup" start tag token with no attributes,
3274                   * > then switch the insertion mode to "in column group".
3275                   */
3276                  $this->insert_virtual_node( 'COLGROUP' );
3277                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3278                  return $this->step( self::REPROCESS_CURRENT_NODE );
3279  
3280              /*
3281               * > A start tag whose tag name is one of: "tbody", "tfoot", "thead"
3282               */
3283              case '+TBODY':
3284              case '+TFOOT':
3285              case '+THEAD':
3286                  $this->state->stack_of_open_elements->clear_to_table_context();
3287                  $this->insert_html_element( $this->state->current_token );
3288                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3289                  return true;
3290  
3291              /*
3292               * > A start tag whose tag name is one of: "td", "th", "tr"
3293               */
3294              case '+TD':
3295              case '+TH':
3296              case '+TR':
3297                  $this->state->stack_of_open_elements->clear_to_table_context();
3298                  /*
3299                   * > Insert an HTML element for a "tbody" start tag token with no attributes,
3300                   * > then switch the insertion mode to "in table body".
3301                   */
3302                  $this->insert_virtual_node( 'TBODY' );
3303                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3304                  return $this->step( self::REPROCESS_CURRENT_NODE );
3305  
3306              /*
3307               * > A start tag whose tag name is "table"
3308               *
3309               * This tag in the IN TABLE insertion mode is a parse error.
3310               */
3311              case '+TABLE':
3312                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3313                      return $this->step();
3314                  }
3315  
3316                  $this->state->stack_of_open_elements->pop_until( 'TABLE' );
3317                  $this->reset_insertion_mode_appropriately();
3318                  return $this->step( self::REPROCESS_CURRENT_NODE );
3319  
3320              /*
3321               * > An end tag whose tag name is "table"
3322               */
3323              case '-TABLE':
3324                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3325                      // @todo Indicate a parse error once it's possible.
3326                      return $this->step();
3327                  }
3328  
3329                  $this->state->stack_of_open_elements->pop_until( 'TABLE' );
3330                  $this->reset_insertion_mode_appropriately();
3331                  return true;
3332  
3333              /*
3334               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3335               */
3336              case '-BODY':
3337              case '-CAPTION':
3338              case '-COL':
3339              case '-COLGROUP':
3340              case '-HTML':
3341              case '-TBODY':
3342              case '-TD':
3343              case '-TFOOT':
3344              case '-TH':
3345              case '-THEAD':
3346              case '-TR':
3347                  // Parse error: ignore the token.
3348                  return $this->step();
3349  
3350              /*
3351               * > A start tag whose tag name is one of: "style", "script", "template"
3352               * > An end tag whose tag name is "template"
3353               */
3354              case '+STYLE':
3355              case '+SCRIPT':
3356              case '+TEMPLATE':
3357              case '-TEMPLATE':
3358                  /*
3359                   * > Process the token using the rules for the "in head" insertion mode.
3360                   */
3361                  return $this->step_in_head();
3362  
3363              /*
3364               * > A start tag whose tag name is "input"
3365               *
3366               * > If the token does not have an attribute with the name "type", or if it does, but
3367               * > that attribute's value is not an ASCII case-insensitive match for the string
3368               * > "hidden", then: act as described in the "anything else" entry below.
3369               */
3370              case '+INPUT':
3371                  $type_attribute = $this->get_attribute( 'type' );
3372                  if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
3373                      goto anything_else;
3374                  }
3375                  // @todo Indicate a parse error once it's possible.
3376                  $this->insert_html_element( $this->state->current_token );
3377                  return true;
3378  
3379              /*
3380               * > A start tag whose tag name is "form"
3381               *
3382               * This tag in the IN TABLE insertion mode is a parse error.
3383               */
3384              case '+FORM':
3385                  if (
3386                      $this->state->stack_of_open_elements->has_element_in_scope( 'TEMPLATE' ) ||
3387                      isset( $this->state->form_element )
3388                  ) {
3389                      return $this->step();
3390                  }
3391  
3392                  // This FORM is special because it immediately closes and cannot have other children.
3393                  $this->insert_html_element( $this->state->current_token );
3394                  $this->state->form_element = $this->state->current_token;
3395                  $this->state->stack_of_open_elements->pop();
3396                  return true;
3397          }
3398  
3399          /*
3400           * > Anything else
3401           * > Parse error. Enable foster parenting, process the token using the rules for the
3402           * > "in body" insertion mode, and then disable foster parenting.
3403           *
3404           * @todo Indicate a parse error once it's possible.
3405           */
3406          anything_else:
3407          $this->bail( 'Foster parenting is not supported.' );
3408      }
3409  
3410      /**
3411       * Parses next element in the 'in table text' insertion mode.
3412       *
3413       * This internal function performs the 'in table text' insertion mode
3414       * logic for the generalized WP_HTML_Processor::step() function.
3415       *
3416       * @since 6.7.0 Stub implementation.
3417       *
3418       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3419       *
3420       * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3421       * @see WP_HTML_Processor::step
3422       *
3423       * @return bool Whether an element was found.
3424       */
3425  	private function step_in_table_text(): bool {
3426          $this->bail( 'No support for parsing in the ' . WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT . ' state.' );
3427      }
3428  
3429      /**
3430       * Parses next element in the 'in caption' insertion mode.
3431       *
3432       * This internal function performs the 'in caption' insertion mode
3433       * logic for the generalized WP_HTML_Processor::step() function.
3434       *
3435       * @since 6.7.0
3436       *
3437       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3438       *
3439       * @see https://html.spec.whatwg.org/#parsing-main-incaption
3440       * @see WP_HTML_Processor::step
3441       *
3442       * @return bool Whether an element was found.
3443       */
3444  	private function step_in_caption(): bool {
3445          $tag_name = $this->get_tag();
3446          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3447          $op       = "{$op_sigil}{$tag_name}";
3448  
3449          switch ( $op ) {
3450              /*
3451               * > An end tag whose tag name is "caption"
3452               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td", "tfoot", "th", "thead", "tr"
3453               * > An end tag whose tag name is "table"
3454               *
3455               * These tag handling rules are identical except for the final instruction.
3456               * Handle them in a single block.
3457               */
3458              case '-CAPTION':
3459              case '+CAPTION':
3460              case '+COL':
3461              case '+COLGROUP':
3462              case '+TBODY':
3463              case '+TD':
3464              case '+TFOOT':
3465              case '+TH':
3466              case '+THEAD':
3467              case '+TR':
3468              case '-TABLE':
3469                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'CAPTION' ) ) {
3470                      // Parse error: ignore the token.
3471                      return $this->step();
3472                  }
3473  
3474                  $this->generate_implied_end_tags();
3475                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'CAPTION' ) ) {
3476                      // @todo Indicate a parse error once it's possible.
3477                  }
3478  
3479                  $this->state->stack_of_open_elements->pop_until( 'CAPTION' );
3480                  $this->state->active_formatting_elements->clear_up_to_last_marker();
3481                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3482  
3483                  // If this is not a CAPTION end tag, the token should be reprocessed.
3484                  if ( '-CAPTION' === $op ) {
3485                      return true;
3486                  }
3487                  return $this->step( self::REPROCESS_CURRENT_NODE );
3488  
3489              /**
3490               * > An end tag whose tag name is one of: "body", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3491               */
3492              case '-BODY':
3493              case '-COL':
3494              case '-COLGROUP':
3495              case '-HTML':
3496              case '-TBODY':
3497              case '-TD':
3498              case '-TFOOT':
3499              case '-TH':
3500              case '-THEAD':
3501              case '-TR':
3502                  // Parse error: ignore the token.
3503                  return $this->step();
3504          }
3505  
3506          /**
3507           * > Anything else
3508           * >   Process the token using the rules for the "in body" insertion mode.
3509           */
3510          return $this->step_in_body();
3511      }
3512  
3513      /**
3514       * Parses next element in the 'in column group' insertion mode.
3515       *
3516       * This internal function performs the 'in column group' insertion mode
3517       * logic for the generalized WP_HTML_Processor::step() function.
3518       *
3519       * @since 6.7.0
3520       *
3521       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3522       *
3523       * @see https://html.spec.whatwg.org/#parsing-main-incolgroup
3524       * @see WP_HTML_Processor::step
3525       *
3526       * @return bool Whether an element was found.
3527       */
3528  	private function step_in_column_group(): bool {
3529          $token_name = $this->get_token_name();
3530          $token_type = $this->get_token_type();
3531          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3532          $op         = "{$op_sigil}{$token_name}";
3533  
3534          switch ( $op ) {
3535              /*
3536               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
3537               * > U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
3538               */
3539              case '#text':
3540                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3541                      // Insert the character.
3542                      $this->insert_html_element( $this->state->current_token );
3543                      return true;
3544                  }
3545  
3546                  goto in_column_group_anything_else;
3547                  break;
3548  
3549              /*
3550               * > A comment token
3551               */
3552              case '#comment':
3553              case '#funky-comment':
3554              case '#presumptuous-tag':
3555                  $this->insert_html_element( $this->state->current_token );
3556                  return true;
3557  
3558              /*
3559               * > A DOCTYPE token
3560               */
3561              case 'html':
3562                  // @todo Indicate a parse error once it's possible.
3563                  return $this->step();
3564  
3565              /*
3566               * > A start tag whose tag name is "html"
3567               */
3568              case '+HTML':
3569                  return $this->step_in_body();
3570  
3571              /*
3572               * > A start tag whose tag name is "col"
3573               */
3574              case '+COL':
3575                  $this->insert_html_element( $this->state->current_token );
3576                  $this->state->stack_of_open_elements->pop();
3577                  return true;
3578  
3579              /*
3580               * > An end tag whose tag name is "colgroup"
3581               */
3582              case '-COLGROUP':
3583                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3584                      // @todo Indicate a parse error once it's possible.
3585                      return $this->step();
3586                  }
3587                  $this->state->stack_of_open_elements->pop();
3588                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3589                  return true;
3590  
3591              /*
3592               * > An end tag whose tag name is "col"
3593               */
3594              case '-COL':
3595                  // Parse error: ignore the token.
3596                  return $this->step();
3597  
3598              /*
3599               * > A start tag whose tag name is "template"
3600               * > An end tag whose tag name is "template"
3601               */
3602              case '+TEMPLATE':
3603              case '-TEMPLATE':
3604                  return $this->step_in_head();
3605          }
3606  
3607          in_column_group_anything_else:
3608          /*
3609           * > Anything else
3610           */
3611          if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3612              // @todo Indicate a parse error once it's possible.
3613              return $this->step();
3614          }
3615          $this->state->stack_of_open_elements->pop();
3616          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3617          return $this->step( self::REPROCESS_CURRENT_NODE );
3618      }
3619  
3620      /**
3621       * Parses next element in the 'in table body' insertion mode.
3622       *
3623       * This internal function performs the 'in table body' insertion mode
3624       * logic for the generalized WP_HTML_Processor::step() function.
3625       *
3626       * @since 6.7.0
3627       *
3628       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3629       *
3630       * @see https://html.spec.whatwg.org/#parsing-main-intbody
3631       * @see WP_HTML_Processor::step
3632       *
3633       * @return bool Whether an element was found.
3634       */
3635  	private function step_in_table_body(): bool {
3636          $tag_name = $this->get_tag();
3637          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3638          $op       = "{$op_sigil}{$tag_name}";
3639  
3640          switch ( $op ) {
3641              /*
3642               * > A start tag whose tag name is "tr"
3643               */
3644              case '+TR':
3645                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3646                  $this->insert_html_element( $this->state->current_token );
3647                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3648                  return true;
3649  
3650              /*
3651               * > A start tag whose tag name is one of: "th", "td"
3652               */
3653              case '+TH':
3654              case '+TD':
3655                  // @todo Indicate a parse error once it's possible.
3656                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3657                  $this->insert_virtual_node( 'TR' );
3658                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3659                  return $this->step( self::REPROCESS_CURRENT_NODE );
3660  
3661              /*
3662               * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3663               */
3664              case '-TBODY':
3665              case '-TFOOT':
3666              case '-THEAD':
3667                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3668                      // Parse error: ignore the token.
3669                      return $this->step();
3670                  }
3671  
3672                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3673                  $this->state->stack_of_open_elements->pop();
3674                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3675                  return true;
3676  
3677              /*
3678               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead"
3679               * > An end tag whose tag name is "table"
3680               */
3681              case '+CAPTION':
3682              case '+COL':
3683              case '+COLGROUP':
3684              case '+TBODY':
3685              case '+TFOOT':
3686              case '+THEAD':
3687              case '-TABLE':
3688                  if (
3689                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TBODY' ) &&
3690                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'THEAD' ) &&
3691                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TFOOT' )
3692                  ) {
3693                      // Parse error: ignore the token.
3694                      return $this->step();
3695                  }
3696                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3697                  $this->state->stack_of_open_elements->pop();
3698                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3699                  return $this->step( self::REPROCESS_CURRENT_NODE );
3700  
3701              /*
3702               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th", "tr"
3703               */
3704              case '-BODY':
3705              case '-CAPTION':
3706              case '-COL':
3707              case '-COLGROUP':
3708              case '-HTML':
3709              case '-TD':
3710              case '-TH':
3711              case '-TR':
3712                  // Parse error: ignore the token.
3713                  return $this->step();
3714          }
3715  
3716          /*
3717           * > Anything else
3718           * > Process the token using the rules for the "in table" insertion mode.
3719           */
3720          return $this->step_in_table();
3721      }
3722  
3723      /**
3724       * Parses next element in the 'in row' insertion mode.
3725       *
3726       * This internal function performs the 'in row' insertion mode
3727       * logic for the generalized WP_HTML_Processor::step() function.
3728       *
3729       * @since 6.7.0
3730       *
3731       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3732       *
3733       * @see https://html.spec.whatwg.org/#parsing-main-intr
3734       * @see WP_HTML_Processor::step
3735       *
3736       * @return bool Whether an element was found.
3737       */
3738  	private function step_in_row(): bool {
3739          $tag_name = $this->get_tag();
3740          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3741          $op       = "{$op_sigil}{$tag_name}";
3742  
3743          switch ( $op ) {
3744              /*
3745               * > A start tag whose tag name is one of: "th", "td"
3746               */
3747              case '+TH':
3748              case '+TD':
3749                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3750                  $this->insert_html_element( $this->state->current_token );
3751                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
3752                  $this->state->active_formatting_elements->insert_marker();
3753                  return true;
3754  
3755              /*
3756               * > An end tag whose tag name is "tr"
3757               */
3758              case '-TR':
3759                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3760                      // Parse error: ignore the token.
3761                      return $this->step();
3762                  }
3763  
3764                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3765                  $this->state->stack_of_open_elements->pop();
3766                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3767                  return true;
3768  
3769              /*
3770               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead", "tr"
3771               * > An end tag whose tag name is "table"
3772               */
3773              case '+CAPTION':
3774              case '+COL':
3775              case '+COLGROUP':
3776              case '+TBODY':
3777              case '+TFOOT':
3778              case '+THEAD':
3779              case '+TR':
3780              case '-TABLE':
3781                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3782                      // Parse error: ignore the token.
3783                      return $this->step();
3784                  }
3785  
3786                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3787                  $this->state->stack_of_open_elements->pop();
3788                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3789                  return $this->step( self::REPROCESS_CURRENT_NODE );
3790  
3791              /*
3792               * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3793               */
3794              case '-TBODY':
3795              case '-TFOOT':
3796              case '-THEAD':
3797                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3798                      // Parse error: ignore the token.
3799                      return $this->step();
3800                  }
3801  
3802                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3803                      // Ignore the token.
3804                      return $this->step();
3805                  }
3806  
3807                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3808                  $this->state->stack_of_open_elements->pop();
3809                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3810                  return $this->step( self::REPROCESS_CURRENT_NODE );
3811  
3812              /*
3813               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th"
3814               */
3815              case '-BODY':
3816              case '-CAPTION':
3817              case '-COL':
3818              case '-COLGROUP':
3819              case '-HTML':
3820              case '-TD':
3821              case '-TH':
3822                  // Parse error: ignore the token.
3823                  return $this->step();
3824          }
3825  
3826          /*
3827           * > Anything else
3828           * >   Process the token using the rules for the "in table" insertion mode.
3829           */
3830          return $this->step_in_table();
3831      }
3832  
3833      /**
3834       * Parses next element in the 'in cell' insertion mode.
3835       *
3836       * This internal function performs the 'in cell' insertion mode
3837       * logic for the generalized WP_HTML_Processor::step() function.
3838       *
3839       * @since 6.7.0
3840       *
3841       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3842       *
3843       * @see https://html.spec.whatwg.org/#parsing-main-intd
3844       * @see WP_HTML_Processor::step
3845       *
3846       * @return bool Whether an element was found.
3847       */
3848  	private function step_in_cell(): bool {
3849          $tag_name = $this->get_tag();
3850          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3851          $op       = "{$op_sigil}{$tag_name}";
3852  
3853          switch ( $op ) {
3854              /*
3855               * > An end tag whose tag name is one of: "td", "th"
3856               */
3857              case '-TD':
3858              case '-TH':
3859                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3860                      // Parse error: ignore the token.
3861                      return $this->step();
3862                  }
3863  
3864                  $this->generate_implied_end_tags();
3865  
3866                  /*
3867                   * @todo This needs to check if the current node is an HTML element, meaning that
3868                   *       when SVG and MathML support is added, this needs to differentiate between an
3869                   *       HTML element of the given name, such as `<center>`, and a foreign element of
3870                   *       the same given name.
3871                   */
3872                  if ( ! $this->state->stack_of_open_elements->current_node_is( $tag_name ) ) {
3873                      // @todo Indicate a parse error once it's possible.
3874                  }
3875  
3876                  $this->state->stack_of_open_elements->pop_until( $tag_name );
3877                  $this->state->active_formatting_elements->clear_up_to_last_marker();
3878                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3879                  return true;
3880  
3881              /*
3882               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td",
3883               * > "tfoot", "th", "thead", "tr"
3884               */
3885              case '+CAPTION':
3886              case '+COL':
3887              case '+COLGROUP':
3888              case '+TBODY':
3889              case '+TD':
3890              case '+TFOOT':
3891              case '+TH':
3892              case '+THEAD':
3893              case '+TR':
3894                  /*
3895                   * > Assert: The stack of open elements has a td or th element in table scope.
3896                   *
3897                   * Nothing to do here, except to verify in tests that this never appears.
3898                   */
3899  
3900                  $this->close_cell();
3901                  return $this->step( self::REPROCESS_CURRENT_NODE );
3902  
3903              /*
3904               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html"
3905               */
3906              case '-BODY':
3907              case '-CAPTION':
3908              case '-COL':
3909              case '-COLGROUP':
3910              case '-HTML':
3911                  // Parse error: ignore the token.
3912                  return $this->step();
3913  
3914              /*
3915               * > An end tag whose tag name is one of: "table", "tbody", "tfoot", "thead", "tr"
3916               */
3917              case '-TABLE':
3918              case '-TBODY':
3919              case '-TFOOT':
3920              case '-THEAD':
3921              case '-TR':
3922                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3923                      // Parse error: ignore the token.
3924                      return $this->step();
3925                  }
3926                  $this->close_cell();
3927                  return $this->step( self::REPROCESS_CURRENT_NODE );
3928          }
3929  
3930          /*
3931           * > Anything else
3932           * >   Process the token using the rules for the "in body" insertion mode.
3933           */
3934          return $this->step_in_body();
3935      }
3936  
3937      /**
3938       * Parses next element in the 'in select' insertion mode.
3939       *
3940       * This internal function performs the 'in select' insertion mode
3941       * logic for the generalized WP_HTML_Processor::step() function.
3942       *
3943       * @since 6.7.0
3944       *
3945       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3946       *
3947       * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inselect
3948       * @see WP_HTML_Processor::step
3949       *
3950       * @return bool Whether an element was found.
3951       */
3952  	private function step_in_select(): bool {
3953          $token_name = $this->get_token_name();
3954          $token_type = $this->get_token_type();
3955          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3956          $op         = "{$op_sigil}{$token_name}";
3957  
3958          switch ( $op ) {
3959              /*
3960               * > Any other character token
3961               */
3962              case '#text':
3963                  /*
3964                   * > A character token that is U+0000 NULL
3965                   *
3966                   * If a text node only comprises null bytes then it should be
3967                   * entirely ignored and should not return to calling code.
3968                   */
3969                  if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
3970                      // Parse error: ignore the token.
3971                      return $this->step();
3972                  }
3973  
3974                  $this->insert_html_element( $this->state->current_token );
3975                  return true;
3976  
3977              /*
3978               * > A comment token
3979               */
3980              case '#comment':
3981              case '#funky-comment':
3982              case '#presumptuous-tag':
3983                  $this->insert_html_element( $this->state->current_token );
3984                  return true;
3985  
3986              /*
3987               * > A DOCTYPE token
3988               */
3989              case 'html':
3990                  // Parse error: ignore the token.
3991                  return $this->step();
3992  
3993              /*
3994               * > A start tag whose tag name is "html"
3995               */
3996              case '+HTML':
3997                  return $this->step_in_body();
3998  
3999              /*
4000               * > A start tag whose tag name is "option"
4001               */
4002              case '+OPTION':
4003                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4004                      $this->state->stack_of_open_elements->pop();
4005                  }
4006                  $this->insert_html_element( $this->state->current_token );
4007                  return true;
4008  
4009              /*
4010               * > A start tag whose tag name is "optgroup"
4011               * > A start tag whose tag name is "hr"
4012               *
4013               * These rules are identical except for the treatment of the self-closing flag and
4014               * the subsequent pop of the HR void element, all of which is handled elsewhere in the processor.
4015               */
4016              case '+OPTGROUP':
4017              case '+HR':
4018                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4019                      $this->state->stack_of_open_elements->pop();
4020                  }
4021  
4022                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4023                      $this->state->stack_of_open_elements->pop();
4024                  }
4025  
4026                  $this->insert_html_element( $this->state->current_token );
4027                  return true;
4028  
4029              /*
4030               * > An end tag whose tag name is "optgroup"
4031               */
4032              case '-OPTGROUP':
4033                  $current_node = $this->state->stack_of_open_elements->current_node();
4034                  if ( $current_node && 'OPTION' === $current_node->node_name ) {
4035                      foreach ( $this->state->stack_of_open_elements->walk_up( $current_node ) as $parent ) {
4036                          break;
4037                      }
4038                      if ( $parent && 'OPTGROUP' === $parent->node_name ) {
4039                          $this->state->stack_of_open_elements->pop();
4040                      }
4041                  }
4042  
4043                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4044                      $this->state->stack_of_open_elements->pop();
4045                      return true;
4046                  }
4047  
4048                  // Parse error: ignore the token.
4049                  return $this->step();
4050  
4051              /*
4052               * > An end tag whose tag name is "option"
4053               */
4054              case '-OPTION':
4055                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4056                      $this->state->stack_of_open_elements->pop();
4057                      return true;
4058                  }
4059  
4060                  // Parse error: ignore the token.
4061                  return $this->step();
4062  
4063              /*
4064               * > An end tag whose tag name is "select"
4065               * > A start tag whose tag name is "select"
4066               *
4067               * > It just gets treated like an end tag.
4068               */
4069              case '-SELECT':
4070              case '+SELECT':
4071                  if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4072                      // Parse error: ignore the token.
4073                      return $this->step();
4074                  }
4075                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4076                  $this->reset_insertion_mode_appropriately();
4077                  return true;
4078  
4079              /*
4080               * > A start tag whose tag name is one of: "input", "keygen", "textarea"
4081               *
4082               * All three of these tags are considered a parse error when found in this insertion mode.
4083               */
4084              case '+INPUT':
4085              case '+KEYGEN':
4086              case '+TEXTAREA':
4087                  if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4088                      // Ignore the token.
4089                      return $this->step();
4090                  }
4091                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4092                  $this->reset_insertion_mode_appropriately();
4093                  return $this->step( self::REPROCESS_CURRENT_NODE );
4094  
4095              /*
4096               * > A start tag whose tag name is one of: "script", "template"
4097               * > An end tag whose tag name is "template"
4098               */
4099              case '+SCRIPT':
4100              case '+TEMPLATE':
4101              case '-TEMPLATE':
4102                  return $this->step_in_head();
4103          }
4104  
4105          /*
4106           * > Anything else
4107           * >   Parse error: ignore the token.
4108           */
4109          return $this->step();
4110      }
4111  
4112      /**
4113       * Parses next element in the 'in select in table' insertion mode.
4114       *
4115       * This internal function performs the 'in select in table' insertion mode
4116       * logic for the generalized WP_HTML_Processor::step() function.
4117       *
4118       * @since 6.7.0
4119       *
4120       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4121       *
4122       * @see https://html.spec.whatwg.org/#parsing-main-inselectintable
4123       * @see WP_HTML_Processor::step
4124       *
4125       * @return bool Whether an element was found.
4126       */
4127  	private function step_in_select_in_table(): bool {
4128          $token_name = $this->get_token_name();
4129          $token_type = $this->get_token_type();
4130          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
4131          $op         = "{$op_sigil}{$token_name}";
4132  
4133          switch ( $op ) {
4134              /*
4135               * > A start tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4136               */
4137              case '+CAPTION':
4138              case '+TABLE':
4139              case '+TBODY':
4140              case '+TFOOT':
4141              case '+THEAD':
4142              case '+TR':
4143              case '+TD':
4144              case '+TH':
4145                  // @todo Indicate a parse error once it's possible.
4146                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4147                  $this->reset_insertion_mode_appropriately();
4148                  return $this->step( self::REPROCESS_CURRENT_NODE );
4149  
4150              /*
4151               * > An end tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4152               */
4153              case '-CAPTION':
4154              case '-TABLE':
4155              case '-TBODY':
4156              case '-TFOOT':
4157              case '-THEAD':
4158              case '-TR':
4159              case '-TD':
4160              case '-TH':
4161                  // @todo Indicate a parse error once it's possible.
4162                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $token_name ) ) {
4163                      return $this->step();
4164                  }
4165                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4166                  $this->reset_insertion_mode_appropriately();
4167                  return $this->step( self::REPROCESS_CURRENT_NODE );
4168          }
4169  
4170          /*
4171           * > Anything else
4172           */
4173          return $this->step_in_select();
4174      }
4175  
4176      /**
4177       * Parses next element in the 'in template' insertion mode.
4178       *
4179       * This internal function performs the 'in template' insertion mode
4180       * logic for the generalized WP_HTML_Processor::step() function.
4181       *
4182       * @since 6.7.0 Stub implementation.
4183       *
4184       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4185       *
4186       * @see https://html.spec.whatwg.org/#parsing-main-intemplate
4187       * @see WP_HTML_Processor::step
4188       *
4189       * @return bool Whether an element was found.
4190       */
4191  	private function step_in_template(): bool {
4192          $token_name = $this->get_token_name();
4193          $token_type = $this->get_token_type();
4194          $is_closer  = $this->is_tag_closer();
4195          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
4196          $op         = "{$op_sigil}{$token_name}";
4197  
4198          switch ( $op ) {
4199              /*
4200               * > A character token
4201               * > A comment token
4202               * > A DOCTYPE token
4203               */
4204              case '#text':
4205              case '#comment':
4206              case '#funky-comment':
4207              case '#presumptuous-tag':
4208              case 'html':
4209                  return $this->step_in_body();
4210  
4211              /*
4212               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
4213               * > "meta", "noframes", "script", "style", "template", "title"
4214               * > An end tag whose tag name is "template"
4215               */
4216              case '+BASE':
4217              case '+BASEFONT':
4218              case '+BGSOUND':
4219              case '+LINK':
4220              case '+META':
4221              case '+NOFRAMES':
4222              case '+SCRIPT':
4223              case '+STYLE':
4224              case '+TEMPLATE':
4225              case '+TITLE':
4226              case '-TEMPLATE':
4227                  return $this->step_in_head();
4228  
4229              /*
4230               * > A start tag whose tag name is one of: "caption", "colgroup", "tbody", "tfoot", "thead"
4231               */
4232              case '+CAPTION':
4233              case '+COLGROUP':
4234              case '+TBODY':
4235              case '+TFOOT':
4236              case '+THEAD':
4237                  array_pop( $this->state->stack_of_template_insertion_modes );
4238                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4239                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4240                  return $this->step( self::REPROCESS_CURRENT_NODE );
4241  
4242              /*
4243               * > A start tag whose tag name is "col"
4244               */
4245              case '+COL':
4246                  array_pop( $this->state->stack_of_template_insertion_modes );
4247                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4248                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4249                  return $this->step( self::REPROCESS_CURRENT_NODE );
4250  
4251              /*
4252               * > A start tag whose tag name is "tr"
4253               */
4254              case '+TR':
4255                  array_pop( $this->state->stack_of_template_insertion_modes );
4256                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4257                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4258                  return $this->step( self::REPROCESS_CURRENT_NODE );
4259  
4260              /*
4261               * > A start tag whose tag name is one of: "td", "th"
4262               */
4263              case '+TD':
4264              case '+TH':
4265                  array_pop( $this->state->stack_of_template_insertion_modes );
4266                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4267                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4268                  return $this->step( self::REPROCESS_CURRENT_NODE );
4269          }
4270  
4271          /*
4272           * > Any other start tag
4273           */
4274          if ( ! $is_closer ) {
4275              array_pop( $this->state->stack_of_template_insertion_modes );
4276              $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4277              $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4278              return $this->step( self::REPROCESS_CURRENT_NODE );
4279          }
4280  
4281          /*
4282           * > Any other end tag
4283           */
4284          if ( $is_closer ) {
4285              // Parse error: ignore the token.
4286              return $this->step();
4287          }
4288  
4289          /*
4290           * > An end-of-file token
4291           */
4292          if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
4293              // Stop parsing.
4294              return false;
4295          }
4296  
4297          // @todo Indicate a parse error once it's possible.
4298          $this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
4299          $this->state->active_formatting_elements->clear_up_to_last_marker();
4300          array_pop( $this->state->stack_of_template_insertion_modes );
4301          $this->reset_insertion_mode_appropriately();
4302          return $this->step( self::REPROCESS_CURRENT_NODE );
4303      }
4304  
4305      /**
4306       * Parses next element in the 'after body' insertion mode.
4307       *
4308       * This internal function performs the 'after body' insertion mode
4309       * logic for the generalized WP_HTML_Processor::step() function.
4310       *
4311       * @since 6.7.0 Stub implementation.
4312       *
4313       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4314       *
4315       * @see https://html.spec.whatwg.org/#parsing-main-afterbody
4316       * @see WP_HTML_Processor::step
4317       *
4318       * @return bool Whether an element was found.
4319       */
4320  	private function step_after_body(): bool {
4321          $tag_name   = $this->get_token_name();
4322          $token_type = $this->get_token_type();
4323          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4324          $op         = "{$op_sigil}{$tag_name}";
4325  
4326          switch ( $op ) {
4327              /*
4328               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4329               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4330               *
4331               * > Process the token using the rules for the "in body" insertion mode.
4332               */
4333              case '#text':
4334                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4335                      return $this->step_in_body();
4336                  }
4337                  goto after_body_anything_else;
4338                  break;
4339  
4340              /*
4341               * > A comment token
4342               */
4343              case '#comment':
4344              case '#funky-comment':
4345              case '#presumptuous-tag':
4346                  $this->bail( 'Content outside of BODY is unsupported.' );
4347                  break;
4348  
4349              /*
4350               * > A DOCTYPE token
4351               */
4352              case 'html':
4353                  // Parse error: ignore the token.
4354                  return $this->step();
4355  
4356              /*
4357               * > A start tag whose tag name is "html"
4358               */
4359              case '+HTML':
4360                  return $this->step_in_body();
4361  
4362              /*
4363               * > An end tag whose tag name is "html"
4364               *
4365               * > If the parser was created as part of the HTML fragment parsing algorithm,
4366               * > this is a parse error; ignore the token. (fragment case)
4367               * >
4368               * > Otherwise, switch the insertion mode to "after after body".
4369               */
4370              case '-HTML':
4371                  if ( isset( $this->context_node ) ) {
4372                      return $this->step();
4373                  }
4374  
4375                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY;
4376                  return true;
4377          }
4378  
4379          /*
4380           * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4381           */
4382          after_body_anything_else:
4383          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4384          return $this->step( self::REPROCESS_CURRENT_NODE );
4385      }
4386  
4387      /**
4388       * Parses next element in the 'in frameset' insertion mode.
4389       *
4390       * This internal function performs the 'in frameset' insertion mode
4391       * logic for the generalized WP_HTML_Processor::step() function.
4392       *
4393       * @since 6.7.0 Stub implementation.
4394       *
4395       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4396       *
4397       * @see https://html.spec.whatwg.org/#parsing-main-inframeset
4398       * @see WP_HTML_Processor::step
4399       *
4400       * @return bool Whether an element was found.
4401       */
4402  	private function step_in_frameset(): bool {
4403          $tag_name   = $this->get_token_name();
4404          $token_type = $this->get_token_type();
4405          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4406          $op         = "{$op_sigil}{$tag_name}";
4407  
4408          switch ( $op ) {
4409              /*
4410               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4411               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4412               * >
4413               * > Insert the character.
4414               *
4415               * This algorithm effectively strips non-whitespace characters from text and inserts
4416               * them under HTML. This is not supported at this time.
4417               */
4418              case '#text':
4419                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4420                      return $this->step_in_body();
4421                  }
4422                  $this->bail( 'Non-whitespace characters cannot be handled in frameset.' );
4423                  break;
4424  
4425              /*
4426               * > A comment token
4427               */
4428              case '#comment':
4429              case '#funky-comment':
4430              case '#presumptuous-tag':
4431                  $this->insert_html_element( $this->state->current_token );
4432                  return true;
4433  
4434              /*
4435               * > A DOCTYPE token
4436               */
4437              case 'html':
4438                  // Parse error: ignore the token.
4439                  return $this->step();
4440  
4441              /*
4442               * > A start tag whose tag name is "html"
4443               */
4444              case '+HTML':
4445                  return $this->step_in_body();
4446  
4447              /*
4448               * > A start tag whose tag name is "frameset"
4449               */
4450              case '+FRAMESET':
4451                  $this->insert_html_element( $this->state->current_token );
4452                  return true;
4453  
4454              /*
4455               * > An end tag whose tag name is "frameset"
4456               */
4457              case '-FRAMESET':
4458                  /*
4459                   * > If the current node is the root html element, then this is a parse error;
4460                   * > ignore the token. (fragment case)
4461                   */
4462                  if ( $this->state->stack_of_open_elements->current_node_is( 'HTML' ) ) {
4463                      return $this->step();
4464                  }
4465  
4466                  /*
4467                   * > Otherwise, pop the current node from the stack of open elements.
4468                   */
4469                  $this->state->stack_of_open_elements->pop();
4470  
4471                  /*
4472                   * > If the parser was not created as part of the HTML fragment parsing algorithm
4473                   * > (fragment case), and the current node is no longer a frameset element, then
4474                   * > switch the insertion mode to "after frameset".
4475                   */
4476                  if ( ! isset( $this->context_node ) && ! $this->state->stack_of_open_elements->current_node_is( 'FRAMESET' ) ) {
4477                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET;
4478                  }
4479  
4480                  return true;
4481  
4482              /*
4483               * > A start tag whose tag name is "frame"
4484               *
4485               * > Insert an HTML element for the token. Immediately pop the
4486               * > current node off the stack of open elements.
4487               * >
4488               * > Acknowledge the token's self-closing flag, if it is set.
4489               */
4490              case '+FRAME':
4491                  $this->insert_html_element( $this->state->current_token );
4492                  $this->state->stack_of_open_elements->pop();
4493                  return true;
4494  
4495              /*
4496               * > A start tag whose tag name is "noframes"
4497               */
4498              case '+NOFRAMES':
4499                  return $this->step_in_head();
4500          }
4501  
4502          // Parse error: ignore the token.
4503          return $this->step();
4504      }
4505  
4506      /**
4507       * Parses next element in the 'after frameset' insertion mode.
4508       *
4509       * This internal function performs the 'after frameset' insertion mode
4510       * logic for the generalized WP_HTML_Processor::step() function.
4511       *
4512       * @since 6.7.0 Stub implementation.
4513       *
4514       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4515       *
4516       * @see https://html.spec.whatwg.org/#parsing-main-afterframeset
4517       * @see WP_HTML_Processor::step
4518       *
4519       * @return bool Whether an element was found.
4520       */
4521  	private function step_after_frameset(): bool {
4522          $tag_name   = $this->get_token_name();
4523          $token_type = $this->get_token_type();
4524          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4525          $op         = "{$op_sigil}{$tag_name}";
4526  
4527          switch ( $op ) {
4528              /*
4529               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4530               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4531               * >
4532               * > Insert the character.
4533               *
4534               * This algorithm effectively strips non-whitespace characters from text and inserts
4535               * them under HTML. This is not supported at this time.
4536               */
4537              case '#text':
4538                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4539                      return $this->step_in_body();
4540                  }
4541                  $this->bail( 'Non-whitespace characters cannot be handled in after frameset' );
4542                  break;
4543  
4544              /*
4545               * > A comment token
4546               */
4547              case '#comment':
4548              case '#funky-comment':
4549              case '#presumptuous-tag':
4550                  $this->insert_html_element( $this->state->current_token );
4551                  return true;
4552  
4553              /*
4554               * > A DOCTYPE token
4555               */
4556              case 'html':
4557                  // Parse error: ignore the token.
4558                  return $this->step();
4559  
4560              /*
4561               * > A start tag whose tag name is "html"
4562               */
4563              case '+HTML':
4564                  return $this->step_in_body();
4565  
4566              /*
4567               * > An end tag whose tag name is "html"
4568               */
4569              case '-HTML':
4570                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET;
4571                  return true;
4572  
4573              /*
4574               * > A start tag whose tag name is "noframes"
4575               */
4576              case '+NOFRAMES':
4577                  return $this->step_in_head();
4578          }
4579  
4580          // Parse error: ignore the token.
4581          return $this->step();
4582      }
4583  
4584      /**
4585       * Parses next element in the 'after after body' insertion mode.
4586       *
4587       * This internal function performs the 'after after body' insertion mode
4588       * logic for the generalized WP_HTML_Processor::step() function.
4589       *
4590       * @since 6.7.0 Stub implementation.
4591       *
4592       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4593       *
4594       * @see https://html.spec.whatwg.org/#the-after-after-body-insertion-mode
4595       * @see WP_HTML_Processor::step
4596       *
4597       * @return bool Whether an element was found.
4598       */
4599  	private function step_after_after_body(): bool {
4600          $tag_name   = $this->get_token_name();
4601          $token_type = $this->get_token_type();
4602          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4603          $op         = "{$op_sigil}{$tag_name}";
4604  
4605          switch ( $op ) {
4606              /*
4607               * > A comment token
4608               */
4609              case '#comment':
4610              case '#funky-comment':
4611              case '#presumptuous-tag':
4612                  $this->bail( 'Content outside of HTML is unsupported.' );
4613                  break;
4614  
4615              /*
4616               * > A DOCTYPE token
4617               * > A start tag whose tag name is "html"
4618               *
4619               * > Process the token using the rules for the "in body" insertion mode.
4620               */
4621              case 'html':
4622              case '+HTML':
4623                  return $this->step_in_body();
4624  
4625              /*
4626               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4627               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4628               * >
4629               * > Process the token using the rules for the "in body" insertion mode.
4630               */
4631              case '#text':
4632                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4633                      return $this->step_in_body();
4634                  }
4635                  goto after_after_body_anything_else;
4636                  break;
4637          }
4638  
4639          /*
4640           * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4641           */
4642          after_after_body_anything_else:
4643          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4644          return $this->step( self::REPROCESS_CURRENT_NODE );
4645      }
4646  
4647      /**
4648       * Parses next element in the 'after after frameset' insertion mode.
4649       *
4650       * This internal function performs the 'after after frameset' insertion mode
4651       * logic for the generalized WP_HTML_Processor::step() function.
4652       *
4653       * @since 6.7.0 Stub implementation.
4654       *
4655       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4656       *
4657       * @see https://html.spec.whatwg.org/#the-after-after-frameset-insertion-mode
4658       * @see WP_HTML_Processor::step
4659       *
4660       * @return bool Whether an element was found.
4661       */
4662  	private function step_after_after_frameset(): bool {
4663          $tag_name   = $this->get_token_name();
4664          $token_type = $this->get_token_type();
4665          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4666          $op         = "{$op_sigil}{$tag_name}";
4667  
4668          switch ( $op ) {
4669              /*
4670               * > A comment token
4671               */
4672              case '#comment':
4673              case '#funky-comment':
4674              case '#presumptuous-tag':
4675                  $this->bail( 'Content outside of HTML is unsupported.' );
4676                  break;
4677  
4678              /*
4679               * > A DOCTYPE token
4680               * > A start tag whose tag name is "html"
4681               *
4682               * > Process the token using the rules for the "in body" insertion mode.
4683               */
4684              case 'html':
4685              case '+HTML':
4686                  return $this->step_in_body();
4687  
4688              /*
4689               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4690               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4691               * >
4692               * > Process the token using the rules for the "in body" insertion mode.
4693               *
4694               * This algorithm effectively strips non-whitespace characters from text and inserts
4695               * them under HTML. This is not supported at this time.
4696               */
4697              case '#text':
4698                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4699                      return $this->step_in_body();
4700                  }
4701                  $this->bail( 'Non-whitespace characters cannot be handled in after after frameset.' );
4702                  break;
4703  
4704              /*
4705               * > A start tag whose tag name is "noframes"
4706               */
4707              case '+NOFRAMES':
4708                  return $this->step_in_head();
4709          }
4710  
4711          // Parse error: ignore the token.
4712          return $this->step();
4713      }
4714  
4715      /**
4716       * Parses next element in the 'in foreign content' insertion mode.
4717       *
4718       * This internal function performs the 'in foreign content' insertion mode
4719       * logic for the generalized WP_HTML_Processor::step() function.
4720       *
4721       * @since 6.7.0 Stub implementation.
4722       *
4723       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4724       *
4725       * @see https://html.spec.whatwg.org/#parsing-main-inforeign
4726       * @see WP_HTML_Processor::step
4727       *
4728       * @return bool Whether an element was found.
4729       */
4730  	private function step_in_foreign_content(): bool {
4731          $tag_name   = $this->get_token_name();
4732          $token_type = $this->get_token_type();
4733          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4734          $op         = "{$op_sigil}{$tag_name}";
4735  
4736          /*
4737           * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4738           *
4739           * This section drawn out above the switch to more easily incorporate
4740           * the additional rules based on the presence of the attributes.
4741           */
4742          if (
4743              '+FONT' === $op &&
4744              (
4745                  null !== $this->get_attribute( 'color' ) ||
4746                  null !== $this->get_attribute( 'face' ) ||
4747                  null !== $this->get_attribute( 'size' )
4748              )
4749          ) {
4750              $op = '+FONT with attributes';
4751          }
4752  
4753          switch ( $op ) {
4754              case '#text':
4755                  /*
4756                   * > A character token that is U+0000 NULL
4757                   *
4758                   * This is handled by `get_modifiable_text()`.
4759                   */
4760  
4761                  /*
4762                   * Whitespace-only text does not affect the frameset-ok flag.
4763                   * It is probably inter-element whitespace, but it may also
4764                   * contain character references which decode only to whitespace.
4765                   */
4766                  if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
4767                      $this->state->frameset_ok = false;
4768                  }
4769  
4770                  $this->insert_foreign_element( $this->state->current_token, false );
4771                  return true;
4772  
4773              /*
4774               * CDATA sections are alternate wrappers for text content and therefore
4775               * ought to follow the same rules as text nodes.
4776               */
4777              case '#cdata-section':
4778                  /*
4779                   * NULL bytes and whitespace do not change the frameset-ok flag.
4780                   */
4781                  $current_token        = $this->bookmarks[ $this->state->current_token->bookmark_name ];
4782                  $cdata_content_start  = $current_token->start + 9;
4783                  $cdata_content_length = $current_token->length - 12;
4784                  if ( strspn( $this->html, "\0 \t\n\f\r", $cdata_content_start, $cdata_content_length ) !== $cdata_content_length ) {
4785                      $this->state->frameset_ok = false;
4786                  }
4787  
4788                  $this->insert_foreign_element( $this->state->current_token, false );
4789                  return true;
4790  
4791              /*
4792               * > A comment token
4793               */
4794              case '#comment':
4795              case '#funky-comment':
4796              case '#presumptuous-tag':
4797                  $this->insert_foreign_element( $this->state->current_token, false );
4798                  return true;
4799  
4800              /*
4801               * > A DOCTYPE token
4802               */
4803              case 'html':
4804                  // Parse error: ignore the token.
4805                  return $this->step();
4806  
4807              /*
4808               * > A start tag whose tag name is "b", "big", "blockquote", "body", "br", "center",
4809               * > "code", "dd", "div", "dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5",
4810               * > "h6", "head", "hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol",
4811               * > "p", "pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup",
4812               * > "table", "tt", "u", "ul", "var"
4813               *
4814               * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4815               *
4816               * > An end tag whose tag name is "br", "p"
4817               *
4818               * Closing BR tags are always reported by the Tag Processor as opening tags.
4819               */
4820              case '+B':
4821              case '+BIG':
4822              case '+BLOCKQUOTE':
4823              case '+BODY':
4824              case '+BR':
4825              case '+CENTER':
4826              case '+CODE':
4827              case '+DD':
4828              case '+DIV':
4829              case '+DL':
4830              case '+DT':
4831              case '+EM':
4832              case '+EMBED':
4833              case '+H1':
4834              case '+H2':
4835              case '+H3':
4836              case '+H4':
4837              case '+H5':
4838              case '+H6':
4839              case '+HEAD':
4840              case '+HR':
4841              case '+I':
4842              case '+IMG':
4843              case '+LI':
4844              case '+LISTING':
4845              case '+MENU':
4846              case '+META':
4847              case '+NOBR':
4848              case '+OL':
4849              case '+P':
4850              case '+PRE':
4851              case '+RUBY':
4852              case '+S':
4853              case '+SMALL':
4854              case '+SPAN':
4855              case '+STRONG':
4856              case '+STRIKE':
4857              case '+SUB':
4858              case '+SUP':
4859              case '+TABLE':
4860              case '+TT':
4861              case '+U':
4862              case '+UL':
4863              case '+VAR':
4864              case '+FONT with attributes':
4865              case '-BR':
4866              case '-P':
4867                  // @todo Indicate a parse error once it's possible.
4868                  foreach ( $this->state->stack_of_open_elements->walk_up() as $current_node ) {
4869                      if (
4870                          'math' === $current_node->integration_node_type ||
4871                          'html' === $current_node->integration_node_type ||
4872                          'html' === $current_node->namespace
4873                      ) {
4874                          break;
4875                      }
4876  
4877                      $this->state->stack_of_open_elements->pop();
4878                  }
4879                  goto in_foreign_content_process_in_current_insertion_mode;
4880          }
4881  
4882          /*
4883           * > Any other start tag
4884           */
4885          if ( ! $this->is_tag_closer() ) {
4886              $this->insert_foreign_element( $this->state->current_token, false );
4887  
4888              /*
4889               * > If the token has its self-closing flag set, then run
4890               * > the appropriate steps from the following list:
4891               * >
4892               * >   ↪ the token's tag name is "script", and the new current node is in the SVG namespace
4893               * >         Acknowledge the token's self-closing flag, and then act as
4894               * >         described in the steps for a "script" end tag below.
4895               * >
4896               * >   ↪ Otherwise
4897               * >         Pop the current node off the stack of open elements and
4898               * >         acknowledge the token's self-closing flag.
4899               *
4900               * Since the rules for SCRIPT below indicate to pop the element off of the stack of
4901               * open elements, which is the same for the Otherwise condition, there's no need to
4902               * separate these checks. The difference comes when a parser operates with the scripting
4903               * flag enabled, and executes the script, which this parser does not support.
4904               */
4905              if ( $this->state->current_token->has_self_closing_flag ) {
4906                  $this->state->stack_of_open_elements->pop();
4907              }
4908              return true;
4909          }
4910  
4911          /*
4912           * > An end tag whose name is "script", if the current node is an SVG script element.
4913           */
4914          if ( $this->is_tag_closer() && 'SCRIPT' === $this->state->current_token->node_name && 'svg' === $this->state->current_token->namespace ) {
4915              $this->state->stack_of_open_elements->pop();
4916              return true;
4917          }
4918  
4919          /*
4920           * > Any other end tag
4921           */
4922          if ( $this->is_tag_closer() ) {
4923              $node = $this->state->stack_of_open_elements->current_node();
4924              if ( $tag_name !== $node->node_name ) {
4925                  // @todo Indicate a parse error once it's possible.
4926              }
4927              in_foreign_content_end_tag_loop:
4928              if ( $node === $this->state->stack_of_open_elements->at( 1 ) ) {
4929                  return true;
4930              }
4931  
4932              /*
4933               * > If node's tag name, converted to ASCII lowercase, is the same as the tag name
4934               * > of the token, pop elements from the stack of open elements until node has
4935               * > been popped from the stack, and then return.
4936               */
4937              if ( 0 === strcasecmp( $node->node_name, $tag_name ) ) {
4938                  foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
4939                      $this->state->stack_of_open_elements->pop();
4940                      if ( $node === $item ) {
4941                          return true;
4942                      }
4943                  }
4944              }
4945  
4946              foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
4947                  $node = $item;
4948                  break;
4949              }
4950  
4951              if ( 'html' !== $node->namespace ) {
4952                  goto in_foreign_content_end_tag_loop;
4953              }
4954  
4955              in_foreign_content_process_in_current_insertion_mode:
4956              switch ( $this->state->insertion_mode ) {
4957                  case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
4958                      return $this->step_initial();
4959  
4960                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
4961                      return $this->step_before_html();
4962  
4963                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
4964                      return $this->step_before_head();
4965  
4966                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
4967                      return $this->step_in_head();
4968  
4969                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
4970                      return $this->step_in_head_noscript();
4971  
4972                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
4973                      return $this->step_after_head();
4974  
4975                  case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
4976                      return $this->step_in_body();
4977  
4978                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
4979                      return $this->step_in_table();
4980  
4981                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
4982                      return $this->step_in_table_text();
4983  
4984                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
4985                      return $this->step_in_caption();
4986  
4987                  case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
4988                      return $this->step_in_column_group();
4989  
4990                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
4991                      return $this->step_in_table_body();
4992  
4993                  case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
4994                      return $this->step_in_row();
4995  
4996                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
4997                      return $this->step_in_cell();
4998  
4999                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
5000                      return $this->step_in_select();
5001  
5002                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
5003                      return $this->step_in_select_in_table();
5004  
5005                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
5006                      return $this->step_in_template();
5007  
5008                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
5009                      return $this->step_after_body();
5010  
5011                  case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
5012                      return $this->step_in_frameset();
5013  
5014                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
5015                      return $this->step_after_frameset();
5016  
5017                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
5018                      return $this->step_after_after_body();
5019  
5020                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
5021                      return $this->step_after_after_frameset();
5022  
5023                  // This should be unreachable but PHP doesn't have total type checking on switch.
5024                  default:
5025                      $this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
5026              }
5027          }
5028  
5029          $this->bail( 'Should not have been able to reach end of IN FOREIGN CONTENT processing. Check HTML API code.' );
5030          // This unnecessary return prevents tools from inaccurately reporting type errors.
5031          return false;
5032      }
5033  
5034      /*
5035       * Internal helpers
5036       */
5037  
5038      /**
5039       * Creates a new bookmark for the currently-matched token and returns the generated name.
5040       *
5041       * @since 6.4.0
5042       * @since 6.5.0 Renamed from bookmark_tag() to bookmark_token().
5043       *
5044       * @throws Exception When unable to allocate requested bookmark.
5045       *
5046       * @return string|false Name of created bookmark, or false if unable to create.
5047       */
5048  	private function bookmark_token() {
5049          if ( ! parent::set_bookmark( ++$this->bookmark_counter ) ) {
5050              $this->last_error = self::ERROR_EXCEEDED_MAX_BOOKMARKS;
5051              throw new Exception( 'could not allocate bookmark' );
5052          }
5053  
5054          return "{$this->bookmark_counter}";
5055      }
5056  
5057      /*
5058       * HTML semantic overrides for Tag Processor
5059       */
5060  
5061      /**
5062       * Indicates the namespace of the current token, or "html" if there is none.
5063       *
5064       * @return string One of "html", "math", or "svg".
5065       */
5066  	public function get_namespace(): string {
5067          if ( ! isset( $this->current_element ) ) {
5068              return parent::get_namespace();
5069          }
5070  
5071          return $this->current_element->token->namespace;
5072      }
5073  
5074      /**
5075       * Returns the uppercase name of the matched tag.
5076       *
5077       * The semantic rules for HTML specify that certain tags be reprocessed
5078       * with a different tag name. Because of this, the tag name presented
5079       * by the HTML Processor may differ from the one reported by the HTML
5080       * Tag Processor, which doesn't apply these semantic rules.
5081       *
5082       * Example:
5083       *
5084       *     $processor = new WP_HTML_Tag_Processor( '<div class="test">Test</div>' );
5085       *     $processor->next_tag() === true;
5086       *     $processor->get_tag() === 'DIV';
5087       *
5088       *     $processor->next_tag() === false;
5089       *     $processor->get_tag() === null;
5090       *
5091       * @since 6.4.0
5092       *
5093       * @return string|null Name of currently matched tag in input HTML, or `null` if none found.
5094       */
5095  	public function get_tag(): ?string {
5096          if ( null !== $this->last_error ) {
5097              return null;
5098          }
5099  
5100          if ( $this->is_virtual() ) {
5101              return $this->current_element->token->node_name;
5102          }
5103  
5104          $tag_name = parent::get_tag();
5105  
5106          /*
5107           * > A start tag whose tag name is "image"
5108           * > Change the token's tag name to "img" and reprocess it. (Don't ask.)
5109           */
5110          return ( 'IMAGE' === $tag_name && 'html' === $this->get_namespace() )
5111              ? 'IMG'
5112              : $tag_name;
5113      }
5114  
5115      /**
5116       * Indicates if the currently matched tag contains the self-closing flag.
5117       *
5118       * No HTML elements ought to have the self-closing flag and for those, the self-closing
5119       * flag will be ignored. For void elements this is benign because they "self close"
5120       * automatically. For non-void HTML elements though problems will appear if someone
5121       * intends to use a self-closing element in place of that element with an empty body.
5122       * For HTML foreign elements and custom elements the self-closing flag determines if
5123       * they self-close or not.
5124       *
5125       * This function does not determine if a tag is self-closing,
5126       * but only if the self-closing flag is present in the syntax.
5127       *
5128       * @since 6.6.0 Subclassed for the HTML Processor.
5129       *
5130       * @return bool Whether the currently matched tag contains the self-closing flag.
5131       */
5132  	public function has_self_closing_flag(): bool {
5133          return $this->is_virtual() ? false : parent::has_self_closing_flag();
5134      }
5135  
5136      /**
5137       * Returns the node name represented by the token.
5138       *
5139       * This matches the DOM API value `nodeName`. Some values
5140       * are static, such as `#text` for a text node, while others
5141       * are dynamically generated from the token itself.
5142       *
5143       * Dynamic names:
5144       *  - Uppercase tag name for tag matches.
5145       *  - `html` for DOCTYPE declarations.
5146       *
5147       * Note that if the Tag Processor is not matched on a token
5148       * then this function will return `null`, either because it
5149       * hasn't yet found a token or because it reached the end
5150       * of the document without matching a token.
5151       *
5152       * @since 6.6.0 Subclassed for the HTML Processor.
5153       *
5154       * @return string|null Name of the matched token.
5155       */
5156  	public function get_token_name(): ?string {
5157          return $this->is_virtual()
5158              ? $this->current_element->token->node_name
5159              : parent::get_token_name();
5160      }
5161  
5162      /**
5163       * Indicates the kind of matched token, if any.
5164       *
5165       * This differs from `get_token_name()` in that it always
5166       * returns a static string indicating the type, whereas
5167       * `get_token_name()` may return values derived from the
5168       * token itself, such as a tag name or processing
5169       * instruction tag.
5170       *
5171       * Possible values:
5172       *  - `#tag` when matched on a tag.
5173       *  - `#text` when matched on a text node.
5174       *  - `#cdata-section` when matched on a CDATA node.
5175       *  - `#comment` when matched on a comment.
5176       *  - `#doctype` when matched on a DOCTYPE declaration.
5177       *  - `#presumptuous-tag` when matched on an empty tag closer.
5178       *  - `#funky-comment` when matched on a funky comment.
5179       *
5180       * @since 6.6.0 Subclassed for the HTML Processor.
5181       *
5182       * @return string|null What kind of token is matched, or null.
5183       */
5184  	public function get_token_type(): ?string {
5185          if ( $this->is_virtual() ) {
5186              /*
5187               * This logic comes from the Tag Processor.
5188               *
5189               * @todo It would be ideal not to repeat this here, but it's not clearly
5190               *       better to allow passing a token name to `get_token_type()`.
5191               */
5192              $node_name     = $this->current_element->token->node_name;
5193              $starting_char = $node_name[0];
5194              if ( 'A' <= $starting_char && 'Z' >= $starting_char ) {
5195                  return '#tag';
5196              }
5197  
5198              if ( 'html' === $node_name ) {
5199                  return '#doctype';
5200              }
5201  
5202              return $node_name;
5203          }
5204  
5205          return parent::get_token_type();
5206      }
5207  
5208      /**
5209       * Returns the value of a requested attribute from a matched tag opener if that attribute exists.
5210       *
5211       * Example:
5212       *
5213       *     $p = WP_HTML_Processor::create_fragment( '<div enabled class="test" data-test-id="14">Test</div>' );
5214       *     $p->next_token() === true;
5215       *     $p->get_attribute( 'data-test-id' ) === '14';
5216       *     $p->get_attribute( 'enabled' ) === true;
5217       *     $p->get_attribute( 'aria-label' ) === null;
5218       *
5219       *     $p->next_tag() === false;
5220       *     $p->get_attribute( 'class' ) === null;
5221       *
5222       * @since 6.6.0 Subclassed for HTML Processor.
5223       *
5224       * @param string $name Name of attribute whose value is requested.
5225       * @return string|true|null Value of attribute or `null` if not available. Boolean attributes return `true`.
5226       */
5227  	public function get_attribute( $name ) {
5228          return $this->is_virtual() ? null : parent::get_attribute( $name );
5229      }
5230  
5231      /**
5232       * Updates or creates a new attribute on the currently matched tag with the passed value.
5233       *
5234       * For boolean attributes special handling is provided:
5235       *  - When `true` is passed as the value, then only the attribute name is added to the tag.
5236       *  - When `false` is passed, the attribute gets removed if it existed before.
5237       *
5238       * For string attributes, the value is escaped using the `esc_attr` function.
5239       *
5240       * @since 6.6.0 Subclassed for the HTML Processor.
5241       *
5242       * @param string      $name  The attribute name to target.
5243       * @param string|bool $value The new attribute value.
5244       * @return bool Whether an attribute value was set.
5245       */
5246  	public function set_attribute( $name, $value ): bool {
5247          return $this->is_virtual() ? false : parent::set_attribute( $name, $value );
5248      }
5249  
5250      /**
5251       * Remove an attribute from the currently-matched tag.
5252       *
5253       * @since 6.6.0 Subclassed for HTML Processor.
5254       *
5255       * @param string $name The attribute name to remove.
5256       * @return bool Whether an attribute was removed.
5257       */
5258  	public function remove_attribute( $name ): bool {
5259          return $this->is_virtual() ? false : parent::remove_attribute( $name );
5260      }
5261  
5262      /**
5263       * Gets lowercase names of all attributes matching a given prefix in the current tag.
5264       *
5265       * Note that matching is case-insensitive. This is in accordance with the spec:
5266       *
5267       * > There must never be two or more attributes on
5268       * > the same start tag whose names are an ASCII
5269       * > case-insensitive match for each other.
5270       *     - HTML 5 spec
5271       *
5272       * Example:
5273       *
5274       *     $p = new WP_HTML_Tag_Processor( '<div data-ENABLED class="test" DATA-test-id="14">Test</div>' );
5275       *     $p->next_tag( array( 'class_name' => 'test' ) ) === true;
5276       *     $p->get_attribute_names_with_prefix( 'data-' ) === array( 'data-enabled', 'data-test-id' );
5277       *
5278       *     $p->next_tag() === false;
5279       *     $p->get_attribute_names_with_prefix( 'data-' ) === null;
5280       *
5281       * @since 6.6.0 Subclassed for the HTML Processor.
5282       *
5283       * @see https://html.spec.whatwg.org/multipage/syntax.html#attributes-2:ascii-case-insensitive
5284       *
5285       * @param string $prefix Prefix of requested attribute names.
5286       * @return array|null List of attribute names, or `null` when no tag opener is matched.
5287       */
5288  	public function get_attribute_names_with_prefix( $prefix ): ?array {
5289          return $this->is_virtual() ? null : parent::get_attribute_names_with_prefix( $prefix );
5290      }
5291  
5292      /**
5293       * Adds a new class name to the currently matched tag.
5294       *
5295       * @since 6.6.0 Subclassed for the HTML Processor.
5296       *
5297       * @param string $class_name The class name to add.
5298       * @return bool Whether the class was set to be added.
5299       */
5300  	public function add_class( $class_name ): bool {
5301          return $this->is_virtual() ? false : parent::add_class( $class_name );
5302      }
5303  
5304      /**
5305       * Removes a class name from the currently matched tag.
5306       *
5307       * @since 6.6.0 Subclassed for the HTML Processor.
5308       *
5309       * @param string $class_name The class name to remove.
5310       * @return bool Whether the class was set to be removed.
5311       */
5312  	public function remove_class( $class_name ): bool {
5313          return $this->is_virtual() ? false : parent::remove_class( $class_name );
5314      }
5315  
5316      /**
5317       * Returns if a matched tag contains the given ASCII case-insensitive class name.
5318       *
5319       * @since 6.6.0 Subclassed for the HTML Processor.
5320       *
5321       * @todo When reconstructing active formatting elements with attributes, find a way
5322       *       to indicate if the virtually-reconstructed formatting elements contain the
5323       *       wanted class name.
5324       *
5325       * @param string $wanted_class Look for this CSS class name, ASCII case-insensitive.
5326       * @return bool|null Whether the matched tag contains the given class name, or null if not matched.
5327       */
5328  	public function has_class( $wanted_class ): ?bool {
5329          return $this->is_virtual() ? null : parent::has_class( $wanted_class );
5330      }
5331  
5332      /**
5333       * Generator for a foreach loop to step through each class name for the matched tag.
5334       *
5335       * This generator function is designed to be used inside a "foreach" loop.
5336       *
5337       * Example:
5338       *
5339       *     $p = WP_HTML_Processor::create_fragment( "<div class='free &lt;egg&lt;\tlang-en'>" );
5340       *     $p->next_tag();
5341       *     foreach ( $p->class_list() as $class_name ) {
5342       *         echo "{$class_name} ";
5343       *     }
5344       *     // Outputs: "free <egg> lang-en "
5345       *
5346       * @since 6.6.0 Subclassed for the HTML Processor.
5347       */
5348  	public function class_list() {
5349          return $this->is_virtual() ? null : parent::class_list();
5350      }
5351  
5352      /**
5353       * Returns the modifiable text for a matched token, or an empty string.
5354       *
5355       * Modifiable text is text content that may be read and changed without
5356       * changing the HTML structure of the document around it. This includes
5357       * the contents of `#text` nodes in the HTML as well as the inner
5358       * contents of HTML comments, Processing Instructions, and others, even
5359       * though these nodes aren't part of a parsed DOM tree. They also contain
5360       * the contents of SCRIPT and STYLE tags, of TEXTAREA tags, and of any
5361       * other section in an HTML document which cannot contain HTML markup (DATA).
5362       *
5363       * If a token has no modifiable text then an empty string is returned to
5364       * avoid needless crashing or type errors. An empty string does not mean
5365       * that a token has modifiable text, and a token with modifiable text may
5366       * have an empty string (e.g. a comment with no contents).
5367       *
5368       * @since 6.6.0 Subclassed for the HTML Processor.
5369       *
5370       * @return string
5371       */
5372  	public function get_modifiable_text(): string {
5373          return $this->is_virtual() ? '' : parent::get_modifiable_text();
5374      }
5375  
5376      /**
5377       * Indicates what kind of comment produced the comment node.
5378       *
5379       * Because there are different kinds of HTML syntax which produce
5380       * comments, the Tag Processor tracks and exposes this as a type
5381       * for the comment. Nominally only regular HTML comments exist as
5382       * they are commonly known, but a number of unrelated syntax errors
5383       * also produce comments.
5384       *
5385       * @see self::COMMENT_AS_ABRUPTLY_CLOSED_COMMENT
5386       * @see self::COMMENT_AS_CDATA_LOOKALIKE
5387       * @see self::COMMENT_AS_INVALID_HTML
5388       * @see self::COMMENT_AS_HTML_COMMENT
5389       * @see self::COMMENT_AS_PI_NODE_LOOKALIKE
5390       *
5391       * @since 6.6.0 Subclassed for the HTML Processor.
5392       *
5393       * @return string|null
5394       */
5395  	public function get_comment_type(): ?string {
5396          return $this->is_virtual() ? null : parent::get_comment_type();
5397      }
5398  
5399      /**
5400       * Removes a bookmark that is no longer needed.
5401       *
5402       * Releasing a bookmark frees up the small
5403       * performance overhead it requires.
5404       *
5405       * @since 6.4.0
5406       *
5407       * @param string $bookmark_name Name of the bookmark to remove.
5408       * @return bool Whether the bookmark already existed before removal.
5409       */
5410  	public function release_bookmark( $bookmark_name ): bool {
5411          return parent::release_bookmark( "_{$bookmark_name}" );
5412      }
5413  
5414      /**
5415       * Moves the internal cursor in the HTML Processor to a given bookmark's location.
5416       *
5417       * Be careful! Seeking backwards to a previous location resets the parser to the
5418       * start of the document and reparses the entire contents up until it finds the
5419       * sought-after bookmarked location.
5420       *
5421       * In order to prevent accidental infinite loops, there's a
5422       * maximum limit on the number of times seek() can be called.
5423       *
5424       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
5425       *
5426       * @since 6.4.0
5427       *
5428       * @param string $bookmark_name Jump to the place in the document identified by this bookmark name.
5429       * @return bool Whether the internal cursor was successfully moved to the bookmark's location.
5430       */
5431  	public function seek( $bookmark_name ): bool {
5432          // Flush any pending updates to the document before beginning.
5433          $this->get_updated_html();
5434  
5435          $actual_bookmark_name = "_{$bookmark_name}";
5436          $processor_started_at = $this->state->current_token
5437              ? $this->bookmarks[ $this->state->current_token->bookmark_name ]->start
5438              : 0;
5439          $bookmark_starts_at   = $this->bookmarks[ $actual_bookmark_name ]->start;
5440          $direction            = $bookmark_starts_at > $processor_started_at ? 'forward' : 'backward';
5441  
5442          /*
5443           * If seeking backwards, it's possible that the sought-after bookmark exists within an element
5444           * which has been closed before the current cursor; in other words, it has already been removed
5445           * from the stack of open elements. This means that it's insufficient to simply pop off elements
5446           * from the stack of open elements which appear after the bookmarked location and then jump to
5447           * that location, as the elements which were open before won't be re-opened.
5448           *
5449           * In order to maintain consistency, the HTML Processor rewinds to the start of the document
5450           * and reparses everything until it finds the sought-after bookmark.
5451           *
5452           * There are potentially better ways to do this: cache the parser state for each bookmark and
5453           * restore it when seeking; store an immutable and idempotent register of where elements open
5454           * and close.
5455           *
5456           * If caching the parser state it will be essential to properly maintain the cached stack of
5457           * open elements and active formatting elements when modifying the document. This could be a
5458           * tedious and time-consuming process as well, and so for now will not be performed.
5459           *
5460           * It may be possible to track bookmarks for where elements open and close, and in doing so
5461           * be able to quickly recalculate breadcrumbs for any element in the document. It may even
5462           * be possible to remove the stack of open elements and compute it on the fly this way.
5463           * If doing this, the parser would need to track the opening and closing locations for all
5464           * tokens in the breadcrumb path for any and all bookmarks. By utilizing bookmarks themselves
5465           * this list could be automatically maintained while modifying the document. Finding the
5466           * breadcrumbs would then amount to traversing that list from the start until the token
5467           * being inspected. Once an element closes, if there are no bookmarks pointing to locations
5468           * within that element, then all of these locations may be forgotten to save on memory use
5469           * and computation time.
5470           */
5471          if ( 'backward' === $direction ) {
5472  
5473              /*
5474               * When moving backward, stateful stacks should be cleared.
5475               */
5476              foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
5477                  $this->state->stack_of_open_elements->remove_node( $item );
5478              }
5479  
5480              foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
5481                  $this->state->active_formatting_elements->remove_node( $item );
5482              }
5483  
5484              /*
5485               * **After** clearing stacks, more processor state can be reset.
5486               * This must be done after clearing the stack because those stacks generate events that
5487               * would appear on a subsequent call to `next_token()`.
5488               */
5489              $this->state->frameset_ok                       = true;
5490              $this->state->stack_of_template_insertion_modes = array();
5491              $this->state->head_element                      = null;
5492              $this->state->form_element                      = null;
5493              $this->state->current_token                     = null;
5494              $this->current_element                          = null;
5495              $this->element_queue                            = array();
5496  
5497              /*
5498               * The absence of a context node indicates a full parse.
5499               * The presence of a context node indicates a fragment parser.
5500               */
5501              if ( null === $this->context_node ) {
5502                  $this->change_parsing_namespace( 'html' );
5503                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_INITIAL;
5504                  $this->breadcrumbs           = array();
5505  
5506                  $this->bookmarks['initial'] = new WP_HTML_Span( 0, 0 );
5507                  parent::seek( 'initial' );
5508                  unset( $this->bookmarks['initial'] );
5509              } else {
5510  
5511                  /*
5512                   * Push the root-node (HTML) back onto the stack of open elements.
5513                   *
5514                   * Fragment parsers require this extra bit of setup.
5515                   * It's handled in full parsers by advancing the processor state.
5516                   */
5517                  $this->state->stack_of_open_elements->push(
5518                      new WP_HTML_Token(
5519                          'root-node',
5520                          'HTML',
5521                          false
5522                      )
5523                  );
5524  
5525                  $this->change_parsing_namespace(
5526                      $this->context_node->integration_node_type
5527                          ? 'html'
5528                          : $this->context_node->namespace
5529                  );
5530  
5531                  if ( 'TEMPLATE' === $this->context_node->node_name ) {
5532                      $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
5533                  }
5534  
5535                  $this->reset_insertion_mode_appropriately();
5536                  $this->breadcrumbs = array_slice( $this->breadcrumbs, 0, 2 );
5537                  parent::seek( $this->context_node->bookmark_name );
5538              }
5539          }
5540  
5541          /*
5542           * Here, the processor moves forward through the document until it matches the bookmark.
5543           * do-while is used here because the processor is expected to already be stopped on
5544           * a token than may match the bookmarked location.
5545           */
5546          do {
5547              /*
5548               * The processor will stop on virtual tokens, but bookmarks may not be set on them.
5549               * They should not be matched when seeking a bookmark, skip them.
5550               */
5551              if ( $this->is_virtual() ) {
5552                  continue;
5553              }
5554              if ( $bookmark_starts_at === $this->bookmarks[ $this->state->current_token->bookmark_name ]->start ) {
5555                  return true;
5556              }
5557          } while ( $this->next_token() );
5558  
5559          return false;
5560      }
5561  
5562      /**
5563       * Sets a bookmark in the HTML document.
5564       *
5565       * Bookmarks represent specific places or tokens in the HTML
5566       * document, such as a tag opener or closer. When applying
5567       * edits to a document, such as setting an attribute, the
5568       * text offsets of that token may shift; the bookmark is
5569       * kept updated with those shifts and remains stable unless
5570       * the entire span of text in which the token sits is removed.
5571       *
5572       * Release bookmarks when they are no longer needed.
5573       *
5574       * Example:
5575       *
5576       *     <main><h2>Surprising fact you may not know!</h2></main>
5577       *           ^  ^
5578       *            \-|-- this `H2` opener bookmark tracks the token
5579       *
5580       *     <main class="clickbait"><h2>Surprising fact you may no…
5581       *                             ^  ^
5582       *                              \-|-- it shifts with edits
5583       *
5584       * Bookmarks provide the ability to seek to a previously-scanned
5585       * place in the HTML document. This avoids the need to re-scan
5586       * the entire document.
5587       *
5588       * Example:
5589       *
5590       *     <ul><li>One</li><li>Two</li><li>Three</li></ul>
5591       *                                 ^^^^
5592       *                                 want to note this last item
5593       *
5594       *     $p = new WP_HTML_Tag_Processor( $html );
5595       *     $in_list = false;
5596       *     while ( $p->next_tag( array( 'tag_closers' => $in_list ? 'visit' : 'skip' ) ) ) {
5597       *         if ( 'UL' === $p->get_tag() ) {
5598       *             if ( $p->is_tag_closer() ) {
5599       *                 $in_list = false;
5600       *                 $p->set_bookmark( 'resume' );
5601       *                 if ( $p->seek( 'last-li' ) ) {
5602       *                     $p->add_class( 'last-li' );
5603       *                 }
5604       *                 $p->seek( 'resume' );
5605       *                 $p->release_bookmark( 'last-li' );
5606       *                 $p->release_bookmark( 'resume' );
5607       *             } else {
5608       *                 $in_list = true;
5609       *             }
5610       *         }
5611       *
5612       *         if ( 'LI' === $p->get_tag() ) {
5613       *             $p->set_bookmark( 'last-li' );
5614       *         }
5615       *     }
5616       *
5617       * Bookmarks intentionally hide the internal string offsets
5618       * to which they refer. They are maintained internally as
5619       * updates are applied to the HTML document and therefore
5620       * retain their "position" - the location to which they
5621       * originally pointed. The inability to use bookmarks with
5622       * functions like `substr` is therefore intentional to guard
5623       * against accidentally breaking the HTML.
5624       *
5625       * Because bookmarks allocate memory and require processing
5626       * for every applied update, they are limited and require
5627       * a name. They should not be created with programmatically-made
5628       * names, such as "li_{$index}" with some loop. As a general
5629       * rule they should only be created with string-literal names
5630       * like "start-of-section" or "last-paragraph".
5631       *
5632       * Bookmarks are a powerful tool to enable complicated behavior.
5633       * Consider double-checking that you need this tool if you are
5634       * reaching for it, as inappropriate use could lead to broken
5635       * HTML structure or unwanted processing overhead.
5636       *
5637       * @since 6.4.0
5638       *
5639       * @param string $bookmark_name Identifies this particular bookmark.
5640       * @return bool Whether the bookmark was successfully created.
5641       */
5642  	public function set_bookmark( $bookmark_name ): bool {
5643          return parent::set_bookmark( "_{$bookmark_name}" );
5644      }
5645  
5646      /**
5647       * Checks whether a bookmark with the given name exists.
5648       *
5649       * @since 6.5.0
5650       *
5651       * @param string $bookmark_name Name to identify a bookmark that potentially exists.
5652       * @return bool Whether that bookmark exists.
5653       */
5654  	public function has_bookmark( $bookmark_name ): bool {
5655          return parent::has_bookmark( "_{$bookmark_name}" );
5656      }
5657  
5658      /*
5659       * HTML Parsing Algorithms
5660       */
5661  
5662      /**
5663       * Closes a P element.
5664       *
5665       * @since 6.4.0
5666       *
5667       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5668       *
5669       * @see https://html.spec.whatwg.org/#close-a-p-element
5670       */
5671  	private function close_a_p_element(): void {
5672          $this->generate_implied_end_tags( 'P' );
5673          $this->state->stack_of_open_elements->pop_until( 'P' );
5674      }
5675  
5676      /**
5677       * Closes elements that have implied end tags.
5678       *
5679       * @since 6.4.0
5680       * @since 6.7.0 Full spec support.
5681       *
5682       * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5683       *
5684       * @param string|null $except_for_this_element Perform as if this element doesn't exist in the stack of open elements.
5685       */
5686  	private function generate_implied_end_tags( ?string $except_for_this_element = null ): void {
5687          $elements_with_implied_end_tags = array(
5688              'DD',
5689              'DT',
5690              'LI',
5691              'OPTGROUP',
5692              'OPTION',
5693              'P',
5694              'RB',
5695              'RP',
5696              'RT',
5697              'RTC',
5698          );
5699  
5700          $no_exclusions = ! isset( $except_for_this_element );
5701  
5702          while (
5703              ( $no_exclusions || ! $this->state->stack_of_open_elements->current_node_is( $except_for_this_element ) ) &&
5704              in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true )
5705          ) {
5706              $this->state->stack_of_open_elements->pop();
5707          }
5708      }
5709  
5710      /**
5711       * Closes elements that have implied end tags, thoroughly.
5712       *
5713       * See the HTML specification for an explanation why this is
5714       * different from generating end tags in the normal sense.
5715       *
5716       * @since 6.4.0
5717       * @since 6.7.0 Full spec support.
5718       *
5719       * @see WP_HTML_Processor::generate_implied_end_tags
5720       * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5721       */
5722  	private function generate_implied_end_tags_thoroughly(): void {
5723          $elements_with_implied_end_tags = array(
5724              'CAPTION',
5725              'COLGROUP',
5726              'DD',
5727              'DT',
5728              'LI',
5729              'OPTGROUP',
5730              'OPTION',
5731              'P',
5732              'RB',
5733              'RP',
5734              'RT',
5735              'RTC',
5736              'TBODY',
5737              'TD',
5738              'TFOOT',
5739              'TH',
5740              'THEAD',
5741              'TR',
5742          );
5743  
5744          while ( in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true ) ) {
5745              $this->state->stack_of_open_elements->pop();
5746          }
5747      }
5748  
5749      /**
5750       * Returns the adjusted current node.
5751       *
5752       * > The adjusted current node is the context element if the parser was created as
5753       * > part of the HTML fragment parsing algorithm and the stack of open elements
5754       * > has only one element in it (fragment case); otherwise, the adjusted current
5755       * > node is the current node.
5756       *
5757       * @see https://html.spec.whatwg.org/#adjusted-current-node
5758       *
5759       * @since 6.7.0
5760       *
5761       * @return WP_HTML_Token|null The adjusted current node.
5762       */
5763  	private function get_adjusted_current_node(): ?WP_HTML_Token {
5764          if ( isset( $this->context_node ) && 1 === $this->state->stack_of_open_elements->count() ) {
5765              return $this->context_node;
5766          }
5767  
5768          return $this->state->stack_of_open_elements->current_node();
5769      }
5770  
5771      /**
5772       * Reconstructs the active formatting elements.
5773       *
5774       * > This has the effect of reopening all the formatting elements that were opened
5775       * > in the current body, cell, or caption (whichever is youngest) that haven't
5776       * > been explicitly closed.
5777       *
5778       * @since 6.4.0
5779       *
5780       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5781       *
5782       * @see https://html.spec.whatwg.org/#reconstruct-the-active-formatting-elements
5783       *
5784       * @return bool Whether any formatting elements needed to be reconstructed.
5785       */
5786  	private function reconstruct_active_formatting_elements(): bool {
5787          /*
5788           * > If there are no entries in the list of active formatting elements, then there is nothing
5789           * > to reconstruct; stop this algorithm.
5790           */
5791          if ( 0 === $this->state->active_formatting_elements->count() ) {
5792              return false;
5793          }
5794  
5795          $last_entry = $this->state->active_formatting_elements->current_node();
5796          if (
5797  
5798              /*
5799               * > If the last (most recently added) entry in the list of active formatting elements is a marker;
5800               * > stop this algorithm.
5801               */
5802              'marker' === $last_entry->node_name ||
5803  
5804              /*
5805               * > If the last (most recently added) entry in the list of active formatting elements is an
5806               * > element that is in the stack of open elements, then there is nothing to reconstruct;
5807               * > stop this algorithm.
5808               */
5809              $this->state->stack_of_open_elements->contains_node( $last_entry )
5810          ) {
5811              return false;
5812          }
5813  
5814          $this->bail( 'Cannot reconstruct active formatting elements when advancing and rewinding is required.' );
5815      }
5816  
5817      /**
5818       * Runs the reset the insertion mode appropriately algorithm.
5819       *
5820       * @since 6.7.0
5821       *
5822       * @see https://html.spec.whatwg.org/multipage/parsing.html#reset-the-insertion-mode-appropriately
5823       */
5824  	private function reset_insertion_mode_appropriately(): void {
5825          // Set the first node.
5826          $first_node = null;
5827          foreach ( $this->state->stack_of_open_elements->walk_down() as $first_node ) {
5828              break;
5829          }
5830  
5831          /*
5832           * > 1. Let _last_ be false.
5833           */
5834          $last = false;
5835          foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
5836              /*
5837               * > 2. Let _node_ be the last node in the stack of open elements.
5838               * > 3. _Loop_: If _node_ is the first node in the stack of open elements, then set _last_
5839               * >            to true, and, if the parser was created as part of the HTML fragment parsing
5840               * >            algorithm (fragment case), set node to the context element passed to
5841               * >            that algorithm.
5842               * > …
5843               */
5844              if ( $node === $first_node ) {
5845                  $last = true;
5846                  if ( isset( $this->context_node ) ) {
5847                      $node = $this->context_node;
5848                  }
5849              }
5850  
5851              // All of the following rules are for matching HTML elements.
5852              if ( 'html' !== $node->namespace ) {
5853                  continue;
5854              }
5855  
5856              switch ( $node->node_name ) {
5857                  /*
5858                   * > 4. If node is a `select` element, run these substeps:
5859                   * >   1. If _last_ is true, jump to the step below labeled done.
5860                   * >   2. Let _ancestor_ be _node_.
5861                   * >   3. _Loop_: If _ancestor_ is the first node in the stack of open elements,
5862                   * >      jump to the step below labeled done.
5863                   * >   4. Let ancestor be the node before ancestor in the stack of open elements.
5864                   * >   …
5865                   * >   7. Jump back to the step labeled _loop_.
5866                   * >   8. _Done_: Switch the insertion mode to "in select" and return.
5867                   */
5868                  case 'SELECT':
5869                      if ( ! $last ) {
5870                          foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $ancestor ) {
5871                              if ( 'html' !== $ancestor->namespace ) {
5872                                  continue;
5873                              }
5874  
5875                              switch ( $ancestor->node_name ) {
5876                                  /*
5877                                   * > 5. If _ancestor_ is a `template` node, jump to the step below
5878                                   * >    labeled _done_.
5879                                   */
5880                                  case 'TEMPLATE':
5881                                      break 2;
5882  
5883                                  /*
5884                                   * > 6. If _ancestor_ is a `table` node, switch the insertion mode to
5885                                   * >    "in select in table" and return.
5886                                   */
5887                                  case 'TABLE':
5888                                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
5889                                      return;
5890                              }
5891                          }
5892                      }
5893                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
5894                      return;
5895  
5896                  /*
5897                   * > 5. If _node_ is a `td` or `th` element and _last_ is false, then switch the
5898                   * >    insertion mode to "in cell" and return.
5899                   */
5900                  case 'TD':
5901                  case 'TH':
5902                      if ( ! $last ) {
5903                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
5904                          return;
5905                      }
5906                      break;
5907  
5908                      /*
5909                      * > 6. If _node_ is a `tr` element, then switch the insertion mode to "in row"
5910                      * >    and return.
5911                      */
5912                  case 'TR':
5913                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
5914                      return;
5915  
5916                  /*
5917                   * > 7. If _node_ is a `tbody`, `thead`, or `tfoot` element, then switch the
5918                   * >    insertion mode to "in table body" and return.
5919                   */
5920                  case 'TBODY':
5921                  case 'THEAD':
5922                  case 'TFOOT':
5923                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
5924                      return;
5925  
5926                  /*
5927                   * > 8. If _node_ is a `caption` element, then switch the insertion mode to
5928                   * >    "in caption" and return.
5929                   */
5930                  case 'CAPTION':
5931                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
5932                      return;
5933  
5934                  /*
5935                   * > 9. If _node_ is a `colgroup` element, then switch the insertion mode to
5936                   * >    "in column group" and return.
5937                   */
5938                  case 'COLGROUP':
5939                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
5940                      return;
5941  
5942                  /*
5943                   * > 10. If _node_ is a `table` element, then switch the insertion mode to
5944                   * >     "in table" and return.
5945                   */
5946                  case 'TABLE':
5947                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
5948                      return;
5949  
5950                  /*
5951                   * > 11. If _node_ is a `template` element, then switch the insertion mode to the
5952                   * >     current template insertion mode and return.
5953                   */
5954                  case 'TEMPLATE':
5955                      $this->state->insertion_mode = end( $this->state->stack_of_template_insertion_modes );
5956                      return;
5957  
5958                  /*
5959                   * > 12. If _node_ is a `head` element and _last_ is false, then switch the
5960                   * >     insertion mode to "in head" and return.
5961                   */
5962                  case 'HEAD':
5963                      if ( ! $last ) {
5964                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
5965                          return;
5966                      }
5967                      break;
5968  
5969                  /*
5970                   * > 13. If _node_ is a `body` element, then switch the insertion mode to "in body"
5971                   * >     and return.
5972                   */
5973                  case 'BODY':
5974                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
5975                      return;
5976  
5977                  /*
5978                   * > 14. If _node_ is a `frameset` element, then switch the insertion mode to
5979                   * >     "in frameset" and return. (fragment case)
5980                   */
5981                  case 'FRAMESET':
5982                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
5983                      return;
5984  
5985                  /*
5986                   * > 15. If _node_ is an `html` element, run these substeps:
5987                   * >     1. If the head element pointer is null, switch the insertion mode to
5988                   * >        "before head" and return. (fragment case)
5989                   * >     2. Otherwise, the head element pointer is not null, switch the insertion
5990                   * >        mode to "after head" and return.
5991                   */
5992                  case 'HTML':
5993                      $this->state->insertion_mode = isset( $this->state->head_element )
5994                          ? WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD
5995                          : WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
5996                      return;
5997              }
5998          }
5999  
6000          /*
6001           * > 16. If _last_ is true, then switch the insertion mode to "in body"
6002           * >     and return. (fragment case)
6003           *
6004           * This is only reachable if `$last` is true, as per the fragment parsing case.
6005           */
6006          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6007      }
6008  
6009      /**
6010       * Runs the adoption agency algorithm.
6011       *
6012       * @since 6.4.0
6013       *
6014       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
6015       *
6016       * @see https://html.spec.whatwg.org/#adoption-agency-algorithm
6017       */
6018  	private function run_adoption_agency_algorithm(): void {
6019          $budget       = 1000;
6020          $subject      = $this->get_tag();
6021          $current_node = $this->state->stack_of_open_elements->current_node();
6022  
6023          if (
6024              // > If the current node is an HTML element whose tag name is subject
6025              $current_node && $subject === $current_node->node_name &&
6026              // > the current node is not in the list of active formatting elements
6027              ! $this->state->active_formatting_elements->contains_node( $current_node )
6028          ) {
6029              $this->state->stack_of_open_elements->pop();
6030              return;
6031          }
6032  
6033          $outer_loop_counter = 0;
6034          while ( $budget-- > 0 ) {
6035              if ( $outer_loop_counter++ >= 8 ) {
6036                  return;
6037              }
6038  
6039              /*
6040               * > Let formatting element be the last element in the list of active formatting elements that:
6041               * >   - is between the end of the list and the last marker in the list,
6042               * >     if any, or the start of the list otherwise,
6043               * >   - and has the tag name subject.
6044               */
6045              $formatting_element = null;
6046              foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
6047                  if ( 'marker' === $item->node_name ) {
6048                      break;
6049                  }
6050  
6051                  if ( $subject === $item->node_name ) {
6052                      $formatting_element = $item;
6053                      break;
6054                  }
6055              }
6056  
6057              // > If there is no such element, then return and instead act as described in the "any other end tag" entry above.
6058              if ( null === $formatting_element ) {
6059                  $this->bail( 'Cannot run adoption agency when "any other end tag" is required.' );
6060              }
6061  
6062              // > If formatting element is not in the stack of open elements, then this is a parse error; remove the element from the list, and return.
6063              if ( ! $this->state->stack_of_open_elements->contains_node( $formatting_element ) ) {
6064                  $this->state->active_formatting_elements->remove_node( $formatting_element );
6065                  return;
6066              }
6067  
6068              // > If formatting element is in the stack of open elements, but the element is not in scope, then this is a parse error; return.
6069              if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $formatting_element->node_name ) ) {
6070                  return;
6071              }
6072  
6073              /*
6074               * > Let furthest block be the topmost node in the stack of open elements that is lower in the stack
6075               * > than formatting element, and is an element in the special category. There might not be one.
6076               */
6077              $is_above_formatting_element = true;
6078              $furthest_block              = null;
6079              foreach ( $this->state->stack_of_open_elements->walk_down() as $item ) {
6080                  if ( $is_above_formatting_element && $formatting_element->bookmark_name !== $item->bookmark_name ) {
6081                      continue;
6082                  }
6083  
6084                  if ( $is_above_formatting_element ) {
6085                      $is_above_formatting_element = false;
6086                      continue;
6087                  }
6088  
6089                  if ( self::is_special( $item ) ) {
6090                      $furthest_block = $item;
6091                      break;
6092                  }
6093              }
6094  
6095              /*
6096               * > If there is no furthest block, then the UA must first pop all the nodes from the bottom of the
6097               * > stack of open elements, from the current node up to and including formatting element, then
6098               * > remove formatting element from the list of active formatting elements, and finally return.
6099               */
6100              if ( null === $furthest_block ) {
6101                  foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
6102                      $this->state->stack_of_open_elements->pop();
6103  
6104                      if ( $formatting_element->bookmark_name === $item->bookmark_name ) {
6105                          $this->state->active_formatting_elements->remove_node( $formatting_element );
6106                          return;
6107                      }
6108                  }
6109              }
6110  
6111              $this->bail( 'Cannot extract common ancestor in adoption agency algorithm.' );
6112          }
6113  
6114          $this->bail( 'Cannot run adoption agency when looping required.' );
6115      }
6116  
6117      /**
6118       * Runs the "close the cell" algorithm.
6119       *
6120       * > Where the steps above say to close the cell, they mean to run the following algorithm:
6121       * >   1. Generate implied end tags.
6122       * >   2. If the current node is not now a td element or a th element, then this is a parse error.
6123       * >   3. Pop elements from the stack of open elements stack until a td element or a th element has been popped from the stack.
6124       * >   4. Clear the list of active formatting elements up to the last marker.
6125       * >   5. Switch the insertion mode to "in row".
6126       *
6127       * @see https://html.spec.whatwg.org/multipage/parsing.html#close-the-cell
6128       *
6129       * @since 6.7.0
6130       */
6131  	private function close_cell(): void {
6132          $this->generate_implied_end_tags();
6133          // @todo Parse error if the current node is a "td" or "th" element.
6134          foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
6135              $this->state->stack_of_open_elements->pop();
6136              if ( 'TD' === $element->node_name || 'TH' === $element->node_name ) {
6137                  break;
6138              }
6139          }
6140          $this->state->active_formatting_elements->clear_up_to_last_marker();
6141          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
6142      }
6143  
6144      /**
6145       * Inserts an HTML element on the stack of open elements.
6146       *
6147       * @since 6.4.0
6148       *
6149       * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6150       *
6151       * @param WP_HTML_Token $token Name of bookmark pointing to element in original input HTML.
6152       */
6153  	private function insert_html_element( WP_HTML_Token $token ): void {
6154          $this->state->stack_of_open_elements->push( $token );
6155      }
6156  
6157      /**
6158       * Inserts a foreign element on to the stack of open elements.
6159       *
6160       * @since 6.7.0
6161       *
6162       * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6163       *
6164       * @param WP_HTML_Token $token                     Insert this token. The token's namespace and
6165       *                                                 insertion point will be updated correctly.
6166       * @param bool          $only_add_to_element_stack Whether to skip the "insert an element at the adjusted
6167       *                                                 insertion location" algorithm when adding this element.
6168       */
6169  	private function insert_foreign_element( WP_HTML_Token $token, bool $only_add_to_element_stack ): void {
6170          $adjusted_current_node = $this->get_adjusted_current_node();
6171  
6172          $token->namespace = $adjusted_current_node ? $adjusted_current_node->namespace : 'html';
6173  
6174          if ( $this->is_mathml_integration_point() ) {
6175              $token->integration_node_type = 'math';
6176          } elseif ( $this->is_html_integration_point() ) {
6177              $token->integration_node_type = 'html';
6178          }
6179  
6180          if ( false === $only_add_to_element_stack ) {
6181              /*
6182               * @todo Implement the "appropriate place for inserting a node" and the
6183               *       "insert an element at the adjusted insertion location" algorithms.
6184               *
6185               * These algorithms mostly impacts DOM tree construction and not the HTML API.
6186               * Here, there's no DOM node onto which the element will be appended, so the
6187               * parser will skip this step.
6188               *
6189               * @see https://html.spec.whatwg.org/#insert-an-element-at-the-adjusted-insertion-location
6190               */
6191          }
6192  
6193          $this->insert_html_element( $token );
6194      }
6195  
6196      /**
6197       * Inserts a virtual element on the stack of open elements.
6198       *
6199       * @since 6.7.0
6200       *
6201       * @param string      $token_name    Name of token to create and insert into the stack of open elements.
6202       * @param string|null $bookmark_name Optional. Name to give bookmark for created virtual node.
6203       *                                   Defaults to auto-creating a bookmark name.
6204       * @return WP_HTML_Token Newly-created virtual token.
6205       */
6206  	private function insert_virtual_node( $token_name, $bookmark_name = null ): WP_HTML_Token {
6207          $here = $this->bookmarks[ $this->state->current_token->bookmark_name ];
6208          $name = $bookmark_name ?? $this->bookmark_token();
6209  
6210          $this->bookmarks[ $name ] = new WP_HTML_Span( $here->start, 0 );
6211  
6212          $token = new WP_HTML_Token( $name, $token_name, false );
6213          $this->insert_html_element( $token );
6214          return $token;
6215      }
6216  
6217      /*
6218       * HTML Specification Helpers
6219       */
6220  
6221      /**
6222       * Indicates if the current token is a MathML integration point.
6223       *
6224       * @since 6.7.0
6225       *
6226       * @see https://html.spec.whatwg.org/#mathml-text-integration-point
6227       *
6228       * @return bool Whether the current token is a MathML integration point.
6229       */
6230  	private function is_mathml_integration_point(): bool {
6231          $current_token = $this->state->current_token;
6232          if ( ! isset( $current_token ) ) {
6233              return false;
6234          }
6235  
6236          if ( 'math' !== $current_token->namespace || 'M' !== $current_token->node_name[0] ) {
6237              return false;
6238          }
6239  
6240          $tag_name = $current_token->node_name;
6241  
6242          return (
6243              'MI' === $tag_name ||
6244              'MO' === $tag_name ||
6245              'MN' === $tag_name ||
6246              'MS' === $tag_name ||
6247              'MTEXT' === $tag_name
6248          );
6249      }
6250  
6251      /**
6252       * Indicates if the current token is an HTML integration point.
6253       *
6254       * Note that this method must be an instance method with access
6255       * to the current token, since it needs to examine the attributes
6256       * of the currently-matched tag, if it's in the MathML namespace.
6257       * Otherwise it would be required to scan the HTML and ensure that
6258       * no other accounting is overlooked.
6259       *
6260       * @since 6.7.0
6261       *
6262       * @see https://html.spec.whatwg.org/#html-integration-point
6263       *
6264       * @return bool Whether the current token is an HTML integration point.
6265       */
6266  	private function is_html_integration_point(): bool {
6267          $current_token = $this->state->current_token;
6268          if ( ! isset( $current_token ) ) {
6269              return false;
6270          }
6271  
6272          if ( 'html' === $current_token->namespace ) {
6273              return false;
6274          }
6275  
6276          $tag_name = $current_token->node_name;
6277  
6278          if ( 'svg' === $current_token->namespace ) {
6279              return (
6280                  'DESC' === $tag_name ||
6281                  'FOREIGNOBJECT' === $tag_name ||
6282                  'TITLE' === $tag_name
6283              );
6284          }
6285  
6286          if ( 'math' === $current_token->namespace ) {
6287              if ( 'ANNOTATION-XML' !== $tag_name ) {
6288                  return false;
6289              }
6290  
6291              $encoding = $this->get_attribute( 'encoding' );
6292  
6293              return (
6294                  is_string( $encoding ) &&
6295                  (
6296                      0 === strcasecmp( $encoding, 'application/xhtml+xml' ) ||
6297                      0 === strcasecmp( $encoding, 'text/html' )
6298                  )
6299              );
6300          }
6301  
6302          $this->bail( 'Should not have reached end of HTML Integration Point detection: check HTML API code.' );
6303          // This unnecessary return prevents tools from inaccurately reporting type errors.
6304          return false;
6305      }
6306  
6307      /**
6308       * Returns whether an element of a given name is in the HTML special category.
6309       *
6310       * @since 6.4.0
6311       *
6312       * @see https://html.spec.whatwg.org/#special
6313       *
6314       * @param WP_HTML_Token|string $tag_name Node to check, or only its name if in the HTML namespace.
6315       * @return bool Whether the element of the given name is in the special category.
6316       */
6317  	public static function is_special( $tag_name ): bool {
6318          if ( is_string( $tag_name ) ) {
6319              $tag_name = strtoupper( $tag_name );
6320          } else {
6321              $tag_name = 'html' === $tag_name->namespace
6322                  ? strtoupper( $tag_name->node_name )
6323                  : "{$tag_name->namespace} {$tag_name->node_name}";
6324          }
6325  
6326          return (
6327              'ADDRESS' === $tag_name ||
6328              'APPLET' === $tag_name ||
6329              'AREA' === $tag_name ||
6330              'ARTICLE' === $tag_name ||
6331              'ASIDE' === $tag_name ||
6332              'BASE' === $tag_name ||
6333              'BASEFONT' === $tag_name ||
6334              'BGSOUND' === $tag_name ||
6335              'BLOCKQUOTE' === $tag_name ||
6336              'BODY' === $tag_name ||
6337              'BR' === $tag_name ||
6338              'BUTTON' === $tag_name ||
6339              'CAPTION' === $tag_name ||
6340              'CENTER' === $tag_name ||
6341              'COL' === $tag_name ||
6342              'COLGROUP' === $tag_name ||
6343              'DD' === $tag_name ||
6344              'DETAILS' === $tag_name ||
6345              'DIR' === $tag_name ||
6346              'DIV' === $tag_name ||
6347              'DL' === $tag_name ||
6348              'DT' === $tag_name ||
6349              'EMBED' === $tag_name ||
6350              'FIELDSET' === $tag_name ||
6351              'FIGCAPTION' === $tag_name ||
6352              'FIGURE' === $tag_name ||
6353              'FOOTER' === $tag_name ||
6354              'FORM' === $tag_name ||
6355              'FRAME' === $tag_name ||
6356              'FRAMESET' === $tag_name ||
6357              'H1' === $tag_name ||
6358              'H2' === $tag_name ||
6359              'H3' === $tag_name ||
6360              'H4' === $tag_name ||
6361              'H5' === $tag_name ||
6362              'H6' === $tag_name ||
6363              'HEAD' === $tag_name ||
6364              'HEADER' === $tag_name ||
6365              'HGROUP' === $tag_name ||
6366              'HR' === $tag_name ||
6367              'HTML' === $tag_name ||
6368              'IFRAME' === $tag_name ||
6369              'IMG' === $tag_name ||
6370              'INPUT' === $tag_name ||
6371              'KEYGEN' === $tag_name ||
6372              'LI' === $tag_name ||
6373              'LINK' === $tag_name ||
6374              'LISTING' === $tag_name ||
6375              'MAIN' === $tag_name ||
6376              'MARQUEE' === $tag_name ||
6377              'MENU' === $tag_name ||
6378              'META' === $tag_name ||
6379              'NAV' === $tag_name ||
6380              'NOEMBED' === $tag_name ||
6381              'NOFRAMES' === $tag_name ||
6382              'NOSCRIPT' === $tag_name ||
6383              'OBJECT' === $tag_name ||
6384              'OL' === $tag_name ||
6385              'P' === $tag_name ||
6386              'PARAM' === $tag_name ||
6387              'PLAINTEXT' === $tag_name ||
6388              'PRE' === $tag_name ||
6389              'SCRIPT' === $tag_name ||
6390              'SEARCH' === $tag_name ||
6391              'SECTION' === $tag_name ||
6392              'SELECT' === $tag_name ||
6393              'SOURCE' === $tag_name ||
6394              'STYLE' === $tag_name ||
6395              'SUMMARY' === $tag_name ||
6396              'TABLE' === $tag_name ||
6397              'TBODY' === $tag_name ||
6398              'TD' === $tag_name ||
6399              'TEMPLATE' === $tag_name ||
6400              'TEXTAREA' === $tag_name ||
6401              'TFOOT' === $tag_name ||
6402              'TH' === $tag_name ||
6403              'THEAD' === $tag_name ||
6404              'TITLE' === $tag_name ||
6405              'TR' === $tag_name ||
6406              'TRACK' === $tag_name ||
6407              'UL' === $tag_name ||
6408              'WBR' === $tag_name ||
6409              'XMP' === $tag_name ||
6410  
6411              // MathML.
6412              'math MI' === $tag_name ||
6413              'math MO' === $tag_name ||
6414              'math MN' === $tag_name ||
6415              'math MS' === $tag_name ||
6416              'math MTEXT' === $tag_name ||
6417              'math ANNOTATION-XML' === $tag_name ||
6418  
6419              // SVG.
6420              'svg DESC' === $tag_name ||
6421              'svg FOREIGNOBJECT' === $tag_name ||
6422              'svg TITLE' === $tag_name
6423          );
6424      }
6425  
6426      /**
6427       * Returns whether a given element is an HTML Void Element
6428       *
6429       * > area, base, br, col, embed, hr, img, input, link, meta, source, track, wbr
6430       *
6431       * @since 6.4.0
6432       *
6433       * @see https://html.spec.whatwg.org/#void-elements
6434       *
6435       * @param string $tag_name Name of HTML tag to check.
6436       * @return bool Whether the given tag is an HTML Void Element.
6437       */
6438  	public static function is_void( $tag_name ): bool {
6439          $tag_name = strtoupper( $tag_name );
6440  
6441          return (
6442              'AREA' === $tag_name ||
6443              'BASE' === $tag_name ||
6444              'BASEFONT' === $tag_name || // Obsolete but still treated as void.
6445              'BGSOUND' === $tag_name || // Obsolete but still treated as void.
6446              'BR' === $tag_name ||
6447              'COL' === $tag_name ||
6448              'EMBED' === $tag_name ||
6449              'FRAME' === $tag_name ||
6450              'HR' === $tag_name ||
6451              'IMG' === $tag_name ||
6452              'INPUT' === $tag_name ||
6453              'KEYGEN' === $tag_name || // Obsolete but still treated as void.
6454              'LINK' === $tag_name ||
6455              'META' === $tag_name ||
6456              'PARAM' === $tag_name || // Obsolete but still treated as void.
6457              'SOURCE' === $tag_name ||
6458              'TRACK' === $tag_name ||
6459              'WBR' === $tag_name
6460          );
6461      }
6462  
6463      /**
6464       * Gets an encoding from a given string.
6465       *
6466       * This is an algorithm defined in the WHAT-WG specification.
6467       *
6468       * Example:
6469       *
6470       *     'UTF-8' === self::get_encoding( 'utf8' );
6471       *     'UTF-8' === self::get_encoding( "  \tUTF-8 " );
6472       *     null    === self::get_encoding( 'UTF-7' );
6473       *     null    === self::get_encoding( 'utf8; charset=' );
6474       *
6475       * @see https://encoding.spec.whatwg.org/#concept-encoding-get
6476       *
6477       * @todo As this parser only supports UTF-8, only the UTF-8
6478       *       encodings are detected. Add more as desired, but the
6479       *       parser will bail on non-UTF-8 encodings.
6480       *
6481       * @since 6.7.0
6482       *
6483       * @param string $label A string which may specify a known encoding.
6484       * @return string|null Known encoding if matched, otherwise null.
6485       */
6486  	protected static function get_encoding( string $label ): ?string {
6487          /*
6488           * > Remove any leading and trailing ASCII whitespace from label.
6489           */
6490          $label = trim( $label, " \t\f\r\n" );
6491  
6492          /*
6493           * > If label is an ASCII case-insensitive match for any of the labels listed in the
6494           * > table below, then return the corresponding encoding; otherwise return failure.
6495           */
6496          switch ( strtolower( $label ) ) {
6497              case 'unicode-1-1-utf-8':
6498              case 'unicode11utf8':
6499              case 'unicode20utf8':
6500              case 'utf-8':
6501              case 'utf8':
6502              case 'x-unicode20utf8':
6503                  return 'UTF-8';
6504  
6505              default:
6506                  return null;
6507          }
6508      }
6509  
6510      /*
6511       * Constants that would pollute the top of the class if they were found there.
6512       */
6513  
6514      /**
6515       * Indicates that the next HTML token should be parsed and processed.
6516       *
6517       * @since 6.4.0
6518       *
6519       * @var string
6520       */
6521      const PROCESS_NEXT_NODE = 'process-next-node';
6522  
6523      /**
6524       * Indicates that the current HTML token should be reprocessed in the newly-selected insertion mode.
6525       *
6526       * @since 6.4.0
6527       *
6528       * @var string
6529       */
6530      const REPROCESS_CURRENT_NODE = 'reprocess-current-node';
6531  
6532      /**
6533       * Indicates that the current HTML token should be processed without advancing the parser.
6534       *
6535       * @since 6.5.0
6536       *
6537       * @var string
6538       */
6539      const PROCESS_CURRENT_NODE = 'process-current-node';
6540  
6541      /**
6542       * Indicates that the parser encountered unsupported markup and has bailed.
6543       *
6544       * @since 6.4.0
6545       *
6546       * @var string
6547       */
6548      const ERROR_UNSUPPORTED = 'unsupported';
6549  
6550      /**
6551       * Indicates that the parser encountered more HTML tokens than it
6552       * was able to process and has bailed.
6553       *
6554       * @since 6.4.0
6555       *
6556       * @var string
6557       */
6558      const ERROR_EXCEEDED_MAX_BOOKMARKS = 'exceeded-max-bookmarks';
6559  
6560      /**
6561       * Unlock code that must be passed into the constructor to create this class.
6562       *
6563       * This class extends the WP_HTML_Tag_Processor, which has a public class
6564       * constructor. Therefore, it's not possible to have a private constructor here.
6565       *
6566       * This unlock code is used to ensure that anyone calling the constructor is
6567       * doing so with a full understanding that it's intended to be a private API.
6568       *
6569       * @access private
6570       */
6571      const CONSTRUCTOR_UNLOCK_CODE = 'Use WP_HTML_Processor::create_fragment() instead of calling the class constructor directly.';
6572  }


Generated : Sat Nov 23 08:20:01 2024 Cross-referenced by PHPXref