[ Index ]

PHP Cross Reference of WordPress Trunk (Updated Daily)

Search

title

Body

[close]

/wp-includes/html-api/ -> class-wp-html-processor.php (source)

   1  <?php
   2  /**
   3   * HTML API: WP_HTML_Processor class
   4   *
   5   * @package WordPress
   6   * @subpackage HTML-API
   7   * @since 6.4.0
   8   */
   9  
  10  /**
  11   * Core class used to safely parse and modify an HTML document.
  12   *
  13   * The HTML Processor class properly parses and modifies HTML5 documents.
  14   *
  15   * It supports a subset of the HTML5 specification, and when it encounters
  16   * unsupported markup, it aborts early to avoid unintentionally breaking
  17   * the document. The HTML Processor should never break an HTML document.
  18   *
  19   * While the `WP_HTML_Tag_Processor` is a valuable tool for modifying
  20   * attributes on individual HTML tags, the HTML Processor is more capable
  21   * and useful for the following operations:
  22   *
  23   *  - Querying based on nested HTML structure.
  24   *
  25   * Eventually the HTML Processor will also support:
  26   *  - Wrapping a tag in surrounding HTML.
  27   *  - Unwrapping a tag by removing its parent.
  28   *  - Inserting and removing nodes.
  29   *  - Reading and changing inner content.
  30   *  - Navigating up or around HTML structure.
  31   *
  32   * ## Usage
  33   *
  34   * Use of this class requires three steps:
  35   *
  36   *   1. Call a static creator method with your input HTML document.
  37   *   2. Find the location in the document you are looking for.
  38   *   3. Request changes to the document at that location.
  39   *
  40   * Example:
  41   *
  42   *     $processor = WP_HTML_Processor::create_fragment( $html );
  43   *     if ( $processor->next_tag( array( 'breadcrumbs' => array( 'DIV', 'FIGURE', 'IMG' ) ) ) ) {
  44   *         $processor->add_class( 'responsive-image' );
  45   *     }
  46   *
  47   * #### Breadcrumbs
  48   *
  49   * Breadcrumbs represent the stack of open elements from the root
  50   * of the document or fragment down to the currently-matched node,
  51   * if one is currently selected. Call WP_HTML_Processor::get_breadcrumbs()
  52   * to inspect the breadcrumbs for a matched tag.
  53   *
  54   * Breadcrumbs can specify nested HTML structure and are equivalent
  55   * to a CSS selector comprising tag names separated by the child
  56   * combinator, such as "DIV > FIGURE > IMG".
  57   *
  58   * Since all elements find themselves inside a full HTML document
  59   * when parsed, the return value from `get_breadcrumbs()` will always
  60   * contain any implicit outermost elements. For example, when parsing
  61   * with `create_fragment()` in the `BODY` context (the default), any
  62   * tag in the given HTML document will contain `array( 'HTML', 'BODY', … )`
  63   * in its breadcrumbs.
  64   *
  65   * Despite containing the implied outermost elements in their breadcrumbs,
  66   * tags may be found with the shortest-matching breadcrumb query. That is,
  67   * `array( 'IMG' )` matches all IMG elements and `array( 'P', 'IMG' )`
  68   * matches all IMG elements directly inside a P element. To ensure that no
  69   * partial matches erroneously match it's possible to specify in a query
  70   * the full breadcrumb match all the way down from the root HTML element.
  71   *
  72   * Example:
  73   *
  74   *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
  75   *     //               ----- Matches here.
  76   *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'IMG' ) ) );
  77   *
  78   *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
  79   *     //                                  ---- Matches here.
  80   *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'FIGCAPTION', 'EM' ) ) );
  81   *
  82   *     $html = '<div><img></div><img>';
  83   *     //                       ----- Matches here, because IMG must be a direct child of the implicit BODY.
  84   *     $processor->next_tag( array( 'breadcrumbs' => array( 'BODY', 'IMG' ) ) );
  85   *
  86   * ## HTML Support
  87   *
  88   * This class implements a small part of the HTML5 specification.
  89   * It's designed to operate within its support and abort early whenever
  90   * encountering circumstances it can't properly handle. This is
  91   * the principle way in which this class remains as simple as possible
  92   * without cutting corners and breaking compliance.
  93   *
  94   * ### Supported elements
  95   *
  96   * If any unsupported element appears in the HTML input the HTML Processor
  97   * will abort early and stop all processing. This draconian measure ensures
  98   * that the HTML Processor won't break any HTML it doesn't fully understand.
  99   *
 100   * The HTML Processor supports all elements other than a specific set:
 101   *
 102   *  - Any element inside a TABLE.
 103   *  - Any element inside foreign content, including SVG and MATH.
 104   *  - Any element outside the IN BODY insertion mode, e.g. doctype declarations, meta, links.
 105   *
 106   * ### Supported markup
 107   *
 108   * Some kinds of non-normative HTML involve reconstruction of formatting elements and
 109   * re-parenting of mis-nested elements. For example, a DIV tag found inside a TABLE
 110   * may in fact belong _before_ the table in the DOM. If the HTML Processor encounters
 111   * such a case it will stop processing.
 112   *
 113   * The following list illustrates some common examples of unexpected HTML inputs that
 114   * the HTML Processor properly parses and represents:
 115   *
 116   *  - HTML with optional tags omitted, e.g. `<p>one<p>two`.
 117   *  - HTML with unexpected tag closers, e.g. `<p>one </span> more</p>`.
 118   *  - Non-void tags with self-closing flag, e.g. `<div/>the DIV is still open.</div>`.
 119   *  - Heading elements which close open heading elements of another level, e.g. `<h1>Closed by </h2>`.
 120   *  - Elements containing text that looks like other tags but isn't, e.g. `<title>The <img> is plaintext</title>`.
 121   *  - SCRIPT and STYLE tags containing text that looks like HTML but isn't, e.g. `<script>document.write('<p>Hi</p>');</script>`.
 122   *  - SCRIPT content which has been escaped, e.g. `<script><!-- document.write('<script>console.log("hi")</script>') --></script>`.
 123   *
 124   * ### Unsupported Features
 125   *
 126   * This parser does not report parse errors.
 127   *
 128   * Normally, when additional HTML or BODY tags are encountered in a document, if there
 129   * are any additional attributes on them that aren't found on the previous elements,
 130   * the existing HTML and BODY elements adopt those missing attribute values. This
 131   * parser does not add those additional attributes.
 132   *
 133   * In certain situations, elements are moved to a different part of the document in
 134   * a process called "adoption" and "fostering." Because the nodes move to a location
 135   * in the document that the parser had already processed, this parser does not support
 136   * these situations and will bail.
 137   *
 138   * @since 6.4.0
 139   *
 140   * @see WP_HTML_Tag_Processor
 141   * @see https://html.spec.whatwg.org/
 142   */
 143  class WP_HTML_Processor extends WP_HTML_Tag_Processor {
 144      /**
 145       * The maximum number of bookmarks allowed to exist at any given time.
 146       *
 147       * HTML processing requires more bookmarks than basic tag processing,
 148       * so this class constant from the Tag Processor is overwritten.
 149       *
 150       * @since 6.4.0
 151       *
 152       * @var int
 153       */
 154      const MAX_BOOKMARKS = 100;
 155  
 156      /**
 157       * Holds the working state of the parser, including the stack of
 158       * open elements and the stack of active formatting elements.
 159       *
 160       * Initialized in the constructor.
 161       *
 162       * @since 6.4.0
 163       *
 164       * @var WP_HTML_Processor_State
 165       */
 166      private $state;
 167  
 168      /**
 169       * Used to create unique bookmark names.
 170       *
 171       * This class sets a bookmark for every tag in the HTML document that it encounters.
 172       * The bookmark name is auto-generated and increments, starting with `1`. These are
 173       * internal bookmarks and are automatically released when the referring WP_HTML_Token
 174       * goes out of scope and is garbage-collected.
 175       *
 176       * @since 6.4.0
 177       *
 178       * @see WP_HTML_Processor::$release_internal_bookmark_on_destruct
 179       *
 180       * @var int
 181       */
 182      private $bookmark_counter = 0;
 183  
 184      /**
 185       * Stores an explanation for why something failed, if it did.
 186       *
 187       * @see self::get_last_error
 188       *
 189       * @since 6.4.0
 190       *
 191       * @var string|null
 192       */
 193      private $last_error = null;
 194  
 195      /**
 196       * Stores context for why the parser bailed on unsupported HTML, if it did.
 197       *
 198       * @see self::get_unsupported_exception
 199       *
 200       * @since 6.7.0
 201       *
 202       * @var WP_HTML_Unsupported_Exception|null
 203       */
 204      private $unsupported_exception = null;
 205  
 206      /**
 207       * Releases a bookmark when PHP garbage-collects its wrapping WP_HTML_Token instance.
 208       *
 209       * This function is created inside the class constructor so that it can be passed to
 210       * the stack of open elements and the stack of active formatting elements without
 211       * exposing it as a public method on the class.
 212       *
 213       * @since 6.4.0
 214       *
 215       * @var Closure|null
 216       */
 217      private $release_internal_bookmark_on_destruct = null;
 218  
 219      /**
 220       * Stores stack events which arise during parsing of the
 221       * HTML document, which will then supply the "match" events.
 222       *
 223       * @since 6.6.0
 224       *
 225       * @var WP_HTML_Stack_Event[]
 226       */
 227      private $element_queue = array();
 228  
 229      /**
 230       * Stores the current breadcrumbs.
 231       *
 232       * @since 6.7.0
 233       *
 234       * @var string[]
 235       */
 236      private $breadcrumbs = array();
 237  
 238      /**
 239       * Current stack event, if set, representing a matched token.
 240       *
 241       * Because the parser may internally point to a place further along in a document
 242       * than the nodes which have already been processed (some "virtual" nodes may have
 243       * appeared while scanning the HTML document), this will point at the "current" node
 244       * being processed. It comes from the front of the element queue.
 245       *
 246       * @since 6.6.0
 247       *
 248       * @var WP_HTML_Stack_Event|null
 249       */
 250      private $current_element = null;
 251  
 252      /**
 253       * Context node if created as a fragment parser.
 254       *
 255       * @var WP_HTML_Token|null
 256       */
 257      private $context_node = null;
 258  
 259      /*
 260       * Public Interface Functions
 261       */
 262  
 263      /**
 264       * Creates an HTML processor in the fragment parsing mode.
 265       *
 266       * Use this for cases where you are processing chunks of HTML that
 267       * will be found within a bigger HTML document, such as rendered
 268       * block output that exists within a post, `the_content` inside a
 269       * rendered site layout.
 270       *
 271       * Fragment parsing occurs within a context, which is an HTML element
 272       * that the document will eventually be placed in. It becomes important
 273       * when special elements have different rules than others, such as inside
 274       * a TEXTAREA or a TITLE tag where things that look like tags are text,
 275       * or inside a SCRIPT tag where things that look like HTML syntax are JS.
 276       *
 277       * The context value should be a representation of the tag into which the
 278       * HTML is found. For most cases this will be the body element. The HTML
 279       * form is provided because a context element may have attributes that
 280       * impact the parse, such as with a SCRIPT tag and its `type` attribute.
 281       *
 282       * ## Current HTML Support
 283       *
 284       *  - The only supported context is `<body>`, which is the default value.
 285       *  - The only supported document encoding is `UTF-8`, which is the default value.
 286       *
 287       * @since 6.4.0
 288       * @since 6.6.0 Returns `static` instead of `self` so it can create subclass instances.
 289       *
 290       * @param string $html     Input HTML fragment to process.
 291       * @param string $context  Context element for the fragment, must be default of `<body>`.
 292       * @param string $encoding Text encoding of the document; must be default of 'UTF-8'.
 293       * @return static|null The created processor if successful, otherwise null.
 294       */
 295  	public static function create_fragment( $html, $context = '<body>', $encoding = 'UTF-8' ) {
 296          if ( '<body>' !== $context || 'UTF-8' !== $encoding ) {
 297              return null;
 298          }
 299  
 300          if ( ! is_string( $html ) ) {
 301              _doing_it_wrong(
 302                  __METHOD__,
 303                  __( 'The HTML parameter must be a string.' ),
 304                  '6.9.0'
 305              );
 306              return null;
 307          }
 308  
 309          $context_processor = static::create_full_parser( "<!DOCTYPE html>{$context}", $encoding );
 310          if ( null === $context_processor ) {
 311              return null;
 312          }
 313  
 314          while ( $context_processor->next_tag() ) {
 315              if ( ! $context_processor->is_virtual() ) {
 316                  $context_processor->set_bookmark( 'final_node' );
 317              }
 318          }
 319  
 320          if (
 321              ! $context_processor->has_bookmark( 'final_node' ) ||
 322              ! $context_processor->seek( 'final_node' )
 323          ) {
 324              _doing_it_wrong( __METHOD__, __( 'No valid context element was detected.' ), '6.8.0' );
 325              return null;
 326          }
 327  
 328          return $context_processor->create_fragment_at_current_node( $html );
 329      }
 330  
 331      /**
 332       * Creates an HTML processor in the full parsing mode.
 333       *
 334       * It's likely that a fragment parser is more appropriate, unless sending an
 335       * entire HTML document from start to finish. Consider a fragment parser with
 336       * a context node of `<body>`.
 337       *
 338       * UTF-8 is the only allowed encoding. If working with a document that
 339       * isn't UTF-8, first convert the document to UTF-8, then pass in the
 340       * converted HTML.
 341       *
 342       * @param string      $html                    Input HTML document to process.
 343       * @param string|null $known_definite_encoding Optional. If provided, specifies the charset used
 344       *                                             in the input byte stream. Currently must be UTF-8.
 345       * @return static|null The created processor if successful, otherwise null.
 346       */
 347  	public static function create_full_parser( $html, $known_definite_encoding = 'UTF-8' ) {
 348          if ( 'UTF-8' !== $known_definite_encoding ) {
 349              return null;
 350          }
 351          if ( ! is_string( $html ) ) {
 352              _doing_it_wrong(
 353                  __METHOD__,
 354                  __( 'The HTML parameter must be a string.' ),
 355                  '6.9.0'
 356              );
 357              return null;
 358          }
 359  
 360          $processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
 361          $processor->state->encoding            = $known_definite_encoding;
 362          $processor->state->encoding_confidence = 'certain';
 363  
 364          return $processor;
 365      }
 366  
 367      /**
 368       * Constructor.
 369       *
 370       * Do not use this method. Use the static creator methods instead.
 371       *
 372       * @access private
 373       *
 374       * @since 6.4.0
 375       *
 376       * @see WP_HTML_Processor::create_fragment()
 377       *
 378       * @param string      $html                                  HTML to process.
 379       * @param string|null $use_the_static_create_methods_instead This constructor should not be called manually.
 380       */
 381  	public function __construct( $html, $use_the_static_create_methods_instead = null ) {
 382          parent::__construct( $html );
 383  
 384          if ( self::CONSTRUCTOR_UNLOCK_CODE !== $use_the_static_create_methods_instead ) {
 385              _doing_it_wrong(
 386                  __METHOD__,
 387                  sprintf(
 388                      /* translators: %s: WP_HTML_Processor::create_fragment(). */
 389                      __( 'Call %s to create an HTML Processor instead of calling the constructor directly.' ),
 390                      '<code>WP_HTML_Processor::create_fragment()</code>'
 391                  ),
 392                  '6.4.0'
 393              );
 394          }
 395  
 396          $this->state = new WP_HTML_Processor_State();
 397  
 398          $this->state->stack_of_open_elements->set_push_handler(
 399              function ( WP_HTML_Token $token ): void {
 400                  $is_virtual            = ! isset( $this->state->current_token ) || $this->is_tag_closer();
 401                  $same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
 402                  $provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
 403                  $this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::PUSH, $provenance );
 404  
 405                  $this->change_parsing_namespace( $token->integration_node_type ? 'html' : $token->namespace );
 406              }
 407          );
 408  
 409          $this->state->stack_of_open_elements->set_pop_handler(
 410              function ( WP_HTML_Token $token ): void {
 411                  $is_virtual            = ! isset( $this->state->current_token ) || ! $this->is_tag_closer();
 412                  $same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
 413                  $provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
 414                  $this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::POP, $provenance );
 415  
 416                  $adjusted_current_node = $this->get_adjusted_current_node();
 417  
 418                  if ( $adjusted_current_node ) {
 419                      $this->change_parsing_namespace( $adjusted_current_node->integration_node_type ? 'html' : $adjusted_current_node->namespace );
 420                  } else {
 421                      $this->change_parsing_namespace( 'html' );
 422                  }
 423              }
 424          );
 425  
 426          /*
 427           * Create this wrapper so that it's possible to pass
 428           * a private method into WP_HTML_Token classes without
 429           * exposing it to any public API.
 430           */
 431          $this->release_internal_bookmark_on_destruct = function ( string $name ): void {
 432              parent::release_bookmark( $name );
 433          };
 434      }
 435  
 436      /**
 437       * Creates a fragment processor at the current node.
 438       *
 439       * HTML Fragment parsing always happens with a context node. HTML Fragment Processors can be
 440       * instantiated with a `BODY` context node via `WP_HTML_Processor::create_fragment( $html )`.
 441       *
 442       * The context node may impact how a fragment of HTML is parsed. For example, consider the HTML
 443       * fragment `<td />Inside TD?</td>`.
 444       *
 445       * A BODY context node will produce the following tree:
 446       *
 447       *     └─#text Inside TD?
 448       *
 449       * Notice that the `<td>` tags are completely ignored.
 450       *
 451       * Compare that with an SVG context node that produces the following tree:
 452       *
 453       *     ├─svg:td
 454       *     └─#text Inside TD?
 455       *
 456       * Here, a `td` node in the `svg` namespace is created, and its self-closing flag is respected.
 457       * This is a peculiarity of parsing HTML in foreign content like SVG.
 458       *
 459       * Finally, consider the tree produced with a TABLE context node:
 460       *
 461       *     └─TBODY
 462       *       └─TR
 463       *         └─TD
 464       *           └─#text Inside TD?
 465       *
 466       * These examples demonstrate how important the context node may be when processing an HTML
 467       * fragment. Special care must be taken when processing fragments that are expected to appear
 468       * in specific contexts. SVG and TABLE are good examples, but there are others.
 469       *
 470       * @see https://html.spec.whatwg.org/multipage/parsing.html#html-fragment-parsing-algorithm
 471       *
 472       * @since 6.8.0
 473       *
 474       * @param string $html Input HTML fragment to process.
 475       * @return static|null The created processor if successful, otherwise null.
 476       */
 477  	private function create_fragment_at_current_node( string $html ) {
 478          if ( $this->get_token_type() !== '#tag' || $this->is_tag_closer() ) {
 479              _doing_it_wrong(
 480                  __METHOD__,
 481                  __( 'The context element must be a start tag.' ),
 482                  '6.8.0'
 483              );
 484              return null;
 485          }
 486  
 487          $tag_name  = $this->current_element->token->node_name;
 488          $namespace = $this->current_element->token->namespace;
 489  
 490          if ( 'html' === $namespace && self::is_void( $tag_name ) ) {
 491              _doing_it_wrong(
 492                  __METHOD__,
 493                  sprintf(
 494                      // translators: %s: A tag name like INPUT or BR.
 495                      __( 'The context element cannot be a void element, found "%s".' ),
 496                      $tag_name
 497                  ),
 498                  '6.8.0'
 499              );
 500              return null;
 501          }
 502  
 503          /*
 504           * Prevent creating fragments at nodes that require a special tokenizer state.
 505           * This is unsupported by the HTML Processor.
 506           */
 507          if (
 508              'html' === $namespace &&
 509              in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP', 'PLAINTEXT' ), true )
 510          ) {
 511              _doing_it_wrong(
 512                  __METHOD__,
 513                  sprintf(
 514                      // translators: %s: A tag name like IFRAME or TEXTAREA.
 515                      __( 'The context element "%s" is not supported.' ),
 516                      $tag_name
 517                  ),
 518                  '6.8.0'
 519              );
 520              return null;
 521          }
 522  
 523          $fragment_processor = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
 524  
 525          $fragment_processor->compat_mode = $this->compat_mode;
 526  
 527          // @todo Create "fake" bookmarks for non-existent but implied nodes.
 528          $fragment_processor->bookmarks['root-node'] = new WP_HTML_Span( 0, 0 );
 529          $root_node                                  = new WP_HTML_Token(
 530              'root-node',
 531              'HTML',
 532              false
 533          );
 534          $fragment_processor->state->stack_of_open_elements->push( $root_node );
 535  
 536          $fragment_processor->bookmarks['context-node']   = new WP_HTML_Span( 0, 0 );
 537          $fragment_processor->context_node                = clone $this->current_element->token;
 538          $fragment_processor->context_node->bookmark_name = 'context-node';
 539          $fragment_processor->context_node->on_destroy    = null;
 540  
 541          $fragment_processor->breadcrumbs = array( 'HTML', $fragment_processor->context_node->node_name );
 542  
 543          if ( 'TEMPLATE' === $fragment_processor->context_node->node_name ) {
 544              $fragment_processor->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
 545          }
 546  
 547          $fragment_processor->reset_insertion_mode_appropriately();
 548  
 549          /*
 550           * > Set the parser's form element pointer to the nearest node to the context element that
 551           * > is a form element (going straight up the ancestor chain, and including the element
 552           * > itself, if it is a form element), if any. (If there is no such form element, the
 553           * > form element pointer keeps its initial value, null.)
 554           */
 555          foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
 556              if ( 'FORM' === $element->node_name && 'html' === $element->namespace ) {
 557                  $fragment_processor->state->form_element                = clone $element;
 558                  $fragment_processor->state->form_element->bookmark_name = null;
 559                  $fragment_processor->state->form_element->on_destroy    = null;
 560                  break;
 561              }
 562          }
 563  
 564          $fragment_processor->state->encoding_confidence = 'irrelevant';
 565  
 566          /*
 567           * Update the parsing namespace near the end of the process.
 568           * This is important so that any push/pop from the stack of open
 569           * elements does not change the parsing namespace.
 570           */
 571          $fragment_processor->change_parsing_namespace(
 572              $this->current_element->token->integration_node_type ? 'html' : $namespace
 573          );
 574  
 575          return $fragment_processor;
 576      }
 577  
 578      /**
 579       * Stops the parser and terminates its execution when encountering unsupported markup.
 580       *
 581       * @throws WP_HTML_Unsupported_Exception Halts execution of the parser.
 582       *
 583       * @since 6.7.0
 584       *
 585       * @param string $message Explains support is missing in order to parse the current node.
 586       */
 587  	private function bail( string $message ) {
 588          $here  = $this->bookmarks[ $this->state->current_token->bookmark_name ];
 589          $token = substr( $this->html, $here->start, $here->length );
 590  
 591          $open_elements = array();
 592          foreach ( $this->state->stack_of_open_elements->stack as $item ) {
 593              $open_elements[] = $item->node_name;
 594          }
 595  
 596          $active_formats = array();
 597          foreach ( $this->state->active_formatting_elements->walk_down() as $item ) {
 598              $active_formats[] = $item->node_name;
 599          }
 600  
 601          $this->last_error = self::ERROR_UNSUPPORTED;
 602  
 603          $this->unsupported_exception = new WP_HTML_Unsupported_Exception(
 604              $message,
 605              $this->state->current_token->node_name,
 606              $here->start,
 607              $token,
 608              $open_elements,
 609              $active_formats
 610          );
 611  
 612          throw $this->unsupported_exception;
 613      }
 614  
 615      /**
 616       * Returns the last error, if any.
 617       *
 618       * Various situations lead to parsing failure but this class will
 619       * return `false` in all those cases. To determine why something
 620       * failed it's possible to request the last error. This can be
 621       * helpful to know to distinguish whether a given tag couldn't
 622       * be found or if content in the document caused the processor
 623       * to give up and abort processing.
 624       *
 625       * Example
 626       *
 627       *     $processor = WP_HTML_Processor::create_fragment( '<template><strong><button><em><p><em>' );
 628       *     false === $processor->next_tag();
 629       *     WP_HTML_Processor::ERROR_UNSUPPORTED === $processor->get_last_error();
 630       *
 631       * @since 6.4.0
 632       *
 633       * @see self::ERROR_UNSUPPORTED
 634       * @see self::ERROR_EXCEEDED_MAX_BOOKMARKS
 635       *
 636       * @return string|null The last error, if one exists, otherwise null.
 637       */
 638  	public function get_last_error(): ?string {
 639          return $this->last_error;
 640      }
 641  
 642      /**
 643       * Returns context for why the parser aborted due to unsupported HTML, if it did.
 644       *
 645       * This is meant for debugging purposes, not for production use.
 646       *
 647       * @since 6.7.0
 648       *
 649       * @see self::$unsupported_exception
 650       *
 651       * @return WP_HTML_Unsupported_Exception|null
 652       */
 653  	public function get_unsupported_exception() {
 654          return $this->unsupported_exception;
 655      }
 656  
 657      /**
 658       * Finds the next tag matching the $query.
 659       *
 660       * @todo Support matching the class name and tag name.
 661       *
 662       * @since 6.4.0
 663       * @since 6.6.0 Visits all tokens, including virtual ones.
 664       *
 665       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
 666       *
 667       * @param array|string|null $query {
 668       *     Optional. Which tag name to find, having which class, etc. Default is to find any tag.
 669       *
 670       *     @type string|null $tag_name     Which tag to find, or `null` for "any tag."
 671       *     @type string      $tag_closers  'visit' to pause at tag closers, 'skip' or unset to only visit openers.
 672       *     @type int|null    $match_offset Find the Nth tag matching all search criteria.
 673       *                                     1 for "first" tag, 3 for "third," etc.
 674       *                                     Defaults to first tag.
 675       *     @type string|null $class_name   Tag must contain this whole class name to match.
 676       *     @type string[]    $breadcrumbs  DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
 677       *                                     May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
 678       * }
 679       * @return bool Whether a tag was matched.
 680       */
 681  	public function next_tag( $query = null ): bool {
 682          $visit_closers = isset( $query['tag_closers'] ) && 'visit' === $query['tag_closers'];
 683  
 684          if ( null === $query ) {
 685              while ( $this->next_token() ) {
 686                  if ( '#tag' !== $this->get_token_type() ) {
 687                      continue;
 688                  }
 689  
 690                  if ( ! $this->is_tag_closer() || $visit_closers ) {
 691                      return true;
 692                  }
 693              }
 694  
 695              return false;
 696          }
 697  
 698          if ( is_string( $query ) ) {
 699              $query = array( 'breadcrumbs' => array( $query ) );
 700          }
 701  
 702          if ( ! is_array( $query ) ) {
 703              _doing_it_wrong(
 704                  __METHOD__,
 705                  __( 'Please pass a query array to this function.' ),
 706                  '6.4.0'
 707              );
 708              return false;
 709          }
 710  
 711          if ( isset( $query['tag_name'] ) ) {
 712              $query['tag_name'] = strtoupper( $query['tag_name'] );
 713          }
 714  
 715          $needs_class = ( isset( $query['class_name'] ) && is_string( $query['class_name'] ) )
 716              ? $query['class_name']
 717              : null;
 718  
 719          if ( ! ( array_key_exists( 'breadcrumbs', $query ) && is_array( $query['breadcrumbs'] ) ) ) {
 720              while ( $this->next_token() ) {
 721                  if ( '#tag' !== $this->get_token_type() ) {
 722                      continue;
 723                  }
 724  
 725                  if ( isset( $query['tag_name'] ) && $query['tag_name'] !== $this->get_token_name() ) {
 726                      continue;
 727                  }
 728  
 729                  if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
 730                      continue;
 731                  }
 732  
 733                  if ( ! $this->is_tag_closer() || $visit_closers ) {
 734                      return true;
 735                  }
 736              }
 737  
 738              return false;
 739          }
 740  
 741          $breadcrumbs  = $query['breadcrumbs'];
 742          $match_offset = isset( $query['match_offset'] ) ? (int) $query['match_offset'] : 1;
 743  
 744          while ( $match_offset > 0 && $this->next_token() ) {
 745              if ( '#tag' !== $this->get_token_type() || $this->is_tag_closer() ) {
 746                  continue;
 747              }
 748  
 749              if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
 750                  continue;
 751              }
 752  
 753              if ( $this->matches_breadcrumbs( $breadcrumbs ) && 0 === --$match_offset ) {
 754                  return true;
 755              }
 756          }
 757  
 758          return false;
 759      }
 760  
 761      /**
 762       * Finds the next token in the HTML document.
 763       *
 764       * This doesn't currently have a way to represent non-tags and doesn't process
 765       * semantic rules for text nodes. For access to the raw tokens consider using
 766       * WP_HTML_Tag_Processor instead.
 767       *
 768       * @since 6.5.0 Added for internal support; do not use.
 769       * @since 6.7.2 Refactored so subclasses may extend.
 770       *
 771       * @return bool Whether a token was parsed.
 772       */
 773  	public function next_token(): bool {
 774          return $this->next_visitable_token();
 775      }
 776  
 777      /**
 778       * Ensures internal accounting is maintained for HTML semantic rules while
 779       * the underlying Tag Processor class is seeking to a bookmark.
 780       *
 781       * This doesn't currently have a way to represent non-tags and doesn't process
 782       * semantic rules for text nodes. For access to the raw tokens consider using
 783       * WP_HTML_Tag_Processor instead.
 784       *
 785       * Note that this method may call itself recursively. This is why it is not
 786       * implemented as {@see WP_HTML_Processor::next_token()}, which instead calls
 787       * this method similarly to how {@see WP_HTML_Tag_Processor::next_token()}
 788       * calls the {@see WP_HTML_Tag_Processor::base_class_next_token()} method.
 789       *
 790       * @since 6.7.2 Added for internal support.
 791       *
 792       * @access private
 793       *
 794       * @return bool
 795       */
 796  	private function next_visitable_token(): bool {
 797          $this->current_element = null;
 798  
 799          if ( isset( $this->last_error ) ) {
 800              return false;
 801          }
 802  
 803          /*
 804           * Prime the events if there are none.
 805           *
 806           * @todo In some cases, probably related to the adoption agency
 807           *       algorithm, this call to step() doesn't create any new
 808           *       events. Calling it again creates them. Figure out why
 809           *       this is and if it's inherent or if it's a bug. Looping
 810           *       until there are events or until there are no more
 811           *       tokens works in the meantime and isn't obviously wrong.
 812           */
 813          if ( empty( $this->element_queue ) && $this->step() ) {
 814              return $this->next_visitable_token();
 815          }
 816  
 817          // Process the next event on the queue.
 818          $this->current_element = array_shift( $this->element_queue );
 819          if ( ! isset( $this->current_element ) ) {
 820              // There are no tokens left, so close all remaining open elements.
 821              while ( $this->state->stack_of_open_elements->pop() ) {
 822                  continue;
 823              }
 824  
 825              return empty( $this->element_queue ) ? false : $this->next_visitable_token();
 826          }
 827  
 828          $is_pop = WP_HTML_Stack_Event::POP === $this->current_element->operation;
 829  
 830          /*
 831           * The root node only exists in the fragment parser, and closing it
 832           * indicates that the parse is complete. Stop before popping it from
 833           * the breadcrumbs.
 834           */
 835          if ( 'root-node' === $this->current_element->token->bookmark_name ) {
 836              return $this->next_visitable_token();
 837          }
 838  
 839          // Adjust the breadcrumbs for this event.
 840          if ( $is_pop ) {
 841              array_pop( $this->breadcrumbs );
 842          } else {
 843              $this->breadcrumbs[] = $this->current_element->token->node_name;
 844          }
 845  
 846          // Avoid sending close events for elements which don't expect a closing.
 847          if ( $is_pop && ! $this->expects_closer( $this->current_element->token ) ) {
 848              return $this->next_visitable_token();
 849          }
 850  
 851          return true;
 852      }
 853  
 854      /**
 855       * Indicates if the current tag token is a tag closer.
 856       *
 857       * Example:
 858       *
 859       *     $p = WP_HTML_Processor::create_fragment( '<div></div>' );
 860       *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
 861       *     $p->is_tag_closer() === false;
 862       *
 863       *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
 864       *     $p->is_tag_closer() === true;
 865       *
 866       * @since 6.6.0 Subclassed for HTML Processor.
 867       *
 868       * @return bool Whether the current tag is a tag closer.
 869       */
 870  	public function is_tag_closer(): bool {
 871          return $this->is_virtual()
 872              ? ( WP_HTML_Stack_Event::POP === $this->current_element->operation && '#tag' === $this->get_token_type() )
 873              : parent::is_tag_closer();
 874      }
 875  
 876      /**
 877       * Indicates if the currently-matched token is virtual, created by a stack operation
 878       * while processing HTML, rather than a token found in the HTML text itself.
 879       *
 880       * @since 6.6.0
 881       *
 882       * @return bool Whether the current token is virtual.
 883       */
 884  	private function is_virtual(): bool {
 885          return (
 886              isset( $this->current_element->provenance ) &&
 887              'virtual' === $this->current_element->provenance
 888          );
 889      }
 890  
 891      /**
 892       * Indicates if the currently-matched tag matches the given breadcrumbs.
 893       *
 894       * A "*" represents a single tag wildcard, where any tag matches, but not no tags.
 895       *
 896       * At some point this function _may_ support a `**` syntax for matching any number
 897       * of unspecified tags in the breadcrumb stack. This has been intentionally left
 898       * out, however, to keep this function simple and to avoid introducing backtracking,
 899       * which could open up surprising performance breakdowns.
 900       *
 901       * Example:
 902       *
 903       *     $processor = WP_HTML_Processor::create_fragment( '<div><span><figure><img></figure></span></div>' );
 904       *     $processor->next_tag( 'img' );
 905       *     true  === $processor->matches_breadcrumbs( array( 'figure', 'img' ) );
 906       *     true  === $processor->matches_breadcrumbs( array( 'span', 'figure', 'img' ) );
 907       *     false === $processor->matches_breadcrumbs( array( 'span', 'img' ) );
 908       *     true  === $processor->matches_breadcrumbs( array( 'span', '*', 'img' ) );
 909       *
 910       * @since 6.4.0
 911       *
 912       * @param string[] $breadcrumbs DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
 913       *                              May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
 914       * @return bool Whether the currently-matched tag is found at the given nested structure.
 915       */
 916  	public function matches_breadcrumbs( $breadcrumbs ): bool {
 917          // Everything matches when there are zero constraints.
 918          if ( 0 === count( $breadcrumbs ) ) {
 919              return true;
 920          }
 921  
 922          // Start at the last crumb.
 923          $crumb = end( $breadcrumbs );
 924  
 925          if ( '*' !== $crumb && $this->get_tag() !== strtoupper( $crumb ) ) {
 926              return false;
 927          }
 928  
 929          for ( $i = count( $this->breadcrumbs ) - 1; $i >= 0; $i-- ) {
 930              $node  = $this->breadcrumbs[ $i ];
 931              $crumb = strtoupper( current( $breadcrumbs ) );
 932  
 933              if ( '*' !== $crumb && $node !== $crumb ) {
 934                  return false;
 935              }
 936  
 937              if ( false === prev( $breadcrumbs ) ) {
 938                  return true;
 939              }
 940          }
 941  
 942          return false;
 943      }
 944  
 945      /**
 946       * Indicates if the currently-matched node expects a closing
 947       * token, or if it will self-close on the next step.
 948       *
 949       * Most HTML elements expect a closer, such as a P element or
 950       * a DIV element. Others, like an IMG element are void and don't
 951       * have a closing tag. Special elements, such as SCRIPT and STYLE,
 952       * are treated just like void tags. Text nodes and self-closing
 953       * foreign content will also act just like a void tag, immediately
 954       * closing as soon as the processor advances to the next token.
 955       *
 956       * @since 6.6.0
 957       *
 958       * @param WP_HTML_Token|null $node Optional. Node to examine, if provided.
 959       *                                 Default is to examine current node.
 960       * @return bool|null Whether to expect a closer for the currently-matched node,
 961       *                   or `null` if not matched on any token.
 962       */
 963  	public function expects_closer( ?WP_HTML_Token $node = null ): ?bool {
 964          $token_name = $node->node_name ?? $this->get_token_name();
 965  
 966          if ( ! isset( $token_name ) ) {
 967              return null;
 968          }
 969  
 970          $token_namespace        = $node->namespace ?? $this->get_namespace();
 971          $token_has_self_closing = $node->has_self_closing_flag ?? $this->has_self_closing_flag();
 972  
 973          return ! (
 974              // Comments, text nodes, and other atomic tokens.
 975              '#' === $token_name[0] ||
 976              // Doctype declarations.
 977              'html' === $token_name ||
 978              // Void elements.
 979              ( 'html' === $token_namespace && self::is_void( $token_name ) ) ||
 980              // Special atomic elements.
 981              ( 'html' === $token_namespace && in_array( $token_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) ||
 982              // Self-closing elements in foreign content.
 983              ( 'html' !== $token_namespace && $token_has_self_closing )
 984          );
 985      }
 986  
 987      /**
 988       * Steps through the HTML document and stop at the next tag, if any.
 989       *
 990       * @since 6.4.0
 991       *
 992       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
 993       *
 994       * @see self::PROCESS_NEXT_NODE
 995       * @see self::REPROCESS_CURRENT_NODE
 996       *
 997       * @param string $node_to_process Whether to parse the next node or reprocess the current node.
 998       * @return bool Whether a tag was matched.
 999       */
1000  	public function step( $node_to_process = self::PROCESS_NEXT_NODE ): bool {
1001          // Refuse to proceed if there was a previous error.
1002          if ( null !== $this->last_error ) {
1003              return false;
1004          }
1005  
1006          if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
1007              /*
1008               * Void elements still hop onto the stack of open elements even though
1009               * there's no corresponding closing tag. This is important for managing
1010               * stack-based operations such as "navigate to parent node" or checking
1011               * on an element's breadcrumbs.
1012               *
1013               * When moving on to the next node, therefore, if the bottom-most element
1014               * on the stack is a void element, it must be closed.
1015               */
1016              $top_node = $this->state->stack_of_open_elements->current_node();
1017              if ( isset( $top_node ) && ! $this->expects_closer( $top_node ) ) {
1018                  $this->state->stack_of_open_elements->pop();
1019              }
1020          }
1021  
1022          if ( self::PROCESS_NEXT_NODE === $node_to_process ) {
1023              parent::next_token();
1024              if ( WP_HTML_Tag_Processor::STATE_TEXT_NODE === $this->parser_state ) {
1025                  parent::subdivide_text_appropriately();
1026              }
1027          }
1028  
1029          // Finish stepping when there are no more tokens in the document.
1030          if (
1031              WP_HTML_Tag_Processor::STATE_INCOMPLETE_INPUT === $this->parser_state ||
1032              WP_HTML_Tag_Processor::STATE_COMPLETE === $this->parser_state
1033          ) {
1034              return false;
1035          }
1036  
1037          $adjusted_current_node = $this->get_adjusted_current_node();
1038          $is_closer             = $this->is_tag_closer();
1039          $is_start_tag          = WP_HTML_Tag_Processor::STATE_MATCHED_TAG === $this->parser_state && ! $is_closer;
1040          $token_name            = $this->get_token_name();
1041  
1042          if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
1043              $this->state->current_token = new WP_HTML_Token(
1044                  $this->bookmark_token(),
1045                  $token_name,
1046                  $this->has_self_closing_flag(),
1047                  $this->release_internal_bookmark_on_destruct
1048              );
1049          }
1050  
1051          $parse_in_current_insertion_mode = (
1052              0 === $this->state->stack_of_open_elements->count() ||
1053              'html' === $adjusted_current_node->namespace ||
1054              (
1055                  'math' === $adjusted_current_node->integration_node_type &&
1056                  (
1057                      ( $is_start_tag && ! in_array( $token_name, array( 'MGLYPH', 'MALIGNMARK' ), true ) ) ||
1058                      '#text' === $token_name
1059                  )
1060              ) ||
1061              (
1062                  'math' === $adjusted_current_node->namespace &&
1063                  'ANNOTATION-XML' === $adjusted_current_node->node_name &&
1064                  $is_start_tag && 'SVG' === $token_name
1065              ) ||
1066              (
1067                  'html' === $adjusted_current_node->integration_node_type &&
1068                  ( $is_start_tag || '#text' === $token_name )
1069              )
1070          );
1071  
1072          try {
1073              if ( ! $parse_in_current_insertion_mode ) {
1074                  return $this->step_in_foreign_content();
1075              }
1076  
1077              switch ( $this->state->insertion_mode ) {
1078                  case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
1079                      return $this->step_initial();
1080  
1081                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
1082                      return $this->step_before_html();
1083  
1084                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
1085                      return $this->step_before_head();
1086  
1087                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
1088                      return $this->step_in_head();
1089  
1090                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
1091                      return $this->step_in_head_noscript();
1092  
1093                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
1094                      return $this->step_after_head();
1095  
1096                  case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
1097                      return $this->step_in_body();
1098  
1099                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
1100                      return $this->step_in_table();
1101  
1102                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
1103                      return $this->step_in_table_text();
1104  
1105                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
1106                      return $this->step_in_caption();
1107  
1108                  case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
1109                      return $this->step_in_column_group();
1110  
1111                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
1112                      return $this->step_in_table_body();
1113  
1114                  case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
1115                      return $this->step_in_row();
1116  
1117                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
1118                      return $this->step_in_cell();
1119  
1120                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
1121                      return $this->step_in_select();
1122  
1123                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
1124                      return $this->step_in_select_in_table();
1125  
1126                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
1127                      return $this->step_in_template();
1128  
1129                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
1130                      return $this->step_after_body();
1131  
1132                  case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
1133                      return $this->step_in_frameset();
1134  
1135                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
1136                      return $this->step_after_frameset();
1137  
1138                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
1139                      return $this->step_after_after_body();
1140  
1141                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
1142                      return $this->step_after_after_frameset();
1143  
1144                  // This should be unreachable but PHP doesn't have total type checking on switch.
1145                  default:
1146                      $this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
1147              }
1148          } catch ( WP_HTML_Unsupported_Exception $e ) {
1149              /*
1150               * Exceptions are used in this class to escape deep call stacks that
1151               * otherwise might involve messier calling and return conventions.
1152               */
1153              return false;
1154          }
1155      }
1156  
1157      /**
1158       * Computes the HTML breadcrumbs for the currently-matched node, if matched.
1159       *
1160       * Breadcrumbs start at the outermost parent and descend toward the matched element.
1161       * They always include the entire path from the root HTML node to the matched element.
1162       *
1163       * Example:
1164       *
1165       *     $processor = WP_HTML_Processor::create_fragment( '<p><strong><em><img></em></strong></p>' );
1166       *     $processor->next_tag( 'IMG' );
1167       *     $processor->get_breadcrumbs() === array( 'HTML', 'BODY', 'P', 'STRONG', 'EM', 'IMG' );
1168       *
1169       * @since 6.4.0
1170       *
1171       * @return string[] Array of tag names representing path to matched node.
1172       */
1173  	public function get_breadcrumbs(): array {
1174          return $this->breadcrumbs;
1175      }
1176  
1177      /**
1178       * Returns the nesting depth of the current location in the document.
1179       *
1180       * Example:
1181       *
1182       *     $processor = WP_HTML_Processor::create_fragment( '<div><p></p></div>' );
1183       *     // The processor starts in the BODY context, meaning it has depth from the start: HTML > BODY.
1184       *     2 === $processor->get_current_depth();
1185       *
1186       *     // Opening the DIV element increases the depth.
1187       *     $processor->next_token();
1188       *     3 === $processor->get_current_depth();
1189       *
1190       *     // Opening the P element increases the depth.
1191       *     $processor->next_token();
1192       *     4 === $processor->get_current_depth();
1193       *
1194       *     // The P element is closed during `next_token()` so the depth is decreased to reflect that.
1195       *     $processor->next_token();
1196       *     3 === $processor->get_current_depth();
1197       *
1198       * @since 6.6.0
1199       *
1200       * @return int Nesting-depth of current location in the document.
1201       */
1202  	public function get_current_depth(): int {
1203          return count( $this->breadcrumbs );
1204      }
1205  
1206      /**
1207       * Normalizes an HTML fragment by serializing it.
1208       *
1209       * This method assumes that the given HTML snippet is found in BODY context.
1210       * For normalizing full documents or fragments found in other contexts, create
1211       * a new processor using {@see WP_HTML_Processor::create_fragment} or
1212       * {@see WP_HTML_Processor::create_full_parser} and call {@see WP_HTML_Processor::serialize}
1213       * on the created instances.
1214       *
1215       * Many aspects of an input HTML fragment may be changed during normalization.
1216       *
1217       *  - Attribute values will be double-quoted.
1218       *  - Duplicate attributes will be removed.
1219       *  - Omitted tags will be added.
1220       *  - Tag and attribute name casing will be lower-cased,
1221       *    except for specific SVG and MathML tags or attributes.
1222       *  - Text will be re-encoded, null bytes handled,
1223       *    and invalid UTF-8 replaced with U+FFFD.
1224       *  - Any incomplete syntax trailing at the end will be omitted,
1225       *    for example, an unclosed comment opener will be removed.
1226       *
1227       * Example:
1228       *
1229       *     echo WP_HTML_Processor::normalize( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1230       *     // <a href="#anchor" v="5" enabled>One</a>
1231       *
1232       *     echo WP_HTML_Processor::normalize( '<div></p>fun<table><td>cell</div>' );
1233       *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1234       *
1235       *     echo WP_HTML_Processor::normalize( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1236       *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1237       *
1238       * @since 6.7.0
1239       *
1240       * @param string $html Input HTML to normalize.
1241       *
1242       * @return string|null Normalized output, or `null` if unable to normalize.
1243       */
1244  	public static function normalize( string $html ): ?string {
1245          return static::create_fragment( $html )->serialize();
1246      }
1247  
1248      /**
1249       * Returns normalized HTML for a fragment by serializing it.
1250       *
1251       * This differs from {@see WP_HTML_Processor::normalize} in that it starts with
1252       * a specific HTML Processor, which _must_ not have already started scanning;
1253       * it must be in the initial ready state and will be in the completed state once
1254       * serialization is complete.
1255       *
1256       * Many aspects of an input HTML fragment may be changed during normalization.
1257       *
1258       *  - Attribute values will be double-quoted.
1259       *  - Duplicate attributes will be removed.
1260       *  - Omitted tags will be added.
1261       *  - Tag and attribute name casing will be lower-cased,
1262       *    except for specific SVG and MathML tags or attributes.
1263       *  - Text will be re-encoded, null bytes handled,
1264       *    and invalid UTF-8 replaced with U+FFFD.
1265       *  - Any incomplete syntax trailing at the end will be omitted,
1266       *    for example, an unclosed comment opener will be removed.
1267       *
1268       * Example:
1269       *
1270       *     $processor = WP_HTML_Processor::create_fragment( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1271       *     echo $processor->serialize();
1272       *     // <a href="#anchor" v="5" enabled>One</a>
1273       *
1274       *     $processor = WP_HTML_Processor::create_fragment( '<div></p>fun<table><td>cell</div>' );
1275       *     echo $processor->serialize();
1276       *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1277       *
1278       *     $processor = WP_HTML_Processor::create_fragment( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1279       *     echo $processor->serialize();
1280       *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1281       *
1282       * @since 6.7.0
1283       *
1284       * @return string|null Normalized HTML markup represented by processor,
1285       *                     or `null` if unable to generate serialization.
1286       */
1287  	public function serialize(): ?string {
1288          if ( WP_HTML_Tag_Processor::STATE_READY !== $this->parser_state ) {
1289              wp_trigger_error(
1290                  __METHOD__,
1291                  'An HTML Processor which has already started processing cannot serialize its contents. Serialize immediately after creating the instance.',
1292                  E_USER_WARNING
1293              );
1294              return null;
1295          }
1296  
1297          $html = '';
1298          while ( $this->next_token() ) {
1299              $html .= $this->serialize_token();
1300          }
1301  
1302          if ( null !== $this->get_last_error() ) {
1303              wp_trigger_error(
1304                  __METHOD__,
1305                  "Cannot serialize HTML Processor with parsing error: {$this->get_last_error()}.",
1306                  E_USER_WARNING
1307              );
1308              return null;
1309          }
1310  
1311          return $html;
1312      }
1313  
1314      /**
1315       * Serializes the currently-matched token.
1316       *
1317       * This method produces a fully-normative HTML string for the currently-matched token,
1318       * if able. If not matched at any token or if the token doesn't correspond to any HTML
1319       * it will return an empty string (for example, presumptuous end tags are ignored).
1320       *
1321       * @see static::serialize()
1322       *
1323       * @since 6.7.0
1324       * @since 6.9.0 Converted from protected to public method.
1325       *
1326       * @return string Serialization of token, or empty string if no serialization exists.
1327       */
1328  	public function serialize_token(): string {
1329          $html       = '';
1330          $token_type = $this->get_token_type();
1331  
1332          switch ( $token_type ) {
1333              case '#doctype':
1334                  $doctype = $this->get_doctype_info();
1335                  if ( null === $doctype ) {
1336                      break;
1337                  }
1338  
1339                  $html .= '<!DOCTYPE';
1340  
1341                  if ( $doctype->name ) {
1342                      $html .= " {$doctype->name}";
1343                  }
1344  
1345                  if ( null !== $doctype->public_identifier ) {
1346                      $quote = str_contains( $doctype->public_identifier, '"' ) ? "'" : '"';
1347                      $html .= " PUBLIC {$quote}{$doctype->public_identifier}{$quote}";
1348                  }
1349                  if ( null !== $doctype->system_identifier ) {
1350                      if ( null === $doctype->public_identifier ) {
1351                          $html .= ' SYSTEM';
1352                      }
1353                      $quote = str_contains( $doctype->system_identifier, '"' ) ? "'" : '"';
1354                      $html .= " {$quote}{$doctype->system_identifier}{$quote}";
1355                  }
1356  
1357                  $html .= '>';
1358                  break;
1359  
1360              case '#text':
1361                  $html .= htmlspecialchars( $this->get_modifiable_text(), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1362                  break;
1363  
1364              // Unlike the `<>` which is interpreted as plaintext, this is ignored entirely.
1365              case '#presumptuous-tag':
1366                  break;
1367  
1368              case '#funky-comment':
1369              case '#comment':
1370                  $html .= "<!--{$this->get_full_comment_text()}-->";
1371                  break;
1372  
1373              case '#cdata-section':
1374                  $html .= "<![CDATA[{$this->get_modifiable_text()}]]>";
1375                  break;
1376          }
1377  
1378          if ( '#tag' !== $token_type ) {
1379              return $html;
1380          }
1381  
1382          $tag_name       = str_replace( "\x00", "\u{FFFD}", $this->get_tag() );
1383          $in_html        = 'html' === $this->get_namespace();
1384          $qualified_name = $in_html ? strtolower( $tag_name ) : $this->get_qualified_tag_name();
1385  
1386          if ( $this->is_tag_closer() ) {
1387              $html .= "</{$qualified_name}>";
1388              return $html;
1389          }
1390  
1391          $attribute_names = $this->get_attribute_names_with_prefix( '' );
1392          if ( ! isset( $attribute_names ) ) {
1393              $html .= "<{$qualified_name}>";
1394              return $html;
1395          }
1396  
1397          $html .= "<{$qualified_name}";
1398          foreach ( $attribute_names as $attribute_name ) {
1399              $html .= " {$this->get_qualified_attribute_name( $attribute_name )}";
1400              $value = $this->get_attribute( $attribute_name );
1401  
1402              if ( is_string( $value ) ) {
1403                  $html .= '="' . htmlspecialchars( $value, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5 ) . '"';
1404              }
1405  
1406              $html = str_replace( "\x00", "\u{FFFD}", $html );
1407          }
1408  
1409          if ( ! $in_html && $this->has_self_closing_flag() ) {
1410              $html .= ' /';
1411          }
1412  
1413          $html .= '>';
1414  
1415          // Flush out self-contained elements.
1416          if ( $in_html && in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) {
1417              $text = $this->get_modifiable_text();
1418  
1419              switch ( $tag_name ) {
1420                  case 'IFRAME':
1421                  case 'NOEMBED':
1422                  case 'NOFRAMES':
1423                      $text = '';
1424                      break;
1425  
1426                  case 'SCRIPT':
1427                  case 'STYLE':
1428                      break;
1429  
1430                  default:
1431                      $text = htmlspecialchars( $text, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1432              }
1433  
1434              $html .= "{$text}</{$qualified_name}>";
1435          }
1436  
1437          return $html;
1438      }
1439  
1440      /**
1441       * Parses next element in the 'initial' insertion mode.
1442       *
1443       * This internal function performs the 'initial' insertion mode
1444       * logic for the generalized WP_HTML_Processor::step() function.
1445       *
1446       * @since 6.7.0
1447       *
1448       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1449       *
1450       * @see https://html.spec.whatwg.org/#the-initial-insertion-mode
1451       * @see WP_HTML_Processor::step
1452       *
1453       * @return bool Whether an element was found.
1454       */
1455  	private function step_initial(): bool {
1456          $token_name = $this->get_token_name();
1457          $token_type = $this->get_token_type();
1458          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
1459          $op         = "{$op_sigil}{$token_name}";
1460  
1461          switch ( $op ) {
1462              /*
1463               * > A character token that is one of U+0009 CHARACTER TABULATION,
1464               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1465               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1466               *
1467               * Parse error: ignore the token.
1468               */
1469              case '#text':
1470                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1471                      return $this->step();
1472                  }
1473                  goto initial_anything_else;
1474                  break;
1475  
1476              /*
1477               * > A comment token
1478               */
1479              case '#comment':
1480              case '#funky-comment':
1481              case '#presumptuous-tag':
1482                  $this->insert_html_element( $this->state->current_token );
1483                  return true;
1484  
1485              /*
1486               * > A DOCTYPE token
1487               */
1488              case 'html':
1489                  $doctype = $this->get_doctype_info();
1490                  if ( null !== $doctype && 'quirks' === $doctype->indicated_compatibility_mode ) {
1491                      $this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
1492                  }
1493  
1494                  /*
1495                   * > Then, switch the insertion mode to "before html".
1496                   */
1497                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1498                  $this->insert_html_element( $this->state->current_token );
1499                  return true;
1500          }
1501  
1502          /*
1503           * > Anything else
1504           */
1505          initial_anything_else:
1506          $this->compat_mode           = WP_HTML_Tag_Processor::QUIRKS_MODE;
1507          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1508          return $this->step( self::REPROCESS_CURRENT_NODE );
1509      }
1510  
1511      /**
1512       * Parses next element in the 'before html' insertion mode.
1513       *
1514       * This internal function performs the 'before html' insertion mode
1515       * logic for the generalized WP_HTML_Processor::step() function.
1516       *
1517       * @since 6.7.0
1518       *
1519       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1520       *
1521       * @see https://html.spec.whatwg.org/#the-before-html-insertion-mode
1522       * @see WP_HTML_Processor::step
1523       *
1524       * @return bool Whether an element was found.
1525       */
1526  	private function step_before_html(): bool {
1527          $token_name = $this->get_token_name();
1528          $token_type = $this->get_token_type();
1529          $is_closer  = parent::is_tag_closer();
1530          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1531          $op         = "{$op_sigil}{$token_name}";
1532  
1533          switch ( $op ) {
1534              /*
1535               * > A DOCTYPE token
1536               */
1537              case 'html':
1538                  // Parse error: ignore the token.
1539                  return $this->step();
1540  
1541              /*
1542               * > A comment token
1543               */
1544              case '#comment':
1545              case '#funky-comment':
1546              case '#presumptuous-tag':
1547                  $this->insert_html_element( $this->state->current_token );
1548                  return true;
1549  
1550              /*
1551               * > A character token that is one of U+0009 CHARACTER TABULATION,
1552               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1553               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1554               *
1555               * Parse error: ignore the token.
1556               */
1557              case '#text':
1558                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1559                      return $this->step();
1560                  }
1561                  goto before_html_anything_else;
1562                  break;
1563  
1564              /*
1565               * > A start tag whose tag name is "html"
1566               */
1567              case '+HTML':
1568                  $this->insert_html_element( $this->state->current_token );
1569                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1570                  return true;
1571  
1572              /*
1573               * > An end tag whose tag name is one of: "head", "body", "html", "br"
1574               *
1575               * Closing BR tags are always reported by the Tag Processor as opening tags.
1576               */
1577              case '-HEAD':
1578              case '-BODY':
1579              case '-HTML':
1580                  /*
1581                   * > Act as described in the "anything else" entry below.
1582                   */
1583                  goto before_html_anything_else;
1584                  break;
1585          }
1586  
1587          /*
1588           * > Any other end tag
1589           */
1590          if ( $is_closer ) {
1591              // Parse error: ignore the token.
1592              return $this->step();
1593          }
1594  
1595          /*
1596           * > Anything else.
1597           *
1598           * > Create an html element whose node document is the Document object.
1599           * > Append it to the Document object. Put this element in the stack of open elements.
1600           * > Switch the insertion mode to "before head", then reprocess the token.
1601           */
1602          before_html_anything_else:
1603          $this->insert_virtual_node( 'HTML' );
1604          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1605          return $this->step( self::REPROCESS_CURRENT_NODE );
1606      }
1607  
1608      /**
1609       * Parses next element in the 'before head' insertion mode.
1610       *
1611       * This internal function performs the 'before head' insertion mode
1612       * logic for the generalized WP_HTML_Processor::step() function.
1613       *
1614       * @since 6.7.0 Stub implementation.
1615       *
1616       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1617       *
1618       * @see https://html.spec.whatwg.org/#the-before-head-insertion-mode
1619       * @see WP_HTML_Processor::step
1620       *
1621       * @return bool Whether an element was found.
1622       */
1623  	private function step_before_head(): bool {
1624          $token_name = $this->get_token_name();
1625          $token_type = $this->get_token_type();
1626          $is_closer  = parent::is_tag_closer();
1627          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1628          $op         = "{$op_sigil}{$token_name}";
1629  
1630          switch ( $op ) {
1631              /*
1632               * > A character token that is one of U+0009 CHARACTER TABULATION,
1633               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1634               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1635               *
1636               * Parse error: ignore the token.
1637               */
1638              case '#text':
1639                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1640                      return $this->step();
1641                  }
1642                  goto before_head_anything_else;
1643                  break;
1644  
1645              /*
1646               * > A comment token
1647               */
1648              case '#comment':
1649              case '#funky-comment':
1650              case '#presumptuous-tag':
1651                  $this->insert_html_element( $this->state->current_token );
1652                  return true;
1653  
1654              /*
1655               * > A DOCTYPE token
1656               */
1657              case 'html':
1658                  // Parse error: ignore the token.
1659                  return $this->step();
1660  
1661              /*
1662               * > A start tag whose tag name is "html"
1663               */
1664              case '+HTML':
1665                  return $this->step_in_body();
1666  
1667              /*
1668               * > A start tag whose tag name is "head"
1669               */
1670              case '+HEAD':
1671                  $this->insert_html_element( $this->state->current_token );
1672                  $this->state->head_element   = $this->state->current_token;
1673                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1674                  return true;
1675  
1676              /*
1677               * > An end tag whose tag name is one of: "head", "body", "html", "br"
1678               * > Act as described in the "anything else" entry below.
1679               *
1680               * Closing BR tags are always reported by the Tag Processor as opening tags.
1681               */
1682              case '-HEAD':
1683              case '-BODY':
1684              case '-HTML':
1685                  goto before_head_anything_else;
1686                  break;
1687          }
1688  
1689          if ( $is_closer ) {
1690              // Parse error: ignore the token.
1691              return $this->step();
1692          }
1693  
1694          /*
1695           * > Anything else
1696           *
1697           * > Insert an HTML element for a "head" start tag token with no attributes.
1698           */
1699          before_head_anything_else:
1700          $this->state->head_element   = $this->insert_virtual_node( 'HEAD' );
1701          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1702          return $this->step( self::REPROCESS_CURRENT_NODE );
1703      }
1704  
1705      /**
1706       * Parses next element in the 'in head' insertion mode.
1707       *
1708       * This internal function performs the 'in head' insertion mode
1709       * logic for the generalized WP_HTML_Processor::step() function.
1710       *
1711       * @since 6.7.0
1712       *
1713       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1714       *
1715       * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inhead
1716       * @see WP_HTML_Processor::step
1717       *
1718       * @return bool Whether an element was found.
1719       */
1720  	private function step_in_head(): bool {
1721          $token_name = $this->get_token_name();
1722          $token_type = $this->get_token_type();
1723          $is_closer  = parent::is_tag_closer();
1724          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1725          $op         = "{$op_sigil}{$token_name}";
1726  
1727          switch ( $op ) {
1728              case '#text':
1729                  /*
1730                   * > A character token that is one of U+0009 CHARACTER TABULATION,
1731                   * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1732                   * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1733                   */
1734                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1735                      // Insert the character.
1736                      $this->insert_html_element( $this->state->current_token );
1737                      return true;
1738                  }
1739  
1740                  goto in_head_anything_else;
1741                  break;
1742  
1743              /*
1744               * > A comment token
1745               */
1746              case '#comment':
1747              case '#funky-comment':
1748              case '#presumptuous-tag':
1749                  $this->insert_html_element( $this->state->current_token );
1750                  return true;
1751  
1752              /*
1753               * > A DOCTYPE token
1754               */
1755              case 'html':
1756                  // Parse error: ignore the token.
1757                  return $this->step();
1758  
1759              /*
1760               * > A start tag whose tag name is "html"
1761               */
1762              case '+HTML':
1763                  return $this->step_in_body();
1764  
1765              /*
1766               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link"
1767               */
1768              case '+BASE':
1769              case '+BASEFONT':
1770              case '+BGSOUND':
1771              case '+LINK':
1772                  $this->insert_html_element( $this->state->current_token );
1773                  return true;
1774  
1775              /*
1776               * > A start tag whose tag name is "meta"
1777               */
1778              case '+META':
1779                  $this->insert_html_element( $this->state->current_token );
1780  
1781                  // All following conditions depend on "tentative" encoding confidence.
1782                  if ( 'tentative' !== $this->state->encoding_confidence ) {
1783                      return true;
1784                  }
1785  
1786                  /*
1787                   * > If the active speculative HTML parser is null, then:
1788                   * >   - If the element has a charset attribute, and getting an encoding from
1789                   * >     its value results in an encoding, and the confidence is currently
1790                   * >     tentative, then change the encoding to the resulting encoding.
1791                   */
1792                  $charset = $this->get_attribute( 'charset' );
1793                  if ( is_string( $charset ) ) {
1794                      $this->bail( 'Cannot yet process META tags with charset to determine encoding.' );
1795                  }
1796  
1797                  /*
1798                   * >   - Otherwise, if the element has an http-equiv attribute whose value is
1799                   * >     an ASCII case-insensitive match for the string "Content-Type", and
1800                   * >     the element has a content attribute, and applying the algorithm for
1801                   * >     extracting a character encoding from a meta element to that attribute's
1802                   * >     value returns an encoding, and the confidence is currently tentative,
1803                   * >     then change the encoding to the extracted encoding.
1804                   */
1805                  $http_equiv = $this->get_attribute( 'http-equiv' );
1806                  $content    = $this->get_attribute( 'content' );
1807                  if (
1808                      is_string( $http_equiv ) &&
1809                      is_string( $content ) &&
1810                      0 === strcasecmp( $http_equiv, 'Content-Type' )
1811                  ) {
1812                      $this->bail( 'Cannot yet process META tags with http-equiv Content-Type to determine encoding.' );
1813                  }
1814  
1815                  return true;
1816  
1817              /*
1818               * > A start tag whose tag name is "title"
1819               */
1820              case '+TITLE':
1821                  $this->insert_html_element( $this->state->current_token );
1822                  return true;
1823  
1824              /*
1825               * > A start tag whose tag name is "noscript", if the scripting flag is enabled
1826               * > A start tag whose tag name is one of: "noframes", "style"
1827               *
1828               * The scripting flag is never enabled in this parser.
1829               */
1830              case '+NOFRAMES':
1831              case '+STYLE':
1832                  $this->insert_html_element( $this->state->current_token );
1833                  return true;
1834  
1835              /*
1836               * > A start tag whose tag name is "noscript", if the scripting flag is disabled
1837               */
1838              case '+NOSCRIPT':
1839                  $this->insert_html_element( $this->state->current_token );
1840                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT;
1841                  return true;
1842  
1843              /*
1844               * > A start tag whose tag name is "script"
1845               *
1846               * @todo Could the adjusted insertion location be anything other than the current location?
1847               */
1848              case '+SCRIPT':
1849                  $this->insert_html_element( $this->state->current_token );
1850                  return true;
1851  
1852              /*
1853               * > An end tag whose tag name is "head"
1854               */
1855              case '-HEAD':
1856                  $this->state->stack_of_open_elements->pop();
1857                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1858                  return true;
1859  
1860              /*
1861               * > An end tag whose tag name is one of: "body", "html", "br"
1862               *
1863               * BR tags are always reported by the Tag Processor as opening tags.
1864               */
1865              case '-BODY':
1866              case '-HTML':
1867                  /*
1868                   * > Act as described in the "anything else" entry below.
1869                   */
1870                  goto in_head_anything_else;
1871                  break;
1872  
1873              /*
1874               * > A start tag whose tag name is "template"
1875               *
1876               * @todo Could the adjusted insertion location be anything other than the current location?
1877               */
1878              case '+TEMPLATE':
1879                  $this->state->active_formatting_elements->insert_marker();
1880                  $this->state->frameset_ok = false;
1881  
1882                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1883                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1884  
1885                  $this->insert_html_element( $this->state->current_token );
1886                  return true;
1887  
1888              /*
1889               * > An end tag whose tag name is "template"
1890               */
1891              case '-TEMPLATE':
1892                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
1893                      // @todo Indicate a parse error once it's possible.
1894                      return $this->step();
1895                  }
1896  
1897                  $this->generate_implied_end_tags_thoroughly();
1898                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'TEMPLATE' ) ) {
1899                      // @todo Indicate a parse error once it's possible.
1900                  }
1901  
1902                  $this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
1903                  $this->state->active_formatting_elements->clear_up_to_last_marker();
1904                  array_pop( $this->state->stack_of_template_insertion_modes );
1905                  $this->reset_insertion_mode_appropriately();
1906                  return true;
1907          }
1908  
1909          /*
1910           * > A start tag whose tag name is "head"
1911           * > Any other end tag
1912           */
1913          if ( '+HEAD' === $op || $is_closer ) {
1914              // Parse error: ignore the token.
1915              return $this->step();
1916          }
1917  
1918          /*
1919           * > Anything else
1920           */
1921          in_head_anything_else:
1922          $this->state->stack_of_open_elements->pop();
1923          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1924          return $this->step( self::REPROCESS_CURRENT_NODE );
1925      }
1926  
1927      /**
1928       * Parses next element in the 'in head noscript' insertion mode.
1929       *
1930       * This internal function performs the 'in head noscript' insertion mode
1931       * logic for the generalized WP_HTML_Processor::step() function.
1932       *
1933       * @since 6.7.0 Stub implementation.
1934       *
1935       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1936       *
1937       * @see https://html.spec.whatwg.org/#parsing-main-inheadnoscript
1938       * @see WP_HTML_Processor::step
1939       *
1940       * @return bool Whether an element was found.
1941       */
1942  	private function step_in_head_noscript(): bool {
1943          $token_name = $this->get_token_name();
1944          $token_type = $this->get_token_type();
1945          $is_closer  = parent::is_tag_closer();
1946          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1947          $op         = "{$op_sigil}{$token_name}";
1948  
1949          switch ( $op ) {
1950              /*
1951               * > A character token that is one of U+0009 CHARACTER TABULATION,
1952               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1953               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1954               *
1955               * Parse error: ignore the token.
1956               */
1957              case '#text':
1958                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1959                      return $this->step_in_head();
1960                  }
1961  
1962                  goto in_head_noscript_anything_else;
1963                  break;
1964  
1965              /*
1966               * > A DOCTYPE token
1967               */
1968              case 'html':
1969                  // Parse error: ignore the token.
1970                  return $this->step();
1971  
1972              /*
1973               * > A start tag whose tag name is "html"
1974               */
1975              case '+HTML':
1976                  return $this->step_in_body();
1977  
1978              /*
1979               * > An end tag whose tag name is "noscript"
1980               */
1981              case '-NOSCRIPT':
1982                  $this->state->stack_of_open_elements->pop();
1983                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1984                  return true;
1985  
1986              /*
1987               * > A comment token
1988               * >
1989               * > A start tag whose tag name is one of: "basefont", "bgsound",
1990               * > "link", "meta", "noframes", "style"
1991               */
1992              case '#comment':
1993              case '#funky-comment':
1994              case '#presumptuous-tag':
1995              case '+BASEFONT':
1996              case '+BGSOUND':
1997              case '+LINK':
1998              case '+META':
1999              case '+NOFRAMES':
2000              case '+STYLE':
2001                  return $this->step_in_head();
2002  
2003              /*
2004               * > An end tag whose tag name is "br"
2005               *
2006               * This should never happen, as the Tag Processor prevents showing a BR closing tag.
2007               */
2008          }
2009  
2010          /*
2011           * > A start tag whose tag name is one of: "head", "noscript"
2012           * > Any other end tag
2013           */
2014          if ( '+HEAD' === $op || '+NOSCRIPT' === $op || $is_closer ) {
2015              // Parse error: ignore the token.
2016              return $this->step();
2017          }
2018  
2019          /*
2020           * > Anything else
2021           *
2022           * Anything here is a parse error.
2023           */
2024          in_head_noscript_anything_else:
2025          $this->state->stack_of_open_elements->pop();
2026          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
2027          return $this->step( self::REPROCESS_CURRENT_NODE );
2028      }
2029  
2030      /**
2031       * Parses next element in the 'after head' insertion mode.
2032       *
2033       * This internal function performs the 'after head' insertion mode
2034       * logic for the generalized WP_HTML_Processor::step() function.
2035       *
2036       * @since 6.7.0 Stub implementation.
2037       *
2038       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2039       *
2040       * @see https://html.spec.whatwg.org/#the-after-head-insertion-mode
2041       * @see WP_HTML_Processor::step
2042       *
2043       * @return bool Whether an element was found.
2044       */
2045  	private function step_after_head(): bool {
2046          $token_name = $this->get_token_name();
2047          $token_type = $this->get_token_type();
2048          $is_closer  = parent::is_tag_closer();
2049          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
2050          $op         = "{$op_sigil}{$token_name}";
2051  
2052          switch ( $op ) {
2053              /*
2054               * > A character token that is one of U+0009 CHARACTER TABULATION,
2055               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2056               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
2057               */
2058              case '#text':
2059                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
2060                      // Insert the character.
2061                      $this->insert_html_element( $this->state->current_token );
2062                      return true;
2063                  }
2064                  goto after_head_anything_else;
2065                  break;
2066  
2067              /*
2068               * > A comment token
2069               */
2070              case '#comment':
2071              case '#funky-comment':
2072              case '#presumptuous-tag':
2073                  $this->insert_html_element( $this->state->current_token );
2074                  return true;
2075  
2076              /*
2077               * > A DOCTYPE token
2078               */
2079              case 'html':
2080                  // Parse error: ignore the token.
2081                  return $this->step();
2082  
2083              /*
2084               * > A start tag whose tag name is "html"
2085               */
2086              case '+HTML':
2087                  return $this->step_in_body();
2088  
2089              /*
2090               * > A start tag whose tag name is "body"
2091               */
2092              case '+BODY':
2093                  $this->insert_html_element( $this->state->current_token );
2094                  $this->state->frameset_ok    = false;
2095                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2096                  return true;
2097  
2098              /*
2099               * > A start tag whose tag name is "frameset"
2100               */
2101              case '+FRAMESET':
2102                  $this->insert_html_element( $this->state->current_token );
2103                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
2104                  return true;
2105  
2106              /*
2107               * > A start tag whose tag name is one of: "base", "basefont", "bgsound",
2108               * > "link", "meta", "noframes", "script", "style", "template", "title"
2109               *
2110               * Anything here is a parse error.
2111               */
2112              case '+BASE':
2113              case '+BASEFONT':
2114              case '+BGSOUND':
2115              case '+LINK':
2116              case '+META':
2117              case '+NOFRAMES':
2118              case '+SCRIPT':
2119              case '+STYLE':
2120              case '+TEMPLATE':
2121              case '+TITLE':
2122                  /*
2123                   * > Push the node pointed to by the head element pointer onto the stack of open elements.
2124                   * > Process the token using the rules for the "in head" insertion mode.
2125                   * > Remove the node pointed to by the head element pointer from the stack of open elements. (It might not be the current node at this point.)
2126                   */
2127                  $this->bail( 'Cannot process elements after HEAD which reopen the HEAD element.' );
2128                  /*
2129                   * Do not leave this break in when adding support; it's here to prevent
2130                   * WPCS from getting confused at the switch structure without a return,
2131                   * because it doesn't know that `bail()` always throws.
2132                   */
2133                  break;
2134  
2135              /*
2136               * > An end tag whose tag name is "template"
2137               */
2138              case '-TEMPLATE':
2139                  return $this->step_in_head();
2140  
2141              /*
2142               * > An end tag whose tag name is one of: "body", "html", "br"
2143               *
2144               * Closing BR tags are always reported by the Tag Processor as opening tags.
2145               */
2146              case '-BODY':
2147              case '-HTML':
2148                  /*
2149                   * > Act as described in the "anything else" entry below.
2150                   */
2151                  goto after_head_anything_else;
2152                  break;
2153          }
2154  
2155          /*
2156           * > A start tag whose tag name is "head"
2157           * > Any other end tag
2158           */
2159          if ( '+HEAD' === $op || $is_closer ) {
2160              // Parse error: ignore the token.
2161              return $this->step();
2162          }
2163  
2164          /*
2165           * > Anything else
2166           * > Insert an HTML element for a "body" start tag token with no attributes.
2167           */
2168          after_head_anything_else:
2169          $this->insert_virtual_node( 'BODY' );
2170          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2171          return $this->step( self::REPROCESS_CURRENT_NODE );
2172      }
2173  
2174      /**
2175       * Parses next element in the 'in body' insertion mode.
2176       *
2177       * This internal function performs the 'in body' insertion mode
2178       * logic for the generalized WP_HTML_Processor::step() function.
2179       *
2180       * @since 6.4.0
2181       *
2182       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2183       *
2184       * @see https://html.spec.whatwg.org/#parsing-main-inbody
2185       * @see WP_HTML_Processor::step
2186       *
2187       * @return bool Whether an element was found.
2188       */
2189  	private function step_in_body(): bool {
2190          $token_name = $this->get_token_name();
2191          $token_type = $this->get_token_type();
2192          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
2193          $op         = "{$op_sigil}{$token_name}";
2194  
2195          switch ( $op ) {
2196              case '#text':
2197                  /*
2198                   * > A character token that is U+0000 NULL
2199                   *
2200                   * Any successive sequence of NULL bytes is ignored and won't
2201                   * trigger active format reconstruction. Therefore, if the text
2202                   * only comprises NULL bytes then the token should be ignored
2203                   * here, but if there are any other characters in the stream
2204                   * the active formats should be reconstructed.
2205                   */
2206                  if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
2207                      // Parse error: ignore the token.
2208                      return $this->step();
2209                  }
2210  
2211                  $this->reconstruct_active_formatting_elements();
2212  
2213                  /*
2214                   * Whitespace-only text does not affect the frameset-ok flag.
2215                   * It is probably inter-element whitespace, but it may also
2216                   * contain character references which decode only to whitespace.
2217                   */
2218                  if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
2219                      $this->state->frameset_ok = false;
2220                  }
2221  
2222                  $this->insert_html_element( $this->state->current_token );
2223                  return true;
2224  
2225              case '#comment':
2226              case '#funky-comment':
2227              case '#presumptuous-tag':
2228                  $this->insert_html_element( $this->state->current_token );
2229                  return true;
2230  
2231              /*
2232               * > A DOCTYPE token
2233               * > Parse error. Ignore the token.
2234               */
2235              case 'html':
2236                  return $this->step();
2237  
2238              /*
2239               * > A start tag whose tag name is "html"
2240               */
2241              case '+HTML':
2242                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2243                      /*
2244                       * > Otherwise, for each attribute on the token, check to see if the attribute
2245                       * > is already present on the top element of the stack of open elements. If
2246                       * > it is not, add the attribute and its corresponding value to that element.
2247                       *
2248                       * This parser does not currently support this behavior: ignore the token.
2249                       */
2250                  }
2251  
2252                  // Ignore the token.
2253                  return $this->step();
2254  
2255              /*
2256               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
2257               * > "meta", "noframes", "script", "style", "template", "title"
2258               * >
2259               * > An end tag whose tag name is "template"
2260               */
2261              case '+BASE':
2262              case '+BASEFONT':
2263              case '+BGSOUND':
2264              case '+LINK':
2265              case '+META':
2266              case '+NOFRAMES':
2267              case '+SCRIPT':
2268              case '+STYLE':
2269              case '+TEMPLATE':
2270              case '+TITLE':
2271              case '-TEMPLATE':
2272                  return $this->step_in_head();
2273  
2274              /*
2275               * > A start tag whose tag name is "body"
2276               *
2277               * This tag in the IN BODY insertion mode is a parse error.
2278               */
2279              case '+BODY':
2280                  if (
2281                      1 === $this->state->stack_of_open_elements->count() ||
2282                      'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2283                      $this->state->stack_of_open_elements->contains( 'TEMPLATE' )
2284                  ) {
2285                      // Ignore the token.
2286                      return $this->step();
2287                  }
2288  
2289                  /*
2290                   * > Otherwise, set the frameset-ok flag to "not ok"; then, for each attribute
2291                   * > on the token, check to see if the attribute is already present on the body
2292                   * > element (the second element) on the stack of open elements, and if it is
2293                   * > not, add the attribute and its corresponding value to that element.
2294                   *
2295                   * This parser does not currently support this behavior: ignore the token.
2296                   */
2297                  $this->state->frameset_ok = false;
2298                  return $this->step();
2299  
2300              /*
2301               * > A start tag whose tag name is "frameset"
2302               *
2303               * This tag in the IN BODY insertion mode is a parse error.
2304               */
2305              case '+FRAMESET':
2306                  if (
2307                      1 === $this->state->stack_of_open_elements->count() ||
2308                      'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2309                      false === $this->state->frameset_ok
2310                  ) {
2311                      // Ignore the token.
2312                      return $this->step();
2313                  }
2314  
2315                  /*
2316                   * > Otherwise, run the following steps:
2317                   */
2318                  $this->bail( 'Cannot process non-ignored FRAMESET tags.' );
2319                  break;
2320  
2321              /*
2322               * > An end tag whose tag name is "body"
2323               */
2324              case '-BODY':
2325                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2326                      // Parse error: ignore the token.
2327                      return $this->step();
2328                  }
2329  
2330                  /*
2331                   * > Otherwise, if there is a node in the stack of open elements that is not either a
2332                   * > dd element, a dt element, an li element, an optgroup element, an option element,
2333                   * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2334                   * > element, a td element, a tfoot element, a th element, a thread element, a tr
2335                   * > element, the body element, or the html element, then this is a parse error.
2336                   *
2337                   * There is nothing to do for this parse error, so don't check for it.
2338                   */
2339  
2340                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2341                  /*
2342                   * The BODY element is not removed from the stack of open elements.
2343                   * Only internal state has changed, this does not qualify as a "step"
2344                   * in terms of advancing through the document to another token.
2345                   * Nothing has been pushed or popped.
2346                   * Proceed to parse the next item.
2347                   */
2348                  return $this->step();
2349  
2350              /*
2351               * > An end tag whose tag name is "html"
2352               */
2353              case '-HTML':
2354                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2355                      // Parse error: ignore the token.
2356                      return $this->step();
2357                  }
2358  
2359                  /*
2360                   * > Otherwise, if there is a node in the stack of open elements that is not either a
2361                   * > dd element, a dt element, an li element, an optgroup element, an option element,
2362                   * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2363                   * > element, a td element, a tfoot element, a th element, a thread element, a tr
2364                   * > element, the body element, or the html element, then this is a parse error.
2365                   *
2366                   * There is nothing to do for this parse error, so don't check for it.
2367                   */
2368  
2369                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2370                  return $this->step( self::REPROCESS_CURRENT_NODE );
2371  
2372              /*
2373               * > A start tag whose tag name is one of: "address", "article", "aside",
2374               * > "blockquote", "center", "details", "dialog", "dir", "div", "dl",
2375               * > "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
2376               * > "main", "menu", "nav", "ol", "p", "search", "section", "summary", "ul"
2377               */
2378              case '+ADDRESS':
2379              case '+ARTICLE':
2380              case '+ASIDE':
2381              case '+BLOCKQUOTE':
2382              case '+CENTER':
2383              case '+DETAILS':
2384              case '+DIALOG':
2385              case '+DIR':
2386              case '+DIV':
2387              case '+DL':
2388              case '+FIELDSET':
2389              case '+FIGCAPTION':
2390              case '+FIGURE':
2391              case '+FOOTER':
2392              case '+HEADER':
2393              case '+HGROUP':
2394              case '+MAIN':
2395              case '+MENU':
2396              case '+NAV':
2397              case '+OL':
2398              case '+P':
2399              case '+SEARCH':
2400              case '+SECTION':
2401              case '+SUMMARY':
2402              case '+UL':
2403                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2404                      $this->close_a_p_element();
2405                  }
2406  
2407                  $this->insert_html_element( $this->state->current_token );
2408                  return true;
2409  
2410              /*
2411               * > A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2412               */
2413              case '+H1':
2414              case '+H2':
2415              case '+H3':
2416              case '+H4':
2417              case '+H5':
2418              case '+H6':
2419                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2420                      $this->close_a_p_element();
2421                  }
2422  
2423                  if (
2424                      in_array(
2425                          $this->state->stack_of_open_elements->current_node()->node_name,
2426                          array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' ),
2427                          true
2428                      )
2429                  ) {
2430                      // @todo Indicate a parse error once it's possible.
2431                      $this->state->stack_of_open_elements->pop();
2432                  }
2433  
2434                  $this->insert_html_element( $this->state->current_token );
2435                  return true;
2436  
2437              /*
2438               * > A start tag whose tag name is one of: "pre", "listing"
2439               */
2440              case '+PRE':
2441              case '+LISTING':
2442                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2443                      $this->close_a_p_element();
2444                  }
2445  
2446                  /*
2447                   * > If the next token is a U+000A LINE FEED (LF) character token,
2448                   * > then ignore that token and move on to the next one. (Newlines
2449                   * > at the start of pre blocks are ignored as an authoring convenience.)
2450                   *
2451                   * This is handled in `get_modifiable_text()`.
2452                   */
2453  
2454                  $this->insert_html_element( $this->state->current_token );
2455                  $this->state->frameset_ok = false;
2456                  return true;
2457  
2458              /*
2459               * > A start tag whose tag name is "form"
2460               */
2461              case '+FORM':
2462                  $stack_contains_template = $this->state->stack_of_open_elements->contains( 'TEMPLATE' );
2463  
2464                  if ( isset( $this->state->form_element ) && ! $stack_contains_template ) {
2465                      // Parse error: ignore the token.
2466                      return $this->step();
2467                  }
2468  
2469                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2470                      $this->close_a_p_element();
2471                  }
2472  
2473                  $this->insert_html_element( $this->state->current_token );
2474                  if ( ! $stack_contains_template ) {
2475                      $this->state->form_element = $this->state->current_token;
2476                  }
2477  
2478                  return true;
2479  
2480              /*
2481               * > A start tag whose tag name is "li"
2482               * > A start tag whose tag name is one of: "dd", "dt"
2483               */
2484              case '+DD':
2485              case '+DT':
2486              case '+LI':
2487                  $this->state->frameset_ok = false;
2488                  $node                     = $this->state->stack_of_open_elements->current_node();
2489                  $is_li                    = 'LI' === $token_name;
2490  
2491                  in_body_list_loop:
2492                  /*
2493                   * The logic for LI and DT/DD is the same except for one point: LI elements _only_
2494                   * close other LI elements, but a DT or DD element closes _any_ open DT or DD element.
2495                   */
2496                  if ( $is_li ? 'LI' === $node->node_name : ( 'DD' === $node->node_name || 'DT' === $node->node_name ) ) {
2497                      $node_name = $is_li ? 'LI' : $node->node_name;
2498                      $this->generate_implied_end_tags( $node_name );
2499                      if ( ! $this->state->stack_of_open_elements->current_node_is( $node_name ) ) {
2500                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2501                      }
2502  
2503                      $this->state->stack_of_open_elements->pop_until( $node_name );
2504                      goto in_body_list_done;
2505                  }
2506  
2507                  if (
2508                      'ADDRESS' !== $node->node_name &&
2509                      'DIV' !== $node->node_name &&
2510                      'P' !== $node->node_name &&
2511                      self::is_special( $node )
2512                  ) {
2513                      /*
2514                       * > If node is in the special category, but is not an address, div,
2515                       * > or p element, then jump to the step labeled done below.
2516                       */
2517                      goto in_body_list_done;
2518                  } else {
2519                      /*
2520                       * > Otherwise, set node to the previous entry in the stack of open elements
2521                       * > and return to the step labeled loop.
2522                       */
2523                      foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
2524                          $node = $item;
2525                          break;
2526                      }
2527                      goto in_body_list_loop;
2528                  }
2529  
2530                  in_body_list_done:
2531                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2532                      $this->close_a_p_element();
2533                  }
2534  
2535                  $this->insert_html_element( $this->state->current_token );
2536                  return true;
2537  
2538              case '+PLAINTEXT':
2539                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2540                      $this->close_a_p_element();
2541                  }
2542  
2543                  /*
2544                   * @todo This may need to be handled in the Tag Processor and turn into
2545                   *       a single self-contained tag like TEXTAREA, whose modifiable text
2546                   *       is the rest of the input document as plaintext.
2547                   */
2548                  $this->bail( 'Cannot process PLAINTEXT elements.' );
2549                  break;
2550  
2551              /*
2552               * > A start tag whose tag name is "button"
2553               */
2554              case '+BUTTON':
2555                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
2556                      // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2557                      $this->generate_implied_end_tags();
2558                      $this->state->stack_of_open_elements->pop_until( 'BUTTON' );
2559                  }
2560  
2561                  $this->reconstruct_active_formatting_elements();
2562                  $this->insert_html_element( $this->state->current_token );
2563                  $this->state->frameset_ok = false;
2564  
2565                  return true;
2566  
2567              /*
2568               * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
2569               * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
2570               * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
2571               * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
2572               */
2573              case '-ADDRESS':
2574              case '-ARTICLE':
2575              case '-ASIDE':
2576              case '-BLOCKQUOTE':
2577              case '-BUTTON':
2578              case '-CENTER':
2579              case '-DETAILS':
2580              case '-DIALOG':
2581              case '-DIR':
2582              case '-DIV':
2583              case '-DL':
2584              case '-FIELDSET':
2585              case '-FIGCAPTION':
2586              case '-FIGURE':
2587              case '-FOOTER':
2588              case '-HEADER':
2589              case '-HGROUP':
2590              case '-LISTING':
2591              case '-MAIN':
2592              case '-MENU':
2593              case '-NAV':
2594              case '-OL':
2595              case '-PRE':
2596              case '-SEARCH':
2597              case '-SECTION':
2598              case '-SUMMARY':
2599              case '-UL':
2600                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2601                      // @todo Report parse error.
2602                      // Ignore the token.
2603                      return $this->step();
2604                  }
2605  
2606                  $this->generate_implied_end_tags();
2607                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2608                      // @todo Record parse error: this error doesn't impact parsing.
2609                  }
2610                  $this->state->stack_of_open_elements->pop_until( $token_name );
2611                  return true;
2612  
2613              /*
2614               * > An end tag whose tag name is "form"
2615               */
2616              case '-FORM':
2617                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2618                      $node                      = $this->state->form_element;
2619                      $this->state->form_element = null;
2620  
2621                      /*
2622                       * > If node is null or if the stack of open elements does not have node
2623                       * > in scope, then this is a parse error; return and ignore the token.
2624                       *
2625                       * @todo It's necessary to check if the form token itself is in scope, not
2626                       *       simply whether any FORM is in scope.
2627                       */
2628                      if (
2629                          null === $node ||
2630                          ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' )
2631                      ) {
2632                          // Parse error: ignore the token.
2633                          return $this->step();
2634                      }
2635  
2636                      $this->generate_implied_end_tags();
2637                      if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
2638                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2639                          $this->bail( 'Cannot close a FORM when other elements remain open as this would throw off the breadcrumbs for the following tokens.' );
2640                      }
2641  
2642                      $this->state->stack_of_open_elements->remove_node( $node );
2643                      return true;
2644                  } else {
2645                      /*
2646                       * > If the stack of open elements does not have a form element in scope,
2647                       * > then this is a parse error; return and ignore the token.
2648                       *
2649                       * Note that unlike in the clause above, this is checking for any FORM in scope.
2650                       */
2651                      if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' ) ) {
2652                          // Parse error: ignore the token.
2653                          return $this->step();
2654                      }
2655  
2656                      $this->generate_implied_end_tags();
2657  
2658                      if ( ! $this->state->stack_of_open_elements->current_node_is( 'FORM' ) ) {
2659                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2660                      }
2661  
2662                      $this->state->stack_of_open_elements->pop_until( 'FORM' );
2663                      return true;
2664                  }
2665                  break;
2666  
2667              /*
2668               * > An end tag whose tag name is "p"
2669               */
2670              case '-P':
2671                  if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2672                      $this->insert_html_element( $this->state->current_token );
2673                  }
2674  
2675                  $this->close_a_p_element();
2676                  return true;
2677  
2678              /*
2679               * > An end tag whose tag name is "li"
2680               * > An end tag whose tag name is one of: "dd", "dt"
2681               */
2682              case '-DD':
2683              case '-DT':
2684              case '-LI':
2685                  if (
2686                      /*
2687                       * An end tag whose tag name is "li":
2688                       * If the stack of open elements does not have an li element in list item scope,
2689                       * then this is a parse error; ignore the token.
2690                       */
2691                      (
2692                          'LI' === $token_name &&
2693                          ! $this->state->stack_of_open_elements->has_element_in_list_item_scope( 'LI' )
2694                      ) ||
2695                      /*
2696                       * An end tag whose tag name is one of: "dd", "dt":
2697                       * If the stack of open elements does not have an element in scope that is an
2698                       * HTML element with the same tag name as that of the token, then this is a
2699                       * parse error; ignore the token.
2700                       */
2701                      (
2702                          'LI' !== $token_name &&
2703                          ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name )
2704                      )
2705                  ) {
2706                      /*
2707                       * This is a parse error, ignore the token.
2708                       *
2709                       * @todo Indicate a parse error once it's possible.
2710                       */
2711                      return $this->step();
2712                  }
2713  
2714                  $this->generate_implied_end_tags( $token_name );
2715  
2716                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2717                      // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2718                  }
2719  
2720                  $this->state->stack_of_open_elements->pop_until( $token_name );
2721                  return true;
2722  
2723              /*
2724               * > An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2725               */
2726              case '-H1':
2727              case '-H2':
2728              case '-H3':
2729              case '-H4':
2730              case '-H5':
2731              case '-H6':
2732                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( '(internal: H1 through H6 - do not use)' ) ) {
2733                      /*
2734                       * This is a parse error; ignore the token.
2735                       *
2736                       * @todo Indicate a parse error once it's possible.
2737                       */
2738                      return $this->step();
2739                  }
2740  
2741                  $this->generate_implied_end_tags();
2742  
2743                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2744                      // @todo Record parse error: this error doesn't impact parsing.
2745                  }
2746  
2747                  $this->state->stack_of_open_elements->pop_until( '(internal: H1 through H6 - do not use)' );
2748                  return true;
2749  
2750              /*
2751               * > A start tag whose tag name is "a"
2752               */
2753              case '+A':
2754                  foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
2755                      switch ( $item->node_name ) {
2756                          case 'marker':
2757                              break 2;
2758  
2759                          case 'A':
2760                              $this->run_adoption_agency_algorithm();
2761                              $this->state->active_formatting_elements->remove_node( $item );
2762                              $this->state->stack_of_open_elements->remove_node( $item );
2763                              break 2;
2764                      }
2765                  }
2766  
2767                  $this->reconstruct_active_formatting_elements();
2768                  $this->insert_html_element( $this->state->current_token );
2769                  $this->state->active_formatting_elements->push( $this->state->current_token );
2770                  return true;
2771  
2772              /*
2773               * > A start tag whose tag name is one of: "b", "big", "code", "em", "font", "i",
2774               * > "s", "small", "strike", "strong", "tt", "u"
2775               */
2776              case '+B':
2777              case '+BIG':
2778              case '+CODE':
2779              case '+EM':
2780              case '+FONT':
2781              case '+I':
2782              case '+S':
2783              case '+SMALL':
2784              case '+STRIKE':
2785              case '+STRONG':
2786              case '+TT':
2787              case '+U':
2788                  $this->reconstruct_active_formatting_elements();
2789                  $this->insert_html_element( $this->state->current_token );
2790                  $this->state->active_formatting_elements->push( $this->state->current_token );
2791                  return true;
2792  
2793              /*
2794               * > A start tag whose tag name is "nobr"
2795               */
2796              case '+NOBR':
2797                  $this->reconstruct_active_formatting_elements();
2798  
2799                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'NOBR' ) ) {
2800                      // Parse error.
2801                      $this->run_adoption_agency_algorithm();
2802                      $this->reconstruct_active_formatting_elements();
2803                  }
2804  
2805                  $this->insert_html_element( $this->state->current_token );
2806                  $this->state->active_formatting_elements->push( $this->state->current_token );
2807                  return true;
2808  
2809              /*
2810               * > An end tag whose tag name is one of: "a", "b", "big", "code", "em", "font", "i",
2811               * > "nobr", "s", "small", "strike", "strong", "tt", "u"
2812               */
2813              case '-A':
2814              case '-B':
2815              case '-BIG':
2816              case '-CODE':
2817              case '-EM':
2818              case '-FONT':
2819              case '-I':
2820              case '-NOBR':
2821              case '-S':
2822              case '-SMALL':
2823              case '-STRIKE':
2824              case '-STRONG':
2825              case '-TT':
2826              case '-U':
2827                  $this->run_adoption_agency_algorithm();
2828                  return true;
2829  
2830              /*
2831               * > A start tag whose tag name is one of: "applet", "marquee", "object"
2832               */
2833              case '+APPLET':
2834              case '+MARQUEE':
2835              case '+OBJECT':
2836                  $this->reconstruct_active_formatting_elements();
2837                  $this->insert_html_element( $this->state->current_token );
2838                  $this->state->active_formatting_elements->insert_marker();
2839                  $this->state->frameset_ok = false;
2840                  return true;
2841  
2842              /*
2843               * > A end tag token whose tag name is one of: "applet", "marquee", "object"
2844               */
2845              case '-APPLET':
2846              case '-MARQUEE':
2847              case '-OBJECT':
2848                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2849                      // Parse error: ignore the token.
2850                      return $this->step();
2851                  }
2852  
2853                  $this->generate_implied_end_tags();
2854                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2855                      // This is a parse error.
2856                  }
2857  
2858                  $this->state->stack_of_open_elements->pop_until( $token_name );
2859                  $this->state->active_formatting_elements->clear_up_to_last_marker();
2860                  return true;
2861  
2862              /*
2863               * > A start tag whose tag name is "table"
2864               */
2865              case '+TABLE':
2866                  /*
2867                   * > If the Document is not set to quirks mode, and the stack of open elements
2868                   * > has a p element in button scope, then close a p element.
2869                   */
2870                  if (
2871                      WP_HTML_Tag_Processor::QUIRKS_MODE !== $this->compat_mode &&
2872                      $this->state->stack_of_open_elements->has_p_in_button_scope()
2873                  ) {
2874                      $this->close_a_p_element();
2875                  }
2876  
2877                  $this->insert_html_element( $this->state->current_token );
2878                  $this->state->frameset_ok    = false;
2879                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
2880                  return true;
2881  
2882              /*
2883               * > An end tag whose tag name is "br"
2884               *
2885               * This is prevented from happening because the Tag Processor
2886               * reports all closing BR tags as if they were opening tags.
2887               */
2888  
2889              /*
2890               * > A start tag whose tag name is one of: "area", "br", "embed", "img", "keygen", "wbr"
2891               */
2892              case '+AREA':
2893              case '+BR':
2894              case '+EMBED':
2895              case '+IMG':
2896              case '+KEYGEN':
2897              case '+WBR':
2898                  $this->reconstruct_active_formatting_elements();
2899                  $this->insert_html_element( $this->state->current_token );
2900                  $this->state->frameset_ok = false;
2901                  return true;
2902  
2903              /*
2904               * > A start tag whose tag name is "input"
2905               */
2906              case '+INPUT':
2907                  $this->reconstruct_active_formatting_elements();
2908                  $this->insert_html_element( $this->state->current_token );
2909  
2910                  /*
2911                   * > If the token does not have an attribute with the name "type", or if it does,
2912                   * > but that attribute's value is not an ASCII case-insensitive match for the
2913                   * > string "hidden", then: set the frameset-ok flag to "not ok".
2914                   */
2915                  $type_attribute = $this->get_attribute( 'type' );
2916                  if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
2917                      $this->state->frameset_ok = false;
2918                  }
2919  
2920                  return true;
2921  
2922              /*
2923               * > A start tag whose tag name is one of: "param", "source", "track"
2924               */
2925              case '+PARAM':
2926              case '+SOURCE':
2927              case '+TRACK':
2928                  $this->insert_html_element( $this->state->current_token );
2929                  return true;
2930  
2931              /*
2932               * > A start tag whose tag name is "hr"
2933               */
2934              case '+HR':
2935                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2936                      $this->close_a_p_element();
2937                  }
2938                  $this->insert_html_element( $this->state->current_token );
2939                  $this->state->frameset_ok = false;
2940                  return true;
2941  
2942              /*
2943               * > A start tag whose tag name is "image"
2944               */
2945              case '+IMAGE':
2946                  /*
2947                   * > Parse error. Change the token's tag name to "img" and reprocess it. (Don't ask.)
2948                   *
2949                   * Note that this is handled elsewhere, so it should not be possible to reach this code.
2950                   */
2951                  $this->bail( "Cannot process an IMAGE tag. (Don't ask.)" );
2952                  break;
2953  
2954              /*
2955               * > A start tag whose tag name is "textarea"
2956               */
2957              case '+TEXTAREA':
2958                  $this->insert_html_element( $this->state->current_token );
2959  
2960                  /*
2961                   * > If the next token is a U+000A LINE FEED (LF) character token, then ignore
2962                   * > that token and move on to the next one. (Newlines at the start of
2963                   * > textarea elements are ignored as an authoring convenience.)
2964                   *
2965                   * This is handled in `get_modifiable_text()`.
2966                   */
2967  
2968                  $this->state->frameset_ok = false;
2969  
2970                  /*
2971                   * > Switch the insertion mode to "text".
2972                   *
2973                   * As a self-contained node, this behavior is handled in the Tag Processor.
2974                   */
2975                  return true;
2976  
2977              /*
2978               * > A start tag whose tag name is "xmp"
2979               */
2980              case '+XMP':
2981                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2982                      $this->close_a_p_element();
2983                  }
2984  
2985                  $this->reconstruct_active_formatting_elements();
2986                  $this->state->frameset_ok = false;
2987  
2988                  /*
2989                   * > Follow the generic raw text element parsing algorithm.
2990                   *
2991                   * As a self-contained node, this behavior is handled in the Tag Processor.
2992                   */
2993                  $this->insert_html_element( $this->state->current_token );
2994                  return true;
2995  
2996              /*
2997               * A start tag whose tag name is "iframe"
2998               */
2999              case '+IFRAME':
3000                  $this->state->frameset_ok = false;
3001  
3002                  /*
3003                   * > Follow the generic raw text element parsing algorithm.
3004                   *
3005                   * As a self-contained node, this behavior is handled in the Tag Processor.
3006                   */
3007                  $this->insert_html_element( $this->state->current_token );
3008                  return true;
3009  
3010              /*
3011               * > A start tag whose tag name is "noembed"
3012               * > A start tag whose tag name is "noscript", if the scripting flag is enabled
3013               *
3014               * The scripting flag is never enabled in this parser.
3015               */
3016              case '+NOEMBED':
3017                  $this->insert_html_element( $this->state->current_token );
3018                  return true;
3019  
3020              /*
3021               * > A start tag whose tag name is "select"
3022               */
3023              case '+SELECT':
3024                  $this->reconstruct_active_formatting_elements();
3025                  $this->insert_html_element( $this->state->current_token );
3026                  $this->state->frameset_ok = false;
3027  
3028                  switch ( $this->state->insertion_mode ) {
3029                      /*
3030                       * > If the insertion mode is one of "in table", "in caption", "in table body", "in row",
3031                       * > or "in cell", then switch the insertion mode to "in select in table".
3032                       */
3033                      case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
3034                      case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
3035                      case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
3036                      case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
3037                      case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
3038                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
3039                          break;
3040  
3041                      /*
3042                       * > Otherwise, switch the insertion mode to "in select".
3043                       */
3044                      default:
3045                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
3046                          break;
3047                  }
3048                  return true;
3049  
3050              /*
3051               * > A start tag whose tag name is one of: "optgroup", "option"
3052               */
3053              case '+OPTGROUP':
3054              case '+OPTION':
3055                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
3056                      $this->state->stack_of_open_elements->pop();
3057                  }
3058                  $this->reconstruct_active_formatting_elements();
3059                  $this->insert_html_element( $this->state->current_token );
3060                  return true;
3061  
3062              /*
3063               * > A start tag whose tag name is one of: "rb", "rtc"
3064               */
3065              case '+RB':
3066              case '+RTC':
3067                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3068                      $this->generate_implied_end_tags();
3069  
3070                      if ( $this->state->stack_of_open_elements->current_node_is( 'RUBY' ) ) {
3071                          // @todo Indicate a parse error once it's possible.
3072                      }
3073                  }
3074  
3075                  $this->insert_html_element( $this->state->current_token );
3076                  return true;
3077  
3078              /*
3079               * > A start tag whose tag name is one of: "rp", "rt"
3080               */
3081              case '+RP':
3082              case '+RT':
3083                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3084                      $this->generate_implied_end_tags( 'RTC' );
3085  
3086                      $current_node_name = $this->state->stack_of_open_elements->current_node()->node_name;
3087                      if ( 'RTC' === $current_node_name || 'RUBY' === $current_node_name ) {
3088                          // @todo Indicate a parse error once it's possible.
3089                      }
3090                  }
3091  
3092                  $this->insert_html_element( $this->state->current_token );
3093                  return true;
3094  
3095              /*
3096               * > A start tag whose tag name is "math"
3097               */
3098              case '+MATH':
3099                  $this->reconstruct_active_formatting_elements();
3100  
3101                  /*
3102                   * @todo Adjust MathML attributes for the token. (This fixes the case of MathML attributes that are not all lowercase.)
3103                   * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink.)
3104                   *
3105                   * These ought to be handled in the attribute methods.
3106                   */
3107                  $this->state->current_token->namespace = 'math';
3108                  $this->insert_html_element( $this->state->current_token );
3109                  if ( $this->state->current_token->has_self_closing_flag ) {
3110                      $this->state->stack_of_open_elements->pop();
3111                  }
3112                  return true;
3113  
3114              /*
3115               * > A start tag whose tag name is "svg"
3116               */
3117              case '+SVG':
3118                  $this->reconstruct_active_formatting_elements();
3119  
3120                  /*
3121                   * @todo Adjust SVG attributes for the token. (This fixes the case of SVG attributes that are not all lowercase.)
3122                   * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink in SVG.)
3123                   *
3124                   * These ought to be handled in the attribute methods.
3125                   */
3126                  $this->state->current_token->namespace = 'svg';
3127                  $this->insert_html_element( $this->state->current_token );
3128                  if ( $this->state->current_token->has_self_closing_flag ) {
3129                      $this->state->stack_of_open_elements->pop();
3130                  }
3131                  return true;
3132  
3133              /*
3134               * > A start tag whose tag name is one of: "caption", "col", "colgroup",
3135               * > "frame", "head", "tbody", "td", "tfoot", "th", "thead", "tr"
3136               */
3137              case '+CAPTION':
3138              case '+COL':
3139              case '+COLGROUP':
3140              case '+FRAME':
3141              case '+HEAD':
3142              case '+TBODY':
3143              case '+TD':
3144              case '+TFOOT':
3145              case '+TH':
3146              case '+THEAD':
3147              case '+TR':
3148                  // Parse error. Ignore the token.
3149                  return $this->step();
3150          }
3151  
3152          if ( ! parent::is_tag_closer() ) {
3153              /*
3154               * > Any other start tag
3155               */
3156              $this->reconstruct_active_formatting_elements();
3157              $this->insert_html_element( $this->state->current_token );
3158              return true;
3159          } else {
3160              /*
3161               * > Any other end tag
3162               */
3163  
3164              /*
3165               * Find the corresponding tag opener in the stack of open elements, if
3166               * it exists before reaching a special element, which provides a kind
3167               * of boundary in the stack. For example, a `</custom-tag>` should not
3168               * close anything beyond its containing `P` or `DIV` element.
3169               */
3170              foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
3171                  if ( 'html' === $node->namespace && $token_name === $node->node_name ) {
3172                      break;
3173                  }
3174  
3175                  if ( self::is_special( $node ) ) {
3176                      // This is a parse error, ignore the token.
3177                      return $this->step();
3178                  }
3179              }
3180  
3181              $this->generate_implied_end_tags( $token_name );
3182              if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
3183                  // @todo Record parse error: this error doesn't impact parsing.
3184              }
3185  
3186              foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
3187                  $this->state->stack_of_open_elements->pop();
3188                  if ( $node === $item ) {
3189                      return true;
3190                  }
3191              }
3192          }
3193  
3194          $this->bail( 'Should not have been able to reach end of IN BODY processing. Check HTML API code.' );
3195          // This unnecessary return prevents tools from inaccurately reporting type errors.
3196          return false;
3197      }
3198  
3199      /**
3200       * Parses next element in the 'in table' insertion mode.
3201       *
3202       * This internal function performs the 'in table' insertion mode
3203       * logic for the generalized WP_HTML_Processor::step() function.
3204       *
3205       * @since 6.7.0
3206       *
3207       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3208       *
3209       * @see https://html.spec.whatwg.org/#parsing-main-intable
3210       * @see WP_HTML_Processor::step
3211       *
3212       * @return bool Whether an element was found.
3213       */
3214  	private function step_in_table(): bool {
3215          $token_name = $this->get_token_name();
3216          $token_type = $this->get_token_type();
3217          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3218          $op         = "{$op_sigil}{$token_name}";
3219  
3220          switch ( $op ) {
3221              /*
3222               * > A character token, if the current node is table,
3223               * > tbody, template, tfoot, thead, or tr element
3224               */
3225              case '#text':
3226                  $current_node      = $this->state->stack_of_open_elements->current_node();
3227                  $current_node_name = $current_node ? $current_node->node_name : null;
3228                  if (
3229                      $current_node_name && (
3230                          'TABLE' === $current_node_name ||
3231                          'TBODY' === $current_node_name ||
3232                          'TEMPLATE' === $current_node_name ||
3233                          'TFOOT' === $current_node_name ||
3234                          'THEAD' === $current_node_name ||
3235                          'TR' === $current_node_name
3236                      )
3237                  ) {
3238                      /*
3239                       * If the text is empty after processing HTML entities and stripping
3240                       * U+0000 NULL bytes then ignore the token.
3241                       */
3242                      if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
3243                          return $this->step();
3244                      }
3245  
3246                      /*
3247                       * This follows the rules for "in table text" insertion mode.
3248                       *
3249                       * Whitespace-only text nodes are inserted in-place. Otherwise
3250                       * foster parenting is enabled and the nodes would be
3251                       * inserted out-of-place.
3252                       *
3253                       * > If any of the tokens in the pending table character tokens
3254                       * > list are character tokens that are not ASCII whitespace,
3255                       * > then this is a parse error: reprocess the character tokens
3256                       * > in the pending table character tokens list using the rules
3257                       * > given in the "anything else" entry in the "in table"
3258                       * > insertion mode.
3259                       * >
3260                       * > Otherwise, insert the characters given by the pending table
3261                       * > character tokens list.
3262                       *
3263                       * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3264                       */
3265                      if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3266                          $this->insert_html_element( $this->state->current_token );
3267                          return true;
3268                      }
3269  
3270                      // Non-whitespace would trigger fostering, unsupported at this time.
3271                      $this->bail( 'Foster parenting is not supported.' );
3272                      break;
3273                  }
3274                  break;
3275  
3276              /*
3277               * > A comment token
3278               */
3279              case '#comment':
3280              case '#funky-comment':
3281              case '#presumptuous-tag':
3282                  $this->insert_html_element( $this->state->current_token );
3283                  return true;
3284  
3285              /*
3286               * > A DOCTYPE token
3287               */
3288              case 'html':
3289                  // Parse error: ignore the token.
3290                  return $this->step();
3291  
3292              /*
3293               * > A start tag whose tag name is "caption"
3294               */
3295              case '+CAPTION':
3296                  $this->state->stack_of_open_elements->clear_to_table_context();
3297                  $this->state->active_formatting_elements->insert_marker();
3298                  $this->insert_html_element( $this->state->current_token );
3299                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
3300                  return true;
3301  
3302              /*
3303               * > A start tag whose tag name is "colgroup"
3304               */
3305              case '+COLGROUP':
3306                  $this->state->stack_of_open_elements->clear_to_table_context();
3307                  $this->insert_html_element( $this->state->current_token );
3308                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3309                  return true;
3310  
3311              /*
3312               * > A start tag whose tag name is "col"
3313               */
3314              case '+COL':
3315                  $this->state->stack_of_open_elements->clear_to_table_context();
3316  
3317                  /*
3318                   * > Insert an HTML element for a "colgroup" start tag token with no attributes,
3319                   * > then switch the insertion mode to "in column group".
3320                   */
3321                  $this->insert_virtual_node( 'COLGROUP' );
3322                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3323                  return $this->step( self::REPROCESS_CURRENT_NODE );
3324  
3325              /*
3326               * > A start tag whose tag name is one of: "tbody", "tfoot", "thead"
3327               */
3328              case '+TBODY':
3329              case '+TFOOT':
3330              case '+THEAD':
3331                  $this->state->stack_of_open_elements->clear_to_table_context();
3332                  $this->insert_html_element( $this->state->current_token );
3333                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3334                  return true;
3335  
3336              /*
3337               * > A start tag whose tag name is one of: "td", "th", "tr"
3338               */
3339              case '+TD':
3340              case '+TH':
3341              case '+TR':
3342                  $this->state->stack_of_open_elements->clear_to_table_context();
3343                  /*
3344                   * > Insert an HTML element for a "tbody" start tag token with no attributes,
3345                   * > then switch the insertion mode to "in table body".
3346                   */
3347                  $this->insert_virtual_node( 'TBODY' );
3348                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3349                  return $this->step( self::REPROCESS_CURRENT_NODE );
3350  
3351              /*
3352               * > A start tag whose tag name is "table"
3353               *
3354               * This tag in the IN TABLE insertion mode is a parse error.
3355               */
3356              case '+TABLE':
3357                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3358                      return $this->step();
3359                  }
3360  
3361                  $this->state->stack_of_open_elements->pop_until( 'TABLE' );
3362                  $this->reset_insertion_mode_appropriately();
3363                  return $this->step( self::REPROCESS_CURRENT_NODE );
3364  
3365              /*
3366               * > An end tag whose tag name is "table"
3367               */
3368              case '-TABLE':
3369                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3370                      // @todo Indicate a parse error once it's possible.
3371                      return $this->step();
3372                  }
3373  
3374                  $this->state->stack_of_open_elements->pop_until( 'TABLE' );
3375                  $this->reset_insertion_mode_appropriately();
3376                  return true;
3377  
3378              /*
3379               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3380               */
3381              case '-BODY':
3382              case '-CAPTION':
3383              case '-COL':
3384              case '-COLGROUP':
3385              case '-HTML':
3386              case '-TBODY':
3387              case '-TD':
3388              case '-TFOOT':
3389              case '-TH':
3390              case '-THEAD':
3391              case '-TR':
3392                  // Parse error: ignore the token.
3393                  return $this->step();
3394  
3395              /*
3396               * > A start tag whose tag name is one of: "style", "script", "template"
3397               * > An end tag whose tag name is "template"
3398               */
3399              case '+STYLE':
3400              case '+SCRIPT':
3401              case '+TEMPLATE':
3402              case '-TEMPLATE':
3403                  /*
3404                   * > Process the token using the rules for the "in head" insertion mode.
3405                   */
3406                  return $this->step_in_head();
3407  
3408              /*
3409               * > A start tag whose tag name is "input"
3410               *
3411               * > If the token does not have an attribute with the name "type", or if it does, but
3412               * > that attribute's value is not an ASCII case-insensitive match for the string
3413               * > "hidden", then: act as described in the "anything else" entry below.
3414               */
3415              case '+INPUT':
3416                  $type_attribute = $this->get_attribute( 'type' );
3417                  if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
3418                      goto anything_else;
3419                  }
3420                  // @todo Indicate a parse error once it's possible.
3421                  $this->insert_html_element( $this->state->current_token );
3422                  return true;
3423  
3424              /*
3425               * > A start tag whose tag name is "form"
3426               *
3427               * This tag in the IN TABLE insertion mode is a parse error.
3428               */
3429              case '+FORM':
3430                  if (
3431                      $this->state->stack_of_open_elements->has_element_in_scope( 'TEMPLATE' ) ||
3432                      isset( $this->state->form_element )
3433                  ) {
3434                      return $this->step();
3435                  }
3436  
3437                  // This FORM is special because it immediately closes and cannot have other children.
3438                  $this->insert_html_element( $this->state->current_token );
3439                  $this->state->form_element = $this->state->current_token;
3440                  $this->state->stack_of_open_elements->pop();
3441                  return true;
3442          }
3443  
3444          /*
3445           * > Anything else
3446           * > Parse error. Enable foster parenting, process the token using the rules for the
3447           * > "in body" insertion mode, and then disable foster parenting.
3448           *
3449           * @todo Indicate a parse error once it's possible.
3450           */
3451          anything_else:
3452          $this->bail( 'Foster parenting is not supported.' );
3453      }
3454  
3455      /**
3456       * Parses next element in the 'in table text' insertion mode.
3457       *
3458       * This internal function performs the 'in table text' insertion mode
3459       * logic for the generalized WP_HTML_Processor::step() function.
3460       *
3461       * @since 6.7.0 Stub implementation.
3462       *
3463       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3464       *
3465       * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3466       * @see WP_HTML_Processor::step
3467       *
3468       * @return bool Whether an element was found.
3469       */
3470  	private function step_in_table_text(): bool {
3471          $this->bail( 'No support for parsing in the ' . WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT . ' state.' );
3472      }
3473  
3474      /**
3475       * Parses next element in the 'in caption' insertion mode.
3476       *
3477       * This internal function performs the 'in caption' insertion mode
3478       * logic for the generalized WP_HTML_Processor::step() function.
3479       *
3480       * @since 6.7.0
3481       *
3482       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3483       *
3484       * @see https://html.spec.whatwg.org/#parsing-main-incaption
3485       * @see WP_HTML_Processor::step
3486       *
3487       * @return bool Whether an element was found.
3488       */
3489  	private function step_in_caption(): bool {
3490          $tag_name = $this->get_tag();
3491          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3492          $op       = "{$op_sigil}{$tag_name}";
3493  
3494          switch ( $op ) {
3495              /*
3496               * > An end tag whose tag name is "caption"
3497               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td", "tfoot", "th", "thead", "tr"
3498               * > An end tag whose tag name is "table"
3499               *
3500               * These tag handling rules are identical except for the final instruction.
3501               * Handle them in a single block.
3502               */
3503              case '-CAPTION':
3504              case '+CAPTION':
3505              case '+COL':
3506              case '+COLGROUP':
3507              case '+TBODY':
3508              case '+TD':
3509              case '+TFOOT':
3510              case '+TH':
3511              case '+THEAD':
3512              case '+TR':
3513              case '-TABLE':
3514                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'CAPTION' ) ) {
3515                      // Parse error: ignore the token.
3516                      return $this->step();
3517                  }
3518  
3519                  $this->generate_implied_end_tags();
3520                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'CAPTION' ) ) {
3521                      // @todo Indicate a parse error once it's possible.
3522                  }
3523  
3524                  $this->state->stack_of_open_elements->pop_until( 'CAPTION' );
3525                  $this->state->active_formatting_elements->clear_up_to_last_marker();
3526                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3527  
3528                  // If this is not a CAPTION end tag, the token should be reprocessed.
3529                  if ( '-CAPTION' === $op ) {
3530                      return true;
3531                  }
3532                  return $this->step( self::REPROCESS_CURRENT_NODE );
3533  
3534              /**
3535               * > An end tag whose tag name is one of: "body", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3536               */
3537              case '-BODY':
3538              case '-COL':
3539              case '-COLGROUP':
3540              case '-HTML':
3541              case '-TBODY':
3542              case '-TD':
3543              case '-TFOOT':
3544              case '-TH':
3545              case '-THEAD':
3546              case '-TR':
3547                  // Parse error: ignore the token.
3548                  return $this->step();
3549          }
3550  
3551          /**
3552           * > Anything else
3553           * >   Process the token using the rules for the "in body" insertion mode.
3554           */
3555          return $this->step_in_body();
3556      }
3557  
3558      /**
3559       * Parses next element in the 'in column group' insertion mode.
3560       *
3561       * This internal function performs the 'in column group' insertion mode
3562       * logic for the generalized WP_HTML_Processor::step() function.
3563       *
3564       * @since 6.7.0
3565       *
3566       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3567       *
3568       * @see https://html.spec.whatwg.org/#parsing-main-incolgroup
3569       * @see WP_HTML_Processor::step
3570       *
3571       * @return bool Whether an element was found.
3572       */
3573  	private function step_in_column_group(): bool {
3574          $token_name = $this->get_token_name();
3575          $token_type = $this->get_token_type();
3576          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3577          $op         = "{$op_sigil}{$token_name}";
3578  
3579          switch ( $op ) {
3580              /*
3581               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
3582               * > U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
3583               */
3584              case '#text':
3585                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3586                      // Insert the character.
3587                      $this->insert_html_element( $this->state->current_token );
3588                      return true;
3589                  }
3590  
3591                  goto in_column_group_anything_else;
3592                  break;
3593  
3594              /*
3595               * > A comment token
3596               */
3597              case '#comment':
3598              case '#funky-comment':
3599              case '#presumptuous-tag':
3600                  $this->insert_html_element( $this->state->current_token );
3601                  return true;
3602  
3603              /*
3604               * > A DOCTYPE token
3605               */
3606              case 'html':
3607                  // @todo Indicate a parse error once it's possible.
3608                  return $this->step();
3609  
3610              /*
3611               * > A start tag whose tag name is "html"
3612               */
3613              case '+HTML':
3614                  return $this->step_in_body();
3615  
3616              /*
3617               * > A start tag whose tag name is "col"
3618               */
3619              case '+COL':
3620                  $this->insert_html_element( $this->state->current_token );
3621                  $this->state->stack_of_open_elements->pop();
3622                  return true;
3623  
3624              /*
3625               * > An end tag whose tag name is "colgroup"
3626               */
3627              case '-COLGROUP':
3628                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3629                      // @todo Indicate a parse error once it's possible.
3630                      return $this->step();
3631                  }
3632                  $this->state->stack_of_open_elements->pop();
3633                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3634                  return true;
3635  
3636              /*
3637               * > An end tag whose tag name is "col"
3638               */
3639              case '-COL':
3640                  // Parse error: ignore the token.
3641                  return $this->step();
3642  
3643              /*
3644               * > A start tag whose tag name is "template"
3645               * > An end tag whose tag name is "template"
3646               */
3647              case '+TEMPLATE':
3648              case '-TEMPLATE':
3649                  return $this->step_in_head();
3650          }
3651  
3652          in_column_group_anything_else:
3653          /*
3654           * > Anything else
3655           */
3656          if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3657              // @todo Indicate a parse error once it's possible.
3658              return $this->step();
3659          }
3660          $this->state->stack_of_open_elements->pop();
3661          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3662          return $this->step( self::REPROCESS_CURRENT_NODE );
3663      }
3664  
3665      /**
3666       * Parses next element in the 'in table body' insertion mode.
3667       *
3668       * This internal function performs the 'in table body' insertion mode
3669       * logic for the generalized WP_HTML_Processor::step() function.
3670       *
3671       * @since 6.7.0
3672       *
3673       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3674       *
3675       * @see https://html.spec.whatwg.org/#parsing-main-intbody
3676       * @see WP_HTML_Processor::step
3677       *
3678       * @return bool Whether an element was found.
3679       */
3680  	private function step_in_table_body(): bool {
3681          $tag_name = $this->get_tag();
3682          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3683          $op       = "{$op_sigil}{$tag_name}";
3684  
3685          switch ( $op ) {
3686              /*
3687               * > A start tag whose tag name is "tr"
3688               */
3689              case '+TR':
3690                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3691                  $this->insert_html_element( $this->state->current_token );
3692                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3693                  return true;
3694  
3695              /*
3696               * > A start tag whose tag name is one of: "th", "td"
3697               */
3698              case '+TH':
3699              case '+TD':
3700                  // @todo Indicate a parse error once it's possible.
3701                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3702                  $this->insert_virtual_node( 'TR' );
3703                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3704                  return $this->step( self::REPROCESS_CURRENT_NODE );
3705  
3706              /*
3707               * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3708               */
3709              case '-TBODY':
3710              case '-TFOOT':
3711              case '-THEAD':
3712                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3713                      // Parse error: ignore the token.
3714                      return $this->step();
3715                  }
3716  
3717                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3718                  $this->state->stack_of_open_elements->pop();
3719                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3720                  return true;
3721  
3722              /*
3723               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead"
3724               * > An end tag whose tag name is "table"
3725               */
3726              case '+CAPTION':
3727              case '+COL':
3728              case '+COLGROUP':
3729              case '+TBODY':
3730              case '+TFOOT':
3731              case '+THEAD':
3732              case '-TABLE':
3733                  if (
3734                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TBODY' ) &&
3735                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'THEAD' ) &&
3736                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TFOOT' )
3737                  ) {
3738                      // Parse error: ignore the token.
3739                      return $this->step();
3740                  }
3741                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3742                  $this->state->stack_of_open_elements->pop();
3743                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3744                  return $this->step( self::REPROCESS_CURRENT_NODE );
3745  
3746              /*
3747               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th", "tr"
3748               */
3749              case '-BODY':
3750              case '-CAPTION':
3751              case '-COL':
3752              case '-COLGROUP':
3753              case '-HTML':
3754              case '-TD':
3755              case '-TH':
3756              case '-TR':
3757                  // Parse error: ignore the token.
3758                  return $this->step();
3759          }
3760  
3761          /*
3762           * > Anything else
3763           * > Process the token using the rules for the "in table" insertion mode.
3764           */
3765          return $this->step_in_table();
3766      }
3767  
3768      /**
3769       * Parses next element in the 'in row' insertion mode.
3770       *
3771       * This internal function performs the 'in row' insertion mode
3772       * logic for the generalized WP_HTML_Processor::step() function.
3773       *
3774       * @since 6.7.0
3775       *
3776       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3777       *
3778       * @see https://html.spec.whatwg.org/#parsing-main-intr
3779       * @see WP_HTML_Processor::step
3780       *
3781       * @return bool Whether an element was found.
3782       */
3783  	private function step_in_row(): bool {
3784          $tag_name = $this->get_tag();
3785          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3786          $op       = "{$op_sigil}{$tag_name}";
3787  
3788          switch ( $op ) {
3789              /*
3790               * > A start tag whose tag name is one of: "th", "td"
3791               */
3792              case '+TH':
3793              case '+TD':
3794                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3795                  $this->insert_html_element( $this->state->current_token );
3796                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
3797                  $this->state->active_formatting_elements->insert_marker();
3798                  return true;
3799  
3800              /*
3801               * > An end tag whose tag name is "tr"
3802               */
3803              case '-TR':
3804                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3805                      // Parse error: ignore the token.
3806                      return $this->step();
3807                  }
3808  
3809                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3810                  $this->state->stack_of_open_elements->pop();
3811                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3812                  return true;
3813  
3814              /*
3815               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead", "tr"
3816               * > An end tag whose tag name is "table"
3817               */
3818              case '+CAPTION':
3819              case '+COL':
3820              case '+COLGROUP':
3821              case '+TBODY':
3822              case '+TFOOT':
3823              case '+THEAD':
3824              case '+TR':
3825              case '-TABLE':
3826                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3827                      // Parse error: ignore the token.
3828                      return $this->step();
3829                  }
3830  
3831                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3832                  $this->state->stack_of_open_elements->pop();
3833                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3834                  return $this->step( self::REPROCESS_CURRENT_NODE );
3835  
3836              /*
3837               * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3838               */
3839              case '-TBODY':
3840              case '-TFOOT':
3841              case '-THEAD':
3842                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3843                      // Parse error: ignore the token.
3844                      return $this->step();
3845                  }
3846  
3847                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3848                      // Ignore the token.
3849                      return $this->step();
3850                  }
3851  
3852                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3853                  $this->state->stack_of_open_elements->pop();
3854                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3855                  return $this->step( self::REPROCESS_CURRENT_NODE );
3856  
3857              /*
3858               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th"
3859               */
3860              case '-BODY':
3861              case '-CAPTION':
3862              case '-COL':
3863              case '-COLGROUP':
3864              case '-HTML':
3865              case '-TD':
3866              case '-TH':
3867                  // Parse error: ignore the token.
3868                  return $this->step();
3869          }
3870  
3871          /*
3872           * > Anything else
3873           * >   Process the token using the rules for the "in table" insertion mode.
3874           */
3875          return $this->step_in_table();
3876      }
3877  
3878      /**
3879       * Parses next element in the 'in cell' insertion mode.
3880       *
3881       * This internal function performs the 'in cell' insertion mode
3882       * logic for the generalized WP_HTML_Processor::step() function.
3883       *
3884       * @since 6.7.0
3885       *
3886       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3887       *
3888       * @see https://html.spec.whatwg.org/#parsing-main-intd
3889       * @see WP_HTML_Processor::step
3890       *
3891       * @return bool Whether an element was found.
3892       */
3893  	private function step_in_cell(): bool {
3894          $tag_name = $this->get_tag();
3895          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3896          $op       = "{$op_sigil}{$tag_name}";
3897  
3898          switch ( $op ) {
3899              /*
3900               * > An end tag whose tag name is one of: "td", "th"
3901               */
3902              case '-TD':
3903              case '-TH':
3904                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3905                      // Parse error: ignore the token.
3906                      return $this->step();
3907                  }
3908  
3909                  $this->generate_implied_end_tags();
3910  
3911                  /*
3912                   * @todo This needs to check if the current node is an HTML element, meaning that
3913                   *       when SVG and MathML support is added, this needs to differentiate between an
3914                   *       HTML element of the given name, such as `<center>`, and a foreign element of
3915                   *       the same given name.
3916                   */
3917                  if ( ! $this->state->stack_of_open_elements->current_node_is( $tag_name ) ) {
3918                      // @todo Indicate a parse error once it's possible.
3919                  }
3920  
3921                  $this->state->stack_of_open_elements->pop_until( $tag_name );
3922                  $this->state->active_formatting_elements->clear_up_to_last_marker();
3923                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3924                  return true;
3925  
3926              /*
3927               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td",
3928               * > "tfoot", "th", "thead", "tr"
3929               */
3930              case '+CAPTION':
3931              case '+COL':
3932              case '+COLGROUP':
3933              case '+TBODY':
3934              case '+TD':
3935              case '+TFOOT':
3936              case '+TH':
3937              case '+THEAD':
3938              case '+TR':
3939                  /*
3940                   * > Assert: The stack of open elements has a td or th element in table scope.
3941                   *
3942                   * Nothing to do here, except to verify in tests that this never appears.
3943                   */
3944  
3945                  $this->close_cell();
3946                  return $this->step( self::REPROCESS_CURRENT_NODE );
3947  
3948              /*
3949               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html"
3950               */
3951              case '-BODY':
3952              case '-CAPTION':
3953              case '-COL':
3954              case '-COLGROUP':
3955              case '-HTML':
3956                  // Parse error: ignore the token.
3957                  return $this->step();
3958  
3959              /*
3960               * > An end tag whose tag name is one of: "table", "tbody", "tfoot", "thead", "tr"
3961               */
3962              case '-TABLE':
3963              case '-TBODY':
3964              case '-TFOOT':
3965              case '-THEAD':
3966              case '-TR':
3967                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3968                      // Parse error: ignore the token.
3969                      return $this->step();
3970                  }
3971                  $this->close_cell();
3972                  return $this->step( self::REPROCESS_CURRENT_NODE );
3973          }
3974  
3975          /*
3976           * > Anything else
3977           * >   Process the token using the rules for the "in body" insertion mode.
3978           */
3979          return $this->step_in_body();
3980      }
3981  
3982      /**
3983       * Parses next element in the 'in select' insertion mode.
3984       *
3985       * This internal function performs the 'in select' insertion mode
3986       * logic for the generalized WP_HTML_Processor::step() function.
3987       *
3988       * @since 6.7.0
3989       *
3990       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3991       *
3992       * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inselect
3993       * @see WP_HTML_Processor::step
3994       *
3995       * @return bool Whether an element was found.
3996       */
3997  	private function step_in_select(): bool {
3998          $token_name = $this->get_token_name();
3999          $token_type = $this->get_token_type();
4000          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
4001          $op         = "{$op_sigil}{$token_name}";
4002  
4003          switch ( $op ) {
4004              /*
4005               * > Any other character token
4006               */
4007              case '#text':
4008                  /*
4009                   * > A character token that is U+0000 NULL
4010                   *
4011                   * If a text node only comprises null bytes then it should be
4012                   * entirely ignored and should not return to calling code.
4013                   */
4014                  if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
4015                      // Parse error: ignore the token.
4016                      return $this->step();
4017                  }
4018  
4019                  $this->insert_html_element( $this->state->current_token );
4020                  return true;
4021  
4022              /*
4023               * > A comment token
4024               */
4025              case '#comment':
4026              case '#funky-comment':
4027              case '#presumptuous-tag':
4028                  $this->insert_html_element( $this->state->current_token );
4029                  return true;
4030  
4031              /*
4032               * > A DOCTYPE token
4033               */
4034              case 'html':
4035                  // Parse error: ignore the token.
4036                  return $this->step();
4037  
4038              /*
4039               * > A start tag whose tag name is "html"
4040               */
4041              case '+HTML':
4042                  return $this->step_in_body();
4043  
4044              /*
4045               * > A start tag whose tag name is "option"
4046               */
4047              case '+OPTION':
4048                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4049                      $this->state->stack_of_open_elements->pop();
4050                  }
4051                  $this->insert_html_element( $this->state->current_token );
4052                  return true;
4053  
4054              /*
4055               * > A start tag whose tag name is "optgroup"
4056               * > A start tag whose tag name is "hr"
4057               *
4058               * These rules are identical except for the treatment of the self-closing flag and
4059               * the subsequent pop of the HR void element, all of which is handled elsewhere in the processor.
4060               */
4061              case '+OPTGROUP':
4062              case '+HR':
4063                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4064                      $this->state->stack_of_open_elements->pop();
4065                  }
4066  
4067                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4068                      $this->state->stack_of_open_elements->pop();
4069                  }
4070  
4071                  $this->insert_html_element( $this->state->current_token );
4072                  return true;
4073  
4074              /*
4075               * > An end tag whose tag name is "optgroup"
4076               */
4077              case '-OPTGROUP':
4078                  $current_node = $this->state->stack_of_open_elements->current_node();
4079                  if ( $current_node && 'OPTION' === $current_node->node_name ) {
4080                      foreach ( $this->state->stack_of_open_elements->walk_up( $current_node ) as $parent ) {
4081                          break;
4082                      }
4083                      if ( $parent && 'OPTGROUP' === $parent->node_name ) {
4084                          $this->state->stack_of_open_elements->pop();
4085                      }
4086                  }
4087  
4088                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4089                      $this->state->stack_of_open_elements->pop();
4090                      return true;
4091                  }
4092  
4093                  // Parse error: ignore the token.
4094                  return $this->step();
4095  
4096              /*
4097               * > An end tag whose tag name is "option"
4098               */
4099              case '-OPTION':
4100                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4101                      $this->state->stack_of_open_elements->pop();
4102                      return true;
4103                  }
4104  
4105                  // Parse error: ignore the token.
4106                  return $this->step();
4107  
4108              /*
4109               * > An end tag whose tag name is "select"
4110               * > A start tag whose tag name is "select"
4111               *
4112               * > It just gets treated like an end tag.
4113               */
4114              case '-SELECT':
4115              case '+SELECT':
4116                  if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4117                      // Parse error: ignore the token.
4118                      return $this->step();
4119                  }
4120                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4121                  $this->reset_insertion_mode_appropriately();
4122                  return true;
4123  
4124              /*
4125               * > A start tag whose tag name is one of: "input", "keygen", "textarea"
4126               *
4127               * All three of these tags are considered a parse error when found in this insertion mode.
4128               */
4129              case '+INPUT':
4130              case '+KEYGEN':
4131              case '+TEXTAREA':
4132                  if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4133                      // Ignore the token.
4134                      return $this->step();
4135                  }
4136                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4137                  $this->reset_insertion_mode_appropriately();
4138                  return $this->step( self::REPROCESS_CURRENT_NODE );
4139  
4140              /*
4141               * > A start tag whose tag name is one of: "script", "template"
4142               * > An end tag whose tag name is "template"
4143               */
4144              case '+SCRIPT':
4145              case '+TEMPLATE':
4146              case '-TEMPLATE':
4147                  return $this->step_in_head();
4148          }
4149  
4150          /*
4151           * > Anything else
4152           * >   Parse error: ignore the token.
4153           */
4154          return $this->step();
4155      }
4156  
4157      /**
4158       * Parses next element in the 'in select in table' insertion mode.
4159       *
4160       * This internal function performs the 'in select in table' insertion mode
4161       * logic for the generalized WP_HTML_Processor::step() function.
4162       *
4163       * @since 6.7.0
4164       *
4165       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4166       *
4167       * @see https://html.spec.whatwg.org/#parsing-main-inselectintable
4168       * @see WP_HTML_Processor::step
4169       *
4170       * @return bool Whether an element was found.
4171       */
4172  	private function step_in_select_in_table(): bool {
4173          $token_name = $this->get_token_name();
4174          $token_type = $this->get_token_type();
4175          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
4176          $op         = "{$op_sigil}{$token_name}";
4177  
4178          switch ( $op ) {
4179              /*
4180               * > A start tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4181               */
4182              case '+CAPTION':
4183              case '+TABLE':
4184              case '+TBODY':
4185              case '+TFOOT':
4186              case '+THEAD':
4187              case '+TR':
4188              case '+TD':
4189              case '+TH':
4190                  // @todo Indicate a parse error once it's possible.
4191                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4192                  $this->reset_insertion_mode_appropriately();
4193                  return $this->step( self::REPROCESS_CURRENT_NODE );
4194  
4195              /*
4196               * > An end tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4197               */
4198              case '-CAPTION':
4199              case '-TABLE':
4200              case '-TBODY':
4201              case '-TFOOT':
4202              case '-THEAD':
4203              case '-TR':
4204              case '-TD':
4205              case '-TH':
4206                  // @todo Indicate a parse error once it's possible.
4207                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $token_name ) ) {
4208                      return $this->step();
4209                  }
4210                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4211                  $this->reset_insertion_mode_appropriately();
4212                  return $this->step( self::REPROCESS_CURRENT_NODE );
4213          }
4214  
4215          /*
4216           * > Anything else
4217           */
4218          return $this->step_in_select();
4219      }
4220  
4221      /**
4222       * Parses next element in the 'in template' insertion mode.
4223       *
4224       * This internal function performs the 'in template' insertion mode
4225       * logic for the generalized WP_HTML_Processor::step() function.
4226       *
4227       * @since 6.7.0 Stub implementation.
4228       *
4229       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4230       *
4231       * @see https://html.spec.whatwg.org/#parsing-main-intemplate
4232       * @see WP_HTML_Processor::step
4233       *
4234       * @return bool Whether an element was found.
4235       */
4236  	private function step_in_template(): bool {
4237          $token_name = $this->get_token_name();
4238          $token_type = $this->get_token_type();
4239          $is_closer  = $this->is_tag_closer();
4240          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
4241          $op         = "{$op_sigil}{$token_name}";
4242  
4243          switch ( $op ) {
4244              /*
4245               * > A character token
4246               * > A comment token
4247               * > A DOCTYPE token
4248               */
4249              case '#text':
4250              case '#comment':
4251              case '#funky-comment':
4252              case '#presumptuous-tag':
4253              case 'html':
4254                  return $this->step_in_body();
4255  
4256              /*
4257               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
4258               * > "meta", "noframes", "script", "style", "template", "title"
4259               * > An end tag whose tag name is "template"
4260               */
4261              case '+BASE':
4262              case '+BASEFONT':
4263              case '+BGSOUND':
4264              case '+LINK':
4265              case '+META':
4266              case '+NOFRAMES':
4267              case '+SCRIPT':
4268              case '+STYLE':
4269              case '+TEMPLATE':
4270              case '+TITLE':
4271              case '-TEMPLATE':
4272                  return $this->step_in_head();
4273  
4274              /*
4275               * > A start tag whose tag name is one of: "caption", "colgroup", "tbody", "tfoot", "thead"
4276               */
4277              case '+CAPTION':
4278              case '+COLGROUP':
4279              case '+TBODY':
4280              case '+TFOOT':
4281              case '+THEAD':
4282                  array_pop( $this->state->stack_of_template_insertion_modes );
4283                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4284                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4285                  return $this->step( self::REPROCESS_CURRENT_NODE );
4286  
4287              /*
4288               * > A start tag whose tag name is "col"
4289               */
4290              case '+COL':
4291                  array_pop( $this->state->stack_of_template_insertion_modes );
4292                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4293                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4294                  return $this->step( self::REPROCESS_CURRENT_NODE );
4295  
4296              /*
4297               * > A start tag whose tag name is "tr"
4298               */
4299              case '+TR':
4300                  array_pop( $this->state->stack_of_template_insertion_modes );
4301                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4302                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4303                  return $this->step( self::REPROCESS_CURRENT_NODE );
4304  
4305              /*
4306               * > A start tag whose tag name is one of: "td", "th"
4307               */
4308              case '+TD':
4309              case '+TH':
4310                  array_pop( $this->state->stack_of_template_insertion_modes );
4311                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4312                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4313                  return $this->step( self::REPROCESS_CURRENT_NODE );
4314          }
4315  
4316          /*
4317           * > Any other start tag
4318           */
4319          if ( ! $is_closer ) {
4320              array_pop( $this->state->stack_of_template_insertion_modes );
4321              $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4322              $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4323              return $this->step( self::REPROCESS_CURRENT_NODE );
4324          }
4325  
4326          /*
4327           * > Any other end tag
4328           */
4329          if ( $is_closer ) {
4330              // Parse error: ignore the token.
4331              return $this->step();
4332          }
4333  
4334          /*
4335           * > An end-of-file token
4336           */
4337          if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
4338              // Stop parsing.
4339              return false;
4340          }
4341  
4342          // @todo Indicate a parse error once it's possible.
4343          $this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
4344          $this->state->active_formatting_elements->clear_up_to_last_marker();
4345          array_pop( $this->state->stack_of_template_insertion_modes );
4346          $this->reset_insertion_mode_appropriately();
4347          return $this->step( self::REPROCESS_CURRENT_NODE );
4348      }
4349  
4350      /**
4351       * Parses next element in the 'after body' insertion mode.
4352       *
4353       * This internal function performs the 'after body' insertion mode
4354       * logic for the generalized WP_HTML_Processor::step() function.
4355       *
4356       * @since 6.7.0 Stub implementation.
4357       *
4358       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4359       *
4360       * @see https://html.spec.whatwg.org/#parsing-main-afterbody
4361       * @see WP_HTML_Processor::step
4362       *
4363       * @return bool Whether an element was found.
4364       */
4365  	private function step_after_body(): bool {
4366          $tag_name   = $this->get_token_name();
4367          $token_type = $this->get_token_type();
4368          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4369          $op         = "{$op_sigil}{$tag_name}";
4370  
4371          switch ( $op ) {
4372              /*
4373               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4374               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4375               *
4376               * > Process the token using the rules for the "in body" insertion mode.
4377               */
4378              case '#text':
4379                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4380                      return $this->step_in_body();
4381                  }
4382                  goto after_body_anything_else;
4383                  break;
4384  
4385              /*
4386               * > A comment token
4387               */
4388              case '#comment':
4389              case '#funky-comment':
4390              case '#presumptuous-tag':
4391                  $this->bail( 'Content outside of BODY is unsupported.' );
4392                  break;
4393  
4394              /*
4395               * > A DOCTYPE token
4396               */
4397              case 'html':
4398                  // Parse error: ignore the token.
4399                  return $this->step();
4400  
4401              /*
4402               * > A start tag whose tag name is "html"
4403               */
4404              case '+HTML':
4405                  return $this->step_in_body();
4406  
4407              /*
4408               * > An end tag whose tag name is "html"
4409               *
4410               * > If the parser was created as part of the HTML fragment parsing algorithm,
4411               * > this is a parse error; ignore the token. (fragment case)
4412               * >
4413               * > Otherwise, switch the insertion mode to "after after body".
4414               */
4415              case '-HTML':
4416                  if ( isset( $this->context_node ) ) {
4417                      return $this->step();
4418                  }
4419  
4420                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY;
4421                  /*
4422                   * The HTML element is not removed from the stack of open elements.
4423                   * Only internal state has changed, this does not qualify as a "step"
4424                   * in terms of advancing through the document to another token.
4425                   * Nothing has been pushed or popped.
4426                   * Proceed to parse the next item.
4427                   */
4428                  return $this->step();
4429          }
4430  
4431          /*
4432           * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4433           */
4434          after_body_anything_else:
4435          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4436          return $this->step( self::REPROCESS_CURRENT_NODE );
4437      }
4438  
4439      /**
4440       * Parses next element in the 'in frameset' insertion mode.
4441       *
4442       * This internal function performs the 'in frameset' insertion mode
4443       * logic for the generalized WP_HTML_Processor::step() function.
4444       *
4445       * @since 6.7.0 Stub implementation.
4446       *
4447       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4448       *
4449       * @see https://html.spec.whatwg.org/#parsing-main-inframeset
4450       * @see WP_HTML_Processor::step
4451       *
4452       * @return bool Whether an element was found.
4453       */
4454  	private function step_in_frameset(): bool {
4455          $tag_name   = $this->get_token_name();
4456          $token_type = $this->get_token_type();
4457          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4458          $op         = "{$op_sigil}{$tag_name}";
4459  
4460          switch ( $op ) {
4461              /*
4462               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4463               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4464               * >
4465               * > Insert the character.
4466               *
4467               * This algorithm effectively strips non-whitespace characters from text and inserts
4468               * them under HTML. This is not supported at this time.
4469               */
4470              case '#text':
4471                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4472                      return $this->step_in_body();
4473                  }
4474                  $this->bail( 'Non-whitespace characters cannot be handled in frameset.' );
4475                  break;
4476  
4477              /*
4478               * > A comment token
4479               */
4480              case '#comment':
4481              case '#funky-comment':
4482              case '#presumptuous-tag':
4483                  $this->insert_html_element( $this->state->current_token );
4484                  return true;
4485  
4486              /*
4487               * > A DOCTYPE token
4488               */
4489              case 'html':
4490                  // Parse error: ignore the token.
4491                  return $this->step();
4492  
4493              /*
4494               * > A start tag whose tag name is "html"
4495               */
4496              case '+HTML':
4497                  return $this->step_in_body();
4498  
4499              /*
4500               * > A start tag whose tag name is "frameset"
4501               */
4502              case '+FRAMESET':
4503                  $this->insert_html_element( $this->state->current_token );
4504                  return true;
4505  
4506              /*
4507               * > An end tag whose tag name is "frameset"
4508               */
4509              case '-FRAMESET':
4510                  /*
4511                   * > If the current node is the root html element, then this is a parse error;
4512                   * > ignore the token. (fragment case)
4513                   */
4514                  if ( $this->state->stack_of_open_elements->current_node_is( 'HTML' ) ) {
4515                      return $this->step();
4516                  }
4517  
4518                  /*
4519                   * > Otherwise, pop the current node from the stack of open elements.
4520                   */
4521                  $this->state->stack_of_open_elements->pop();
4522  
4523                  /*
4524                   * > If the parser was not created as part of the HTML fragment parsing algorithm
4525                   * > (fragment case), and the current node is no longer a frameset element, then
4526                   * > switch the insertion mode to "after frameset".
4527                   */
4528                  if ( ! isset( $this->context_node ) && ! $this->state->stack_of_open_elements->current_node_is( 'FRAMESET' ) ) {
4529                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET;
4530                  }
4531  
4532                  return true;
4533  
4534              /*
4535               * > A start tag whose tag name is "frame"
4536               *
4537               * > Insert an HTML element for the token. Immediately pop the
4538               * > current node off the stack of open elements.
4539               * >
4540               * > Acknowledge the token's self-closing flag, if it is set.
4541               */
4542              case '+FRAME':
4543                  $this->insert_html_element( $this->state->current_token );
4544                  $this->state->stack_of_open_elements->pop();
4545                  return true;
4546  
4547              /*
4548               * > A start tag whose tag name is "noframes"
4549               */
4550              case '+NOFRAMES':
4551                  return $this->step_in_head();
4552          }
4553  
4554          // Parse error: ignore the token.
4555          return $this->step();
4556      }
4557  
4558      /**
4559       * Parses next element in the 'after frameset' insertion mode.
4560       *
4561       * This internal function performs the 'after frameset' insertion mode
4562       * logic for the generalized WP_HTML_Processor::step() function.
4563       *
4564       * @since 6.7.0 Stub implementation.
4565       *
4566       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4567       *
4568       * @see https://html.spec.whatwg.org/#parsing-main-afterframeset
4569       * @see WP_HTML_Processor::step
4570       *
4571       * @return bool Whether an element was found.
4572       */
4573  	private function step_after_frameset(): bool {
4574          $tag_name   = $this->get_token_name();
4575          $token_type = $this->get_token_type();
4576          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4577          $op         = "{$op_sigil}{$tag_name}";
4578  
4579          switch ( $op ) {
4580              /*
4581               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4582               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4583               * >
4584               * > Insert the character.
4585               *
4586               * This algorithm effectively strips non-whitespace characters from text and inserts
4587               * them under HTML. This is not supported at this time.
4588               */
4589              case '#text':
4590                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4591                      return $this->step_in_body();
4592                  }
4593                  $this->bail( 'Non-whitespace characters cannot be handled in after frameset' );
4594                  break;
4595  
4596              /*
4597               * > A comment token
4598               */
4599              case '#comment':
4600              case '#funky-comment':
4601              case '#presumptuous-tag':
4602                  $this->insert_html_element( $this->state->current_token );
4603                  return true;
4604  
4605              /*
4606               * > A DOCTYPE token
4607               */
4608              case 'html':
4609                  // Parse error: ignore the token.
4610                  return $this->step();
4611  
4612              /*
4613               * > A start tag whose tag name is "html"
4614               */
4615              case '+HTML':
4616                  return $this->step_in_body();
4617  
4618              /*
4619               * > An end tag whose tag name is "html"
4620               */
4621              case '-HTML':
4622                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET;
4623                  /*
4624                   * The HTML element is not removed from the stack of open elements.
4625                   * Only internal state has changed, this does not qualify as a "step"
4626                   * in terms of advancing through the document to another token.
4627                   * Nothing has been pushed or popped.
4628                   * Proceed to parse the next item.
4629                   */
4630                  return $this->step();
4631  
4632              /*
4633               * > A start tag whose tag name is "noframes"
4634               */
4635              case '+NOFRAMES':
4636                  return $this->step_in_head();
4637          }
4638  
4639          // Parse error: ignore the token.
4640          return $this->step();
4641      }
4642  
4643      /**
4644       * Parses next element in the 'after after body' insertion mode.
4645       *
4646       * This internal function performs the 'after after body' insertion mode
4647       * logic for the generalized WP_HTML_Processor::step() function.
4648       *
4649       * @since 6.7.0 Stub implementation.
4650       *
4651       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4652       *
4653       * @see https://html.spec.whatwg.org/#the-after-after-body-insertion-mode
4654       * @see WP_HTML_Processor::step
4655       *
4656       * @return bool Whether an element was found.
4657       */
4658  	private function step_after_after_body(): bool {
4659          $tag_name   = $this->get_token_name();
4660          $token_type = $this->get_token_type();
4661          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4662          $op         = "{$op_sigil}{$tag_name}";
4663  
4664          switch ( $op ) {
4665              /*
4666               * > A comment token
4667               */
4668              case '#comment':
4669              case '#funky-comment':
4670              case '#presumptuous-tag':
4671                  $this->bail( 'Content outside of HTML is unsupported.' );
4672                  break;
4673  
4674              /*
4675               * > A DOCTYPE token
4676               * > A start tag whose tag name is "html"
4677               *
4678               * > Process the token using the rules for the "in body" insertion mode.
4679               */
4680              case 'html':
4681              case '+HTML':
4682                  return $this->step_in_body();
4683  
4684              /*
4685               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4686               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4687               * >
4688               * > Process the token using the rules for the "in body" insertion mode.
4689               */
4690              case '#text':
4691                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4692                      return $this->step_in_body();
4693                  }
4694                  goto after_after_body_anything_else;
4695                  break;
4696          }
4697  
4698          /*
4699           * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4700           */
4701          after_after_body_anything_else:
4702          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4703          return $this->step( self::REPROCESS_CURRENT_NODE );
4704      }
4705  
4706      /**
4707       * Parses next element in the 'after after frameset' insertion mode.
4708       *
4709       * This internal function performs the 'after after frameset' insertion mode
4710       * logic for the generalized WP_HTML_Processor::step() function.
4711       *
4712       * @since 6.7.0 Stub implementation.
4713       *
4714       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4715       *
4716       * @see https://html.spec.whatwg.org/#the-after-after-frameset-insertion-mode
4717       * @see WP_HTML_Processor::step
4718       *
4719       * @return bool Whether an element was found.
4720       */
4721  	private function step_after_after_frameset(): bool {
4722          $tag_name   = $this->get_token_name();
4723          $token_type = $this->get_token_type();
4724          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4725          $op         = "{$op_sigil}{$tag_name}";
4726  
4727          switch ( $op ) {
4728              /*
4729               * > A comment token
4730               */
4731              case '#comment':
4732              case '#funky-comment':
4733              case '#presumptuous-tag':
4734                  $this->bail( 'Content outside of HTML is unsupported.' );
4735                  break;
4736  
4737              /*
4738               * > A DOCTYPE token
4739               * > A start tag whose tag name is "html"
4740               *
4741               * > Process the token using the rules for the "in body" insertion mode.
4742               */
4743              case 'html':
4744              case '+HTML':
4745                  return $this->step_in_body();
4746  
4747              /*
4748               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4749               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4750               * >
4751               * > Process the token using the rules for the "in body" insertion mode.
4752               *
4753               * This algorithm effectively strips non-whitespace characters from text and inserts
4754               * them under HTML. This is not supported at this time.
4755               */
4756              case '#text':
4757                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4758                      return $this->step_in_body();
4759                  }
4760                  $this->bail( 'Non-whitespace characters cannot be handled in after after frameset.' );
4761                  break;
4762  
4763              /*
4764               * > A start tag whose tag name is "noframes"
4765               */
4766              case '+NOFRAMES':
4767                  return $this->step_in_head();
4768          }
4769  
4770          // Parse error: ignore the token.
4771          return $this->step();
4772      }
4773  
4774      /**
4775       * Parses next element in the 'in foreign content' insertion mode.
4776       *
4777       * This internal function performs the 'in foreign content' insertion mode
4778       * logic for the generalized WP_HTML_Processor::step() function.
4779       *
4780       * @since 6.7.0 Stub implementation.
4781       *
4782       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4783       *
4784       * @see https://html.spec.whatwg.org/#parsing-main-inforeign
4785       * @see WP_HTML_Processor::step
4786       *
4787       * @return bool Whether an element was found.
4788       */
4789  	private function step_in_foreign_content(): bool {
4790          $tag_name   = $this->get_token_name();
4791          $token_type = $this->get_token_type();
4792          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4793          $op         = "{$op_sigil}{$tag_name}";
4794  
4795          /*
4796           * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4797           *
4798           * This section drawn out above the switch to more easily incorporate
4799           * the additional rules based on the presence of the attributes.
4800           */
4801          if (
4802              '+FONT' === $op &&
4803              (
4804                  null !== $this->get_attribute( 'color' ) ||
4805                  null !== $this->get_attribute( 'face' ) ||
4806                  null !== $this->get_attribute( 'size' )
4807              )
4808          ) {
4809              $op = '+FONT with attributes';
4810          }
4811  
4812          switch ( $op ) {
4813              case '#text':
4814                  /*
4815                   * > A character token that is U+0000 NULL
4816                   *
4817                   * This is handled by `get_modifiable_text()`.
4818                   */
4819  
4820                  /*
4821                   * Whitespace-only text does not affect the frameset-ok flag.
4822                   * It is probably inter-element whitespace, but it may also
4823                   * contain character references which decode only to whitespace.
4824                   */
4825                  if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
4826                      $this->state->frameset_ok = false;
4827                  }
4828  
4829                  $this->insert_foreign_element( $this->state->current_token, false );
4830                  return true;
4831  
4832              /*
4833               * CDATA sections are alternate wrappers for text content and therefore
4834               * ought to follow the same rules as text nodes.
4835               */
4836              case '#cdata-section':
4837                  /*
4838                   * NULL bytes and whitespace do not change the frameset-ok flag.
4839                   */
4840                  $current_token        = $this->bookmarks[ $this->state->current_token->bookmark_name ];
4841                  $cdata_content_start  = $current_token->start + 9;
4842                  $cdata_content_length = $current_token->length - 12;
4843                  if ( strspn( $this->html, "\0 \t\n\f\r", $cdata_content_start, $cdata_content_length ) !== $cdata_content_length ) {
4844                      $this->state->frameset_ok = false;
4845                  }
4846  
4847                  $this->insert_foreign_element( $this->state->current_token, false );
4848                  return true;
4849  
4850              /*
4851               * > A comment token
4852               */
4853              case '#comment':
4854              case '#funky-comment':
4855              case '#presumptuous-tag':
4856                  $this->insert_foreign_element( $this->state->current_token, false );
4857                  return true;
4858  
4859              /*
4860               * > A DOCTYPE token
4861               */
4862              case 'html':
4863                  // Parse error: ignore the token.
4864                  return $this->step();
4865  
4866              /*
4867               * > A start tag whose tag name is "b", "big", "blockquote", "body", "br", "center",
4868               * > "code", "dd", "div", "dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5",
4869               * > "h6", "head", "hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol",
4870               * > "p", "pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup",
4871               * > "table", "tt", "u", "ul", "var"
4872               *
4873               * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4874               *
4875               * > An end tag whose tag name is "br", "p"
4876               *
4877               * Closing BR tags are always reported by the Tag Processor as opening tags.
4878               */
4879              case '+B':
4880              case '+BIG':
4881              case '+BLOCKQUOTE':
4882              case '+BODY':
4883              case '+BR':
4884              case '+CENTER':
4885              case '+CODE':
4886              case '+DD':
4887              case '+DIV':
4888              case '+DL':
4889              case '+DT':
4890              case '+EM':
4891              case '+EMBED':
4892              case '+H1':
4893              case '+H2':
4894              case '+H3':
4895              case '+H4':
4896              case '+H5':
4897              case '+H6':
4898              case '+HEAD':
4899              case '+HR':
4900              case '+I':
4901              case '+IMG':
4902              case '+LI':
4903              case '+LISTING':
4904              case '+MENU':
4905              case '+META':
4906              case '+NOBR':
4907              case '+OL':
4908              case '+P':
4909              case '+PRE':
4910              case '+RUBY':
4911              case '+S':
4912              case '+SMALL':
4913              case '+SPAN':
4914              case '+STRONG':
4915              case '+STRIKE':
4916              case '+SUB':
4917              case '+SUP':
4918              case '+TABLE':
4919              case '+TT':
4920              case '+U':
4921              case '+UL':
4922              case '+VAR':
4923              case '+FONT with attributes':
4924              case '-BR':
4925              case '-P':
4926                  // @todo Indicate a parse error once it's possible.
4927                  foreach ( $this->state->stack_of_open_elements->walk_up() as $current_node ) {
4928                      if (
4929                          'math' === $current_node->integration_node_type ||
4930                          'html' === $current_node->integration_node_type ||
4931                          'html' === $current_node->namespace
4932                      ) {
4933                          break;
4934                      }
4935  
4936                      $this->state->stack_of_open_elements->pop();
4937                  }
4938                  goto in_foreign_content_process_in_current_insertion_mode;
4939          }
4940  
4941          /*
4942           * > Any other start tag
4943           */
4944          if ( ! $this->is_tag_closer() ) {
4945              $this->insert_foreign_element( $this->state->current_token, false );
4946  
4947              /*
4948               * > If the token has its self-closing flag set, then run
4949               * > the appropriate steps from the following list:
4950               * >
4951               * >   ↪ the token's tag name is "script", and the new current node is in the SVG namespace
4952               * >         Acknowledge the token's self-closing flag, and then act as
4953               * >         described in the steps for a "script" end tag below.
4954               * >
4955               * >   ↪ Otherwise
4956               * >         Pop the current node off the stack of open elements and
4957               * >         acknowledge the token's self-closing flag.
4958               *
4959               * Since the rules for SCRIPT below indicate to pop the element off of the stack of
4960               * open elements, which is the same for the Otherwise condition, there's no need to
4961               * separate these checks. The difference comes when a parser operates with the scripting
4962               * flag enabled, and executes the script, which this parser does not support.
4963               */
4964              if ( $this->state->current_token->has_self_closing_flag ) {
4965                  $this->state->stack_of_open_elements->pop();
4966              }
4967              return true;
4968          }
4969  
4970          /*
4971           * > An end tag whose name is "script", if the current node is an SVG script element.
4972           */
4973          if ( $this->is_tag_closer() && 'SCRIPT' === $this->state->current_token->node_name && 'svg' === $this->state->current_token->namespace ) {
4974              $this->state->stack_of_open_elements->pop();
4975              return true;
4976          }
4977  
4978          /*
4979           * > Any other end tag
4980           */
4981          if ( $this->is_tag_closer() ) {
4982              $node = $this->state->stack_of_open_elements->current_node();
4983              if ( $tag_name !== $node->node_name ) {
4984                  // @todo Indicate a parse error once it's possible.
4985              }
4986              in_foreign_content_end_tag_loop:
4987              if ( $node === $this->state->stack_of_open_elements->at( 1 ) ) {
4988                  return true;
4989              }
4990  
4991              /*
4992               * > If node's tag name, converted to ASCII lowercase, is the same as the tag name
4993               * > of the token, pop elements from the stack of open elements until node has
4994               * > been popped from the stack, and then return.
4995               */
4996              if ( 0 === strcasecmp( $node->node_name, $tag_name ) ) {
4997                  foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
4998                      $this->state->stack_of_open_elements->pop();
4999                      if ( $node === $item ) {
5000                          return true;
5001                      }
5002                  }
5003              }
5004  
5005              foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
5006                  $node = $item;
5007                  break;
5008              }
5009  
5010              if ( 'html' !== $node->namespace ) {
5011                  goto in_foreign_content_end_tag_loop;
5012              }
5013  
5014              in_foreign_content_process_in_current_insertion_mode:
5015              switch ( $this->state->insertion_mode ) {
5016                  case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
5017                      return $this->step_initial();
5018  
5019                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
5020                      return $this->step_before_html();
5021  
5022                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
5023                      return $this->step_before_head();
5024  
5025                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
5026                      return $this->step_in_head();
5027  
5028                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
5029                      return $this->step_in_head_noscript();
5030  
5031                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
5032                      return $this->step_after_head();
5033  
5034                  case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
5035                      return $this->step_in_body();
5036  
5037                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
5038                      return $this->step_in_table();
5039  
5040                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
5041                      return $this->step_in_table_text();
5042  
5043                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
5044                      return $this->step_in_caption();
5045  
5046                  case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
5047                      return $this->step_in_column_group();
5048  
5049                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
5050                      return $this->step_in_table_body();
5051  
5052                  case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
5053                      return $this->step_in_row();
5054  
5055                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
5056                      return $this->step_in_cell();
5057  
5058                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
5059                      return $this->step_in_select();
5060  
5061                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
5062                      return $this->step_in_select_in_table();
5063  
5064                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
5065                      return $this->step_in_template();
5066  
5067                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
5068                      return $this->step_after_body();
5069  
5070                  case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
5071                      return $this->step_in_frameset();
5072  
5073                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
5074                      return $this->step_after_frameset();
5075  
5076                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
5077                      return $this->step_after_after_body();
5078  
5079                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
5080                      return $this->step_after_after_frameset();
5081  
5082                  // This should be unreachable but PHP doesn't have total type checking on switch.
5083                  default:
5084                      $this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
5085              }
5086          }
5087  
5088          $this->bail( 'Should not have been able to reach end of IN FOREIGN CONTENT processing. Check HTML API code.' );
5089          // This unnecessary return prevents tools from inaccurately reporting type errors.
5090          return false;
5091      }
5092  
5093      /*
5094       * Internal helpers
5095       */
5096  
5097      /**
5098       * Creates a new bookmark for the currently-matched token and returns the generated name.
5099       *
5100       * @since 6.4.0
5101       * @since 6.5.0 Renamed from bookmark_tag() to bookmark_token().
5102       *
5103       * @throws Exception When unable to allocate requested bookmark.
5104       *
5105       * @return string|false Name of created bookmark, or false if unable to create.
5106       */
5107  	private function bookmark_token() {
5108          if ( ! parent::set_bookmark( ++$this->bookmark_counter ) ) {
5109              $this->last_error = self::ERROR_EXCEEDED_MAX_BOOKMARKS;
5110              throw new Exception( 'could not allocate bookmark' );
5111          }
5112  
5113          return "{$this->bookmark_counter}";
5114      }
5115  
5116      /*
5117       * HTML semantic overrides for Tag Processor
5118       */
5119  
5120      /**
5121       * Indicates the namespace of the current token, or "html" if there is none.
5122       *
5123       * @return string One of "html", "math", or "svg".
5124       */
5125  	public function get_namespace(): string {
5126          if ( ! isset( $this->current_element ) ) {
5127              return parent::get_namespace();
5128          }
5129  
5130          return $this->current_element->token->namespace;
5131      }
5132  
5133      /**
5134       * Returns the uppercase name of the matched tag.
5135       *
5136       * The semantic rules for HTML specify that certain tags be reprocessed
5137       * with a different tag name. Because of this, the tag name presented
5138       * by the HTML Processor may differ from the one reported by the HTML
5139       * Tag Processor, which doesn't apply these semantic rules.
5140       *
5141       * Example:
5142       *
5143       *     $processor = new WP_HTML_Tag_Processor( '<div class="test">Test</div>' );
5144       *     $processor->next_tag() === true;
5145       *     $processor->get_tag() === 'DIV';
5146       *
5147       *     $processor->next_tag() === false;
5148       *     $processor->get_tag() === null;
5149       *
5150       * @since 6.4.0
5151       *
5152       * @return string|null Name of currently matched tag in input HTML, or `null` if none found.
5153       */
5154  	public function get_tag(): ?string {
5155          if ( null !== $this->last_error ) {
5156              return null;
5157          }
5158  
5159          if ( $this->is_virtual() ) {
5160              return $this->current_element->token->node_name;
5161          }
5162  
5163          $tag_name = parent::get_tag();
5164  
5165          /*
5166           * > A start tag whose tag name is "image"
5167           * > Change the token's tag name to "img" and reprocess it. (Don't ask.)
5168           */
5169          return ( 'IMAGE' === $tag_name && 'html' === $this->get_namespace() )
5170              ? 'IMG'
5171              : $tag_name;
5172      }
5173  
5174      /**
5175       * Indicates if the currently matched tag contains the self-closing flag.
5176       *
5177       * No HTML elements ought to have the self-closing flag and for those, the self-closing
5178       * flag will be ignored. For void elements this is benign because they "self close"
5179       * automatically. For non-void HTML elements though problems will appear if someone
5180       * intends to use a self-closing element in place of that element with an empty body.
5181       * For HTML foreign elements and custom elements the self-closing flag determines if
5182       * they self-close or not.
5183       *
5184       * This function does not determine if a tag is self-closing,
5185       * but only if the self-closing flag is present in the syntax.
5186       *
5187       * @since 6.6.0 Subclassed for the HTML Processor.
5188       *
5189       * @return bool Whether the currently matched tag contains the self-closing flag.
5190       */
5191  	public function has_self_closing_flag(): bool {
5192          return $this->is_virtual() ? false : parent::has_self_closing_flag();
5193      }
5194  
5195      /**
5196       * Returns the node name represented by the token.
5197       *
5198       * This matches the DOM API value `nodeName`. Some values
5199       * are static, such as `#text` for a text node, while others
5200       * are dynamically generated from the token itself.
5201       *
5202       * Dynamic names:
5203       *  - Uppercase tag name for tag matches.
5204       *  - `html` for DOCTYPE declarations.
5205       *
5206       * Note that if the Tag Processor is not matched on a token
5207       * then this function will return `null`, either because it
5208       * hasn't yet found a token or because it reached the end
5209       * of the document without matching a token.
5210       *
5211       * @since 6.6.0 Subclassed for the HTML Processor.
5212       *
5213       * @return string|null Name of the matched token.
5214       */
5215  	public function get_token_name(): ?string {
5216          return $this->is_virtual()
5217              ? $this->current_element->token->node_name
5218              : parent::get_token_name();
5219      }
5220  
5221      /**
5222       * Indicates the kind of matched token, if any.
5223       *
5224       * This differs from `get_token_name()` in that it always
5225       * returns a static string indicating the type, whereas
5226       * `get_token_name()` may return values derived from the
5227       * token itself, such as a tag name or processing
5228       * instruction tag.
5229       *
5230       * Possible values:
5231       *  - `#tag` when matched on a tag.
5232       *  - `#text` when matched on a text node.
5233       *  - `#cdata-section` when matched on a CDATA node.
5234       *  - `#comment` when matched on a comment.
5235       *  - `#doctype` when matched on a DOCTYPE declaration.
5236       *  - `#presumptuous-tag` when matched on an empty tag closer.
5237       *  - `#funky-comment` when matched on a funky comment.
5238       *
5239       * @since 6.6.0 Subclassed for the HTML Processor.
5240       *
5241       * @return string|null What kind of token is matched, or null.
5242       */
5243  	public function get_token_type(): ?string {
5244          if ( $this->is_virtual() ) {
5245              /*
5246               * This logic comes from the Tag Processor.
5247               *
5248               * @todo It would be ideal not to repeat this here, but it's not clearly
5249               *       better to allow passing a token name to `get_token_type()`.
5250               */
5251              $node_name     = $this->current_element->token->node_name;
5252              $starting_char = $node_name[0];
5253              if ( 'A' <= $starting_char && 'Z' >= $starting_char ) {
5254                  return '#tag';
5255              }
5256  
5257              if ( 'html' === $node_name ) {
5258                  return '#doctype';
5259              }
5260  
5261              return $node_name;
5262          }
5263  
5264          return parent::get_token_type();
5265      }
5266  
5267      /**
5268       * Returns the value of a requested attribute from a matched tag opener if that attribute exists.
5269       *
5270       * Example:
5271       *
5272       *     $p = WP_HTML_Processor::create_fragment( '<div enabled class="test" data-test-id="14">Test</div>' );
5273       *     $p->next_token() === true;
5274       *     $p->get_attribute( 'data-test-id' ) === '14';
5275       *     $p->get_attribute( 'enabled' ) === true;
5276       *     $p->get_attribute( 'aria-label' ) === null;
5277       *
5278       *     $p->next_tag() === false;
5279       *     $p->get_attribute( 'class' ) === null;
5280       *
5281       * @since 6.6.0 Subclassed for HTML Processor.
5282       *
5283       * @param string $name Name of attribute whose value is requested.
5284       * @return string|true|null Value of attribute or `null` if not available. Boolean attributes return `true`.
5285       */
5286  	public function get_attribute( $name ) {
5287          return $this->is_virtual() ? null : parent::get_attribute( $name );
5288      }
5289  
5290      /**
5291       * Updates or creates a new attribute on the currently matched tag with the passed value.
5292       *
5293       * This function handles all necessary HTML encoding. Provide normal, unescaped string values.
5294       * The HTML API will encode the strings appropriately so that the browser will interpret them
5295       * as the intended value.
5296       *
5297       * Example:
5298       *
5299       *     // Renders “Eggs & Milk” in a browser, encoded as `<abbr title="Eggs &amp; Milk">`.
5300       *     $processor->set_attribute( 'title', 'Eggs & Milk' );
5301       *
5302       *     // Renders “Eggs &amp; Milk” in a browser, encoded as `<abbr title="Eggs &amp;amp; Milk">`.
5303       *     $processor->set_attribute( 'title', 'Eggs &amp; Milk' );
5304       *
5305       *     // Renders `true` as `<abbr title>`.
5306       *     $processor->set_attribute( 'title', true );
5307       *
5308       *     // Renders without the attribute for `false` as `<abbr>`.
5309       *     $processor->set_attribute( 'title', false );
5310       *
5311       * Special handling is provided for boolean attribute values:
5312       *  - When `true` is passed as the value, then only the attribute name is added to the tag.
5313       *  - When `false` is passed, the attribute gets removed if it existed before.
5314       *
5315       * @since 6.6.0 Subclassed for the HTML Processor.
5316       * @since 6.9.0 Escapes all character references instead of trying to avoid double-escaping.
5317       *
5318       * @param string      $name  The attribute name to target.
5319       * @param string|bool $value The new attribute value.
5320       * @return bool Whether an attribute value was set.
5321       */
5322  	public function set_attribute( $name, $value ): bool {
5323          return $this->is_virtual() ? false : parent::set_attribute( $name, $value );
5324      }
5325  
5326      /**
5327       * Remove an attribute from the currently-matched tag.
5328       *
5329       * @since 6.6.0 Subclassed for HTML Processor.
5330       *
5331       * @param string $name The attribute name to remove.
5332       * @return bool Whether an attribute was removed.
5333       */
5334  	public function remove_attribute( $name ): bool {
5335          return $this->is_virtual() ? false : parent::remove_attribute( $name );
5336      }
5337  
5338      /**
5339       * Gets lowercase names of all attributes matching a given prefix in the current tag.
5340       *
5341       * Note that matching is case-insensitive. This is in accordance with the spec:
5342       *
5343       * > There must never be two or more attributes on
5344       * > the same start tag whose names are an ASCII
5345       * > case-insensitive match for each other.
5346       *     - HTML 5 spec
5347       *
5348       * Example:
5349       *
5350       *     $p = new WP_HTML_Tag_Processor( '<div data-ENABLED class="test" DATA-test-id="14">Test</div>' );
5351       *     $p->next_tag( array( 'class_name' => 'test' ) ) === true;
5352       *     $p->get_attribute_names_with_prefix( 'data-' ) === array( 'data-enabled', 'data-test-id' );
5353       *
5354       *     $p->next_tag() === false;
5355       *     $p->get_attribute_names_with_prefix( 'data-' ) === null;
5356       *
5357       * @since 6.6.0 Subclassed for the HTML Processor.
5358       *
5359       * @see https://html.spec.whatwg.org/multipage/syntax.html#attributes-2:ascii-case-insensitive
5360       *
5361       * @param string $prefix Prefix of requested attribute names.
5362       * @return array|null List of attribute names, or `null` when no tag opener is matched.
5363       */
5364  	public function get_attribute_names_with_prefix( $prefix ): ?array {
5365          return $this->is_virtual() ? null : parent::get_attribute_names_with_prefix( $prefix );
5366      }
5367  
5368      /**
5369       * Adds a new class name to the currently matched tag.
5370       *
5371       * @since 6.6.0 Subclassed for the HTML Processor.
5372       *
5373       * @param string $class_name The class name to add.
5374       * @return bool Whether the class was set to be added.
5375       */
5376  	public function add_class( $class_name ): bool {
5377          return $this->is_virtual() ? false : parent::add_class( $class_name );
5378      }
5379  
5380      /**
5381       * Removes a class name from the currently matched tag.
5382       *
5383       * @since 6.6.0 Subclassed for the HTML Processor.
5384       *
5385       * @param string $class_name The class name to remove.
5386       * @return bool Whether the class was set to be removed.
5387       */
5388  	public function remove_class( $class_name ): bool {
5389          return $this->is_virtual() ? false : parent::remove_class( $class_name );
5390      }
5391  
5392      /**
5393       * Returns if a matched tag contains the given ASCII case-insensitive class name.
5394       *
5395       * @since 6.6.0 Subclassed for the HTML Processor.
5396       *
5397       * @todo When reconstructing active formatting elements with attributes, find a way
5398       *       to indicate if the virtually-reconstructed formatting elements contain the
5399       *       wanted class name.
5400       *
5401       * @param string $wanted_class Look for this CSS class name, ASCII case-insensitive.
5402       * @return bool|null Whether the matched tag contains the given class name, or null if not matched.
5403       */
5404  	public function has_class( $wanted_class ): ?bool {
5405          return $this->is_virtual() ? null : parent::has_class( $wanted_class );
5406      }
5407  
5408      /**
5409       * Generator for a foreach loop to step through each class name for the matched tag.
5410       *
5411       * This generator function is designed to be used inside a "foreach" loop.
5412       *
5413       * Example:
5414       *
5415       *     $p = WP_HTML_Processor::create_fragment( "<div class='free &lt;egg&lt;\tlang-en'>" );
5416       *     $p->next_tag();
5417       *     foreach ( $p->class_list() as $class_name ) {
5418       *         echo "{$class_name} ";
5419       *     }
5420       *     // Outputs: "free <egg> lang-en "
5421       *
5422       * @since 6.6.0 Subclassed for the HTML Processor.
5423       */
5424  	public function class_list() {
5425          return $this->is_virtual() ? null : parent::class_list();
5426      }
5427  
5428      /**
5429       * Returns the modifiable text for a matched token, or an empty string.
5430       *
5431       * Modifiable text is text content that may be read and changed without
5432       * changing the HTML structure of the document around it. This includes
5433       * the contents of `#text` nodes in the HTML as well as the inner
5434       * contents of HTML comments, Processing Instructions, and others, even
5435       * though these nodes aren't part of a parsed DOM tree. They also contain
5436       * the contents of SCRIPT and STYLE tags, of TEXTAREA tags, and of any
5437       * other section in an HTML document which cannot contain HTML markup (DATA).
5438       *
5439       * If a token has no modifiable text then an empty string is returned to
5440       * avoid needless crashing or type errors. An empty string does not mean
5441       * that a token has modifiable text, and a token with modifiable text may
5442       * have an empty string (e.g. a comment with no contents).
5443       *
5444       * @since 6.6.0 Subclassed for the HTML Processor.
5445       *
5446       * @return string
5447       */
5448  	public function get_modifiable_text(): string {
5449          return $this->is_virtual() ? '' : parent::get_modifiable_text();
5450      }
5451  
5452      /**
5453       * Indicates what kind of comment produced the comment node.
5454       *
5455       * Because there are different kinds of HTML syntax which produce
5456       * comments, the Tag Processor tracks and exposes this as a type
5457       * for the comment. Nominally only regular HTML comments exist as
5458       * they are commonly known, but a number of unrelated syntax errors
5459       * also produce comments.
5460       *
5461       * @see self::COMMENT_AS_ABRUPTLY_CLOSED_COMMENT
5462       * @see self::COMMENT_AS_CDATA_LOOKALIKE
5463       * @see self::COMMENT_AS_INVALID_HTML
5464       * @see self::COMMENT_AS_HTML_COMMENT
5465       * @see self::COMMENT_AS_PI_NODE_LOOKALIKE
5466       *
5467       * @since 6.6.0 Subclassed for the HTML Processor.
5468       *
5469       * @return string|null
5470       */
5471  	public function get_comment_type(): ?string {
5472          return $this->is_virtual() ? null : parent::get_comment_type();
5473      }
5474  
5475      /**
5476       * Removes a bookmark that is no longer needed.
5477       *
5478       * Releasing a bookmark frees up the small
5479       * performance overhead it requires.
5480       *
5481       * @since 6.4.0
5482       *
5483       * @param string $bookmark_name Name of the bookmark to remove.
5484       * @return bool Whether the bookmark already existed before removal.
5485       */
5486  	public function release_bookmark( $bookmark_name ): bool {
5487          return parent::release_bookmark( "_{$bookmark_name}" );
5488      }
5489  
5490      /**
5491       * Moves the internal cursor in the HTML Processor to a given bookmark's location.
5492       *
5493       * Be careful! Seeking backwards to a previous location resets the parser to the
5494       * start of the document and reparses the entire contents up until it finds the
5495       * sought-after bookmarked location.
5496       *
5497       * In order to prevent accidental infinite loops, there's a
5498       * maximum limit on the number of times seek() can be called.
5499       *
5500       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
5501       *
5502       * @since 6.4.0
5503       *
5504       * @param string $bookmark_name Jump to the place in the document identified by this bookmark name.
5505       * @return bool Whether the internal cursor was successfully moved to the bookmark's location.
5506       */
5507  	public function seek( $bookmark_name ): bool {
5508          // Flush any pending updates to the document before beginning.
5509          $this->get_updated_html();
5510  
5511          $actual_bookmark_name = "_{$bookmark_name}";
5512          $processor_started_at = $this->state->current_token
5513              ? $this->bookmarks[ $this->state->current_token->bookmark_name ]->start
5514              : 0;
5515          $bookmark_starts_at   = $this->bookmarks[ $actual_bookmark_name ]->start;
5516          $direction            = $bookmark_starts_at > $processor_started_at ? 'forward' : 'backward';
5517  
5518          /*
5519           * If seeking backwards, it's possible that the sought-after bookmark exists within an element
5520           * which has been closed before the current cursor; in other words, it has already been removed
5521           * from the stack of open elements. This means that it's insufficient to simply pop off elements
5522           * from the stack of open elements which appear after the bookmarked location and then jump to
5523           * that location, as the elements which were open before won't be re-opened.
5524           *
5525           * In order to maintain consistency, the HTML Processor rewinds to the start of the document
5526           * and reparses everything until it finds the sought-after bookmark.
5527           *
5528           * There are potentially better ways to do this: cache the parser state for each bookmark and
5529           * restore it when seeking; store an immutable and idempotent register of where elements open
5530           * and close.
5531           *
5532           * If caching the parser state it will be essential to properly maintain the cached stack of
5533           * open elements and active formatting elements when modifying the document. This could be a
5534           * tedious and time-consuming process as well, and so for now will not be performed.
5535           *
5536           * It may be possible to track bookmarks for where elements open and close, and in doing so
5537           * be able to quickly recalculate breadcrumbs for any element in the document. It may even
5538           * be possible to remove the stack of open elements and compute it on the fly this way.
5539           * If doing this, the parser would need to track the opening and closing locations for all
5540           * tokens in the breadcrumb path for any and all bookmarks. By utilizing bookmarks themselves
5541           * this list could be automatically maintained while modifying the document. Finding the
5542           * breadcrumbs would then amount to traversing that list from the start until the token
5543           * being inspected. Once an element closes, if there are no bookmarks pointing to locations
5544           * within that element, then all of these locations may be forgotten to save on memory use
5545           * and computation time.
5546           */
5547          if ( 'backward' === $direction ) {
5548  
5549              /*
5550               * When moving backward, stateful stacks should be cleared.
5551               */
5552              foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
5553                  $this->state->stack_of_open_elements->remove_node( $item );
5554              }
5555  
5556              foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
5557                  $this->state->active_formatting_elements->remove_node( $item );
5558              }
5559  
5560              /*
5561               * **After** clearing stacks, more processor state can be reset.
5562               * This must be done after clearing the stack because those stacks generate events that
5563               * would appear on a subsequent call to `next_token()`.
5564               */
5565              $this->state->frameset_ok                       = true;
5566              $this->state->stack_of_template_insertion_modes = array();
5567              $this->state->head_element                      = null;
5568              $this->state->form_element                      = null;
5569              $this->state->current_token                     = null;
5570              $this->current_element                          = null;
5571              $this->element_queue                            = array();
5572  
5573              /*
5574               * The absence of a context node indicates a full parse.
5575               * The presence of a context node indicates a fragment parser.
5576               */
5577              if ( null === $this->context_node ) {
5578                  $this->change_parsing_namespace( 'html' );
5579                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_INITIAL;
5580                  $this->breadcrumbs           = array();
5581  
5582                  $this->bookmarks['initial'] = new WP_HTML_Span( 0, 0 );
5583                  parent::seek( 'initial' );
5584                  unset( $this->bookmarks['initial'] );
5585              } else {
5586  
5587                  /*
5588                   * Push the root-node (HTML) back onto the stack of open elements.
5589                   *
5590                   * Fragment parsers require this extra bit of setup.
5591                   * It's handled in full parsers by advancing the processor state.
5592                   */
5593                  $this->state->stack_of_open_elements->push(
5594                      new WP_HTML_Token(
5595                          'root-node',
5596                          'HTML',
5597                          false
5598                      )
5599                  );
5600  
5601                  $this->change_parsing_namespace(
5602                      $this->context_node->integration_node_type
5603                          ? 'html'
5604                          : $this->context_node->namespace
5605                  );
5606  
5607                  if ( 'TEMPLATE' === $this->context_node->node_name ) {
5608                      $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
5609                  }
5610  
5611                  $this->reset_insertion_mode_appropriately();
5612                  $this->breadcrumbs = array_slice( $this->breadcrumbs, 0, 2 );
5613                  parent::seek( $this->context_node->bookmark_name );
5614              }
5615          }
5616  
5617          /*
5618           * Here, the processor moves forward through the document until it matches the bookmark.
5619           * do-while is used here because the processor is expected to already be stopped on
5620           * a token than may match the bookmarked location.
5621           */
5622          do {
5623              /*
5624               * The processor will stop on virtual tokens, but bookmarks may not be set on them.
5625               * They should not be matched when seeking a bookmark, skip them.
5626               */
5627              if ( $this->is_virtual() ) {
5628                  continue;
5629              }
5630              if ( $bookmark_starts_at === $this->bookmarks[ $this->state->current_token->bookmark_name ]->start ) {
5631                  return true;
5632              }
5633          } while ( $this->next_token() );
5634  
5635          return false;
5636      }
5637  
5638      /**
5639       * Sets a bookmark in the HTML document.
5640       *
5641       * Bookmarks represent specific places or tokens in the HTML
5642       * document, such as a tag opener or closer. When applying
5643       * edits to a document, such as setting an attribute, the
5644       * text offsets of that token may shift; the bookmark is
5645       * kept updated with those shifts and remains stable unless
5646       * the entire span of text in which the token sits is removed.
5647       *
5648       * Release bookmarks when they are no longer needed.
5649       *
5650       * Example:
5651       *
5652       *     <main><h2>Surprising fact you may not know!</h2></main>
5653       *           ^  ^
5654       *            \-|-- this `H2` opener bookmark tracks the token
5655       *
5656       *     <main class="clickbait"><h2>Surprising fact you may no…
5657       *                             ^  ^
5658       *                              \-|-- it shifts with edits
5659       *
5660       * Bookmarks provide the ability to seek to a previously-scanned
5661       * place in the HTML document. This avoids the need to re-scan
5662       * the entire document.
5663       *
5664       * Example:
5665       *
5666       *     <ul><li>One</li><li>Two</li><li>Three</li></ul>
5667       *                                 ^^^^
5668       *                                 want to note this last item
5669       *
5670       *     $p = new WP_HTML_Tag_Processor( $html );
5671       *     $in_list = false;
5672       *     while ( $p->next_tag( array( 'tag_closers' => $in_list ? 'visit' : 'skip' ) ) ) {
5673       *         if ( 'UL' === $p->get_tag() ) {
5674       *             if ( $p->is_tag_closer() ) {
5675       *                 $in_list = false;
5676       *                 $p->set_bookmark( 'resume' );
5677       *                 if ( $p->seek( 'last-li' ) ) {
5678       *                     $p->add_class( 'last-li' );
5679       *                 }
5680       *                 $p->seek( 'resume' );
5681       *                 $p->release_bookmark( 'last-li' );
5682       *                 $p->release_bookmark( 'resume' );
5683       *             } else {
5684       *                 $in_list = true;
5685       *             }
5686       *         }
5687       *
5688       *         if ( 'LI' === $p->get_tag() ) {
5689       *             $p->set_bookmark( 'last-li' );
5690       *         }
5691       *     }
5692       *
5693       * Bookmarks intentionally hide the internal string offsets
5694       * to which they refer. They are maintained internally as
5695       * updates are applied to the HTML document and therefore
5696       * retain their "position" - the location to which they
5697       * originally pointed. The inability to use bookmarks with
5698       * functions like `substr` is therefore intentional to guard
5699       * against accidentally breaking the HTML.
5700       *
5701       * Because bookmarks allocate memory and require processing
5702       * for every applied update, they are limited and require
5703       * a name. They should not be created with programmatically-made
5704       * names, such as "li_{$index}" with some loop. As a general
5705       * rule they should only be created with string-literal names
5706       * like "start-of-section" or "last-paragraph".
5707       *
5708       * Bookmarks are a powerful tool to enable complicated behavior.
5709       * Consider double-checking that you need this tool if you are
5710       * reaching for it, as inappropriate use could lead to broken
5711       * HTML structure or unwanted processing overhead.
5712       *
5713       * Bookmarks cannot be set on tokens that do no appear in the original
5714       * HTML text. For example, the HTML `<table><td>` stops at tags `TABLE`,
5715       * `TBODY`, `TR`, and `TD`. The `TBODY` and `TR` tags do not appear in
5716       * the original HTML and cannot be used as bookmarks.
5717       *
5718       * @since 6.4.0
5719       *
5720       * @param string $bookmark_name Identifies this particular bookmark.
5721       * @return bool Whether the bookmark was successfully created.
5722       */
5723  	public function set_bookmark( $bookmark_name ): bool {
5724          if ( $this->is_virtual() ) {
5725              _doing_it_wrong(
5726                  __METHOD__,
5727                  __( 'Cannot set bookmarks on tokens that do no appear in the original HTML text.' ),
5728                  '6.8.0'
5729              );
5730              return false;
5731          }
5732          return parent::set_bookmark( "_{$bookmark_name}" );
5733      }
5734  
5735      /**
5736       * Checks whether a bookmark with the given name exists.
5737       *
5738       * @since 6.5.0
5739       *
5740       * @param string $bookmark_name Name to identify a bookmark that potentially exists.
5741       * @return bool Whether that bookmark exists.
5742       */
5743  	public function has_bookmark( $bookmark_name ): bool {
5744          return parent::has_bookmark( "_{$bookmark_name}" );
5745      }
5746  
5747      /*
5748       * HTML Parsing Algorithms
5749       */
5750  
5751      /**
5752       * Closes a P element.
5753       *
5754       * @since 6.4.0
5755       *
5756       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5757       *
5758       * @see https://html.spec.whatwg.org/#close-a-p-element
5759       */
5760  	private function close_a_p_element(): void {
5761          $this->generate_implied_end_tags( 'P' );
5762          $this->state->stack_of_open_elements->pop_until( 'P' );
5763      }
5764  
5765      /**
5766       * Closes elements that have implied end tags.
5767       *
5768       * @since 6.4.0
5769       * @since 6.7.0 Full spec support.
5770       *
5771       * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5772       *
5773       * @param string|null $except_for_this_element Perform as if this element doesn't exist in the stack of open elements.
5774       */
5775  	private function generate_implied_end_tags( ?string $except_for_this_element = null ): void {
5776          $elements_with_implied_end_tags = array(
5777              'DD',
5778              'DT',
5779              'LI',
5780              'OPTGROUP',
5781              'OPTION',
5782              'P',
5783              'RB',
5784              'RP',
5785              'RT',
5786              'RTC',
5787          );
5788  
5789          $no_exclusions = ! isset( $except_for_this_element );
5790  
5791          while (
5792              ( $no_exclusions || ! $this->state->stack_of_open_elements->current_node_is( $except_for_this_element ) ) &&
5793              in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true )
5794          ) {
5795              $this->state->stack_of_open_elements->pop();
5796          }
5797      }
5798  
5799      /**
5800       * Closes elements that have implied end tags, thoroughly.
5801       *
5802       * See the HTML specification for an explanation why this is
5803       * different from generating end tags in the normal sense.
5804       *
5805       * @since 6.4.0
5806       * @since 6.7.0 Full spec support.
5807       *
5808       * @see WP_HTML_Processor::generate_implied_end_tags
5809       * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5810       */
5811  	private function generate_implied_end_tags_thoroughly(): void {
5812          $elements_with_implied_end_tags = array(
5813              'CAPTION',
5814              'COLGROUP',
5815              'DD',
5816              'DT',
5817              'LI',
5818              'OPTGROUP',
5819              'OPTION',
5820              'P',
5821              'RB',
5822              'RP',
5823              'RT',
5824              'RTC',
5825              'TBODY',
5826              'TD',
5827              'TFOOT',
5828              'TH',
5829              'THEAD',
5830              'TR',
5831          );
5832  
5833          while ( in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true ) ) {
5834              $this->state->stack_of_open_elements->pop();
5835          }
5836      }
5837  
5838      /**
5839       * Returns the adjusted current node.
5840       *
5841       * > The adjusted current node is the context element if the parser was created as
5842       * > part of the HTML fragment parsing algorithm and the stack of open elements
5843       * > has only one element in it (fragment case); otherwise, the adjusted current
5844       * > node is the current node.
5845       *
5846       * @see https://html.spec.whatwg.org/#adjusted-current-node
5847       *
5848       * @since 6.7.0
5849       *
5850       * @return WP_HTML_Token|null The adjusted current node.
5851       */
5852  	private function get_adjusted_current_node(): ?WP_HTML_Token {
5853          if ( isset( $this->context_node ) && 1 === $this->state->stack_of_open_elements->count() ) {
5854              return $this->context_node;
5855          }
5856  
5857          return $this->state->stack_of_open_elements->current_node();
5858      }
5859  
5860      /**
5861       * Reconstructs the active formatting elements.
5862       *
5863       * > This has the effect of reopening all the formatting elements that were opened
5864       * > in the current body, cell, or caption (whichever is youngest) that haven't
5865       * > been explicitly closed.
5866       *
5867       * @since 6.4.0
5868       *
5869       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5870       *
5871       * @see https://html.spec.whatwg.org/#reconstruct-the-active-formatting-elements
5872       *
5873       * @return bool Whether any formatting elements needed to be reconstructed.
5874       */
5875  	private function reconstruct_active_formatting_elements(): bool {
5876          /*
5877           * > If there are no entries in the list of active formatting elements, then there is nothing
5878           * > to reconstruct; stop this algorithm.
5879           */
5880          if ( 0 === $this->state->active_formatting_elements->count() ) {
5881              return false;
5882          }
5883  
5884          $last_entry = $this->state->active_formatting_elements->current_node();
5885          if (
5886  
5887              /*
5888               * > If the last (most recently added) entry in the list of active formatting elements is a marker;
5889               * > stop this algorithm.
5890               */
5891              'marker' === $last_entry->node_name ||
5892  
5893              /*
5894               * > If the last (most recently added) entry in the list of active formatting elements is an
5895               * > element that is in the stack of open elements, then there is nothing to reconstruct;
5896               * > stop this algorithm.
5897               */
5898              $this->state->stack_of_open_elements->contains_node( $last_entry )
5899          ) {
5900              return false;
5901          }
5902  
5903          $this->bail( 'Cannot reconstruct active formatting elements when advancing and rewinding is required.' );
5904      }
5905  
5906      /**
5907       * Runs the reset the insertion mode appropriately algorithm.
5908       *
5909       * @since 6.7.0
5910       *
5911       * @see https://html.spec.whatwg.org/multipage/parsing.html#reset-the-insertion-mode-appropriately
5912       */
5913  	private function reset_insertion_mode_appropriately(): void {
5914          // Set the first node.
5915          $first_node = null;
5916          foreach ( $this->state->stack_of_open_elements->walk_down() as $first_node ) {
5917              break;
5918          }
5919  
5920          /*
5921           * > 1. Let _last_ be false.
5922           */
5923          $last = false;
5924          foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
5925              /*
5926               * > 2. Let _node_ be the last node in the stack of open elements.
5927               * > 3. _Loop_: If _node_ is the first node in the stack of open elements, then set _last_
5928               * >            to true, and, if the parser was created as part of the HTML fragment parsing
5929               * >            algorithm (fragment case), set node to the context element passed to
5930               * >            that algorithm.
5931               * > …
5932               */
5933              if ( $node === $first_node ) {
5934                  $last = true;
5935                  if ( isset( $this->context_node ) ) {
5936                      $node = $this->context_node;
5937                  }
5938              }
5939  
5940              // All of the following rules are for matching HTML elements.
5941              if ( 'html' !== $node->namespace ) {
5942                  continue;
5943              }
5944  
5945              switch ( $node->node_name ) {
5946                  /*
5947                   * > 4. If node is a `select` element, run these substeps:
5948                   * >   1. If _last_ is true, jump to the step below labeled done.
5949                   * >   2. Let _ancestor_ be _node_.
5950                   * >   3. _Loop_: If _ancestor_ is the first node in the stack of open elements,
5951                   * >      jump to the step below labeled done.
5952                   * >   4. Let ancestor be the node before ancestor in the stack of open elements.
5953                   * >   …
5954                   * >   7. Jump back to the step labeled _loop_.
5955                   * >   8. _Done_: Switch the insertion mode to "in select" and return.
5956                   */
5957                  case 'SELECT':
5958                      if ( ! $last ) {
5959                          foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $ancestor ) {
5960                              if ( 'html' !== $ancestor->namespace ) {
5961                                  continue;
5962                              }
5963  
5964                              switch ( $ancestor->node_name ) {
5965                                  /*
5966                                   * > 5. If _ancestor_ is a `template` node, jump to the step below
5967                                   * >    labeled _done_.
5968                                   */
5969                                  case 'TEMPLATE':
5970                                      break 2;
5971  
5972                                  /*
5973                                   * > 6. If _ancestor_ is a `table` node, switch the insertion mode to
5974                                   * >    "in select in table" and return.
5975                                   */
5976                                  case 'TABLE':
5977                                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
5978                                      return;
5979                              }
5980                          }
5981                      }
5982                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
5983                      return;
5984  
5985                  /*
5986                   * > 5. If _node_ is a `td` or `th` element and _last_ is false, then switch the
5987                   * >    insertion mode to "in cell" and return.
5988                   */
5989                  case 'TD':
5990                  case 'TH':
5991                      if ( ! $last ) {
5992                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
5993                          return;
5994                      }
5995                      break;
5996  
5997                      /*
5998                      * > 6. If _node_ is a `tr` element, then switch the insertion mode to "in row"
5999                      * >    and return.
6000                      */
6001                  case 'TR':
6002                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
6003                      return;
6004  
6005                  /*
6006                   * > 7. If _node_ is a `tbody`, `thead`, or `tfoot` element, then switch the
6007                   * >    insertion mode to "in table body" and return.
6008                   */
6009                  case 'TBODY':
6010                  case 'THEAD':
6011                  case 'TFOOT':
6012                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
6013                      return;
6014  
6015                  /*
6016                   * > 8. If _node_ is a `caption` element, then switch the insertion mode to
6017                   * >    "in caption" and return.
6018                   */
6019                  case 'CAPTION':
6020                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
6021                      return;
6022  
6023                  /*
6024                   * > 9. If _node_ is a `colgroup` element, then switch the insertion mode to
6025                   * >    "in column group" and return.
6026                   */
6027                  case 'COLGROUP':
6028                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
6029                      return;
6030  
6031                  /*
6032                   * > 10. If _node_ is a `table` element, then switch the insertion mode to
6033                   * >     "in table" and return.
6034                   */
6035                  case 'TABLE':
6036                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
6037                      return;
6038  
6039                  /*
6040                   * > 11. If _node_ is a `template` element, then switch the insertion mode to the
6041                   * >     current template insertion mode and return.
6042                   */
6043                  case 'TEMPLATE':
6044                      $this->state->insertion_mode = end( $this->state->stack_of_template_insertion_modes );
6045                      return;
6046  
6047                  /*
6048                   * > 12. If _node_ is a `head` element and _last_ is false, then switch the
6049                   * >     insertion mode to "in head" and return.
6050                   */
6051                  case 'HEAD':
6052                      if ( ! $last ) {
6053                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
6054                          return;
6055                      }
6056                      break;
6057  
6058                  /*
6059                   * > 13. If _node_ is a `body` element, then switch the insertion mode to "in body"
6060                   * >     and return.
6061                   */
6062                  case 'BODY':
6063                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6064                      return;
6065  
6066                  /*
6067                   * > 14. If _node_ is a `frameset` element, then switch the insertion mode to
6068                   * >     "in frameset" and return. (fragment case)
6069                   */
6070                  case 'FRAMESET':
6071                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
6072                      return;
6073  
6074                  /*
6075                   * > 15. If _node_ is an `html` element, run these substeps:
6076                   * >     1. If the head element pointer is null, switch the insertion mode to
6077                   * >        "before head" and return. (fragment case)
6078                   * >     2. Otherwise, the head element pointer is not null, switch the insertion
6079                   * >        mode to "after head" and return.
6080                   */
6081                  case 'HTML':
6082                      $this->state->insertion_mode = isset( $this->state->head_element )
6083                          ? WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD
6084                          : WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
6085                      return;
6086              }
6087          }
6088  
6089          /*
6090           * > 16. If _last_ is true, then switch the insertion mode to "in body"
6091           * >     and return. (fragment case)
6092           *
6093           * This is only reachable if `$last` is true, as per the fragment parsing case.
6094           */
6095          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6096      }
6097  
6098      /**
6099       * Runs the adoption agency algorithm.
6100       *
6101       * @since 6.4.0
6102       *
6103       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
6104       *
6105       * @see https://html.spec.whatwg.org/#adoption-agency-algorithm
6106       */
6107  	private function run_adoption_agency_algorithm(): void {
6108          $budget       = 1000;
6109          $subject      = $this->get_tag();
6110          $current_node = $this->state->stack_of_open_elements->current_node();
6111  
6112          if (
6113              // > If the current node is an HTML element whose tag name is subject
6114              $current_node && $subject === $current_node->node_name &&
6115              // > the current node is not in the list of active formatting elements
6116              ! $this->state->active_formatting_elements->contains_node( $current_node )
6117          ) {
6118              $this->state->stack_of_open_elements->pop();
6119              return;
6120          }
6121  
6122          $outer_loop_counter = 0;
6123          while ( $budget-- > 0 ) {
6124              if ( $outer_loop_counter++ >= 8 ) {
6125                  return;
6126              }
6127  
6128              /*
6129               * > Let formatting element be the last element in the list of active formatting elements that:
6130               * >   - is between the end of the list and the last marker in the list,
6131               * >     if any, or the start of the list otherwise,
6132               * >   - and has the tag name subject.
6133               */
6134              $formatting_element = null;
6135              foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
6136                  if ( 'marker' === $item->node_name ) {
6137                      break;
6138                  }
6139  
6140                  if ( $subject === $item->node_name ) {
6141                      $formatting_element = $item;
6142                      break;
6143                  }
6144              }
6145  
6146              // > If there is no such element, then return and instead act as described in the "any other end tag" entry above.
6147              if ( null === $formatting_element ) {
6148                  $this->bail( 'Cannot run adoption agency when "any other end tag" is required.' );
6149              }
6150  
6151              // > If formatting element is not in the stack of open elements, then this is a parse error; remove the element from the list, and return.
6152              if ( ! $this->state->stack_of_open_elements->contains_node( $formatting_element ) ) {
6153                  $this->state->active_formatting_elements->remove_node( $formatting_element );
6154                  return;
6155              }
6156  
6157              // > If formatting element is in the stack of open elements, but the element is not in scope, then this is a parse error; return.
6158              if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $formatting_element->node_name ) ) {
6159                  return;
6160              }
6161  
6162              /*
6163               * > Let furthest block be the topmost node in the stack of open elements that is lower in the stack
6164               * > than formatting element, and is an element in the special category. There might not be one.
6165               */
6166              $is_above_formatting_element = true;
6167              $furthest_block              = null;
6168              foreach ( $this->state->stack_of_open_elements->walk_down() as $item ) {
6169                  if ( $is_above_formatting_element && $formatting_element->bookmark_name !== $item->bookmark_name ) {
6170                      continue;
6171                  }
6172  
6173                  if ( $is_above_formatting_element ) {
6174                      $is_above_formatting_element = false;
6175                      continue;
6176                  }
6177  
6178                  if ( self::is_special( $item ) ) {
6179                      $furthest_block = $item;
6180                      break;
6181                  }
6182              }
6183  
6184              /*
6185               * > If there is no furthest block, then the UA must first pop all the nodes from the bottom of the
6186               * > stack of open elements, from the current node up to and including formatting element, then
6187               * > remove formatting element from the list of active formatting elements, and finally return.
6188               */
6189              if ( null === $furthest_block ) {
6190                  foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
6191                      $this->state->stack_of_open_elements->pop();
6192  
6193                      if ( $formatting_element->bookmark_name === $item->bookmark_name ) {
6194                          $this->state->active_formatting_elements->remove_node( $formatting_element );
6195                          return;
6196                      }
6197                  }
6198              }
6199  
6200              $this->bail( 'Cannot extract common ancestor in adoption agency algorithm.' );
6201          }
6202  
6203          $this->bail( 'Cannot run adoption agency when looping required.' );
6204      }
6205  
6206      /**
6207       * Runs the "close the cell" algorithm.
6208       *
6209       * > Where the steps above say to close the cell, they mean to run the following algorithm:
6210       * >   1. Generate implied end tags.
6211       * >   2. If the current node is not now a td element or a th element, then this is a parse error.
6212       * >   3. Pop elements from the stack of open elements stack until a td element or a th element has been popped from the stack.
6213       * >   4. Clear the list of active formatting elements up to the last marker.
6214       * >   5. Switch the insertion mode to "in row".
6215       *
6216       * @see https://html.spec.whatwg.org/multipage/parsing.html#close-the-cell
6217       *
6218       * @since 6.7.0
6219       */
6220  	private function close_cell(): void {
6221          $this->generate_implied_end_tags();
6222          // @todo Parse error if the current node is a "td" or "th" element.
6223          foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
6224              $this->state->stack_of_open_elements->pop();
6225              if ( 'TD' === $element->node_name || 'TH' === $element->node_name ) {
6226                  break;
6227              }
6228          }
6229          $this->state->active_formatting_elements->clear_up_to_last_marker();
6230          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
6231      }
6232  
6233      /**
6234       * Inserts an HTML element on the stack of open elements.
6235       *
6236       * @since 6.4.0
6237       *
6238       * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6239       *
6240       * @param WP_HTML_Token $token Name of bookmark pointing to element in original input HTML.
6241       */
6242  	private function insert_html_element( WP_HTML_Token $token ): void {
6243          $this->state->stack_of_open_elements->push( $token );
6244      }
6245  
6246      /**
6247       * Inserts a foreign element on to the stack of open elements.
6248       *
6249       * @since 6.7.0
6250       *
6251       * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6252       *
6253       * @param WP_HTML_Token $token                     Insert this token. The token's namespace and
6254       *                                                 insertion point will be updated correctly.
6255       * @param bool          $only_add_to_element_stack Whether to skip the "insert an element at the adjusted
6256       *                                                 insertion location" algorithm when adding this element.
6257       */
6258  	private function insert_foreign_element( WP_HTML_Token $token, bool $only_add_to_element_stack ): void {
6259          $adjusted_current_node = $this->get_adjusted_current_node();
6260  
6261          $token->namespace = $adjusted_current_node ? $adjusted_current_node->namespace : 'html';
6262  
6263          if ( $this->is_mathml_integration_point() ) {
6264              $token->integration_node_type = 'math';
6265          } elseif ( $this->is_html_integration_point() ) {
6266              $token->integration_node_type = 'html';
6267          }
6268  
6269          if ( false === $only_add_to_element_stack ) {
6270              /*
6271               * @todo Implement the "appropriate place for inserting a node" and the
6272               *       "insert an element at the adjusted insertion location" algorithms.
6273               *
6274               * These algorithms mostly impacts DOM tree construction and not the HTML API.
6275               * Here, there's no DOM node onto which the element will be appended, so the
6276               * parser will skip this step.
6277               *
6278               * @see https://html.spec.whatwg.org/#insert-an-element-at-the-adjusted-insertion-location
6279               */
6280          }
6281  
6282          $this->insert_html_element( $token );
6283      }
6284  
6285      /**
6286       * Inserts a virtual element on the stack of open elements.
6287       *
6288       * @since 6.7.0
6289       *
6290       * @param string      $token_name    Name of token to create and insert into the stack of open elements.
6291       * @param string|null $bookmark_name Optional. Name to give bookmark for created virtual node.
6292       *                                   Defaults to auto-creating a bookmark name.
6293       * @return WP_HTML_Token Newly-created virtual token.
6294       */
6295  	private function insert_virtual_node( $token_name, $bookmark_name = null ): WP_HTML_Token {
6296          $here = $this->bookmarks[ $this->state->current_token->bookmark_name ];
6297          $name = $bookmark_name ?? $this->bookmark_token();
6298  
6299          $this->bookmarks[ $name ] = new WP_HTML_Span( $here->start, 0 );
6300  
6301          $token = new WP_HTML_Token( $name, $token_name, false );
6302          $this->insert_html_element( $token );
6303          return $token;
6304      }
6305  
6306      /*
6307       * HTML Specification Helpers
6308       */
6309  
6310      /**
6311       * Indicates if the current token is a MathML integration point.
6312       *
6313       * @since 6.7.0
6314       *
6315       * @see https://html.spec.whatwg.org/#mathml-text-integration-point
6316       *
6317       * @return bool Whether the current token is a MathML integration point.
6318       */
6319  	private function is_mathml_integration_point(): bool {
6320          $current_token = $this->state->current_token;
6321          if ( ! isset( $current_token ) ) {
6322              return false;
6323          }
6324  
6325          if ( 'math' !== $current_token->namespace || 'M' !== $current_token->node_name[0] ) {
6326              return false;
6327          }
6328  
6329          $tag_name = $current_token->node_name;
6330  
6331          return (
6332              'MI' === $tag_name ||
6333              'MO' === $tag_name ||
6334              'MN' === $tag_name ||
6335              'MS' === $tag_name ||
6336              'MTEXT' === $tag_name
6337          );
6338      }
6339  
6340      /**
6341       * Indicates if the current token is an HTML integration point.
6342       *
6343       * Note that this method must be an instance method with access
6344       * to the current token, since it needs to examine the attributes
6345       * of the currently-matched tag, if it's in the MathML namespace.
6346       * Otherwise it would be required to scan the HTML and ensure that
6347       * no other accounting is overlooked.
6348       *
6349       * @since 6.7.0
6350       *
6351       * @see https://html.spec.whatwg.org/#html-integration-point
6352       *
6353       * @return bool Whether the current token is an HTML integration point.
6354       */
6355  	private function is_html_integration_point(): bool {
6356          $current_token = $this->state->current_token;
6357          if ( ! isset( $current_token ) ) {
6358              return false;
6359          }
6360  
6361          if ( 'html' === $current_token->namespace ) {
6362              return false;
6363          }
6364  
6365          $tag_name = $current_token->node_name;
6366  
6367          if ( 'svg' === $current_token->namespace ) {
6368              return (
6369                  'DESC' === $tag_name ||
6370                  'FOREIGNOBJECT' === $tag_name ||
6371                  'TITLE' === $tag_name
6372              );
6373          }
6374  
6375          if ( 'math' === $current_token->namespace ) {
6376              if ( 'ANNOTATION-XML' !== $tag_name ) {
6377                  return false;
6378              }
6379  
6380              $encoding = $this->get_attribute( 'encoding' );
6381  
6382              return (
6383                  is_string( $encoding ) &&
6384                  (
6385                      0 === strcasecmp( $encoding, 'application/xhtml+xml' ) ||
6386                      0 === strcasecmp( $encoding, 'text/html' )
6387                  )
6388              );
6389          }
6390  
6391          $this->bail( 'Should not have reached end of HTML Integration Point detection: check HTML API code.' );
6392          // This unnecessary return prevents tools from inaccurately reporting type errors.
6393          return false;
6394      }
6395  
6396      /**
6397       * Returns whether an element of a given name is in the HTML special category.
6398       *
6399       * @since 6.4.0
6400       *
6401       * @see https://html.spec.whatwg.org/#special
6402       *
6403       * @param WP_HTML_Token|string $tag_name Node to check, or only its name if in the HTML namespace.
6404       * @return bool Whether the element of the given name is in the special category.
6405       */
6406  	public static function is_special( $tag_name ): bool {
6407          if ( is_string( $tag_name ) ) {
6408              $tag_name = strtoupper( $tag_name );
6409          } else {
6410              $tag_name = 'html' === $tag_name->namespace
6411                  ? strtoupper( $tag_name->node_name )
6412                  : "{$tag_name->namespace} {$tag_name->node_name}";
6413          }
6414  
6415          return (
6416              'ADDRESS' === $tag_name ||
6417              'APPLET' === $tag_name ||
6418              'AREA' === $tag_name ||
6419              'ARTICLE' === $tag_name ||
6420              'ASIDE' === $tag_name ||
6421              'BASE' === $tag_name ||
6422              'BASEFONT' === $tag_name ||
6423              'BGSOUND' === $tag_name ||
6424              'BLOCKQUOTE' === $tag_name ||
6425              'BODY' === $tag_name ||
6426              'BR' === $tag_name ||
6427              'BUTTON' === $tag_name ||
6428              'CAPTION' === $tag_name ||
6429              'CENTER' === $tag_name ||
6430              'COL' === $tag_name ||
6431              'COLGROUP' === $tag_name ||
6432              'DD' === $tag_name ||
6433              'DETAILS' === $tag_name ||
6434              'DIR' === $tag_name ||
6435              'DIV' === $tag_name ||
6436              'DL' === $tag_name ||
6437              'DT' === $tag_name ||
6438              'EMBED' === $tag_name ||
6439              'FIELDSET' === $tag_name ||
6440              'FIGCAPTION' === $tag_name ||
6441              'FIGURE' === $tag_name ||
6442              'FOOTER' === $tag_name ||
6443              'FORM' === $tag_name ||
6444              'FRAME' === $tag_name ||
6445              'FRAMESET' === $tag_name ||
6446              'H1' === $tag_name ||
6447              'H2' === $tag_name ||
6448              'H3' === $tag_name ||
6449              'H4' === $tag_name ||
6450              'H5' === $tag_name ||
6451              'H6' === $tag_name ||
6452              'HEAD' === $tag_name ||
6453              'HEADER' === $tag_name ||
6454              'HGROUP' === $tag_name ||
6455              'HR' === $tag_name ||
6456              'HTML' === $tag_name ||
6457              'IFRAME' === $tag_name ||
6458              'IMG' === $tag_name ||
6459              'INPUT' === $tag_name ||
6460              'KEYGEN' === $tag_name ||
6461              'LI' === $tag_name ||
6462              'LINK' === $tag_name ||
6463              'LISTING' === $tag_name ||
6464              'MAIN' === $tag_name ||
6465              'MARQUEE' === $tag_name ||
6466              'MENU' === $tag_name ||
6467              'META' === $tag_name ||
6468              'NAV' === $tag_name ||
6469              'NOEMBED' === $tag_name ||
6470              'NOFRAMES' === $tag_name ||
6471              'NOSCRIPT' === $tag_name ||
6472              'OBJECT' === $tag_name ||
6473              'OL' === $tag_name ||
6474              'P' === $tag_name ||
6475              'PARAM' === $tag_name ||
6476              'PLAINTEXT' === $tag_name ||
6477              'PRE' === $tag_name ||
6478              'SCRIPT' === $tag_name ||
6479              'SEARCH' === $tag_name ||
6480              'SECTION' === $tag_name ||
6481              'SELECT' === $tag_name ||
6482              'SOURCE' === $tag_name ||
6483              'STYLE' === $tag_name ||
6484              'SUMMARY' === $tag_name ||
6485              'TABLE' === $tag_name ||
6486              'TBODY' === $tag_name ||
6487              'TD' === $tag_name ||
6488              'TEMPLATE' === $tag_name ||
6489              'TEXTAREA' === $tag_name ||
6490              'TFOOT' === $tag_name ||
6491              'TH' === $tag_name ||
6492              'THEAD' === $tag_name ||
6493              'TITLE' === $tag_name ||
6494              'TR' === $tag_name ||
6495              'TRACK' === $tag_name ||
6496              'UL' === $tag_name ||
6497              'WBR' === $tag_name ||
6498              'XMP' === $tag_name ||
6499  
6500              // MathML.
6501              'math MI' === $tag_name ||
6502              'math MO' === $tag_name ||
6503              'math MN' === $tag_name ||
6504              'math MS' === $tag_name ||
6505              'math MTEXT' === $tag_name ||
6506              'math ANNOTATION-XML' === $tag_name ||
6507  
6508              // SVG.
6509              'svg DESC' === $tag_name ||
6510              'svg FOREIGNOBJECT' === $tag_name ||
6511              'svg TITLE' === $tag_name
6512          );
6513      }
6514  
6515      /**
6516       * Returns whether a given element is an HTML Void Element
6517       *
6518       * > area, base, br, col, embed, hr, img, input, link, meta, source, track, wbr
6519       *
6520       * @since 6.4.0
6521       *
6522       * @see https://html.spec.whatwg.org/#void-elements
6523       *
6524       * @param string $tag_name Name of HTML tag to check.
6525       * @return bool Whether the given tag is an HTML Void Element.
6526       */
6527  	public static function is_void( $tag_name ): bool {
6528          $tag_name = strtoupper( $tag_name );
6529  
6530          return (
6531              'AREA' === $tag_name ||
6532              'BASE' === $tag_name ||
6533              'BASEFONT' === $tag_name || // Obsolete but still treated as void.
6534              'BGSOUND' === $tag_name || // Obsolete but still treated as void.
6535              'BR' === $tag_name ||
6536              'COL' === $tag_name ||
6537              'EMBED' === $tag_name ||
6538              'FRAME' === $tag_name ||
6539              'HR' === $tag_name ||
6540              'IMG' === $tag_name ||
6541              'INPUT' === $tag_name ||
6542              'KEYGEN' === $tag_name || // Obsolete but still treated as void.
6543              'LINK' === $tag_name ||
6544              'META' === $tag_name ||
6545              'PARAM' === $tag_name || // Obsolete but still treated as void.
6546              'SOURCE' === $tag_name ||
6547              'TRACK' === $tag_name ||
6548              'WBR' === $tag_name
6549          );
6550      }
6551  
6552      /**
6553       * Gets an encoding from a given string.
6554       *
6555       * This is an algorithm defined in the WHAT-WG specification.
6556       *
6557       * Example:
6558       *
6559       *     'UTF-8' === self::get_encoding( 'utf8' );
6560       *     'UTF-8' === self::get_encoding( "  \tUTF-8 " );
6561       *     null    === self::get_encoding( 'UTF-7' );
6562       *     null    === self::get_encoding( 'utf8; charset=' );
6563       *
6564       * @see https://encoding.spec.whatwg.org/#concept-encoding-get
6565       *
6566       * @todo As this parser only supports UTF-8, only the UTF-8
6567       *       encodings are detected. Add more as desired, but the
6568       *       parser will bail on non-UTF-8 encodings.
6569       *
6570       * @since 6.7.0
6571       *
6572       * @param string $label A string which may specify a known encoding.
6573       * @return string|null Known encoding if matched, otherwise null.
6574       */
6575  	protected static function get_encoding( string $label ): ?string {
6576          /*
6577           * > Remove any leading and trailing ASCII whitespace from label.
6578           */
6579          $label = trim( $label, " \t\f\r\n" );
6580  
6581          /*
6582           * > If label is an ASCII case-insensitive match for any of the labels listed in the
6583           * > table below, then return the corresponding encoding; otherwise return failure.
6584           */
6585          switch ( strtolower( $label ) ) {
6586              case 'unicode-1-1-utf-8':
6587              case 'unicode11utf8':
6588              case 'unicode20utf8':
6589              case 'utf-8':
6590              case 'utf8':
6591              case 'x-unicode20utf8':
6592                  return 'UTF-8';
6593  
6594              default:
6595                  return null;
6596          }
6597      }
6598  
6599      /*
6600       * Constants that would pollute the top of the class if they were found there.
6601       */
6602  
6603      /**
6604       * Indicates that the next HTML token should be parsed and processed.
6605       *
6606       * @since 6.4.0
6607       *
6608       * @var string
6609       */
6610      const PROCESS_NEXT_NODE = 'process-next-node';
6611  
6612      /**
6613       * Indicates that the current HTML token should be reprocessed in the newly-selected insertion mode.
6614       *
6615       * @since 6.4.0
6616       *
6617       * @var string
6618       */
6619      const REPROCESS_CURRENT_NODE = 'reprocess-current-node';
6620  
6621      /**
6622       * Indicates that the current HTML token should be processed without advancing the parser.
6623       *
6624       * @since 6.5.0
6625       *
6626       * @var string
6627       */
6628      const PROCESS_CURRENT_NODE = 'process-current-node';
6629  
6630      /**
6631       * Indicates that the parser encountered unsupported markup and has bailed.
6632       *
6633       * @since 6.4.0
6634       *
6635       * @var string
6636       */
6637      const ERROR_UNSUPPORTED = 'unsupported';
6638  
6639      /**
6640       * Indicates that the parser encountered more HTML tokens than it
6641       * was able to process and has bailed.
6642       *
6643       * @since 6.4.0
6644       *
6645       * @var string
6646       */
6647      const ERROR_EXCEEDED_MAX_BOOKMARKS = 'exceeded-max-bookmarks';
6648  
6649      /**
6650       * Unlock code that must be passed into the constructor to create this class.
6651       *
6652       * This class extends the WP_HTML_Tag_Processor, which has a public class
6653       * constructor. Therefore, it's not possible to have a private constructor here.
6654       *
6655       * This unlock code is used to ensure that anyone calling the constructor is
6656       * doing so with a full understanding that it's intended to be a private API.
6657       *
6658       * @access private
6659       */
6660      const CONSTRUCTOR_UNLOCK_CODE = 'Use WP_HTML_Processor::create_fragment() instead of calling the class constructor directly.';
6661  }


Generated : Fri Oct 10 08:20:03 2025 Cross-referenced by PHPXref