PHPXRef 0.7.1 : WordPress Trunk (Updated Daily) : /wp-includes/html-api/class-wp-html-processor.php source

[Summary view] [Print] [Text view]
   1  <?php
   2  /**
   3   * HTML API: WP_HTML_Processor class
   4   *
   5   * @package WordPress
   6   * @subpackage HTML-API
   7   * @since 6.4.0
   8   */
   9  
  10  /**
  11   * Core class used to safely parse and modify an HTML document.
  12   *
  13   * The HTML Processor class properly parses and modifies HTML5 documents.
  14   *
  15   * It supports a subset of the HTML5 specification, and when it encounters
  16   * unsupported markup, it aborts early to avoid unintentionally breaking
  17   * the document. The HTML Processor should never break an HTML document.
  18   *
  19   * While the `WP_HTML_Tag_Processor` is a valuable tool for modifying
  20   * attributes on individual HTML tags, the HTML Processor is more capable
  21   * and useful for the following operations:
  22   *
  23   *  - Querying based on nested HTML structure.
  24   *
  25   * Eventually the HTML Processor will also support:
  26   *  - Wrapping a tag in surrounding HTML.
  27   *  - Unwrapping a tag by removing its parent.
  28   *  - Inserting and removing nodes.
  29   *  - Reading and changing inner content.
  30   *  - Navigating up or around HTML structure.
  31   *
  32   * ## Usage
  33   *
  34   * Use of this class requires three steps:
  35   *
  36   *   1. Call a static creator method with your input HTML document.
  37   *   2. Find the location in the document you are looking for.
  38   *   3. Request changes to the document at that location.
  39   *
  40   * Example:
  41   *
  42   *     $processor = WP_HTML_Processor::create_fragment( $html );
  43   *     if ( $processor->next_tag( array( 'breadcrumbs' => array( 'DIV', 'FIGURE', 'IMG' ) ) ) ) {
  44   *         $processor->add_class( 'responsive-image' );
  45   *     }
  46   *
  47   * #### Breadcrumbs
  48   *
  49   * Breadcrumbs represent the stack of open elements from the root
  50   * of the document or fragment down to the currently-matched node,
  51   * if one is currently selected. Call WP_HTML_Processor::get_breadcrumbs()
  52   * to inspect the breadcrumbs for a matched tag.
  53   *
  54   * Breadcrumbs can specify nested HTML structure and are equivalent
  55   * to a CSS selector comprising tag names separated by the child
  56   * combinator, such as "DIV > FIGURE > IMG".
  57   *
  58   * Since all elements find themselves inside a full HTML document
  59   * when parsed, the return value from `get_breadcrumbs()` will always
  60   * contain any implicit outermost elements. For example, when parsing
  61   * with `create_fragment()` in the `BODY` context (the default), any
  62   * tag in the given HTML document will contain `array( 'HTML', 'BODY', … )`
  63   * in its breadcrumbs.
  64   *
  65   * Despite containing the implied outermost elements in their breadcrumbs,
  66   * tags may be found with the shortest-matching breadcrumb query. That is,
  67   * `array( 'IMG' )` matches all IMG elements and `array( 'P', 'IMG' )`
  68   * matches all IMG elements directly inside a P element. To ensure that no
  69   * partial matches erroneously match it's possible to specify in a query
  70   * the full breadcrumb match all the way down from the root HTML element.
  71   *
  72   * Example:
  73   *
  74   *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
  75   *     //               ----- Matches here.
  76   *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'IMG' ) ) );
  77   *
  78   *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
  79   *     //                                  ---- Matches here.
  80   *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'FIGCAPTION', 'EM' ) ) );
  81   *
  82   *     $html = '<div><img></div><img>';
  83   *     //                       ----- Matches here, because IMG must be a direct child of the implicit BODY.
  84   *     $processor->next_tag( array( 'breadcrumbs' => array( 'BODY', 'IMG' ) ) );
  85   *
  86   * ## HTML Support
  87   *
  88   * This class implements a small part of the HTML5 specification.
  89   * It's designed to operate within its support and abort early whenever
  90   * encountering circumstances it can't properly handle. This is
  91   * the principle way in which this class remains as simple as possible
  92   * without cutting corners and breaking compliance.
  93   *
  94   * ### Supported elements
  95   *
  96   * If any unsupported element appears in the HTML input the HTML Processor
  97   * will abort early and stop all processing. This draconian measure ensures
  98   * that the HTML Processor won't break any HTML it doesn't fully understand.
  99   *
 100   * The HTML Processor supports all elements other than a specific set:
 101   *
 102   *  - Any element inside a TABLE.
 103   *  - Any element inside foreign content, including SVG and MATH.
 104   *  - Any element outside the IN BODY insertion mode, e.g. doctype declarations, meta, links.
 105   *
 106   * ### Supported markup
 107   *
 108   * Some kinds of non-normative HTML involve reconstruction of formatting elements and
 109   * re-parenting of mis-nested elements. For example, a DIV tag found inside a TABLE
 110   * may in fact belong _before_ the table in the DOM. If the HTML Processor encounters
 111   * such a case it will stop processing.
 112   *
 113   * The following list illustrates some common examples of unexpected HTML inputs that
 114   * the HTML Processor properly parses and represents:
 115   *
 116   *  - HTML with optional tags omitted, e.g. `<p>one<p>two`.
 117   *  - HTML with unexpected tag closers, e.g. `<p>one </span> more</p>`.
 118   *  - Non-void tags with self-closing flag, e.g. `<div/>the DIV is still open.</div>`.
 119   *  - Heading elements which close open heading elements of another level, e.g. `<h1>Closed by </h2>`.
 120   *  - Elements containing text that looks like other tags but isn't, e.g. `<title>The <img> is plaintext</title>`.
 121   *  - SCRIPT and STYLE tags containing text that looks like HTML but isn't, e.g. `<script>document.write('<p>Hi</p>');</script>`.
 122   *  - SCRIPT content which has been escaped, e.g. `<script><!-- document.write('<script>console.log("hi")</script>') --></script>`.
 123   *
 124   * ### Unsupported Features
 125   *
 126   * This parser does not report parse errors.
 127   *
 128   * Normally, when additional HTML or BODY tags are encountered in a document, if there
 129   * are any additional attributes on them that aren't found on the previous elements,
 130   * the existing HTML and BODY elements adopt those missing attribute values. This
 131   * parser does not add those additional attributes.
 132   *
 133   * In certain situations, elements are moved to a different part of the document in
 134   * a process called "adoption" and "fostering." Because the nodes move to a location
 135   * in the document that the parser had already processed, this parser does not support
 136   * these situations and will bail.
 137   *
 138   * @since 6.4.0
 139   *
 140   * @see WP_HTML_Tag_Processor
 141   * @see https://html.spec.whatwg.org/
 142   */
 143  class WP_HTML_Processor extends WP_HTML_Tag_Processor {
 144      /**
 145       * The maximum number of bookmarks allowed to exist at any given time.
 146       *
 147       * HTML processing requires more bookmarks than basic tag processing,
 148       * so this class constant from the Tag Processor is overwritten.
 149       *
 150       * @since 6.4.0
 151       *
 152       * @var int
 153       */
 154      const MAX_BOOKMARKS = 100;
 155  
 156      /**
 157       * Holds the working state of the parser, including the stack of
 158       * open elements and the stack of active formatting elements.
 159       *
 160       * Initialized in the constructor.
 161       *
 162       * @since 6.4.0
 163       *
 164       * @var WP_HTML_Processor_State
 165       */
 166      private $state;
 167  
 168      /**
 169       * Used to create unique bookmark names.
 170       *
 171       * This class sets a bookmark for every tag in the HTML document that it encounters.
 172       * The bookmark name is auto-generated and increments, starting with `1`. These are
 173       * internal bookmarks and are automatically released when the referring WP_HTML_Token
 174       * goes out of scope and is garbage-collected.
 175       *
 176       * @since 6.4.0
 177       *
 178       * @see WP_HTML_Processor::$release_internal_bookmark_on_destruct
 179       *
 180       * @var int
 181       */
 182      private $bookmark_counter = 0;
 183  
 184      /**
 185       * Stores an explanation for why something failed, if it did.
 186       *
 187       * @see self::get_last_error
 188       *
 189       * @since 6.4.0
 190       *
 191       * @var string|null
 192       */
 193      private $last_error = null;
 194  
 195      /**
 196       * Stores context for why the parser bailed on unsupported HTML, if it did.
 197       *
 198       * @see self::get_unsupported_exception
 199       *
 200       * @since 6.7.0
 201       *
 202       * @var WP_HTML_Unsupported_Exception|null
 203       */
 204      private $unsupported_exception = null;
 205  
 206      /**
 207       * Releases a bookmark when PHP garbage-collects its wrapping WP_HTML_Token instance.
 208       *
 209       * This function is created inside the class constructor so that it can be passed to
 210       * the stack of open elements and the stack of active formatting elements without
 211       * exposing it as a public method on the class.
 212       *
 213       * @since 6.4.0
 214       *
 215       * @var Closure|null
 216       */
 217      private $release_internal_bookmark_on_destruct = null;
 218  
 219      /**
 220       * Stores stack events which arise during parsing of the
 221       * HTML document, which will then supply the "match" events.
 222       *
 223       * @since 6.6.0
 224       *
 225       * @var WP_HTML_Stack_Event[]
 226       */
 227      private $element_queue = array();
 228  
 229      /**
 230       * Stores the current breadcrumbs.
 231       *
 232       * @since 6.7.0
 233       *
 234       * @var string[]
 235       */
 236      private $breadcrumbs = array();
 237  
 238      /**
 239       * Current stack event, if set, representing a matched token.
 240       *
 241       * Because the parser may internally point to a place further along in a document
 242       * than the nodes which have already been processed (some "virtual" nodes may have
 243       * appeared while scanning the HTML document), this will point at the "current" node
 244       * being processed. It comes from the front of the element queue.
 245       *
 246       * @since 6.6.0
 247       *
 248       * @var WP_HTML_Stack_Event|null
 249       */
 250      private $current_element = null;
 251  
 252      /**
 253       * Context node if created as a fragment parser.
 254       *
 255       * @var WP_HTML_Token|null
 256       */
 257      private $context_node = null;
 258  
 259      /*
 260       * Public Interface Functions
 261       */
 262  
 263      /**
 264       * Creates an HTML processor in the fragment parsing mode.
 265       *
 266       * Use this for cases where you are processing chunks of HTML that
 267       * will be found within a bigger HTML document, such as rendered
 268       * block output that exists within a post, `the_content` inside a
 269       * rendered site layout.
 270       *
 271       * Fragment parsing occurs within a context, which is an HTML element
 272       * that the document will eventually be placed in. It becomes important
 273       * when special elements have different rules than others, such as inside
 274       * a TEXTAREA or a TITLE tag where things that look like tags are text,
 275       * or inside a SCRIPT tag where things that look like HTML syntax are JS.
 276       *
 277       * The context value should be a representation of the tag into which the
 278       * HTML is found. For most cases this will be the body element. The HTML
 279       * form is provided because a context element may have attributes that
 280       * impact the parse, such as with a SCRIPT tag and its `type` attribute.
 281       *
 282       * ## Current HTML Support
 283       *
 284       *  - The only supported context is `<body>`, which is the default value.
 285       *  - The only supported document encoding is `UTF-8`, which is the default value.
 286       *
 287       * @since 6.4.0
 288       * @since 6.6.0 Returns `static` instead of `self` so it can create subclass instances.
 289       *
 290       * @param string $html     Input HTML fragment to process.
 291       * @param string $context  Context element for the fragment, must be default of `<body>`.
 292       * @param string $encoding Text encoding of the document; must be default of 'UTF-8'.
 293       * @return static|null The created processor if successful, otherwise null.
 294       */
 295  	public static function create_fragment( $html, $context = '<body>', $encoding = 'UTF-8' ) {
 296          if ( '<body>' !== $context || 'UTF-8' !== $encoding ) {
 297              return null;
 298          }
 299  
 300          $context_processor = static::create_full_parser( "<!DOCTYPE html>{$context}", $encoding );
 301          if ( null === $context_processor ) {
 302              return null;
 303          }
 304  
 305          while ( $context_processor->next_tag() ) {
 306              if ( ! $context_processor->is_virtual() ) {
 307                  $context_processor->set_bookmark( 'final_node' );
 308              }
 309          }
 310  
 311          if (
 312              ! $context_processor->has_bookmark( 'final_node' ) ||
 313              ! $context_processor->seek( 'final_node' )
 314          ) {
 315              _doing_it_wrong( __METHOD__, __( 'No valid context element was detected.' ), '6.8.0' );
 316              return null;
 317          }
 318  
 319          return $context_processor->create_fragment_at_current_node( $html );
 320      }
 321  
 322      /**
 323       * Creates an HTML processor in the full parsing mode.
 324       *
 325       * It's likely that a fragment parser is more appropriate, unless sending an
 326       * entire HTML document from start to finish. Consider a fragment parser with
 327       * a context node of `<body>`.
 328       *
 329       * UTF-8 is the only allowed encoding. If working with a document that
 330       * isn't UTF-8, first convert the document to UTF-8, then pass in the
 331       * converted HTML.
 332       *
 333       * @param string      $html                    Input HTML document to process.
 334       * @param string|null $known_definite_encoding Optional. If provided, specifies the charset used
 335       *                                             in the input byte stream. Currently must be UTF-8.
 336       * @return static|null The created processor if successful, otherwise null.
 337       */
 338  	public static function create_full_parser( $html, $known_definite_encoding = 'UTF-8' ) {
 339          if ( 'UTF-8' !== $known_definite_encoding ) {
 340              return null;
 341          }
 342  
 343          $processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
 344          $processor->state->encoding            = $known_definite_encoding;
 345          $processor->state->encoding_confidence = 'certain';
 346  
 347          return $processor;
 348      }
 349  
 350      /**
 351       * Constructor.
 352       *
 353       * Do not use this method. Use the static creator methods instead.
 354       *
 355       * @access private
 356       *
 357       * @since 6.4.0
 358       *
 359       * @see WP_HTML_Processor::create_fragment()
 360       *
 361       * @param string      $html                                  HTML to process.
 362       * @param string|null $use_the_static_create_methods_instead This constructor should not be called manually.
 363       */
 364  	public function __construct( $html, $use_the_static_create_methods_instead = null ) {
 365          parent::__construct( $html );
 366  
 367          if ( self::CONSTRUCTOR_UNLOCK_CODE !== $use_the_static_create_methods_instead ) {
 368              _doing_it_wrong(
 369                  __METHOD__,
 370                  sprintf(
 371                      /* translators: %s: WP_HTML_Processor::create_fragment(). */
 372                      __( 'Call %s to create an HTML Processor instead of calling the constructor directly.' ),
 373                      '<code>WP_HTML_Processor::create_fragment()</code>'
 374                  ),
 375                  '6.4.0'
 376              );
 377          }
 378  
 379          $this->state = new WP_HTML_Processor_State();
 380  
 381          $this->state->stack_of_open_elements->set_push_handler(
 382              function ( WP_HTML_Token $token ): void {
 383                  $is_virtual            = ! isset( $this->state->current_token ) || $this->is_tag_closer();
 384                  $same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
 385                  $provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
 386                  $this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::PUSH, $provenance );
 387  
 388                  $this->change_parsing_namespace( $token->integration_node_type ? 'html' : $token->namespace );
 389              }
 390          );
 391  
 392          $this->state->stack_of_open_elements->set_pop_handler(
 393              function ( WP_HTML_Token $token ): void {
 394                  $is_virtual            = ! isset( $this->state->current_token ) || ! $this->is_tag_closer();
 395                  $same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
 396                  $provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
 397                  $this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::POP, $provenance );
 398  
 399                  $adjusted_current_node = $this->get_adjusted_current_node();
 400  
 401                  if ( $adjusted_current_node ) {
 402                      $this->change_parsing_namespace( $adjusted_current_node->integration_node_type ? 'html' : $adjusted_current_node->namespace );
 403                  } else {
 404                      $this->change_parsing_namespace( 'html' );
 405                  }
 406              }
 407          );
 408  
 409          /*
 410           * Create this wrapper so that it's possible to pass
 411           * a private method into WP_HTML_Token classes without
 412           * exposing it to any public API.
 413           */
 414          $this->release_internal_bookmark_on_destruct = function ( string $name ): void {
 415              parent::release_bookmark( $name );
 416          };
 417      }
 418  
 419      /**
 420       * Creates a fragment processor at the current node.
 421       *
 422       * HTML Fragment parsing always happens with a context node. HTML Fragment Processors can be
 423       * instantiated with a `BODY` context node via `WP_HTML_Processor::create_fragment( $html )`.
 424       *
 425       * The context node may impact how a fragment of HTML is parsed. For example, consider the HTML
 426       * fragment `<td />Inside TD?</td>`.
 427       *
 428       * A BODY context node will produce the following tree:
 429       *
 430       *     └─#text Inside TD?
 431       *
 432       * Notice that the `<td>` tags are completely ignored.
 433       *
 434       * Compare that with an SVG context node that produces the following tree:
 435       *
 436       *     ├─svg:td
 437       *     └─#text Inside TD?
 438       *
 439       * Here, a `td` node in the `svg` namespace is created, and its self-closing flag is respected.
 440       * This is a peculiarity of parsing HTML in foreign content like SVG.
 441       *
 442       * Finally, consider the tree produced with a TABLE context node:
 443       *
 444       *     └─TBODY
 445       *       └─TR
 446       *         └─TD
 447       *           └─#text Inside TD?
 448       *
 449       * These examples demonstrate how important the context node may be when processing an HTML
 450       * fragment. Special care must be taken when processing fragments that are expected to appear
 451       * in specific contexts. SVG and TABLE are good examples, but there are others.
 452       *
 453       * @see https://html.spec.whatwg.org/multipage/parsing.html#html-fragment-parsing-algorithm
 454       *
 455       * @since 6.8.0
 456       *
 457       * @param string $html Input HTML fragment to process.
 458       * @return static|null The created processor if successful, otherwise null.
 459       */
 460  	private function create_fragment_at_current_node( string $html ) {
 461          if ( $this->get_token_type() !== '#tag' || $this->is_tag_closer() ) {
 462              _doing_it_wrong(
 463                  __METHOD__,
 464                  __( 'The context element must be a start tag.' ),
 465                  '6.8.0'
 466              );
 467              return null;
 468          }
 469  
 470          $tag_name  = $this->current_element->token->node_name;
 471          $namespace = $this->current_element->token->namespace;
 472  
 473          if ( 'html' === $namespace && self::is_void( $tag_name ) ) {
 474              _doing_it_wrong(
 475                  __METHOD__,
 476                  sprintf(
 477                      // translators: %s: A tag name like INPUT or BR.
 478                      __( 'The context element cannot be a void element, found "%s".' ),
 479                      $tag_name
 480                  ),
 481                  '6.8.0'
 482              );
 483              return null;
 484          }
 485  
 486          /*
 487           * Prevent creating fragments at nodes that require a special tokenizer state.
 488           * This is unsupported by the HTML Processor.
 489           */
 490          if (
 491              'html' === $namespace &&
 492              in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP', 'PLAINTEXT' ), true )
 493          ) {
 494              _doing_it_wrong(
 495                  __METHOD__,
 496                  sprintf(
 497                      // translators: %s: A tag name like IFRAME or TEXTAREA.
 498                      __( 'The context element "%s" is not supported.' ),
 499                      $tag_name
 500                  ),
 501                  '6.8.0'
 502              );
 503              return null;
 504          }
 505  
 506          $fragment_processor = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
 507  
 508          $fragment_processor->compat_mode = $this->compat_mode;
 509  
 510          // @todo Create "fake" bookmarks for non-existent but implied nodes.
 511          $fragment_processor->bookmarks['root-node'] = new WP_HTML_Span( 0, 0 );
 512          $root_node                                  = new WP_HTML_Token(
 513              'root-node',
 514              'HTML',
 515              false
 516          );
 517          $fragment_processor->state->stack_of_open_elements->push( $root_node );
 518  
 519          $fragment_processor->bookmarks['context-node']   = new WP_HTML_Span( 0, 0 );
 520          $fragment_processor->context_node                = clone $this->current_element->token;
 521          $fragment_processor->context_node->bookmark_name = 'context-node';
 522          $fragment_processor->context_node->on_destroy    = null;
 523  
 524          $fragment_processor->breadcrumbs = array( 'HTML', $fragment_processor->context_node->node_name );
 525  
 526          if ( 'TEMPLATE' === $fragment_processor->context_node->node_name ) {
 527              $fragment_processor->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
 528          }
 529  
 530          $fragment_processor->reset_insertion_mode_appropriately();
 531  
 532          /*
 533           * > Set the parser's form element pointer to the nearest node to the context element that
 534           * > is a form element (going straight up the ancestor chain, and including the element
 535           * > itself, if it is a form element), if any. (If there is no such form element, the
 536           * > form element pointer keeps its initial value, null.)
 537           */
 538          foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
 539              if ( 'FORM' === $element->node_name && 'html' === $element->namespace ) {
 540                  $fragment_processor->state->form_element                = clone $element;
 541                  $fragment_processor->state->form_element->bookmark_name = null;
 542                  $fragment_processor->state->form_element->on_destroy    = null;
 543                  break;
 544              }
 545          }
 546  
 547          $fragment_processor->state->encoding_confidence = 'irrelevant';
 548  
 549          /*
 550           * Update the parsing namespace near the end of the process.
 551           * This is important so that any push/pop from the stack of open
 552           * elements does not change the parsing namespace.
 553           */
 554          $fragment_processor->change_parsing_namespace(
 555              $this->current_element->token->integration_node_type ? 'html' : $namespace
 556          );
 557  
 558          return $fragment_processor;
 559      }
 560  
 561      /**
 562       * Stops the parser and terminates its execution when encountering unsupported markup.
 563       *
 564       * @throws WP_HTML_Unsupported_Exception Halts execution of the parser.
 565       *
 566       * @since 6.7.0
 567       *
 568       * @param string $message Explains support is missing in order to parse the current node.
 569       */
 570  	private function bail( string $message ) {
 571          $here  = $this->bookmarks[ $this->state->current_token->bookmark_name ];
 572          $token = substr( $this->html, $here->start, $here->length );
 573  
 574          $open_elements = array();
 575          foreach ( $this->state->stack_of_open_elements->stack as $item ) {
 576              $open_elements[] = $item->node_name;
 577          }
 578  
 579          $active_formats = array();
 580          foreach ( $this->state->active_formatting_elements->walk_down() as $item ) {
 581              $active_formats[] = $item->node_name;
 582          }
 583  
 584          $this->last_error = self::ERROR_UNSUPPORTED;
 585  
 586          $this->unsupported_exception = new WP_HTML_Unsupported_Exception(
 587              $message,
 588              $this->state->current_token->node_name,
 589              $here->start,
 590              $token,
 591              $open_elements,
 592              $active_formats
 593          );
 594  
 595          throw $this->unsupported_exception;
 596      }
 597  
 598      /**
 599       * Returns the last error, if any.
 600       *
 601       * Various situations lead to parsing failure but this class will
 602       * return `false` in all those cases. To determine why something
 603       * failed it's possible to request the last error. This can be
 604       * helpful to know to distinguish whether a given tag couldn't
 605       * be found or if content in the document caused the processor
 606       * to give up and abort processing.
 607       *
 608       * Example
 609       *
 610       *     $processor = WP_HTML_Processor::create_fragment( '<template><strong><button><em><p><em>' );
 611       *     false === $processor->next_tag();
 612       *     WP_HTML_Processor::ERROR_UNSUPPORTED === $processor->get_last_error();
 613       *
 614       * @since 6.4.0
 615       *
 616       * @see self::ERROR_UNSUPPORTED
 617       * @see self::ERROR_EXCEEDED_MAX_BOOKMARKS
 618       *
 619       * @return string|null The last error, if one exists, otherwise null.
 620       */
 621  	public function get_last_error(): ?string {
 622          return $this->last_error;
 623      }
 624  
 625      /**
 626       * Returns context for why the parser aborted due to unsupported HTML, if it did.
 627       *
 628       * This is meant for debugging purposes, not for production use.
 629       *
 630       * @since 6.7.0
 631       *
 632       * @see self::$unsupported_exception
 633       *
 634       * @return WP_HTML_Unsupported_Exception|null
 635       */
 636  	public function get_unsupported_exception() {
 637          return $this->unsupported_exception;
 638      }
 639  
 640      /**
 641       * Finds the next tag matching the $query.
 642       *
 643       * @todo Support matching the class name and tag name.
 644       *
 645       * @since 6.4.0
 646       * @since 6.6.0 Visits all tokens, including virtual ones.
 647       *
 648       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
 649       *
 650       * @param array|string|null $query {
 651       *     Optional. Which tag name to find, having which class, etc. Default is to find any tag.
 652       *
 653       *     @type string|null $tag_name     Which tag to find, or `null` for "any tag."
 654       *     @type string      $tag_closers  'visit' to pause at tag closers, 'skip' or unset to only visit openers.
 655       *     @type int|null    $match_offset Find the Nth tag matching all search criteria.
 656       *                                     1 for "first" tag, 3 for "third," etc.
 657       *                                     Defaults to first tag.
 658       *     @type string|null $class_name   Tag must contain this whole class name to match.
 659       *     @type string[]    $breadcrumbs  DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
 660       *                                     May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
 661       * }
 662       * @return bool Whether a tag was matched.
 663       */
 664  	public function next_tag( $query = null ): bool {
 665          $visit_closers = isset( $query['tag_closers'] ) && 'visit' === $query['tag_closers'];
 666  
 667          if ( null === $query ) {
 668              while ( $this->next_token() ) {
 669                  if ( '#tag' !== $this->get_token_type() ) {
 670                      continue;
 671                  }
 672  
 673                  if ( ! $this->is_tag_closer() || $visit_closers ) {
 674                      return true;
 675                  }
 676              }
 677  
 678              return false;
 679          }
 680  
 681          if ( is_string( $query ) ) {
 682              $query = array( 'breadcrumbs' => array( $query ) );
 683          }
 684  
 685          if ( ! is_array( $query ) ) {
 686              _doing_it_wrong(
 687                  __METHOD__,
 688                  __( 'Please pass a query array to this function.' ),
 689                  '6.4.0'
 690              );
 691              return false;
 692          }
 693  
 694          if ( isset( $query['tag_name'] ) ) {
 695              $query['tag_name'] = strtoupper( $query['tag_name'] );
 696          }
 697  
 698          $needs_class = ( isset( $query['class_name'] ) && is_string( $query['class_name'] ) )
 699              ? $query['class_name']
 700              : null;
 701  
 702          if ( ! ( array_key_exists( 'breadcrumbs', $query ) && is_array( $query['breadcrumbs'] ) ) ) {
 703              while ( $this->next_token() ) {
 704                  if ( '#tag' !== $this->get_token_type() ) {
 705                      continue;
 706                  }
 707  
 708                  if ( isset( $query['tag_name'] ) && $query['tag_name'] !== $this->get_token_name() ) {
 709                      continue;
 710                  }
 711  
 712                  if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
 713                      continue;
 714                  }
 715  
 716                  if ( ! $this->is_tag_closer() || $visit_closers ) {
 717                      return true;
 718                  }
 719              }
 720  
 721              return false;
 722          }
 723  
 724          $breadcrumbs  = $query['breadcrumbs'];
 725          $match_offset = isset( $query['match_offset'] ) ? (int) $query['match_offset'] : 1;
 726  
 727          while ( $match_offset > 0 && $this->next_token() ) {
 728              if ( '#tag' !== $this->get_token_type() || $this->is_tag_closer() ) {
 729                  continue;
 730              }
 731  
 732              if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
 733                  continue;
 734              }
 735  
 736              if ( $this->matches_breadcrumbs( $breadcrumbs ) && 0 === --$match_offset ) {
 737                  return true;
 738              }
 739          }
 740  
 741          return false;
 742      }
 743  
 744      /**
 745       * Finds the next token in the HTML document.
 746       *
 747       * This doesn't currently have a way to represent non-tags and doesn't process
 748       * semantic rules for text nodes. For access to the raw tokens consider using
 749       * WP_HTML_Tag_Processor instead.
 750       *
 751       * @since 6.5.0 Added for internal support; do not use.
 752       * @since 6.7.2 Refactored so subclasses may extend.
 753       *
 754       * @return bool Whether a token was parsed.
 755       */
 756  	public function next_token(): bool {
 757          return $this->next_visitable_token();
 758      }
 759  
 760      /**
 761       * Ensures internal accounting is maintained for HTML semantic rules while
 762       * the underlying Tag Processor class is seeking to a bookmark.
 763       *
 764       * This doesn't currently have a way to represent non-tags and doesn't process
 765       * semantic rules for text nodes. For access to the raw tokens consider using
 766       * WP_HTML_Tag_Processor instead.
 767       *
 768       * Note that this method may call itself recursively. This is why it is not
 769       * implemented as {@see WP_HTML_Processor::next_token()}, which instead calls
 770       * this method similarly to how {@see WP_HTML_Tag_Processor::next_token()}
 771       * calls the {@see WP_HTML_Tag_Processor::base_class_next_token()} method.
 772       *
 773       * @since 6.7.2 Added for internal support.
 774       *
 775       * @access private
 776       *
 777       * @return bool
 778       */
 779  	private function next_visitable_token(): bool {
 780          $this->current_element = null;
 781  
 782          if ( isset( $this->last_error ) ) {
 783              return false;
 784          }
 785  
 786          /*
 787           * Prime the events if there are none.
 788           *
 789           * @todo In some cases, probably related to the adoption agency
 790           *       algorithm, this call to step() doesn't create any new
 791           *       events. Calling it again creates them. Figure out why
 792           *       this is and if it's inherent or if it's a bug. Looping
 793           *       until there are events or until there are no more
 794           *       tokens works in the meantime and isn't obviously wrong.
 795           */
 796          if ( empty( $this->element_queue ) && $this->step() ) {
 797              return $this->next_visitable_token();
 798          }
 799  
 800          // Process the next event on the queue.
 801          $this->current_element = array_shift( $this->element_queue );
 802          if ( ! isset( $this->current_element ) ) {
 803              // There are no tokens left, so close all remaining open elements.
 804              while ( $this->state->stack_of_open_elements->pop() ) {
 805                  continue;
 806              }
 807  
 808              return empty( $this->element_queue ) ? false : $this->next_visitable_token();
 809          }
 810  
 811          $is_pop = WP_HTML_Stack_Event::POP === $this->current_element->operation;
 812  
 813          /*
 814           * The root node only exists in the fragment parser, and closing it
 815           * indicates that the parse is complete. Stop before popping it from
 816           * the breadcrumbs.
 817           */
 818          if ( 'root-node' === $this->current_element->token->bookmark_name ) {
 819              return $this->next_visitable_token();
 820          }
 821  
 822          // Adjust the breadcrumbs for this event.
 823          if ( $is_pop ) {
 824              array_pop( $this->breadcrumbs );
 825          } else {
 826              $this->breadcrumbs[] = $this->current_element->token->node_name;
 827          }
 828  
 829          // Avoid sending close events for elements which don't expect a closing.
 830          if ( $is_pop && ! $this->expects_closer( $this->current_element->token ) ) {
 831              return $this->next_visitable_token();
 832          }
 833  
 834          return true;
 835      }
 836  
 837      /**
 838       * Indicates if the current tag token is a tag closer.
 839       *
 840       * Example:
 841       *
 842       *     $p = WP_HTML_Processor::create_fragment( '<div></div>' );
 843       *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
 844       *     $p->is_tag_closer() === false;
 845       *
 846       *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
 847       *     $p->is_tag_closer() === true;
 848       *
 849       * @since 6.6.0 Subclassed for HTML Processor.
 850       *
 851       * @return bool Whether the current tag is a tag closer.
 852       */
 853  	public function is_tag_closer(): bool {
 854          return $this->is_virtual()
 855              ? ( WP_HTML_Stack_Event::POP === $this->current_element->operation && '#tag' === $this->get_token_type() )
 856              : parent::is_tag_closer();
 857      }
 858  
 859      /**
 860       * Indicates if the currently-matched token is virtual, created by a stack operation
 861       * while processing HTML, rather than a token found in the HTML text itself.
 862       *
 863       * @since 6.6.0
 864       *
 865       * @return bool Whether the current token is virtual.
 866       */
 867  	private function is_virtual(): bool {
 868          return (
 869              isset( $this->current_element->provenance ) &&
 870              'virtual' === $this->current_element->provenance
 871          );
 872      }
 873  
 874      /**
 875       * Indicates if the currently-matched tag matches the given breadcrumbs.
 876       *
 877       * A "*" represents a single tag wildcard, where any tag matches, but not no tags.
 878       *
 879       * At some point this function _may_ support a `**` syntax for matching any number
 880       * of unspecified tags in the breadcrumb stack. This has been intentionally left
 881       * out, however, to keep this function simple and to avoid introducing backtracking,
 882       * which could open up surprising performance breakdowns.
 883       *
 884       * Example:
 885       *
 886       *     $processor = WP_HTML_Processor::create_fragment( '<div><span><figure><img></figure></span></div>' );
 887       *     $processor->next_tag( 'img' );
 888       *     true  === $processor->matches_breadcrumbs( array( 'figure', 'img' ) );
 889       *     true  === $processor->matches_breadcrumbs( array( 'span', 'figure', 'img' ) );
 890       *     false === $processor->matches_breadcrumbs( array( 'span', 'img' ) );
 891       *     true  === $processor->matches_breadcrumbs( array( 'span', '*', 'img' ) );
 892       *
 893       * @since 6.4.0
 894       *
 895       * @param string[] $breadcrumbs DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
 896       *                              May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
 897       * @return bool Whether the currently-matched tag is found at the given nested structure.
 898       */
 899  	public function matches_breadcrumbs( $breadcrumbs ): bool {
 900          // Everything matches when there are zero constraints.
 901          if ( 0 === count( $breadcrumbs ) ) {
 902              return true;
 903          }
 904  
 905          // Start at the last crumb.
 906          $crumb = end( $breadcrumbs );
 907  
 908          if ( '*' !== $crumb && $this->get_tag() !== strtoupper( $crumb ) ) {
 909              return false;
 910          }
 911  
 912          for ( $i = count( $this->breadcrumbs ) - 1; $i >= 0; $i-- ) {
 913              $node  = $this->breadcrumbs[ $i ];
 914              $crumb = strtoupper( current( $breadcrumbs ) );
 915  
 916              if ( '*' !== $crumb && $node !== $crumb ) {
 917                  return false;
 918              }
 919  
 920              if ( false === prev( $breadcrumbs ) ) {
 921                  return true;
 922              }
 923          }
 924  
 925          return false;
 926      }
 927  
 928      /**
 929       * Indicates if the currently-matched node expects a closing
 930       * token, or if it will self-close on the next step.
 931       *
 932       * Most HTML elements expect a closer, such as a P element or
 933       * a DIV element. Others, like an IMG element are void and don't
 934       * have a closing tag. Special elements, such as SCRIPT and STYLE,
 935       * are treated just like void tags. Text nodes and self-closing
 936       * foreign content will also act just like a void tag, immediately
 937       * closing as soon as the processor advances to the next token.
 938       *
 939       * @since 6.6.0
 940       *
 941       * @param WP_HTML_Token|null $node Optional. Node to examine, if provided.
 942       *                                 Default is to examine current node.
 943       * @return bool|null Whether to expect a closer for the currently-matched node,
 944       *                   or `null` if not matched on any token.
 945       */
 946  	public function expects_closer( ?WP_HTML_Token $node = null ): ?bool {
 947          $token_name = $node->node_name ?? $this->get_token_name();
 948  
 949          if ( ! isset( $token_name ) ) {
 950              return null;
 951          }
 952  
 953          $token_namespace        = $node->namespace ?? $this->get_namespace();
 954          $token_has_self_closing = $node->has_self_closing_flag ?? $this->has_self_closing_flag();
 955  
 956          return ! (
 957              // Comments, text nodes, and other atomic tokens.
 958              '#' === $token_name[0] ||
 959              // Doctype declarations.
 960              'html' === $token_name ||
 961              // Void elements.
 962              ( 'html' === $token_namespace && self::is_void( $token_name ) ) ||
 963              // Special atomic elements.
 964              ( 'html' === $token_namespace && in_array( $token_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) ||
 965              // Self-closing elements in foreign content.
 966              ( 'html' !== $token_namespace && $token_has_self_closing )
 967          );
 968      }
 969  
 970      /**
 971       * Steps through the HTML document and stop at the next tag, if any.
 972       *
 973       * @since 6.4.0
 974       *
 975       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
 976       *
 977       * @see self::PROCESS_NEXT_NODE
 978       * @see self::REPROCESS_CURRENT_NODE
 979       *
 980       * @param string $node_to_process Whether to parse the next node or reprocess the current node.
 981       * @return bool Whether a tag was matched.
 982       */
 983  	public function step( $node_to_process = self::PROCESS_NEXT_NODE ): bool {
 984          // Refuse to proceed if there was a previous error.
 985          if ( null !== $this->last_error ) {
 986              return false;
 987          }
 988  
 989          if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
 990              /*
 991               * Void elements still hop onto the stack of open elements even though
 992               * there's no corresponding closing tag. This is important for managing
 993               * stack-based operations such as "navigate to parent node" or checking
 994               * on an element's breadcrumbs.
 995               *
 996               * When moving on to the next node, therefore, if the bottom-most element
 997               * on the stack is a void element, it must be closed.
 998               */
 999              $top_node = $this->state->stack_of_open_elements->current_node();
1000              if ( isset( $top_node ) && ! $this->expects_closer( $top_node ) ) {
1001                  $this->state->stack_of_open_elements->pop();
1002              }
1003          }
1004  
1005          if ( self::PROCESS_NEXT_NODE === $node_to_process ) {
1006              parent::next_token();
1007              if ( WP_HTML_Tag_Processor::STATE_TEXT_NODE === $this->parser_state ) {
1008                  parent::subdivide_text_appropriately();
1009              }
1010          }
1011  
1012          // Finish stepping when there are no more tokens in the document.
1013          if (
1014              WP_HTML_Tag_Processor::STATE_INCOMPLETE_INPUT === $this->parser_state ||
1015              WP_HTML_Tag_Processor::STATE_COMPLETE === $this->parser_state
1016          ) {
1017              return false;
1018          }
1019  
1020          $adjusted_current_node = $this->get_adjusted_current_node();
1021          $is_closer             = $this->is_tag_closer();
1022          $is_start_tag          = WP_HTML_Tag_Processor::STATE_MATCHED_TAG === $this->parser_state && ! $is_closer;
1023          $token_name            = $this->get_token_name();
1024  
1025          if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
1026              $this->state->current_token = new WP_HTML_Token(
1027                  $this->bookmark_token(),
1028                  $token_name,
1029                  $this->has_self_closing_flag(),
1030                  $this->release_internal_bookmark_on_destruct
1031              );
1032          }
1033  
1034          $parse_in_current_insertion_mode = (
1035              0 === $this->state->stack_of_open_elements->count() ||
1036              'html' === $adjusted_current_node->namespace ||
1037              (
1038                  'math' === $adjusted_current_node->integration_node_type &&
1039                  (
1040                      ( $is_start_tag && ! in_array( $token_name, array( 'MGLYPH', 'MALIGNMARK' ), true ) ) ||
1041                      '#text' === $token_name
1042                  )
1043              ) ||
1044              (
1045                  'math' === $adjusted_current_node->namespace &&
1046                  'ANNOTATION-XML' === $adjusted_current_node->node_name &&
1047                  $is_start_tag && 'SVG' === $token_name
1048              ) ||
1049              (
1050                  'html' === $adjusted_current_node->integration_node_type &&
1051                  ( $is_start_tag || '#text' === $token_name )
1052              )
1053          );
1054  
1055          try {
1056              if ( ! $parse_in_current_insertion_mode ) {
1057                  return $this->step_in_foreign_content();
1058              }
1059  
1060              switch ( $this->state->insertion_mode ) {
1061                  case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
1062                      return $this->step_initial();
1063  
1064                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
1065                      return $this->step_before_html();
1066  
1067                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
1068                      return $this->step_before_head();
1069  
1070                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
1071                      return $this->step_in_head();
1072  
1073                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
1074                      return $this->step_in_head_noscript();
1075  
1076                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
1077                      return $this->step_after_head();
1078  
1079                  case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
1080                      return $this->step_in_body();
1081  
1082                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
1083                      return $this->step_in_table();
1084  
1085                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
1086                      return $this->step_in_table_text();
1087  
1088                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
1089                      return $this->step_in_caption();
1090  
1091                  case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
1092                      return $this->step_in_column_group();
1093  
1094                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
1095                      return $this->step_in_table_body();
1096  
1097                  case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
1098                      return $this->step_in_row();
1099  
1100                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
1101                      return $this->step_in_cell();
1102  
1103                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
1104                      return $this->step_in_select();
1105  
1106                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
1107                      return $this->step_in_select_in_table();
1108  
1109                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
1110                      return $this->step_in_template();
1111  
1112                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
1113                      return $this->step_after_body();
1114  
1115                  case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
1116                      return $this->step_in_frameset();
1117  
1118                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
1119                      return $this->step_after_frameset();
1120  
1121                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
1122                      return $this->step_after_after_body();
1123  
1124                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
1125                      return $this->step_after_after_frameset();
1126  
1127                  // This should be unreachable but PHP doesn't have total type checking on switch.
1128                  default:
1129                      $this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
1130              }
1131          } catch ( WP_HTML_Unsupported_Exception $e ) {
1132              /*
1133               * Exceptions are used in this class to escape deep call stacks that
1134               * otherwise might involve messier calling and return conventions.
1135               */
1136              return false;
1137          }
1138      }
1139  
1140      /**
1141       * Computes the HTML breadcrumbs for the currently-matched node, if matched.
1142       *
1143       * Breadcrumbs start at the outermost parent and descend toward the matched element.
1144       * They always include the entire path from the root HTML node to the matched element.
1145       *
1146       * Example:
1147       *
1148       *     $processor = WP_HTML_Processor::create_fragment( '<p><strong><em><img></em></strong></p>' );
1149       *     $processor->next_tag( 'IMG' );
1150       *     $processor->get_breadcrumbs() === array( 'HTML', 'BODY', 'P', 'STRONG', 'EM', 'IMG' );
1151       *
1152       * @since 6.4.0
1153       *
1154       * @return string[] Array of tag names representing path to matched node.
1155       */
1156  	public function get_breadcrumbs(): array {
1157          return $this->breadcrumbs;
1158      }
1159  
1160      /**
1161       * Returns the nesting depth of the current location in the document.
1162       *
1163       * Example:
1164       *
1165       *     $processor = WP_HTML_Processor::create_fragment( '<div><p></p></div>' );
1166       *     // The processor starts in the BODY context, meaning it has depth from the start: HTML > BODY.
1167       *     2 === $processor->get_current_depth();
1168       *
1169       *     // Opening the DIV element increases the depth.
1170       *     $processor->next_token();
1171       *     3 === $processor->get_current_depth();
1172       *
1173       *     // Opening the P element increases the depth.
1174       *     $processor->next_token();
1175       *     4 === $processor->get_current_depth();
1176       *
1177       *     // The P element is closed during `next_token()` so the depth is decreased to reflect that.
1178       *     $processor->next_token();
1179       *     3 === $processor->get_current_depth();
1180       *
1181       * @since 6.6.0
1182       *
1183       * @return int Nesting-depth of current location in the document.
1184       */
1185  	public function get_current_depth(): int {
1186          return count( $this->breadcrumbs );
1187      }
1188  
1189      /**
1190       * Normalizes an HTML fragment by serializing it.
1191       *
1192       * This method assumes that the given HTML snippet is found in BODY context.
1193       * For normalizing full documents or fragments found in other contexts, create
1194       * a new processor using {@see WP_HTML_Processor::create_fragment} or
1195       * {@see WP_HTML_Processor::create_full_parser} and call {@see WP_HTML_Processor::serialize}
1196       * on the created instances.
1197       *
1198       * Many aspects of an input HTML fragment may be changed during normalization.
1199       *
1200       *  - Attribute values will be double-quoted.
1201       *  - Duplicate attributes will be removed.
1202       *  - Omitted tags will be added.
1203       *  - Tag and attribute name casing will be lower-cased,
1204       *    except for specific SVG and MathML tags or attributes.
1205       *  - Text will be re-encoded, null bytes handled,
1206       *    and invalid UTF-8 replaced with U+FFFD.
1207       *  - Any incomplete syntax trailing at the end will be omitted,
1208       *    for example, an unclosed comment opener will be removed.
1209       *
1210       * Example:
1211       *
1212       *     echo WP_HTML_Processor::normalize( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1213       *     // <a href="#anchor" v="5" enabled>One</a>
1214       *
1215       *     echo WP_HTML_Processor::normalize( '<div></p>fun<table><td>cell</div>' );
1216       *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1217       *
1218       *     echo WP_HTML_Processor::normalize( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1219       *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1220       *
1221       * @since 6.7.0
1222       *
1223       * @param string $html Input HTML to normalize.
1224       *
1225       * @return string|null Normalized output, or `null` if unable to normalize.
1226       */
1227  	public static function normalize( string $html ): ?string {
1228          return static::create_fragment( $html )->serialize();
1229      }
1230  
1231      /**
1232       * Returns normalized HTML for a fragment by serializing it.
1233       *
1234       * This differs from {@see WP_HTML_Processor::normalize} in that it starts with
1235       * a specific HTML Processor, which _must_ not have already started scanning;
1236       * it must be in the initial ready state and will be in the completed state once
1237       * serialization is complete.
1238       *
1239       * Many aspects of an input HTML fragment may be changed during normalization.
1240       *
1241       *  - Attribute values will be double-quoted.
1242       *  - Duplicate attributes will be removed.
1243       *  - Omitted tags will be added.
1244       *  - Tag and attribute name casing will be lower-cased,
1245       *    except for specific SVG and MathML tags or attributes.
1246       *  - Text will be re-encoded, null bytes handled,
1247       *    and invalid UTF-8 replaced with U+FFFD.
1248       *  - Any incomplete syntax trailing at the end will be omitted,
1249       *    for example, an unclosed comment opener will be removed.
1250       *
1251       * Example:
1252       *
1253       *     $processor = WP_HTML_Processor::create_fragment( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1254       *     echo $processor->serialize();
1255       *     // <a href="#anchor" v="5" enabled>One</a>
1256       *
1257       *     $processor = WP_HTML_Processor::create_fragment( '<div></p>fun<table><td>cell</div>' );
1258       *     echo $processor->serialize();
1259       *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1260       *
1261       *     $processor = WP_HTML_Processor::create_fragment( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1262       *     echo $processor->serialize();
1263       *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1264       *
1265       * @since 6.7.0
1266       *
1267       * @return string|null Normalized HTML markup represented by processor,
1268       *                     or `null` if unable to generate serialization.
1269       */
1270  	public function serialize(): ?string {
1271          if ( WP_HTML_Tag_Processor::STATE_READY !== $this->parser_state ) {
1272              wp_trigger_error(
1273                  __METHOD__,
1274                  'An HTML Processor which has already started processing cannot serialize its contents. Serialize immediately after creating the instance.',
1275                  E_USER_WARNING
1276              );
1277              return null;
1278          }
1279  
1280          $html = '';
1281          while ( $this->next_token() ) {
1282              $html .= $this->serialize_token();
1283          }
1284  
1285          if ( null !== $this->get_last_error() ) {
1286              wp_trigger_error(
1287                  __METHOD__,
1288                  "Cannot serialize HTML Processor with parsing error: {$this->get_last_error()}.",
1289                  E_USER_WARNING
1290              );
1291              return null;
1292          }
1293  
1294          return $html;
1295      }
1296  
1297      /**
1298       * Serializes the currently-matched token.
1299       *
1300       * This method produces a fully-normative HTML string for the currently-matched token,
1301       * if able. If not matched at any token or if the token doesn't correspond to any HTML
1302       * it will return an empty string (for example, presumptuous end tags are ignored).
1303       *
1304       * @see static::serialize()
1305       *
1306       * @since 6.7.0
1307       * @since 6.9.0 Converted from protected to public method.
1308       *
1309       * @return string Serialization of token, or empty string if no serialization exists.
1310       */
1311  	public function serialize_token(): string {
1312          $html       = '';
1313          $token_type = $this->get_token_type();
1314  
1315          switch ( $token_type ) {
1316              case '#doctype':
1317                  $doctype = $this->get_doctype_info();
1318                  if ( null === $doctype ) {
1319                      break;
1320                  }
1321  
1322                  $html .= '<!DOCTYPE';
1323  
1324                  if ( $doctype->name ) {
1325                      $html .= " {$doctype->name}";
1326                  }
1327  
1328                  if ( null !== $doctype->public_identifier ) {
1329                      $quote = str_contains( $doctype->public_identifier, '"' ) ? "'" : '"';
1330                      $html .= " PUBLIC {$quote}{$doctype->public_identifier}{$quote}";
1331                  }
1332                  if ( null !== $doctype->system_identifier ) {
1333                      if ( null === $doctype->public_identifier ) {
1334                          $html .= ' SYSTEM';
1335                      }
1336                      $quote = str_contains( $doctype->system_identifier, '"' ) ? "'" : '"';
1337                      $html .= " {$quote}{$doctype->system_identifier}{$quote}";
1338                  }
1339  
1340                  $html .= '>';
1341                  break;
1342  
1343              case '#text':
1344                  $html .= htmlspecialchars( $this->get_modifiable_text(), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1345                  break;
1346  
1347              // Unlike the `<>` which is interpreted as plaintext, this is ignored entirely.
1348              case '#presumptuous-tag':
1349                  break;
1350  
1351              case '#funky-comment':
1352              case '#comment':
1353                  $html .= "<!--{$this->get_full_comment_text()}-->";
1354                  break;
1355  
1356              case '#cdata-section':
1357                  $html .= "<![CDATA[{$this->get_modifiable_text()}]]>";
1358                  break;
1359          }
1360  
1361          if ( '#tag' !== $token_type ) {
1362              return $html;
1363          }
1364  
1365          $tag_name       = str_replace( "\x00", "\u{FFFD}", $this->get_tag() );
1366          $in_html        = 'html' === $this->get_namespace();
1367          $qualified_name = $in_html ? strtolower( $tag_name ) : $this->get_qualified_tag_name();
1368  
1369          if ( $this->is_tag_closer() ) {
1370              $html .= "</{$qualified_name}>";
1371              return $html;
1372          }
1373  
1374          $attribute_names = $this->get_attribute_names_with_prefix( '' );
1375          if ( ! isset( $attribute_names ) ) {
1376              $html .= "<{$qualified_name}>";
1377              return $html;
1378          }
1379  
1380          $html .= "<{$qualified_name}";
1381          foreach ( $attribute_names as $attribute_name ) {
1382              $html .= " {$this->get_qualified_attribute_name( $attribute_name )}";
1383              $value = $this->get_attribute( $attribute_name );
1384  
1385              if ( is_string( $value ) ) {
1386                  $html .= '="' . htmlspecialchars( $value, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5 ) . '"';
1387              }
1388  
1389              $html = str_replace( "\x00", "\u{FFFD}", $html );
1390          }
1391  
1392          if ( ! $in_html && $this->has_self_closing_flag() ) {
1393              $html .= ' /';
1394          }
1395  
1396          $html .= '>';
1397  
1398          // Flush out self-contained elements.
1399          if ( $in_html && in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) {
1400              $text = $this->get_modifiable_text();
1401  
1402              switch ( $tag_name ) {
1403                  case 'IFRAME':
1404                  case 'NOEMBED':
1405                  case 'NOFRAMES':
1406                      $text = '';
1407                      break;
1408  
1409                  case 'SCRIPT':
1410                  case 'STYLE':
1411                      break;
1412  
1413                  default:
1414                      $text = htmlspecialchars( $text, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1415              }
1416  
1417              $html .= "{$text}</{$qualified_name}>";
1418          }
1419  
1420          return $html;
1421      }
1422  
1423      /**
1424       * Parses next element in the 'initial' insertion mode.
1425       *
1426       * This internal function performs the 'initial' insertion mode
1427       * logic for the generalized WP_HTML_Processor::step() function.
1428       *
1429       * @since 6.7.0
1430       *
1431       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1432       *
1433       * @see https://html.spec.whatwg.org/#the-initial-insertion-mode
1434       * @see WP_HTML_Processor::step
1435       *
1436       * @return bool Whether an element was found.
1437       */
1438  	private function step_initial(): bool {
1439          $token_name = $this->get_token_name();
1440          $token_type = $this->get_token_type();
1441          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
1442          $op         = "{$op_sigil}{$token_name}";
1443  
1444          switch ( $op ) {
1445              /*
1446               * > A character token that is one of U+0009 CHARACTER TABULATION,
1447               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1448               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1449               *
1450               * Parse error: ignore the token.
1451               */
1452              case '#text':
1453                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1454                      return $this->step();
1455                  }
1456                  goto initial_anything_else;
1457                  break;
1458  
1459              /*
1460               * > A comment token
1461               */
1462              case '#comment':
1463              case '#funky-comment':
1464              case '#presumptuous-tag':
1465                  $this->insert_html_element( $this->state->current_token );
1466                  return true;
1467  
1468              /*
1469               * > A DOCTYPE token
1470               */
1471              case 'html':
1472                  $doctype = $this->get_doctype_info();
1473                  if ( null !== $doctype && 'quirks' === $doctype->indicated_compatability_mode ) {
1474                      $this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
1475                  }
1476  
1477                  /*
1478                   * > Then, switch the insertion mode to "before html".
1479                   */
1480                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1481                  $this->insert_html_element( $this->state->current_token );
1482                  return true;
1483          }
1484  
1485          /*
1486           * > Anything else
1487           */
1488          initial_anything_else:
1489          $this->compat_mode           = WP_HTML_Tag_Processor::QUIRKS_MODE;
1490          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1491          return $this->step( self::REPROCESS_CURRENT_NODE );
1492      }
1493  
1494      /**
1495       * Parses next element in the 'before html' insertion mode.
1496       *
1497       * This internal function performs the 'before html' insertion mode
1498       * logic for the generalized WP_HTML_Processor::step() function.
1499       *
1500       * @since 6.7.0
1501       *
1502       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1503       *
1504       * @see https://html.spec.whatwg.org/#the-before-html-insertion-mode
1505       * @see WP_HTML_Processor::step
1506       *
1507       * @return bool Whether an element was found.
1508       */
1509  	private function step_before_html(): bool {
1510          $token_name = $this->get_token_name();
1511          $token_type = $this->get_token_type();
1512          $is_closer  = parent::is_tag_closer();
1513          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1514          $op         = "{$op_sigil}{$token_name}";
1515  
1516          switch ( $op ) {
1517              /*
1518               * > A DOCTYPE token
1519               */
1520              case 'html':
1521                  // Parse error: ignore the token.
1522                  return $this->step();
1523  
1524              /*
1525               * > A comment token
1526               */
1527              case '#comment':
1528              case '#funky-comment':
1529              case '#presumptuous-tag':
1530                  $this->insert_html_element( $this->state->current_token );
1531                  return true;
1532  
1533              /*
1534               * > A character token that is one of U+0009 CHARACTER TABULATION,
1535               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1536               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1537               *
1538               * Parse error: ignore the token.
1539               */
1540              case '#text':
1541                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1542                      return $this->step();
1543                  }
1544                  goto before_html_anything_else;
1545                  break;
1546  
1547              /*
1548               * > A start tag whose tag name is "html"
1549               */
1550              case '+HTML':
1551                  $this->insert_html_element( $this->state->current_token );
1552                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1553                  return true;
1554  
1555              /*
1556               * > An end tag whose tag name is one of: "head", "body", "html", "br"
1557               *
1558               * Closing BR tags are always reported by the Tag Processor as opening tags.
1559               */
1560              case '-HEAD':
1561              case '-BODY':
1562              case '-HTML':
1563                  /*
1564                   * > Act as described in the "anything else" entry below.
1565                   */
1566                  goto before_html_anything_else;
1567                  break;
1568          }
1569  
1570          /*
1571           * > Any other end tag
1572           */
1573          if ( $is_closer ) {
1574              // Parse error: ignore the token.
1575              return $this->step();
1576          }
1577  
1578          /*
1579           * > Anything else.
1580           *
1581           * > Create an html element whose node document is the Document object.
1582           * > Append it to the Document object. Put this element in the stack of open elements.
1583           * > Switch the insertion mode to "before head", then reprocess the token.
1584           */
1585          before_html_anything_else:
1586          $this->insert_virtual_node( 'HTML' );
1587          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1588          return $this->step( self::REPROCESS_CURRENT_NODE );
1589      }
1590  
1591      /**
1592       * Parses next element in the 'before head' insertion mode.
1593       *
1594       * This internal function performs the 'before head' insertion mode
1595       * logic for the generalized WP_HTML_Processor::step() function.
1596       *
1597       * @since 6.7.0 Stub implementation.
1598       *
1599       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1600       *
1601       * @see https://html.spec.whatwg.org/#the-before-head-insertion-mode
1602       * @see WP_HTML_Processor::step
1603       *
1604       * @return bool Whether an element was found.
1605       */
1606  	private function step_before_head(): bool {
1607          $token_name = $this->get_token_name();
1608          $token_type = $this->get_token_type();
1609          $is_closer  = parent::is_tag_closer();
1610          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1611          $op         = "{$op_sigil}{$token_name}";
1612  
1613          switch ( $op ) {
1614              /*
1615               * > A character token that is one of U+0009 CHARACTER TABULATION,
1616               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1617               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1618               *
1619               * Parse error: ignore the token.
1620               */
1621              case '#text':
1622                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1623                      return $this->step();
1624                  }
1625                  goto before_head_anything_else;
1626                  break;
1627  
1628              /*
1629               * > A comment token
1630               */
1631              case '#comment':
1632              case '#funky-comment':
1633              case '#presumptuous-tag':
1634                  $this->insert_html_element( $this->state->current_token );
1635                  return true;
1636  
1637              /*
1638               * > A DOCTYPE token
1639               */
1640              case 'html':
1641                  // Parse error: ignore the token.
1642                  return $this->step();
1643  
1644              /*
1645               * > A start tag whose tag name is "html"
1646               */
1647              case '+HTML':
1648                  return $this->step_in_body();
1649  
1650              /*
1651               * > A start tag whose tag name is "head"
1652               */
1653              case '+HEAD':
1654                  $this->insert_html_element( $this->state->current_token );
1655                  $this->state->head_element   = $this->state->current_token;
1656                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1657                  return true;
1658  
1659              /*
1660               * > An end tag whose tag name is one of: "head", "body", "html", "br"
1661               * > Act as described in the "anything else" entry below.
1662               *
1663               * Closing BR tags are always reported by the Tag Processor as opening tags.
1664               */
1665              case '-HEAD':
1666              case '-BODY':
1667              case '-HTML':
1668                  goto before_head_anything_else;
1669                  break;
1670          }
1671  
1672          if ( $is_closer ) {
1673              // Parse error: ignore the token.
1674              return $this->step();
1675          }
1676  
1677          /*
1678           * > Anything else
1679           *
1680           * > Insert an HTML element for a "head" start tag token with no attributes.
1681           */
1682          before_head_anything_else:
1683          $this->state->head_element   = $this->insert_virtual_node( 'HEAD' );
1684          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1685          return $this->step( self::REPROCESS_CURRENT_NODE );
1686      }
1687  
1688      /**
1689       * Parses next element in the 'in head' insertion mode.
1690       *
1691       * This internal function performs the 'in head' insertion mode
1692       * logic for the generalized WP_HTML_Processor::step() function.
1693       *
1694       * @since 6.7.0
1695       *
1696       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1697       *
1698       * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inhead
1699       * @see WP_HTML_Processor::step
1700       *
1701       * @return bool Whether an element was found.
1702       */
1703  	private function step_in_head(): bool {
1704          $token_name = $this->get_token_name();
1705          $token_type = $this->get_token_type();
1706          $is_closer  = parent::is_tag_closer();
1707          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1708          $op         = "{$op_sigil}{$token_name}";
1709  
1710          switch ( $op ) {
1711              case '#text':
1712                  /*
1713                   * > A character token that is one of U+0009 CHARACTER TABULATION,
1714                   * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1715                   * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1716                   */
1717                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1718                      // Insert the character.
1719                      $this->insert_html_element( $this->state->current_token );
1720                      return true;
1721                  }
1722  
1723                  goto in_head_anything_else;
1724                  break;
1725  
1726              /*
1727               * > A comment token
1728               */
1729              case '#comment':
1730              case '#funky-comment':
1731              case '#presumptuous-tag':
1732                  $this->insert_html_element( $this->state->current_token );
1733                  return true;
1734  
1735              /*
1736               * > A DOCTYPE token
1737               */
1738              case 'html':
1739                  // Parse error: ignore the token.
1740                  return $this->step();
1741  
1742              /*
1743               * > A start tag whose tag name is "html"
1744               */
1745              case '+HTML':
1746                  return $this->step_in_body();
1747  
1748              /*
1749               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link"
1750               */
1751              case '+BASE':
1752              case '+BASEFONT':
1753              case '+BGSOUND':
1754              case '+LINK':
1755                  $this->insert_html_element( $this->state->current_token );
1756                  return true;
1757  
1758              /*
1759               * > A start tag whose tag name is "meta"
1760               */
1761              case '+META':
1762                  $this->insert_html_element( $this->state->current_token );
1763  
1764                  // All following conditions depend on "tentative" encoding confidence.
1765                  if ( 'tentative' !== $this->state->encoding_confidence ) {
1766                      return true;
1767                  }
1768  
1769                  /*
1770                   * > If the active speculative HTML parser is null, then:
1771                   * >   - If the element has a charset attribute, and getting an encoding from
1772                   * >     its value results in an encoding, and the confidence is currently
1773                   * >     tentative, then change the encoding to the resulting encoding.
1774                   */
1775                  $charset = $this->get_attribute( 'charset' );
1776                  if ( is_string( $charset ) ) {
1777                      $this->bail( 'Cannot yet process META tags with charset to determine encoding.' );
1778                  }
1779  
1780                  /*
1781                   * >   - Otherwise, if the element has an http-equiv attribute whose value is
1782                   * >     an ASCII case-insensitive match for the string "Content-Type", and
1783                   * >     the element has a content attribute, and applying the algorithm for
1784                   * >     extracting a character encoding from a meta element to that attribute's
1785                   * >     value returns an encoding, and the confidence is currently tentative,
1786                   * >     then change the encoding to the extracted encoding.
1787                   */
1788                  $http_equiv = $this->get_attribute( 'http-equiv' );
1789                  $content    = $this->get_attribute( 'content' );
1790                  if (
1791                      is_string( $http_equiv ) &&
1792                      is_string( $content ) &&
1793                      0 === strcasecmp( $http_equiv, 'Content-Type' )
1794                  ) {
1795                      $this->bail( 'Cannot yet process META tags with http-equiv Content-Type to determine encoding.' );
1796                  }
1797  
1798                  return true;
1799  
1800              /*
1801               * > A start tag whose tag name is "title"
1802               */
1803              case '+TITLE':
1804                  $this->insert_html_element( $this->state->current_token );
1805                  return true;
1806  
1807              /*
1808               * > A start tag whose tag name is "noscript", if the scripting flag is enabled
1809               * > A start tag whose tag name is one of: "noframes", "style"
1810               *
1811               * The scripting flag is never enabled in this parser.
1812               */
1813              case '+NOFRAMES':
1814              case '+STYLE':
1815                  $this->insert_html_element( $this->state->current_token );
1816                  return true;
1817  
1818              /*
1819               * > A start tag whose tag name is "noscript", if the scripting flag is disabled
1820               */
1821              case '+NOSCRIPT':
1822                  $this->insert_html_element( $this->state->current_token );
1823                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT;
1824                  return true;
1825  
1826              /*
1827               * > A start tag whose tag name is "script"
1828               *
1829               * @todo Could the adjusted insertion location be anything other than the current location?
1830               */
1831              case '+SCRIPT':
1832                  $this->insert_html_element( $this->state->current_token );
1833                  return true;
1834  
1835              /*
1836               * > An end tag whose tag name is "head"
1837               */
1838              case '-HEAD':
1839                  $this->state->stack_of_open_elements->pop();
1840                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1841                  return true;
1842  
1843              /*
1844               * > An end tag whose tag name is one of: "body", "html", "br"
1845               *
1846               * BR tags are always reported by the Tag Processor as opening tags.
1847               */
1848              case '-BODY':
1849              case '-HTML':
1850                  /*
1851                   * > Act as described in the "anything else" entry below.
1852                   */
1853                  goto in_head_anything_else;
1854                  break;
1855  
1856              /*
1857               * > A start tag whose tag name is "template"
1858               *
1859               * @todo Could the adjusted insertion location be anything other than the current location?
1860               */
1861              case '+TEMPLATE':
1862                  $this->state->active_formatting_elements->insert_marker();
1863                  $this->state->frameset_ok = false;
1864  
1865                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1866                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1867  
1868                  $this->insert_html_element( $this->state->current_token );
1869                  return true;
1870  
1871              /*
1872               * > An end tag whose tag name is "template"
1873               */
1874              case '-TEMPLATE':
1875                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
1876                      // @todo Indicate a parse error once it's possible.
1877                      return $this->step();
1878                  }
1879  
1880                  $this->generate_implied_end_tags_thoroughly();
1881                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'TEMPLATE' ) ) {
1882                      // @todo Indicate a parse error once it's possible.
1883                  }
1884  
1885                  $this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
1886                  $this->state->active_formatting_elements->clear_up_to_last_marker();
1887                  array_pop( $this->state->stack_of_template_insertion_modes );
1888                  $this->reset_insertion_mode_appropriately();
1889                  return true;
1890          }
1891  
1892          /*
1893           * > A start tag whose tag name is "head"
1894           * > Any other end tag
1895           */
1896          if ( '+HEAD' === $op || $is_closer ) {
1897              // Parse error: ignore the token.
1898              return $this->step();
1899          }
1900  
1901          /*
1902           * > Anything else
1903           */
1904          in_head_anything_else:
1905          $this->state->stack_of_open_elements->pop();
1906          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1907          return $this->step( self::REPROCESS_CURRENT_NODE );
1908      }
1909  
1910      /**
1911       * Parses next element in the 'in head noscript' insertion mode.
1912       *
1913       * This internal function performs the 'in head noscript' insertion mode
1914       * logic for the generalized WP_HTML_Processor::step() function.
1915       *
1916       * @since 6.7.0 Stub implementation.
1917       *
1918       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1919       *
1920       * @see https://html.spec.whatwg.org/#parsing-main-inheadnoscript
1921       * @see WP_HTML_Processor::step
1922       *
1923       * @return bool Whether an element was found.
1924       */
1925  	private function step_in_head_noscript(): bool {
1926          $token_name = $this->get_token_name();
1927          $token_type = $this->get_token_type();
1928          $is_closer  = parent::is_tag_closer();
1929          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1930          $op         = "{$op_sigil}{$token_name}";
1931  
1932          switch ( $op ) {
1933              /*
1934               * > A character token that is one of U+0009 CHARACTER TABULATION,
1935               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1936               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1937               *
1938               * Parse error: ignore the token.
1939               */
1940              case '#text':
1941                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1942                      return $this->step_in_head();
1943                  }
1944  
1945                  goto in_head_noscript_anything_else;
1946                  break;
1947  
1948              /*
1949               * > A DOCTYPE token
1950               */
1951              case 'html':
1952                  // Parse error: ignore the token.
1953                  return $this->step();
1954  
1955              /*
1956               * > A start tag whose tag name is "html"
1957               */
1958              case '+HTML':
1959                  return $this->step_in_body();
1960  
1961              /*
1962               * > An end tag whose tag name is "noscript"
1963               */
1964              case '-NOSCRIPT':
1965                  $this->state->stack_of_open_elements->pop();
1966                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1967                  return true;
1968  
1969              /*
1970               * > A comment token
1971               * >
1972               * > A start tag whose tag name is one of: "basefont", "bgsound",
1973               * > "link", "meta", "noframes", "style"
1974               */
1975              case '#comment':
1976              case '#funky-comment':
1977              case '#presumptuous-tag':
1978              case '+BASEFONT':
1979              case '+BGSOUND':
1980              case '+LINK':
1981              case '+META':
1982              case '+NOFRAMES':
1983              case '+STYLE':
1984                  return $this->step_in_head();
1985  
1986              /*
1987               * > An end tag whose tag name is "br"
1988               *
1989               * This should never happen, as the Tag Processor prevents showing a BR closing tag.
1990               */
1991          }
1992  
1993          /*
1994           * > A start tag whose tag name is one of: "head", "noscript"
1995           * > Any other end tag
1996           */
1997          if ( '+HEAD' === $op || '+NOSCRIPT' === $op || $is_closer ) {
1998              // Parse error: ignore the token.
1999              return $this->step();
2000          }
2001  
2002          /*
2003           * > Anything else
2004           *
2005           * Anything here is a parse error.
2006           */
2007          in_head_noscript_anything_else:
2008          $this->state->stack_of_open_elements->pop();
2009          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
2010          return $this->step( self::REPROCESS_CURRENT_NODE );
2011      }
2012  
2013      /**
2014       * Parses next element in the 'after head' insertion mode.
2015       *
2016       * This internal function performs the 'after head' insertion mode
2017       * logic for the generalized WP_HTML_Processor::step() function.
2018       *
2019       * @since 6.7.0 Stub implementation.
2020       *
2021       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2022       *
2023       * @see https://html.spec.whatwg.org/#the-after-head-insertion-mode
2024       * @see WP_HTML_Processor::step
2025       *
2026       * @return bool Whether an element was found.
2027       */
2028  	private function step_after_head(): bool {
2029          $token_name = $this->get_token_name();
2030          $token_type = $this->get_token_type();
2031          $is_closer  = parent::is_tag_closer();
2032          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
2033          $op         = "{$op_sigil}{$token_name}";
2034  
2035          switch ( $op ) {
2036              /*
2037               * > A character token that is one of U+0009 CHARACTER TABULATION,
2038               * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2039               * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
2040               */
2041              case '#text':
2042                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
2043                      // Insert the character.
2044                      $this->insert_html_element( $this->state->current_token );
2045                      return true;
2046                  }
2047                  goto after_head_anything_else;
2048                  break;
2049  
2050              /*
2051               * > A comment token
2052               */
2053              case '#comment':
2054              case '#funky-comment':
2055              case '#presumptuous-tag':
2056                  $this->insert_html_element( $this->state->current_token );
2057                  return true;
2058  
2059              /*
2060               * > A DOCTYPE token
2061               */
2062              case 'html':
2063                  // Parse error: ignore the token.
2064                  return $this->step();
2065  
2066              /*
2067               * > A start tag whose tag name is "html"
2068               */
2069              case '+HTML':
2070                  return $this->step_in_body();
2071  
2072              /*
2073               * > A start tag whose tag name is "body"
2074               */
2075              case '+BODY':
2076                  $this->insert_html_element( $this->state->current_token );
2077                  $this->state->frameset_ok    = false;
2078                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2079                  return true;
2080  
2081              /*
2082               * > A start tag whose tag name is "frameset"
2083               */
2084              case '+FRAMESET':
2085                  $this->insert_html_element( $this->state->current_token );
2086                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
2087                  return true;
2088  
2089              /*
2090               * > A start tag whose tag name is one of: "base", "basefont", "bgsound",
2091               * > "link", "meta", "noframes", "script", "style", "template", "title"
2092               *
2093               * Anything here is a parse error.
2094               */
2095              case '+BASE':
2096              case '+BASEFONT':
2097              case '+BGSOUND':
2098              case '+LINK':
2099              case '+META':
2100              case '+NOFRAMES':
2101              case '+SCRIPT':
2102              case '+STYLE':
2103              case '+TEMPLATE':
2104              case '+TITLE':
2105                  /*
2106                   * > Push the node pointed to by the head element pointer onto the stack of open elements.
2107                   * > Process the token using the rules for the "in head" insertion mode.
2108                   * > Remove the node pointed to by the head element pointer from the stack of open elements. (It might not be the current node at this point.)
2109                   */
2110                  $this->bail( 'Cannot process elements after HEAD which reopen the HEAD element.' );
2111                  /*
2112                   * Do not leave this break in when adding support; it's here to prevent
2113                   * WPCS from getting confused at the switch structure without a return,
2114                   * because it doesn't know that `bail()` always throws.
2115                   */
2116                  break;
2117  
2118              /*
2119               * > An end tag whose tag name is "template"
2120               */
2121              case '-TEMPLATE':
2122                  return $this->step_in_head();
2123  
2124              /*
2125               * > An end tag whose tag name is one of: "body", "html", "br"
2126               *
2127               * Closing BR tags are always reported by the Tag Processor as opening tags.
2128               */
2129              case '-BODY':
2130              case '-HTML':
2131                  /*
2132                   * > Act as described in the "anything else" entry below.
2133                   */
2134                  goto after_head_anything_else;
2135                  break;
2136          }
2137  
2138          /*
2139           * > A start tag whose tag name is "head"
2140           * > Any other end tag
2141           */
2142          if ( '+HEAD' === $op || $is_closer ) {
2143              // Parse error: ignore the token.
2144              return $this->step();
2145          }
2146  
2147          /*
2148           * > Anything else
2149           * > Insert an HTML element for a "body" start tag token with no attributes.
2150           */
2151          after_head_anything_else:
2152          $this->insert_virtual_node( 'BODY' );
2153          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2154          return $this->step( self::REPROCESS_CURRENT_NODE );
2155      }
2156  
2157      /**
2158       * Parses next element in the 'in body' insertion mode.
2159       *
2160       * This internal function performs the 'in body' insertion mode
2161       * logic for the generalized WP_HTML_Processor::step() function.
2162       *
2163       * @since 6.4.0
2164       *
2165       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2166       *
2167       * @see https://html.spec.whatwg.org/#parsing-main-inbody
2168       * @see WP_HTML_Processor::step
2169       *
2170       * @return bool Whether an element was found.
2171       */
2172  	private function step_in_body(): bool {
2173          $token_name = $this->get_token_name();
2174          $token_type = $this->get_token_type();
2175          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
2176          $op         = "{$op_sigil}{$token_name}";
2177  
2178          switch ( $op ) {
2179              case '#text':
2180                  /*
2181                   * > A character token that is U+0000 NULL
2182                   *
2183                   * Any successive sequence of NULL bytes is ignored and won't
2184                   * trigger active format reconstruction. Therefore, if the text
2185                   * only comprises NULL bytes then the token should be ignored
2186                   * here, but if there are any other characters in the stream
2187                   * the active formats should be reconstructed.
2188                   */
2189                  if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
2190                      // Parse error: ignore the token.
2191                      return $this->step();
2192                  }
2193  
2194                  $this->reconstruct_active_formatting_elements();
2195  
2196                  /*
2197                   * Whitespace-only text does not affect the frameset-ok flag.
2198                   * It is probably inter-element whitespace, but it may also
2199                   * contain character references which decode only to whitespace.
2200                   */
2201                  if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
2202                      $this->state->frameset_ok = false;
2203                  }
2204  
2205                  $this->insert_html_element( $this->state->current_token );
2206                  return true;
2207  
2208              case '#comment':
2209              case '#funky-comment':
2210              case '#presumptuous-tag':
2211                  $this->insert_html_element( $this->state->current_token );
2212                  return true;
2213  
2214              /*
2215               * > A DOCTYPE token
2216               * > Parse error. Ignore the token.
2217               */
2218              case 'html':
2219                  return $this->step();
2220  
2221              /*
2222               * > A start tag whose tag name is "html"
2223               */
2224              case '+HTML':
2225                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2226                      /*
2227                       * > Otherwise, for each attribute on the token, check to see if the attribute
2228                       * > is already present on the top element of the stack of open elements. If
2229                       * > it is not, add the attribute and its corresponding value to that element.
2230                       *
2231                       * This parser does not currently support this behavior: ignore the token.
2232                       */
2233                  }
2234  
2235                  // Ignore the token.
2236                  return $this->step();
2237  
2238              /*
2239               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
2240               * > "meta", "noframes", "script", "style", "template", "title"
2241               * >
2242               * > An end tag whose tag name is "template"
2243               */
2244              case '+BASE':
2245              case '+BASEFONT':
2246              case '+BGSOUND':
2247              case '+LINK':
2248              case '+META':
2249              case '+NOFRAMES':
2250              case '+SCRIPT':
2251              case '+STYLE':
2252              case '+TEMPLATE':
2253              case '+TITLE':
2254              case '-TEMPLATE':
2255                  return $this->step_in_head();
2256  
2257              /*
2258               * > A start tag whose tag name is "body"
2259               *
2260               * This tag in the IN BODY insertion mode is a parse error.
2261               */
2262              case '+BODY':
2263                  if (
2264                      1 === $this->state->stack_of_open_elements->count() ||
2265                      'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2266                      $this->state->stack_of_open_elements->contains( 'TEMPLATE' )
2267                  ) {
2268                      // Ignore the token.
2269                      return $this->step();
2270                  }
2271  
2272                  /*
2273                   * > Otherwise, set the frameset-ok flag to "not ok"; then, for each attribute
2274                   * > on the token, check to see if the attribute is already present on the body
2275                   * > element (the second element) on the stack of open elements, and if it is
2276                   * > not, add the attribute and its corresponding value to that element.
2277                   *
2278                   * This parser does not currently support this behavior: ignore the token.
2279                   */
2280                  $this->state->frameset_ok = false;
2281                  return $this->step();
2282  
2283              /*
2284               * > A start tag whose tag name is "frameset"
2285               *
2286               * This tag in the IN BODY insertion mode is a parse error.
2287               */
2288              case '+FRAMESET':
2289                  if (
2290                      1 === $this->state->stack_of_open_elements->count() ||
2291                      'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2292                      false === $this->state->frameset_ok
2293                  ) {
2294                      // Ignore the token.
2295                      return $this->step();
2296                  }
2297  
2298                  /*
2299                   * > Otherwise, run the following steps:
2300                   */
2301                  $this->bail( 'Cannot process non-ignored FRAMESET tags.' );
2302                  break;
2303  
2304              /*
2305               * > An end tag whose tag name is "body"
2306               */
2307              case '-BODY':
2308                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2309                      // Parse error: ignore the token.
2310                      return $this->step();
2311                  }
2312  
2313                  /*
2314                   * > Otherwise, if there is a node in the stack of open elements that is not either a
2315                   * > dd element, a dt element, an li element, an optgroup element, an option element,
2316                   * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2317                   * > element, a td element, a tfoot element, a th element, a thread element, a tr
2318                   * > element, the body element, or the html element, then this is a parse error.
2319                   *
2320                   * There is nothing to do for this parse error, so don't check for it.
2321                   */
2322  
2323                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2324                  /*
2325                   * The BODY element is not removed from the stack of open elements.
2326                   * Only internal state has changed, this does not qualify as a "step"
2327                   * in terms of advancing through the document to another token.
2328                   * Nothing has been pushed or popped.
2329                   * Proceed to parse the next item.
2330                   */
2331                  return $this->step();
2332  
2333              /*
2334               * > An end tag whose tag name is "html"
2335               */
2336              case '-HTML':
2337                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2338                      // Parse error: ignore the token.
2339                      return $this->step();
2340                  }
2341  
2342                  /*
2343                   * > Otherwise, if there is a node in the stack of open elements that is not either a
2344                   * > dd element, a dt element, an li element, an optgroup element, an option element,
2345                   * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2346                   * > element, a td element, a tfoot element, a th element, a thread element, a tr
2347                   * > element, the body element, or the html element, then this is a parse error.
2348                   *
2349                   * There is nothing to do for this parse error, so don't check for it.
2350                   */
2351  
2352                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2353                  return $this->step( self::REPROCESS_CURRENT_NODE );
2354  
2355              /*
2356               * > A start tag whose tag name is one of: "address", "article", "aside",
2357               * > "blockquote", "center", "details", "dialog", "dir", "div", "dl",
2358               * > "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
2359               * > "main", "menu", "nav", "ol", "p", "search", "section", "summary", "ul"
2360               */
2361              case '+ADDRESS':
2362              case '+ARTICLE':
2363              case '+ASIDE':
2364              case '+BLOCKQUOTE':
2365              case '+CENTER':
2366              case '+DETAILS':
2367              case '+DIALOG':
2368              case '+DIR':
2369              case '+DIV':
2370              case '+DL':
2371              case '+FIELDSET':
2372              case '+FIGCAPTION':
2373              case '+FIGURE':
2374              case '+FOOTER':
2375              case '+HEADER':
2376              case '+HGROUP':
2377              case '+MAIN':
2378              case '+MENU':
2379              case '+NAV':
2380              case '+OL':
2381              case '+P':
2382              case '+SEARCH':
2383              case '+SECTION':
2384              case '+SUMMARY':
2385              case '+UL':
2386                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2387                      $this->close_a_p_element();
2388                  }
2389  
2390                  $this->insert_html_element( $this->state->current_token );
2391                  return true;
2392  
2393              /*
2394               * > A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2395               */
2396              case '+H1':
2397              case '+H2':
2398              case '+H3':
2399              case '+H4':
2400              case '+H5':
2401              case '+H6':
2402                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2403                      $this->close_a_p_element();
2404                  }
2405  
2406                  if (
2407                      in_array(
2408                          $this->state->stack_of_open_elements->current_node()->node_name,
2409                          array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' ),
2410                          true
2411                      )
2412                  ) {
2413                      // @todo Indicate a parse error once it's possible.
2414                      $this->state->stack_of_open_elements->pop();
2415                  }
2416  
2417                  $this->insert_html_element( $this->state->current_token );
2418                  return true;
2419  
2420              /*
2421               * > A start tag whose tag name is one of: "pre", "listing"
2422               */
2423              case '+PRE':
2424              case '+LISTING':
2425                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2426                      $this->close_a_p_element();
2427                  }
2428  
2429                  /*
2430                   * > If the next token is a U+000A LINE FEED (LF) character token,
2431                   * > then ignore that token and move on to the next one. (Newlines
2432                   * > at the start of pre blocks are ignored as an authoring convenience.)
2433                   *
2434                   * This is handled in `get_modifiable_text()`.
2435                   */
2436  
2437                  $this->insert_html_element( $this->state->current_token );
2438                  $this->state->frameset_ok = false;
2439                  return true;
2440  
2441              /*
2442               * > A start tag whose tag name is "form"
2443               */
2444              case '+FORM':
2445                  $stack_contains_template = $this->state->stack_of_open_elements->contains( 'TEMPLATE' );
2446  
2447                  if ( isset( $this->state->form_element ) && ! $stack_contains_template ) {
2448                      // Parse error: ignore the token.
2449                      return $this->step();
2450                  }
2451  
2452                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2453                      $this->close_a_p_element();
2454                  }
2455  
2456                  $this->insert_html_element( $this->state->current_token );
2457                  if ( ! $stack_contains_template ) {
2458                      $this->state->form_element = $this->state->current_token;
2459                  }
2460  
2461                  return true;
2462  
2463              /*
2464               * > A start tag whose tag name is "li"
2465               * > A start tag whose tag name is one of: "dd", "dt"
2466               */
2467              case '+DD':
2468              case '+DT':
2469              case '+LI':
2470                  $this->state->frameset_ok = false;
2471                  $node                     = $this->state->stack_of_open_elements->current_node();
2472                  $is_li                    = 'LI' === $token_name;
2473  
2474                  in_body_list_loop:
2475                  /*
2476                   * The logic for LI and DT/DD is the same except for one point: LI elements _only_
2477                   * close other LI elements, but a DT or DD element closes _any_ open DT or DD element.
2478                   */
2479                  if ( $is_li ? 'LI' === $node->node_name : ( 'DD' === $node->node_name || 'DT' === $node->node_name ) ) {
2480                      $node_name = $is_li ? 'LI' : $node->node_name;
2481                      $this->generate_implied_end_tags( $node_name );
2482                      if ( ! $this->state->stack_of_open_elements->current_node_is( $node_name ) ) {
2483                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2484                      }
2485  
2486                      $this->state->stack_of_open_elements->pop_until( $node_name );
2487                      goto in_body_list_done;
2488                  }
2489  
2490                  if (
2491                      'ADDRESS' !== $node->node_name &&
2492                      'DIV' !== $node->node_name &&
2493                      'P' !== $node->node_name &&
2494                      self::is_special( $node )
2495                  ) {
2496                      /*
2497                       * > If node is in the special category, but is not an address, div,
2498                       * > or p element, then jump to the step labeled done below.
2499                       */
2500                      goto in_body_list_done;
2501                  } else {
2502                      /*
2503                       * > Otherwise, set node to the previous entry in the stack of open elements
2504                       * > and return to the step labeled loop.
2505                       */
2506                      foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
2507                          $node = $item;
2508                          break;
2509                      }
2510                      goto in_body_list_loop;
2511                  }
2512  
2513                  in_body_list_done:
2514                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2515                      $this->close_a_p_element();
2516                  }
2517  
2518                  $this->insert_html_element( $this->state->current_token );
2519                  return true;
2520  
2521              case '+PLAINTEXT':
2522                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2523                      $this->close_a_p_element();
2524                  }
2525  
2526                  /*
2527                   * @todo This may need to be handled in the Tag Processor and turn into
2528                   *       a single self-contained tag like TEXTAREA, whose modifiable text
2529                   *       is the rest of the input document as plaintext.
2530                   */
2531                  $this->bail( 'Cannot process PLAINTEXT elements.' );
2532                  break;
2533  
2534              /*
2535               * > A start tag whose tag name is "button"
2536               */
2537              case '+BUTTON':
2538                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
2539                      // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2540                      $this->generate_implied_end_tags();
2541                      $this->state->stack_of_open_elements->pop_until( 'BUTTON' );
2542                  }
2543  
2544                  $this->reconstruct_active_formatting_elements();
2545                  $this->insert_html_element( $this->state->current_token );
2546                  $this->state->frameset_ok = false;
2547  
2548                  return true;
2549  
2550              /*
2551               * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
2552               * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
2553               * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
2554               * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
2555               */
2556              case '-ADDRESS':
2557              case '-ARTICLE':
2558              case '-ASIDE':
2559              case '-BLOCKQUOTE':
2560              case '-BUTTON':
2561              case '-CENTER':
2562              case '-DETAILS':
2563              case '-DIALOG':
2564              case '-DIR':
2565              case '-DIV':
2566              case '-DL':
2567              case '-FIELDSET':
2568              case '-FIGCAPTION':
2569              case '-FIGURE':
2570              case '-FOOTER':
2571              case '-HEADER':
2572              case '-HGROUP':
2573              case '-LISTING':
2574              case '-MAIN':
2575              case '-MENU':
2576              case '-NAV':
2577              case '-OL':
2578              case '-PRE':
2579              case '-SEARCH':
2580              case '-SECTION':
2581              case '-SUMMARY':
2582              case '-UL':
2583                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2584                      // @todo Report parse error.
2585                      // Ignore the token.
2586                      return $this->step();
2587                  }
2588  
2589                  $this->generate_implied_end_tags();
2590                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2591                      // @todo Record parse error: this error doesn't impact parsing.
2592                  }
2593                  $this->state->stack_of_open_elements->pop_until( $token_name );
2594                  return true;
2595  
2596              /*
2597               * > An end tag whose tag name is "form"
2598               */
2599              case '-FORM':
2600                  if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2601                      $node                      = $this->state->form_element;
2602                      $this->state->form_element = null;
2603  
2604                      /*
2605                       * > If node is null or if the stack of open elements does not have node
2606                       * > in scope, then this is a parse error; return and ignore the token.
2607                       *
2608                       * @todo It's necessary to check if the form token itself is in scope, not
2609                       *       simply whether any FORM is in scope.
2610                       */
2611                      if (
2612                          null === $node ||
2613                          ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' )
2614                      ) {
2615                          // Parse error: ignore the token.
2616                          return $this->step();
2617                      }
2618  
2619                      $this->generate_implied_end_tags();
2620                      if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
2621                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2622                          $this->bail( 'Cannot close a FORM when other elements remain open as this would throw off the breadcrumbs for the following tokens.' );
2623                      }
2624  
2625                      $this->state->stack_of_open_elements->remove_node( $node );
2626                      return true;
2627                  } else {
2628                      /*
2629                       * > If the stack of open elements does not have a form element in scope,
2630                       * > then this is a parse error; return and ignore the token.
2631                       *
2632                       * Note that unlike in the clause above, this is checking for any FORM in scope.
2633                       */
2634                      if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' ) ) {
2635                          // Parse error: ignore the token.
2636                          return $this->step();
2637                      }
2638  
2639                      $this->generate_implied_end_tags();
2640  
2641                      if ( ! $this->state->stack_of_open_elements->current_node_is( 'FORM' ) ) {
2642                          // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2643                      }
2644  
2645                      $this->state->stack_of_open_elements->pop_until( 'FORM' );
2646                      return true;
2647                  }
2648                  break;
2649  
2650              /*
2651               * > An end tag whose tag name is "p"
2652               */
2653              case '-P':
2654                  if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2655                      $this->insert_html_element( $this->state->current_token );
2656                  }
2657  
2658                  $this->close_a_p_element();
2659                  return true;
2660  
2661              /*
2662               * > An end tag whose tag name is "li"
2663               * > An end tag whose tag name is one of: "dd", "dt"
2664               */
2665              case '-DD':
2666              case '-DT':
2667              case '-LI':
2668                  if (
2669                      /*
2670                       * An end tag whose tag name is "li":
2671                       * If the stack of open elements does not have an li element in list item scope,
2672                       * then this is a parse error; ignore the token.
2673                       */
2674                      (
2675                          'LI' === $token_name &&
2676                          ! $this->state->stack_of_open_elements->has_element_in_list_item_scope( 'LI' )
2677                      ) ||
2678                      /*
2679                       * An end tag whose tag name is one of: "dd", "dt":
2680                       * If the stack of open elements does not have an element in scope that is an
2681                       * HTML element with the same tag name as that of the token, then this is a
2682                       * parse error; ignore the token.
2683                       */
2684                      (
2685                          'LI' !== $token_name &&
2686                          ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name )
2687                      )
2688                  ) {
2689                      /*
2690                       * This is a parse error, ignore the token.
2691                       *
2692                       * @todo Indicate a parse error once it's possible.
2693                       */
2694                      return $this->step();
2695                  }
2696  
2697                  $this->generate_implied_end_tags( $token_name );
2698  
2699                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2700                      // @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2701                  }
2702  
2703                  $this->state->stack_of_open_elements->pop_until( $token_name );
2704                  return true;
2705  
2706              /*
2707               * > An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2708               */
2709              case '-H1':
2710              case '-H2':
2711              case '-H3':
2712              case '-H4':
2713              case '-H5':
2714              case '-H6':
2715                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( '(internal: H1 through H6 - do not use)' ) ) {
2716                      /*
2717                       * This is a parse error; ignore the token.
2718                       *
2719                       * @todo Indicate a parse error once it's possible.
2720                       */
2721                      return $this->step();
2722                  }
2723  
2724                  $this->generate_implied_end_tags();
2725  
2726                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2727                      // @todo Record parse error: this error doesn't impact parsing.
2728                  }
2729  
2730                  $this->state->stack_of_open_elements->pop_until( '(internal: H1 through H6 - do not use)' );
2731                  return true;
2732  
2733              /*
2734               * > A start tag whose tag name is "a"
2735               */
2736              case '+A':
2737                  foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
2738                      switch ( $item->node_name ) {
2739                          case 'marker':
2740                              break 2;
2741  
2742                          case 'A':
2743                              $this->run_adoption_agency_algorithm();
2744                              $this->state->active_formatting_elements->remove_node( $item );
2745                              $this->state->stack_of_open_elements->remove_node( $item );
2746                              break 2;
2747                      }
2748                  }
2749  
2750                  $this->reconstruct_active_formatting_elements();
2751                  $this->insert_html_element( $this->state->current_token );
2752                  $this->state->active_formatting_elements->push( $this->state->current_token );
2753                  return true;
2754  
2755              /*
2756               * > A start tag whose tag name is one of: "b", "big", "code", "em", "font", "i",
2757               * > "s", "small", "strike", "strong", "tt", "u"
2758               */
2759              case '+B':
2760              case '+BIG':
2761              case '+CODE':
2762              case '+EM':
2763              case '+FONT':
2764              case '+I':
2765              case '+S':
2766              case '+SMALL':
2767              case '+STRIKE':
2768              case '+STRONG':
2769              case '+TT':
2770              case '+U':
2771                  $this->reconstruct_active_formatting_elements();
2772                  $this->insert_html_element( $this->state->current_token );
2773                  $this->state->active_formatting_elements->push( $this->state->current_token );
2774                  return true;
2775  
2776              /*
2777               * > A start tag whose tag name is "nobr"
2778               */
2779              case '+NOBR':
2780                  $this->reconstruct_active_formatting_elements();
2781  
2782                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'NOBR' ) ) {
2783                      // Parse error.
2784                      $this->run_adoption_agency_algorithm();
2785                      $this->reconstruct_active_formatting_elements();
2786                  }
2787  
2788                  $this->insert_html_element( $this->state->current_token );
2789                  $this->state->active_formatting_elements->push( $this->state->current_token );
2790                  return true;
2791  
2792              /*
2793               * > An end tag whose tag name is one of: "a", "b", "big", "code", "em", "font", "i",
2794               * > "nobr", "s", "small", "strike", "strong", "tt", "u"
2795               */
2796              case '-A':
2797              case '-B':
2798              case '-BIG':
2799              case '-CODE':
2800              case '-EM':
2801              case '-FONT':
2802              case '-I':
2803              case '-NOBR':
2804              case '-S':
2805              case '-SMALL':
2806              case '-STRIKE':
2807              case '-STRONG':
2808              case '-TT':
2809              case '-U':
2810                  $this->run_adoption_agency_algorithm();
2811                  return true;
2812  
2813              /*
2814               * > A start tag whose tag name is one of: "applet", "marquee", "object"
2815               */
2816              case '+APPLET':
2817              case '+MARQUEE':
2818              case '+OBJECT':
2819                  $this->reconstruct_active_formatting_elements();
2820                  $this->insert_html_element( $this->state->current_token );
2821                  $this->state->active_formatting_elements->insert_marker();
2822                  $this->state->frameset_ok = false;
2823                  return true;
2824  
2825              /*
2826               * > A end tag token whose tag name is one of: "applet", "marquee", "object"
2827               */
2828              case '-APPLET':
2829              case '-MARQUEE':
2830              case '-OBJECT':
2831                  if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2832                      // Parse error: ignore the token.
2833                      return $this->step();
2834                  }
2835  
2836                  $this->generate_implied_end_tags();
2837                  if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2838                      // This is a parse error.
2839                  }
2840  
2841                  $this->state->stack_of_open_elements->pop_until( $token_name );
2842                  $this->state->active_formatting_elements->clear_up_to_last_marker();
2843                  return true;
2844  
2845              /*
2846               * > A start tag whose tag name is "table"
2847               */
2848              case '+TABLE':
2849                  /*
2850                   * > If the Document is not set to quirks mode, and the stack of open elements
2851                   * > has a p element in button scope, then close a p element.
2852                   */
2853                  if (
2854                      WP_HTML_Tag_Processor::QUIRKS_MODE !== $this->compat_mode &&
2855                      $this->state->stack_of_open_elements->has_p_in_button_scope()
2856                  ) {
2857                      $this->close_a_p_element();
2858                  }
2859  
2860                  $this->insert_html_element( $this->state->current_token );
2861                  $this->state->frameset_ok    = false;
2862                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
2863                  return true;
2864  
2865              /*
2866               * > An end tag whose tag name is "br"
2867               *
2868               * This is prevented from happening because the Tag Processor
2869               * reports all closing BR tags as if they were opening tags.
2870               */
2871  
2872              /*
2873               * > A start tag whose tag name is one of: "area", "br", "embed", "img", "keygen", "wbr"
2874               */
2875              case '+AREA':
2876              case '+BR':
2877              case '+EMBED':
2878              case '+IMG':
2879              case '+KEYGEN':
2880              case '+WBR':
2881                  $this->reconstruct_active_formatting_elements();
2882                  $this->insert_html_element( $this->state->current_token );
2883                  $this->state->frameset_ok = false;
2884                  return true;
2885  
2886              /*
2887               * > A start tag whose tag name is "input"
2888               */
2889              case '+INPUT':
2890                  $this->reconstruct_active_formatting_elements();
2891                  $this->insert_html_element( $this->state->current_token );
2892  
2893                  /*
2894                   * > If the token does not have an attribute with the name "type", or if it does,
2895                   * > but that attribute's value is not an ASCII case-insensitive match for the
2896                   * > string "hidden", then: set the frameset-ok flag to "not ok".
2897                   */
2898                  $type_attribute = $this->get_attribute( 'type' );
2899                  if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
2900                      $this->state->frameset_ok = false;
2901                  }
2902  
2903                  return true;
2904  
2905              /*
2906               * > A start tag whose tag name is one of: "param", "source", "track"
2907               */
2908              case '+PARAM':
2909              case '+SOURCE':
2910              case '+TRACK':
2911                  $this->insert_html_element( $this->state->current_token );
2912                  return true;
2913  
2914              /*
2915               * > A start tag whose tag name is "hr"
2916               */
2917              case '+HR':
2918                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2919                      $this->close_a_p_element();
2920                  }
2921                  $this->insert_html_element( $this->state->current_token );
2922                  $this->state->frameset_ok = false;
2923                  return true;
2924  
2925              /*
2926               * > A start tag whose tag name is "image"
2927               */
2928              case '+IMAGE':
2929                  /*
2930                   * > Parse error. Change the token's tag name to "img" and reprocess it. (Don't ask.)
2931                   *
2932                   * Note that this is handled elsewhere, so it should not be possible to reach this code.
2933                   */
2934                  $this->bail( "Cannot process an IMAGE tag. (Don't ask.)" );
2935                  break;
2936  
2937              /*
2938               * > A start tag whose tag name is "textarea"
2939               */
2940              case '+TEXTAREA':
2941                  $this->insert_html_element( $this->state->current_token );
2942  
2943                  /*
2944                   * > If the next token is a U+000A LINE FEED (LF) character token, then ignore
2945                   * > that token and move on to the next one. (Newlines at the start of
2946                   * > textarea elements are ignored as an authoring convenience.)
2947                   *
2948                   * This is handled in `get_modifiable_text()`.
2949                   */
2950  
2951                  $this->state->frameset_ok = false;
2952  
2953                  /*
2954                   * > Switch the insertion mode to "text".
2955                   *
2956                   * As a self-contained node, this behavior is handled in the Tag Processor.
2957                   */
2958                  return true;
2959  
2960              /*
2961               * > A start tag whose tag name is "xmp"
2962               */
2963              case '+XMP':
2964                  if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2965                      $this->close_a_p_element();
2966                  }
2967  
2968                  $this->reconstruct_active_formatting_elements();
2969                  $this->state->frameset_ok = false;
2970  
2971                  /*
2972                   * > Follow the generic raw text element parsing algorithm.
2973                   *
2974                   * As a self-contained node, this behavior is handled in the Tag Processor.
2975                   */
2976                  $this->insert_html_element( $this->state->current_token );
2977                  return true;
2978  
2979              /*
2980               * A start tag whose tag name is "iframe"
2981               */
2982              case '+IFRAME':
2983                  $this->state->frameset_ok = false;
2984  
2985                  /*
2986                   * > Follow the generic raw text element parsing algorithm.
2987                   *
2988                   * As a self-contained node, this behavior is handled in the Tag Processor.
2989                   */
2990                  $this->insert_html_element( $this->state->current_token );
2991                  return true;
2992  
2993              /*
2994               * > A start tag whose tag name is "noembed"
2995               * > A start tag whose tag name is "noscript", if the scripting flag is enabled
2996               *
2997               * The scripting flag is never enabled in this parser.
2998               */
2999              case '+NOEMBED':
3000                  $this->insert_html_element( $this->state->current_token );
3001                  return true;
3002  
3003              /*
3004               * > A start tag whose tag name is "select"
3005               */
3006              case '+SELECT':
3007                  $this->reconstruct_active_formatting_elements();
3008                  $this->insert_html_element( $this->state->current_token );
3009                  $this->state->frameset_ok = false;
3010  
3011                  switch ( $this->state->insertion_mode ) {
3012                      /*
3013                       * > If the insertion mode is one of "in table", "in caption", "in table body", "in row",
3014                       * > or "in cell", then switch the insertion mode to "in select in table".
3015                       */
3016                      case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
3017                      case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
3018                      case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
3019                      case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
3020                      case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
3021                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
3022                          break;
3023  
3024                      /*
3025                       * > Otherwise, switch the insertion mode to "in select".
3026                       */
3027                      default:
3028                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
3029                          break;
3030                  }
3031                  return true;
3032  
3033              /*
3034               * > A start tag whose tag name is one of: "optgroup", "option"
3035               */
3036              case '+OPTGROUP':
3037              case '+OPTION':
3038                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
3039                      $this->state->stack_of_open_elements->pop();
3040                  }
3041                  $this->reconstruct_active_formatting_elements();
3042                  $this->insert_html_element( $this->state->current_token );
3043                  return true;
3044  
3045              /*
3046               * > A start tag whose tag name is one of: "rb", "rtc"
3047               */
3048              case '+RB':
3049              case '+RTC':
3050                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3051                      $this->generate_implied_end_tags();
3052  
3053                      if ( $this->state->stack_of_open_elements->current_node_is( 'RUBY' ) ) {
3054                          // @todo Indicate a parse error once it's possible.
3055                      }
3056                  }
3057  
3058                  $this->insert_html_element( $this->state->current_token );
3059                  return true;
3060  
3061              /*
3062               * > A start tag whose tag name is one of: "rp", "rt"
3063               */
3064              case '+RP':
3065              case '+RT':
3066                  if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3067                      $this->generate_implied_end_tags( 'RTC' );
3068  
3069                      $current_node_name = $this->state->stack_of_open_elements->current_node()->node_name;
3070                      if ( 'RTC' === $current_node_name || 'RUBY' === $current_node_name ) {
3071                          // @todo Indicate a parse error once it's possible.
3072                      }
3073                  }
3074  
3075                  $this->insert_html_element( $this->state->current_token );
3076                  return true;
3077  
3078              /*
3079               * > A start tag whose tag name is "math"
3080               */
3081              case '+MATH':
3082                  $this->reconstruct_active_formatting_elements();
3083  
3084                  /*
3085                   * @todo Adjust MathML attributes for the token. (This fixes the case of MathML attributes that are not all lowercase.)
3086                   * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink.)
3087                   *
3088                   * These ought to be handled in the attribute methods.
3089                   */
3090                  $this->state->current_token->namespace = 'math';
3091                  $this->insert_html_element( $this->state->current_token );
3092                  if ( $this->state->current_token->has_self_closing_flag ) {
3093                      $this->state->stack_of_open_elements->pop();
3094                  }
3095                  return true;
3096  
3097              /*
3098               * > A start tag whose tag name is "svg"
3099               */
3100              case '+SVG':
3101                  $this->reconstruct_active_formatting_elements();
3102  
3103                  /*
3104                   * @todo Adjust SVG attributes for the token. (This fixes the case of SVG attributes that are not all lowercase.)
3105                   * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink in SVG.)
3106                   *
3107                   * These ought to be handled in the attribute methods.
3108                   */
3109                  $this->state->current_token->namespace = 'svg';
3110                  $this->insert_html_element( $this->state->current_token );
3111                  if ( $this->state->current_token->has_self_closing_flag ) {
3112                      $this->state->stack_of_open_elements->pop();
3113                  }
3114                  return true;
3115  
3116              /*
3117               * > A start tag whose tag name is one of: "caption", "col", "colgroup",
3118               * > "frame", "head", "tbody", "td", "tfoot", "th", "thead", "tr"
3119               */
3120              case '+CAPTION':
3121              case '+COL':
3122              case '+COLGROUP':
3123              case '+FRAME':
3124              case '+HEAD':
3125              case '+TBODY':
3126              case '+TD':
3127              case '+TFOOT':
3128              case '+TH':
3129              case '+THEAD':
3130              case '+TR':
3131                  // Parse error. Ignore the token.
3132                  return $this->step();
3133          }
3134  
3135          if ( ! parent::is_tag_closer() ) {
3136              /*
3137               * > Any other start tag
3138               */
3139              $this->reconstruct_active_formatting_elements();
3140              $this->insert_html_element( $this->state->current_token );
3141              return true;
3142          } else {
3143              /*
3144               * > Any other end tag
3145               */
3146  
3147              /*
3148               * Find the corresponding tag opener in the stack of open elements, if
3149               * it exists before reaching a special element, which provides a kind
3150               * of boundary in the stack. For example, a `</custom-tag>` should not
3151               * close anything beyond its containing `P` or `DIV` element.
3152               */
3153              foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
3154                  if ( 'html' === $node->namespace && $token_name === $node->node_name ) {
3155                      break;
3156                  }
3157  
3158                  if ( self::is_special( $node ) ) {
3159                      // This is a parse error, ignore the token.
3160                      return $this->step();
3161                  }
3162              }
3163  
3164              $this->generate_implied_end_tags( $token_name );
3165              if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
3166                  // @todo Record parse error: this error doesn't impact parsing.
3167              }
3168  
3169              foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
3170                  $this->state->stack_of_open_elements->pop();
3171                  if ( $node === $item ) {
3172                      return true;
3173                  }
3174              }
3175          }
3176  
3177          $this->bail( 'Should not have been able to reach end of IN BODY processing. Check HTML API code.' );
3178          // This unnecessary return prevents tools from inaccurately reporting type errors.
3179          return false;
3180      }
3181  
3182      /**
3183       * Parses next element in the 'in table' insertion mode.
3184       *
3185       * This internal function performs the 'in table' insertion mode
3186       * logic for the generalized WP_HTML_Processor::step() function.
3187       *
3188       * @since 6.7.0
3189       *
3190       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3191       *
3192       * @see https://html.spec.whatwg.org/#parsing-main-intable
3193       * @see WP_HTML_Processor::step
3194       *
3195       * @return bool Whether an element was found.
3196       */
3197  	private function step_in_table(): bool {
3198          $token_name = $this->get_token_name();
3199          $token_type = $this->get_token_type();
3200          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3201          $op         = "{$op_sigil}{$token_name}";
3202  
3203          switch ( $op ) {
3204              /*
3205               * > A character token, if the current node is table,
3206               * > tbody, template, tfoot, thead, or tr element
3207               */
3208              case '#text':
3209                  $current_node      = $this->state->stack_of_open_elements->current_node();
3210                  $current_node_name = $current_node ? $current_node->node_name : null;
3211                  if (
3212                      $current_node_name && (
3213                          'TABLE' === $current_node_name ||
3214                          'TBODY' === $current_node_name ||
3215                          'TEMPLATE' === $current_node_name ||
3216                          'TFOOT' === $current_node_name ||
3217                          'THEAD' === $current_node_name ||
3218                          'TR' === $current_node_name
3219                      )
3220                  ) {
3221                      /*
3222                       * If the text is empty after processing HTML entities and stripping
3223                       * U+0000 NULL bytes then ignore the token.
3224                       */
3225                      if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
3226                          return $this->step();
3227                      }
3228  
3229                      /*
3230                       * This follows the rules for "in table text" insertion mode.
3231                       *
3232                       * Whitespace-only text nodes are inserted in-place. Otherwise
3233                       * foster parenting is enabled and the nodes would be
3234                       * inserted out-of-place.
3235                       *
3236                       * > If any of the tokens in the pending table character tokens
3237                       * > list are character tokens that are not ASCII whitespace,
3238                       * > then this is a parse error: reprocess the character tokens
3239                       * > in the pending table character tokens list using the rules
3240                       * > given in the "anything else" entry in the "in table"
3241                       * > insertion mode.
3242                       * >
3243                       * > Otherwise, insert the characters given by the pending table
3244                       * > character tokens list.
3245                       *
3246                       * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3247                       */
3248                      if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3249                          $this->insert_html_element( $this->state->current_token );
3250                          return true;
3251                      }
3252  
3253                      // Non-whitespace would trigger fostering, unsupported at this time.
3254                      $this->bail( 'Foster parenting is not supported.' );
3255                      break;
3256                  }
3257                  break;
3258  
3259              /*
3260               * > A comment token
3261               */
3262              case '#comment':
3263              case '#funky-comment':
3264              case '#presumptuous-tag':
3265                  $this->insert_html_element( $this->state->current_token );
3266                  return true;
3267  
3268              /*
3269               * > A DOCTYPE token
3270               */
3271              case 'html':
3272                  // Parse error: ignore the token.
3273                  return $this->step();
3274  
3275              /*
3276               * > A start tag whose tag name is "caption"
3277               */
3278              case '+CAPTION':
3279                  $this->state->stack_of_open_elements->clear_to_table_context();
3280                  $this->state->active_formatting_elements->insert_marker();
3281                  $this->insert_html_element( $this->state->current_token );
3282                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
3283                  return true;
3284  
3285              /*
3286               * > A start tag whose tag name is "colgroup"
3287               */
3288              case '+COLGROUP':
3289                  $this->state->stack_of_open_elements->clear_to_table_context();
3290                  $this->insert_html_element( $this->state->current_token );
3291                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3292                  return true;
3293  
3294              /*
3295               * > A start tag whose tag name is "col"
3296               */
3297              case '+COL':
3298                  $this->state->stack_of_open_elements->clear_to_table_context();
3299  
3300                  /*
3301                   * > Insert an HTML element for a "colgroup" start tag token with no attributes,
3302                   * > then switch the insertion mode to "in column group".
3303                   */
3304                  $this->insert_virtual_node( 'COLGROUP' );
3305                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3306                  return $this->step( self::REPROCESS_CURRENT_NODE );
3307  
3308              /*
3309               * > A start tag whose tag name is one of: "tbody", "tfoot", "thead"
3310               */
3311              case '+TBODY':
3312              case '+TFOOT':
3313              case '+THEAD':
3314                  $this->state->stack_of_open_elements->clear_to_table_context();
3315                  $this->insert_html_element( $this->state->current_token );
3316                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3317                  return true;
3318  
3319              /*
3320               * > A start tag whose tag name is one of: "td", "th", "tr"
3321               */
3322              case '+TD':
3323              case '+TH':
3324              case '+TR':
3325                  $this->state->stack_of_open_elements->clear_to_table_context();
3326                  /*
3327                   * > Insert an HTML element for a "tbody" start tag token with no attributes,
3328                   * > then switch the insertion mode to "in table body".
3329                   */
3330                  $this->insert_virtual_node( 'TBODY' );
3331                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3332                  return $this->step( self::REPROCESS_CURRENT_NODE );
3333  
3334              /*
3335               * > A start tag whose tag name is "table"
3336               *
3337               * This tag in the IN TABLE insertion mode is a parse error.
3338               */
3339              case '+TABLE':
3340                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3341                      return $this->step();
3342                  }
3343  
3344                  $this->state->stack_of_open_elements->pop_until( 'TABLE' );
3345                  $this->reset_insertion_mode_appropriately();
3346                  return $this->step( self::REPROCESS_CURRENT_NODE );
3347  
3348              /*
3349               * > An end tag whose tag name is "table"
3350               */
3351              case '-TABLE':
3352                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3353                      // @todo Indicate a parse error once it's possible.
3354                      return $this->step();
3355                  }
3356  
3357                  $this->state->stack_of_open_elements->pop_until( 'TABLE' );
3358                  $this->reset_insertion_mode_appropriately();
3359                  return true;
3360  
3361              /*
3362               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3363               */
3364              case '-BODY':
3365              case '-CAPTION':
3366              case '-COL':
3367              case '-COLGROUP':
3368              case '-HTML':
3369              case '-TBODY':
3370              case '-TD':
3371              case '-TFOOT':
3372              case '-TH':
3373              case '-THEAD':
3374              case '-TR':
3375                  // Parse error: ignore the token.
3376                  return $this->step();
3377  
3378              /*
3379               * > A start tag whose tag name is one of: "style", "script", "template"
3380               * > An end tag whose tag name is "template"
3381               */
3382              case '+STYLE':
3383              case '+SCRIPT':
3384              case '+TEMPLATE':
3385              case '-TEMPLATE':
3386                  /*
3387                   * > Process the token using the rules for the "in head" insertion mode.
3388                   */
3389                  return $this->step_in_head();
3390  
3391              /*
3392               * > A start tag whose tag name is "input"
3393               *
3394               * > If the token does not have an attribute with the name "type", or if it does, but
3395               * > that attribute's value is not an ASCII case-insensitive match for the string
3396               * > "hidden", then: act as described in the "anything else" entry below.
3397               */
3398              case '+INPUT':
3399                  $type_attribute = $this->get_attribute( 'type' );
3400                  if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
3401                      goto anything_else;
3402                  }
3403                  // @todo Indicate a parse error once it's possible.
3404                  $this->insert_html_element( $this->state->current_token );
3405                  return true;
3406  
3407              /*
3408               * > A start tag whose tag name is "form"
3409               *
3410               * This tag in the IN TABLE insertion mode is a parse error.
3411               */
3412              case '+FORM':
3413                  if (
3414                      $this->state->stack_of_open_elements->has_element_in_scope( 'TEMPLATE' ) ||
3415                      isset( $this->state->form_element )
3416                  ) {
3417                      return $this->step();
3418                  }
3419  
3420                  // This FORM is special because it immediately closes and cannot have other children.
3421                  $this->insert_html_element( $this->state->current_token );
3422                  $this->state->form_element = $this->state->current_token;
3423                  $this->state->stack_of_open_elements->pop();
3424                  return true;
3425          }
3426  
3427          /*
3428           * > Anything else
3429           * > Parse error. Enable foster parenting, process the token using the rules for the
3430           * > "in body" insertion mode, and then disable foster parenting.
3431           *
3432           * @todo Indicate a parse error once it's possible.
3433           */
3434          anything_else:
3435          $this->bail( 'Foster parenting is not supported.' );
3436      }
3437  
3438      /**
3439       * Parses next element in the 'in table text' insertion mode.
3440       *
3441       * This internal function performs the 'in table text' insertion mode
3442       * logic for the generalized WP_HTML_Processor::step() function.
3443       *
3444       * @since 6.7.0 Stub implementation.
3445       *
3446       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3447       *
3448       * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3449       * @see WP_HTML_Processor::step
3450       *
3451       * @return bool Whether an element was found.
3452       */
3453  	private function step_in_table_text(): bool {
3454          $this->bail( 'No support for parsing in the ' . WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT . ' state.' );
3455      }
3456  
3457      /**
3458       * Parses next element in the 'in caption' insertion mode.
3459       *
3460       * This internal function performs the 'in caption' insertion mode
3461       * logic for the generalized WP_HTML_Processor::step() function.
3462       *
3463       * @since 6.7.0
3464       *
3465       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3466       *
3467       * @see https://html.spec.whatwg.org/#parsing-main-incaption
3468       * @see WP_HTML_Processor::step
3469       *
3470       * @return bool Whether an element was found.
3471       */
3472  	private function step_in_caption(): bool {
3473          $tag_name = $this->get_tag();
3474          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3475          $op       = "{$op_sigil}{$tag_name}";
3476  
3477          switch ( $op ) {
3478              /*
3479               * > An end tag whose tag name is "caption"
3480               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td", "tfoot", "th", "thead", "tr"
3481               * > An end tag whose tag name is "table"
3482               *
3483               * These tag handling rules are identical except for the final instruction.
3484               * Handle them in a single block.
3485               */
3486              case '-CAPTION':
3487              case '+CAPTION':
3488              case '+COL':
3489              case '+COLGROUP':
3490              case '+TBODY':
3491              case '+TD':
3492              case '+TFOOT':
3493              case '+TH':
3494              case '+THEAD':
3495              case '+TR':
3496              case '-TABLE':
3497                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'CAPTION' ) ) {
3498                      // Parse error: ignore the token.
3499                      return $this->step();
3500                  }
3501  
3502                  $this->generate_implied_end_tags();
3503                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'CAPTION' ) ) {
3504                      // @todo Indicate a parse error once it's possible.
3505                  }
3506  
3507                  $this->state->stack_of_open_elements->pop_until( 'CAPTION' );
3508                  $this->state->active_formatting_elements->clear_up_to_last_marker();
3509                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3510  
3511                  // If this is not a CAPTION end tag, the token should be reprocessed.
3512                  if ( '-CAPTION' === $op ) {
3513                      return true;
3514                  }
3515                  return $this->step( self::REPROCESS_CURRENT_NODE );
3516  
3517              /**
3518               * > An end tag whose tag name is one of: "body", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3519               */
3520              case '-BODY':
3521              case '-COL':
3522              case '-COLGROUP':
3523              case '-HTML':
3524              case '-TBODY':
3525              case '-TD':
3526              case '-TFOOT':
3527              case '-TH':
3528              case '-THEAD':
3529              case '-TR':
3530                  // Parse error: ignore the token.
3531                  return $this->step();
3532          }
3533  
3534          /**
3535           * > Anything else
3536           * >   Process the token using the rules for the "in body" insertion mode.
3537           */
3538          return $this->step_in_body();
3539      }
3540  
3541      /**
3542       * Parses next element in the 'in column group' insertion mode.
3543       *
3544       * This internal function performs the 'in column group' insertion mode
3545       * logic for the generalized WP_HTML_Processor::step() function.
3546       *
3547       * @since 6.7.0
3548       *
3549       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3550       *
3551       * @see https://html.spec.whatwg.org/#parsing-main-incolgroup
3552       * @see WP_HTML_Processor::step
3553       *
3554       * @return bool Whether an element was found.
3555       */
3556  	private function step_in_column_group(): bool {
3557          $token_name = $this->get_token_name();
3558          $token_type = $this->get_token_type();
3559          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3560          $op         = "{$op_sigil}{$token_name}";
3561  
3562          switch ( $op ) {
3563              /*
3564               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
3565               * > U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
3566               */
3567              case '#text':
3568                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3569                      // Insert the character.
3570                      $this->insert_html_element( $this->state->current_token );
3571                      return true;
3572                  }
3573  
3574                  goto in_column_group_anything_else;
3575                  break;
3576  
3577              /*
3578               * > A comment token
3579               */
3580              case '#comment':
3581              case '#funky-comment':
3582              case '#presumptuous-tag':
3583                  $this->insert_html_element( $this->state->current_token );
3584                  return true;
3585  
3586              /*
3587               * > A DOCTYPE token
3588               */
3589              case 'html':
3590                  // @todo Indicate a parse error once it's possible.
3591                  return $this->step();
3592  
3593              /*
3594               * > A start tag whose tag name is "html"
3595               */
3596              case '+HTML':
3597                  return $this->step_in_body();
3598  
3599              /*
3600               * > A start tag whose tag name is "col"
3601               */
3602              case '+COL':
3603                  $this->insert_html_element( $this->state->current_token );
3604                  $this->state->stack_of_open_elements->pop();
3605                  return true;
3606  
3607              /*
3608               * > An end tag whose tag name is "colgroup"
3609               */
3610              case '-COLGROUP':
3611                  if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3612                      // @todo Indicate a parse error once it's possible.
3613                      return $this->step();
3614                  }
3615                  $this->state->stack_of_open_elements->pop();
3616                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3617                  return true;
3618  
3619              /*
3620               * > An end tag whose tag name is "col"
3621               */
3622              case '-COL':
3623                  // Parse error: ignore the token.
3624                  return $this->step();
3625  
3626              /*
3627               * > A start tag whose tag name is "template"
3628               * > An end tag whose tag name is "template"
3629               */
3630              case '+TEMPLATE':
3631              case '-TEMPLATE':
3632                  return $this->step_in_head();
3633          }
3634  
3635          in_column_group_anything_else:
3636          /*
3637           * > Anything else
3638           */
3639          if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3640              // @todo Indicate a parse error once it's possible.
3641              return $this->step();
3642          }
3643          $this->state->stack_of_open_elements->pop();
3644          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3645          return $this->step( self::REPROCESS_CURRENT_NODE );
3646      }
3647  
3648      /**
3649       * Parses next element in the 'in table body' insertion mode.
3650       *
3651       * This internal function performs the 'in table body' insertion mode
3652       * logic for the generalized WP_HTML_Processor::step() function.
3653       *
3654       * @since 6.7.0
3655       *
3656       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3657       *
3658       * @see https://html.spec.whatwg.org/#parsing-main-intbody
3659       * @see WP_HTML_Processor::step
3660       *
3661       * @return bool Whether an element was found.
3662       */
3663  	private function step_in_table_body(): bool {
3664          $tag_name = $this->get_tag();
3665          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3666          $op       = "{$op_sigil}{$tag_name}";
3667  
3668          switch ( $op ) {
3669              /*
3670               * > A start tag whose tag name is "tr"
3671               */
3672              case '+TR':
3673                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3674                  $this->insert_html_element( $this->state->current_token );
3675                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3676                  return true;
3677  
3678              /*
3679               * > A start tag whose tag name is one of: "th", "td"
3680               */
3681              case '+TH':
3682              case '+TD':
3683                  // @todo Indicate a parse error once it's possible.
3684                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3685                  $this->insert_virtual_node( 'TR' );
3686                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3687                  return $this->step( self::REPROCESS_CURRENT_NODE );
3688  
3689              /*
3690               * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3691               */
3692              case '-TBODY':
3693              case '-TFOOT':
3694              case '-THEAD':
3695                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3696                      // Parse error: ignore the token.
3697                      return $this->step();
3698                  }
3699  
3700                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3701                  $this->state->stack_of_open_elements->pop();
3702                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3703                  return true;
3704  
3705              /*
3706               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead"
3707               * > An end tag whose tag name is "table"
3708               */
3709              case '+CAPTION':
3710              case '+COL':
3711              case '+COLGROUP':
3712              case '+TBODY':
3713              case '+TFOOT':
3714              case '+THEAD':
3715              case '-TABLE':
3716                  if (
3717                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TBODY' ) &&
3718                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'THEAD' ) &&
3719                      ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TFOOT' )
3720                  ) {
3721                      // Parse error: ignore the token.
3722                      return $this->step();
3723                  }
3724                  $this->state->stack_of_open_elements->clear_to_table_body_context();
3725                  $this->state->stack_of_open_elements->pop();
3726                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3727                  return $this->step( self::REPROCESS_CURRENT_NODE );
3728  
3729              /*
3730               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th", "tr"
3731               */
3732              case '-BODY':
3733              case '-CAPTION':
3734              case '-COL':
3735              case '-COLGROUP':
3736              case '-HTML':
3737              case '-TD':
3738              case '-TH':
3739              case '-TR':
3740                  // Parse error: ignore the token.
3741                  return $this->step();
3742          }
3743  
3744          /*
3745           * > Anything else
3746           * > Process the token using the rules for the "in table" insertion mode.
3747           */
3748          return $this->step_in_table();
3749      }
3750  
3751      /**
3752       * Parses next element in the 'in row' insertion mode.
3753       *
3754       * This internal function performs the 'in row' insertion mode
3755       * logic for the generalized WP_HTML_Processor::step() function.
3756       *
3757       * @since 6.7.0
3758       *
3759       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3760       *
3761       * @see https://html.spec.whatwg.org/#parsing-main-intr
3762       * @see WP_HTML_Processor::step
3763       *
3764       * @return bool Whether an element was found.
3765       */
3766  	private function step_in_row(): bool {
3767          $tag_name = $this->get_tag();
3768          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3769          $op       = "{$op_sigil}{$tag_name}";
3770  
3771          switch ( $op ) {
3772              /*
3773               * > A start tag whose tag name is one of: "th", "td"
3774               */
3775              case '+TH':
3776              case '+TD':
3777                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3778                  $this->insert_html_element( $this->state->current_token );
3779                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
3780                  $this->state->active_formatting_elements->insert_marker();
3781                  return true;
3782  
3783              /*
3784               * > An end tag whose tag name is "tr"
3785               */
3786              case '-TR':
3787                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3788                      // Parse error: ignore the token.
3789                      return $this->step();
3790                  }
3791  
3792                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3793                  $this->state->stack_of_open_elements->pop();
3794                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3795                  return true;
3796  
3797              /*
3798               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead", "tr"
3799               * > An end tag whose tag name is "table"
3800               */
3801              case '+CAPTION':
3802              case '+COL':
3803              case '+COLGROUP':
3804              case '+TBODY':
3805              case '+TFOOT':
3806              case '+THEAD':
3807              case '+TR':
3808              case '-TABLE':
3809                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3810                      // Parse error: ignore the token.
3811                      return $this->step();
3812                  }
3813  
3814                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3815                  $this->state->stack_of_open_elements->pop();
3816                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3817                  return $this->step( self::REPROCESS_CURRENT_NODE );
3818  
3819              /*
3820               * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3821               */
3822              case '-TBODY':
3823              case '-TFOOT':
3824              case '-THEAD':
3825                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3826                      // Parse error: ignore the token.
3827                      return $this->step();
3828                  }
3829  
3830                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3831                      // Ignore the token.
3832                      return $this->step();
3833                  }
3834  
3835                  $this->state->stack_of_open_elements->clear_to_table_row_context();
3836                  $this->state->stack_of_open_elements->pop();
3837                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3838                  return $this->step( self::REPROCESS_CURRENT_NODE );
3839  
3840              /*
3841               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th"
3842               */
3843              case '-BODY':
3844              case '-CAPTION':
3845              case '-COL':
3846              case '-COLGROUP':
3847              case '-HTML':
3848              case '-TD':
3849              case '-TH':
3850                  // Parse error: ignore the token.
3851                  return $this->step();
3852          }
3853  
3854          /*
3855           * > Anything else
3856           * >   Process the token using the rules for the "in table" insertion mode.
3857           */
3858          return $this->step_in_table();
3859      }
3860  
3861      /**
3862       * Parses next element in the 'in cell' insertion mode.
3863       *
3864       * This internal function performs the 'in cell' insertion mode
3865       * logic for the generalized WP_HTML_Processor::step() function.
3866       *
3867       * @since 6.7.0
3868       *
3869       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3870       *
3871       * @see https://html.spec.whatwg.org/#parsing-main-intd
3872       * @see WP_HTML_Processor::step
3873       *
3874       * @return bool Whether an element was found.
3875       */
3876  	private function step_in_cell(): bool {
3877          $tag_name = $this->get_tag();
3878          $op_sigil = $this->is_tag_closer() ? '-' : '+';
3879          $op       = "{$op_sigil}{$tag_name}";
3880  
3881          switch ( $op ) {
3882              /*
3883               * > An end tag whose tag name is one of: "td", "th"
3884               */
3885              case '-TD':
3886              case '-TH':
3887                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3888                      // Parse error: ignore the token.
3889                      return $this->step();
3890                  }
3891  
3892                  $this->generate_implied_end_tags();
3893  
3894                  /*
3895                   * @todo This needs to check if the current node is an HTML element, meaning that
3896                   *       when SVG and MathML support is added, this needs to differentiate between an
3897                   *       HTML element of the given name, such as `<center>`, and a foreign element of
3898                   *       the same given name.
3899                   */
3900                  if ( ! $this->state->stack_of_open_elements->current_node_is( $tag_name ) ) {
3901                      // @todo Indicate a parse error once it's possible.
3902                  }
3903  
3904                  $this->state->stack_of_open_elements->pop_until( $tag_name );
3905                  $this->state->active_formatting_elements->clear_up_to_last_marker();
3906                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3907                  return true;
3908  
3909              /*
3910               * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td",
3911               * > "tfoot", "th", "thead", "tr"
3912               */
3913              case '+CAPTION':
3914              case '+COL':
3915              case '+COLGROUP':
3916              case '+TBODY':
3917              case '+TD':
3918              case '+TFOOT':
3919              case '+TH':
3920              case '+THEAD':
3921              case '+TR':
3922                  /*
3923                   * > Assert: The stack of open elements has a td or th element in table scope.
3924                   *
3925                   * Nothing to do here, except to verify in tests that this never appears.
3926                   */
3927  
3928                  $this->close_cell();
3929                  return $this->step( self::REPROCESS_CURRENT_NODE );
3930  
3931              /*
3932               * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html"
3933               */
3934              case '-BODY':
3935              case '-CAPTION':
3936              case '-COL':
3937              case '-COLGROUP':
3938              case '-HTML':
3939                  // Parse error: ignore the token.
3940                  return $this->step();
3941  
3942              /*
3943               * > An end tag whose tag name is one of: "table", "tbody", "tfoot", "thead", "tr"
3944               */
3945              case '-TABLE':
3946              case '-TBODY':
3947              case '-TFOOT':
3948              case '-THEAD':
3949              case '-TR':
3950                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3951                      // Parse error: ignore the token.
3952                      return $this->step();
3953                  }
3954                  $this->close_cell();
3955                  return $this->step( self::REPROCESS_CURRENT_NODE );
3956          }
3957  
3958          /*
3959           * > Anything else
3960           * >   Process the token using the rules for the "in body" insertion mode.
3961           */
3962          return $this->step_in_body();
3963      }
3964  
3965      /**
3966       * Parses next element in the 'in select' insertion mode.
3967       *
3968       * This internal function performs the 'in select' insertion mode
3969       * logic for the generalized WP_HTML_Processor::step() function.
3970       *
3971       * @since 6.7.0
3972       *
3973       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3974       *
3975       * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inselect
3976       * @see WP_HTML_Processor::step
3977       *
3978       * @return bool Whether an element was found.
3979       */
3980  	private function step_in_select(): bool {
3981          $token_name = $this->get_token_name();
3982          $token_type = $this->get_token_type();
3983          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3984          $op         = "{$op_sigil}{$token_name}";
3985  
3986          switch ( $op ) {
3987              /*
3988               * > Any other character token
3989               */
3990              case '#text':
3991                  /*
3992                   * > A character token that is U+0000 NULL
3993                   *
3994                   * If a text node only comprises null bytes then it should be
3995                   * entirely ignored and should not return to calling code.
3996                   */
3997                  if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
3998                      // Parse error: ignore the token.
3999                      return $this->step();
4000                  }
4001  
4002                  $this->insert_html_element( $this->state->current_token );
4003                  return true;
4004  
4005              /*
4006               * > A comment token
4007               */
4008              case '#comment':
4009              case '#funky-comment':
4010              case '#presumptuous-tag':
4011                  $this->insert_html_element( $this->state->current_token );
4012                  return true;
4013  
4014              /*
4015               * > A DOCTYPE token
4016               */
4017              case 'html':
4018                  // Parse error: ignore the token.
4019                  return $this->step();
4020  
4021              /*
4022               * > A start tag whose tag name is "html"
4023               */
4024              case '+HTML':
4025                  return $this->step_in_body();
4026  
4027              /*
4028               * > A start tag whose tag name is "option"
4029               */
4030              case '+OPTION':
4031                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4032                      $this->state->stack_of_open_elements->pop();
4033                  }
4034                  $this->insert_html_element( $this->state->current_token );
4035                  return true;
4036  
4037              /*
4038               * > A start tag whose tag name is "optgroup"
4039               * > A start tag whose tag name is "hr"
4040               *
4041               * These rules are identical except for the treatment of the self-closing flag and
4042               * the subsequent pop of the HR void element, all of which is handled elsewhere in the processor.
4043               */
4044              case '+OPTGROUP':
4045              case '+HR':
4046                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4047                      $this->state->stack_of_open_elements->pop();
4048                  }
4049  
4050                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4051                      $this->state->stack_of_open_elements->pop();
4052                  }
4053  
4054                  $this->insert_html_element( $this->state->current_token );
4055                  return true;
4056  
4057              /*
4058               * > An end tag whose tag name is "optgroup"
4059               */
4060              case '-OPTGROUP':
4061                  $current_node = $this->state->stack_of_open_elements->current_node();
4062                  if ( $current_node && 'OPTION' === $current_node->node_name ) {
4063                      foreach ( $this->state->stack_of_open_elements->walk_up( $current_node ) as $parent ) {
4064                          break;
4065                      }
4066                      if ( $parent && 'OPTGROUP' === $parent->node_name ) {
4067                          $this->state->stack_of_open_elements->pop();
4068                      }
4069                  }
4070  
4071                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4072                      $this->state->stack_of_open_elements->pop();
4073                      return true;
4074                  }
4075  
4076                  // Parse error: ignore the token.
4077                  return $this->step();
4078  
4079              /*
4080               * > An end tag whose tag name is "option"
4081               */
4082              case '-OPTION':
4083                  if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4084                      $this->state->stack_of_open_elements->pop();
4085                      return true;
4086                  }
4087  
4088                  // Parse error: ignore the token.
4089                  return $this->step();
4090  
4091              /*
4092               * > An end tag whose tag name is "select"
4093               * > A start tag whose tag name is "select"
4094               *
4095               * > It just gets treated like an end tag.
4096               */
4097              case '-SELECT':
4098              case '+SELECT':
4099                  if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4100                      // Parse error: ignore the token.
4101                      return $this->step();
4102                  }
4103                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4104                  $this->reset_insertion_mode_appropriately();
4105                  return true;
4106  
4107              /*
4108               * > A start tag whose tag name is one of: "input", "keygen", "textarea"
4109               *
4110               * All three of these tags are considered a parse error when found in this insertion mode.
4111               */
4112              case '+INPUT':
4113              case '+KEYGEN':
4114              case '+TEXTAREA':
4115                  if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4116                      // Ignore the token.
4117                      return $this->step();
4118                  }
4119                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4120                  $this->reset_insertion_mode_appropriately();
4121                  return $this->step( self::REPROCESS_CURRENT_NODE );
4122  
4123              /*
4124               * > A start tag whose tag name is one of: "script", "template"
4125               * > An end tag whose tag name is "template"
4126               */
4127              case '+SCRIPT':
4128              case '+TEMPLATE':
4129              case '-TEMPLATE':
4130                  return $this->step_in_head();
4131          }
4132  
4133          /*
4134           * > Anything else
4135           * >   Parse error: ignore the token.
4136           */
4137          return $this->step();
4138      }
4139  
4140      /**
4141       * Parses next element in the 'in select in table' insertion mode.
4142       *
4143       * This internal function performs the 'in select in table' insertion mode
4144       * logic for the generalized WP_HTML_Processor::step() function.
4145       *
4146       * @since 6.7.0
4147       *
4148       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4149       *
4150       * @see https://html.spec.whatwg.org/#parsing-main-inselectintable
4151       * @see WP_HTML_Processor::step
4152       *
4153       * @return bool Whether an element was found.
4154       */
4155  	private function step_in_select_in_table(): bool {
4156          $token_name = $this->get_token_name();
4157          $token_type = $this->get_token_type();
4158          $op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
4159          $op         = "{$op_sigil}{$token_name}";
4160  
4161          switch ( $op ) {
4162              /*
4163               * > A start tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4164               */
4165              case '+CAPTION':
4166              case '+TABLE':
4167              case '+TBODY':
4168              case '+TFOOT':
4169              case '+THEAD':
4170              case '+TR':
4171              case '+TD':
4172              case '+TH':
4173                  // @todo Indicate a parse error once it's possible.
4174                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4175                  $this->reset_insertion_mode_appropriately();
4176                  return $this->step( self::REPROCESS_CURRENT_NODE );
4177  
4178              /*
4179               * > An end tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4180               */
4181              case '-CAPTION':
4182              case '-TABLE':
4183              case '-TBODY':
4184              case '-TFOOT':
4185              case '-THEAD':
4186              case '-TR':
4187              case '-TD':
4188              case '-TH':
4189                  // @todo Indicate a parse error once it's possible.
4190                  if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $token_name ) ) {
4191                      return $this->step();
4192                  }
4193                  $this->state->stack_of_open_elements->pop_until( 'SELECT' );
4194                  $this->reset_insertion_mode_appropriately();
4195                  return $this->step( self::REPROCESS_CURRENT_NODE );
4196          }
4197  
4198          /*
4199           * > Anything else
4200           */
4201          return $this->step_in_select();
4202      }
4203  
4204      /**
4205       * Parses next element in the 'in template' insertion mode.
4206       *
4207       * This internal function performs the 'in template' insertion mode
4208       * logic for the generalized WP_HTML_Processor::step() function.
4209       *
4210       * @since 6.7.0 Stub implementation.
4211       *
4212       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4213       *
4214       * @see https://html.spec.whatwg.org/#parsing-main-intemplate
4215       * @see WP_HTML_Processor::step
4216       *
4217       * @return bool Whether an element was found.
4218       */
4219  	private function step_in_template(): bool {
4220          $token_name = $this->get_token_name();
4221          $token_type = $this->get_token_type();
4222          $is_closer  = $this->is_tag_closer();
4223          $op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
4224          $op         = "{$op_sigil}{$token_name}";
4225  
4226          switch ( $op ) {
4227              /*
4228               * > A character token
4229               * > A comment token
4230               * > A DOCTYPE token
4231               */
4232              case '#text':
4233              case '#comment':
4234              case '#funky-comment':
4235              case '#presumptuous-tag':
4236              case 'html':
4237                  return $this->step_in_body();
4238  
4239              /*
4240               * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
4241               * > "meta", "noframes", "script", "style", "template", "title"
4242               * > An end tag whose tag name is "template"
4243               */
4244              case '+BASE':
4245              case '+BASEFONT':
4246              case '+BGSOUND':
4247              case '+LINK':
4248              case '+META':
4249              case '+NOFRAMES':
4250              case '+SCRIPT':
4251              case '+STYLE':
4252              case '+TEMPLATE':
4253              case '+TITLE':
4254              case '-TEMPLATE':
4255                  return $this->step_in_head();
4256  
4257              /*
4258               * > A start tag whose tag name is one of: "caption", "colgroup", "tbody", "tfoot", "thead"
4259               */
4260              case '+CAPTION':
4261              case '+COLGROUP':
4262              case '+TBODY':
4263              case '+TFOOT':
4264              case '+THEAD':
4265                  array_pop( $this->state->stack_of_template_insertion_modes );
4266                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4267                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4268                  return $this->step( self::REPROCESS_CURRENT_NODE );
4269  
4270              /*
4271               * > A start tag whose tag name is "col"
4272               */
4273              case '+COL':
4274                  array_pop( $this->state->stack_of_template_insertion_modes );
4275                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4276                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4277                  return $this->step( self::REPROCESS_CURRENT_NODE );
4278  
4279              /*
4280               * > A start tag whose tag name is "tr"
4281               */
4282              case '+TR':
4283                  array_pop( $this->state->stack_of_template_insertion_modes );
4284                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4285                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4286                  return $this->step( self::REPROCESS_CURRENT_NODE );
4287  
4288              /*
4289               * > A start tag whose tag name is one of: "td", "th"
4290               */
4291              case '+TD':
4292              case '+TH':
4293                  array_pop( $this->state->stack_of_template_insertion_modes );
4294                  $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4295                  $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4296                  return $this->step( self::REPROCESS_CURRENT_NODE );
4297          }
4298  
4299          /*
4300           * > Any other start tag
4301           */
4302          if ( ! $is_closer ) {
4303              array_pop( $this->state->stack_of_template_insertion_modes );
4304              $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4305              $this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4306              return $this->step( self::REPROCESS_CURRENT_NODE );
4307          }
4308  
4309          /*
4310           * > Any other end tag
4311           */
4312          if ( $is_closer ) {
4313              // Parse error: ignore the token.
4314              return $this->step();
4315          }
4316  
4317          /*
4318           * > An end-of-file token
4319           */
4320          if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
4321              // Stop parsing.
4322              return false;
4323          }
4324  
4325          // @todo Indicate a parse error once it's possible.
4326          $this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
4327          $this->state->active_formatting_elements->clear_up_to_last_marker();
4328          array_pop( $this->state->stack_of_template_insertion_modes );
4329          $this->reset_insertion_mode_appropriately();
4330          return $this->step( self::REPROCESS_CURRENT_NODE );
4331      }
4332  
4333      /**
4334       * Parses next element in the 'after body' insertion mode.
4335       *
4336       * This internal function performs the 'after body' insertion mode
4337       * logic for the generalized WP_HTML_Processor::step() function.
4338       *
4339       * @since 6.7.0 Stub implementation.
4340       *
4341       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4342       *
4343       * @see https://html.spec.whatwg.org/#parsing-main-afterbody
4344       * @see WP_HTML_Processor::step
4345       *
4346       * @return bool Whether an element was found.
4347       */
4348  	private function step_after_body(): bool {
4349          $tag_name   = $this->get_token_name();
4350          $token_type = $this->get_token_type();
4351          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4352          $op         = "{$op_sigil}{$tag_name}";
4353  
4354          switch ( $op ) {
4355              /*
4356               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4357               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4358               *
4359               * > Process the token using the rules for the "in body" insertion mode.
4360               */
4361              case '#text':
4362                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4363                      return $this->step_in_body();
4364                  }
4365                  goto after_body_anything_else;
4366                  break;
4367  
4368              /*
4369               * > A comment token
4370               */
4371              case '#comment':
4372              case '#funky-comment':
4373              case '#presumptuous-tag':
4374                  $this->bail( 'Content outside of BODY is unsupported.' );
4375                  break;
4376  
4377              /*
4378               * > A DOCTYPE token
4379               */
4380              case 'html':
4381                  // Parse error: ignore the token.
4382                  return $this->step();
4383  
4384              /*
4385               * > A start tag whose tag name is "html"
4386               */
4387              case '+HTML':
4388                  return $this->step_in_body();
4389  
4390              /*
4391               * > An end tag whose tag name is "html"
4392               *
4393               * > If the parser was created as part of the HTML fragment parsing algorithm,
4394               * > this is a parse error; ignore the token. (fragment case)
4395               * >
4396               * > Otherwise, switch the insertion mode to "after after body".
4397               */
4398              case '-HTML':
4399                  if ( isset( $this->context_node ) ) {
4400                      return $this->step();
4401                  }
4402  
4403                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY;
4404                  /*
4405                   * The HTML element is not removed from the stack of open elements.
4406                   * Only internal state has changed, this does not qualify as a "step"
4407                   * in terms of advancing through the document to another token.
4408                   * Nothing has been pushed or popped.
4409                   * Proceed to parse the next item.
4410                   */
4411                  return $this->step();
4412          }
4413  
4414          /*
4415           * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4416           */
4417          after_body_anything_else:
4418          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4419          return $this->step( self::REPROCESS_CURRENT_NODE );
4420      }
4421  
4422      /**
4423       * Parses next element in the 'in frameset' insertion mode.
4424       *
4425       * This internal function performs the 'in frameset' insertion mode
4426       * logic for the generalized WP_HTML_Processor::step() function.
4427       *
4428       * @since 6.7.0 Stub implementation.
4429       *
4430       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4431       *
4432       * @see https://html.spec.whatwg.org/#parsing-main-inframeset
4433       * @see WP_HTML_Processor::step
4434       *
4435       * @return bool Whether an element was found.
4436       */
4437  	private function step_in_frameset(): bool {
4438          $tag_name   = $this->get_token_name();
4439          $token_type = $this->get_token_type();
4440          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4441          $op         = "{$op_sigil}{$tag_name}";
4442  
4443          switch ( $op ) {
4444              /*
4445               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4446               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4447               * >
4448               * > Insert the character.
4449               *
4450               * This algorithm effectively strips non-whitespace characters from text and inserts
4451               * them under HTML. This is not supported at this time.
4452               */
4453              case '#text':
4454                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4455                      return $this->step_in_body();
4456                  }
4457                  $this->bail( 'Non-whitespace characters cannot be handled in frameset.' );
4458                  break;
4459  
4460              /*
4461               * > A comment token
4462               */
4463              case '#comment':
4464              case '#funky-comment':
4465              case '#presumptuous-tag':
4466                  $this->insert_html_element( $this->state->current_token );
4467                  return true;
4468  
4469              /*
4470               * > A DOCTYPE token
4471               */
4472              case 'html':
4473                  // Parse error: ignore the token.
4474                  return $this->step();
4475  
4476              /*
4477               * > A start tag whose tag name is "html"
4478               */
4479              case '+HTML':
4480                  return $this->step_in_body();
4481  
4482              /*
4483               * > A start tag whose tag name is "frameset"
4484               */
4485              case '+FRAMESET':
4486                  $this->insert_html_element( $this->state->current_token );
4487                  return true;
4488  
4489              /*
4490               * > An end tag whose tag name is "frameset"
4491               */
4492              case '-FRAMESET':
4493                  /*
4494                   * > If the current node is the root html element, then this is a parse error;
4495                   * > ignore the token. (fragment case)
4496                   */
4497                  if ( $this->state->stack_of_open_elements->current_node_is( 'HTML' ) ) {
4498                      return $this->step();
4499                  }
4500  
4501                  /*
4502                   * > Otherwise, pop the current node from the stack of open elements.
4503                   */
4504                  $this->state->stack_of_open_elements->pop();
4505  
4506                  /*
4507                   * > If the parser was not created as part of the HTML fragment parsing algorithm
4508                   * > (fragment case), and the current node is no longer a frameset element, then
4509                   * > switch the insertion mode to "after frameset".
4510                   */
4511                  if ( ! isset( $this->context_node ) && ! $this->state->stack_of_open_elements->current_node_is( 'FRAMESET' ) ) {
4512                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET;
4513                  }
4514  
4515                  return true;
4516  
4517              /*
4518               * > A start tag whose tag name is "frame"
4519               *
4520               * > Insert an HTML element for the token. Immediately pop the
4521               * > current node off the stack of open elements.
4522               * >
4523               * > Acknowledge the token's self-closing flag, if it is set.
4524               */
4525              case '+FRAME':
4526                  $this->insert_html_element( $this->state->current_token );
4527                  $this->state->stack_of_open_elements->pop();
4528                  return true;
4529  
4530              /*
4531               * > A start tag whose tag name is "noframes"
4532               */
4533              case '+NOFRAMES':
4534                  return $this->step_in_head();
4535          }
4536  
4537          // Parse error: ignore the token.
4538          return $this->step();
4539      }
4540  
4541      /**
4542       * Parses next element in the 'after frameset' insertion mode.
4543       *
4544       * This internal function performs the 'after frameset' insertion mode
4545       * logic for the generalized WP_HTML_Processor::step() function.
4546       *
4547       * @since 6.7.0 Stub implementation.
4548       *
4549       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4550       *
4551       * @see https://html.spec.whatwg.org/#parsing-main-afterframeset
4552       * @see WP_HTML_Processor::step
4553       *
4554       * @return bool Whether an element was found.
4555       */
4556  	private function step_after_frameset(): bool {
4557          $tag_name   = $this->get_token_name();
4558          $token_type = $this->get_token_type();
4559          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4560          $op         = "{$op_sigil}{$tag_name}";
4561  
4562          switch ( $op ) {
4563              /*
4564               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4565               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4566               * >
4567               * > Insert the character.
4568               *
4569               * This algorithm effectively strips non-whitespace characters from text and inserts
4570               * them under HTML. This is not supported at this time.
4571               */
4572              case '#text':
4573                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4574                      return $this->step_in_body();
4575                  }
4576                  $this->bail( 'Non-whitespace characters cannot be handled in after frameset' );
4577                  break;
4578  
4579              /*
4580               * > A comment token
4581               */
4582              case '#comment':
4583              case '#funky-comment':
4584              case '#presumptuous-tag':
4585                  $this->insert_html_element( $this->state->current_token );
4586                  return true;
4587  
4588              /*
4589               * > A DOCTYPE token
4590               */
4591              case 'html':
4592                  // Parse error: ignore the token.
4593                  return $this->step();
4594  
4595              /*
4596               * > A start tag whose tag name is "html"
4597               */
4598              case '+HTML':
4599                  return $this->step_in_body();
4600  
4601              /*
4602               * > An end tag whose tag name is "html"
4603               */
4604              case '-HTML':
4605                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET;
4606                  /*
4607                   * The HTML element is not removed from the stack of open elements.
4608                   * Only internal state has changed, this does not qualify as a "step"
4609                   * in terms of advancing through the document to another token.
4610                   * Nothing has been pushed or popped.
4611                   * Proceed to parse the next item.
4612                   */
4613                  return $this->step();
4614  
4615              /*
4616               * > A start tag whose tag name is "noframes"
4617               */
4618              case '+NOFRAMES':
4619                  return $this->step_in_head();
4620          }
4621  
4622          // Parse error: ignore the token.
4623          return $this->step();
4624      }
4625  
4626      /**
4627       * Parses next element in the 'after after body' insertion mode.
4628       *
4629       * This internal function performs the 'after after body' insertion mode
4630       * logic for the generalized WP_HTML_Processor::step() function.
4631       *
4632       * @since 6.7.0 Stub implementation.
4633       *
4634       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4635       *
4636       * @see https://html.spec.whatwg.org/#the-after-after-body-insertion-mode
4637       * @see WP_HTML_Processor::step
4638       *
4639       * @return bool Whether an element was found.
4640       */
4641  	private function step_after_after_body(): bool {
4642          $tag_name   = $this->get_token_name();
4643          $token_type = $this->get_token_type();
4644          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4645          $op         = "{$op_sigil}{$tag_name}";
4646  
4647          switch ( $op ) {
4648              /*
4649               * > A comment token
4650               */
4651              case '#comment':
4652              case '#funky-comment':
4653              case '#presumptuous-tag':
4654                  $this->bail( 'Content outside of HTML is unsupported.' );
4655                  break;
4656  
4657              /*
4658               * > A DOCTYPE token
4659               * > A start tag whose tag name is "html"
4660               *
4661               * > Process the token using the rules for the "in body" insertion mode.
4662               */
4663              case 'html':
4664              case '+HTML':
4665                  return $this->step_in_body();
4666  
4667              /*
4668               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4669               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4670               * >
4671               * > Process the token using the rules for the "in body" insertion mode.
4672               */
4673              case '#text':
4674                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4675                      return $this->step_in_body();
4676                  }
4677                  goto after_after_body_anything_else;
4678                  break;
4679          }
4680  
4681          /*
4682           * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4683           */
4684          after_after_body_anything_else:
4685          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4686          return $this->step( self::REPROCESS_CURRENT_NODE );
4687      }
4688  
4689      /**
4690       * Parses next element in the 'after after frameset' insertion mode.
4691       *
4692       * This internal function performs the 'after after frameset' insertion mode
4693       * logic for the generalized WP_HTML_Processor::step() function.
4694       *
4695       * @since 6.7.0 Stub implementation.
4696       *
4697       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4698       *
4699       * @see https://html.spec.whatwg.org/#the-after-after-frameset-insertion-mode
4700       * @see WP_HTML_Processor::step
4701       *
4702       * @return bool Whether an element was found.
4703       */
4704  	private function step_after_after_frameset(): bool {
4705          $tag_name   = $this->get_token_name();
4706          $token_type = $this->get_token_type();
4707          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4708          $op         = "{$op_sigil}{$tag_name}";
4709  
4710          switch ( $op ) {
4711              /*
4712               * > A comment token
4713               */
4714              case '#comment':
4715              case '#funky-comment':
4716              case '#presumptuous-tag':
4717                  $this->bail( 'Content outside of HTML is unsupported.' );
4718                  break;
4719  
4720              /*
4721               * > A DOCTYPE token
4722               * > A start tag whose tag name is "html"
4723               *
4724               * > Process the token using the rules for the "in body" insertion mode.
4725               */
4726              case 'html':
4727              case '+HTML':
4728                  return $this->step_in_body();
4729  
4730              /*
4731               * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4732               * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4733               * >
4734               * > Process the token using the rules for the "in body" insertion mode.
4735               *
4736               * This algorithm effectively strips non-whitespace characters from text and inserts
4737               * them under HTML. This is not supported at this time.
4738               */
4739              case '#text':
4740                  if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4741                      return $this->step_in_body();
4742                  }
4743                  $this->bail( 'Non-whitespace characters cannot be handled in after after frameset.' );
4744                  break;
4745  
4746              /*
4747               * > A start tag whose tag name is "noframes"
4748               */
4749              case '+NOFRAMES':
4750                  return $this->step_in_head();
4751          }
4752  
4753          // Parse error: ignore the token.
4754          return $this->step();
4755      }
4756  
4757      /**
4758       * Parses next element in the 'in foreign content' insertion mode.
4759       *
4760       * This internal function performs the 'in foreign content' insertion mode
4761       * logic for the generalized WP_HTML_Processor::step() function.
4762       *
4763       * @since 6.7.0 Stub implementation.
4764       *
4765       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4766       *
4767       * @see https://html.spec.whatwg.org/#parsing-main-inforeign
4768       * @see WP_HTML_Processor::step
4769       *
4770       * @return bool Whether an element was found.
4771       */
4772  	private function step_in_foreign_content(): bool {
4773          $tag_name   = $this->get_token_name();
4774          $token_type = $this->get_token_type();
4775          $op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4776          $op         = "{$op_sigil}{$tag_name}";
4777  
4778          /*
4779           * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4780           *
4781           * This section drawn out above the switch to more easily incorporate
4782           * the additional rules based on the presence of the attributes.
4783           */
4784          if (
4785              '+FONT' === $op &&
4786              (
4787                  null !== $this->get_attribute( 'color' ) ||
4788                  null !== $this->get_attribute( 'face' ) ||
4789                  null !== $this->get_attribute( 'size' )
4790              )
4791          ) {
4792              $op = '+FONT with attributes';
4793          }
4794  
4795          switch ( $op ) {
4796              case '#text':
4797                  /*
4798                   * > A character token that is U+0000 NULL
4799                   *
4800                   * This is handled by `get_modifiable_text()`.
4801                   */
4802  
4803                  /*
4804                   * Whitespace-only text does not affect the frameset-ok flag.
4805                   * It is probably inter-element whitespace, but it may also
4806                   * contain character references which decode only to whitespace.
4807                   */
4808                  if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
4809                      $this->state->frameset_ok = false;
4810                  }
4811  
4812                  $this->insert_foreign_element( $this->state->current_token, false );
4813                  return true;
4814  
4815              /*
4816               * CDATA sections are alternate wrappers for text content and therefore
4817               * ought to follow the same rules as text nodes.
4818               */
4819              case '#cdata-section':
4820                  /*
4821                   * NULL bytes and whitespace do not change the frameset-ok flag.
4822                   */
4823                  $current_token        = $this->bookmarks[ $this->state->current_token->bookmark_name ];
4824                  $cdata_content_start  = $current_token->start + 9;
4825                  $cdata_content_length = $current_token->length - 12;
4826                  if ( strspn( $this->html, "\0 \t\n\f\r", $cdata_content_start, $cdata_content_length ) !== $cdata_content_length ) {
4827                      $this->state->frameset_ok = false;
4828                  }
4829  
4830                  $this->insert_foreign_element( $this->state->current_token, false );
4831                  return true;
4832  
4833              /*
4834               * > A comment token
4835               */
4836              case '#comment':
4837              case '#funky-comment':
4838              case '#presumptuous-tag':
4839                  $this->insert_foreign_element( $this->state->current_token, false );
4840                  return true;
4841  
4842              /*
4843               * > A DOCTYPE token
4844               */
4845              case 'html':
4846                  // Parse error: ignore the token.
4847                  return $this->step();
4848  
4849              /*
4850               * > A start tag whose tag name is "b", "big", "blockquote", "body", "br", "center",
4851               * > "code", "dd", "div", "dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5",
4852               * > "h6", "head", "hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol",
4853               * > "p", "pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup",
4854               * > "table", "tt", "u", "ul", "var"
4855               *
4856               * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4857               *
4858               * > An end tag whose tag name is "br", "p"
4859               *
4860               * Closing BR tags are always reported by the Tag Processor as opening tags.
4861               */
4862              case '+B':
4863              case '+BIG':
4864              case '+BLOCKQUOTE':
4865              case '+BODY':
4866              case '+BR':
4867              case '+CENTER':
4868              case '+CODE':
4869              case '+DD':
4870              case '+DIV':
4871              case '+DL':
4872              case '+DT':
4873              case '+EM':
4874              case '+EMBED':
4875              case '+H1':
4876              case '+H2':
4877              case '+H3':
4878              case '+H4':
4879              case '+H5':
4880              case '+H6':
4881              case '+HEAD':
4882              case '+HR':
4883              case '+I':
4884              case '+IMG':
4885              case '+LI':
4886              case '+LISTING':
4887              case '+MENU':
4888              case '+META':
4889              case '+NOBR':
4890              case '+OL':
4891              case '+P':
4892              case '+PRE':
4893              case '+RUBY':
4894              case '+S':
4895              case '+SMALL':
4896              case '+SPAN':
4897              case '+STRONG':
4898              case '+STRIKE':
4899              case '+SUB':
4900              case '+SUP':
4901              case '+TABLE':
4902              case '+TT':
4903              case '+U':
4904              case '+UL':
4905              case '+VAR':
4906              case '+FONT with attributes':
4907              case '-BR':
4908              case '-P':
4909                  // @todo Indicate a parse error once it's possible.
4910                  foreach ( $this->state->stack_of_open_elements->walk_up() as $current_node ) {
4911                      if (
4912                          'math' === $current_node->integration_node_type ||
4913                          'html' === $current_node->integration_node_type ||
4914                          'html' === $current_node->namespace
4915                      ) {
4916                          break;
4917                      }
4918  
4919                      $this->state->stack_of_open_elements->pop();
4920                  }
4921                  goto in_foreign_content_process_in_current_insertion_mode;
4922          }
4923  
4924          /*
4925           * > Any other start tag
4926           */
4927          if ( ! $this->is_tag_closer() ) {
4928              $this->insert_foreign_element( $this->state->current_token, false );
4929  
4930              /*
4931               * > If the token has its self-closing flag set, then run
4932               * > the appropriate steps from the following list:
4933               * >
4934               * >   ↪ the token's tag name is "script", and the new current node is in the SVG namespace
4935               * >         Acknowledge the token's self-closing flag, and then act as
4936               * >         described in the steps for a "script" end tag below.
4937               * >
4938               * >   ↪ Otherwise
4939               * >         Pop the current node off the stack of open elements and
4940               * >         acknowledge the token's self-closing flag.
4941               *
4942               * Since the rules for SCRIPT below indicate to pop the element off of the stack of
4943               * open elements, which is the same for the Otherwise condition, there's no need to
4944               * separate these checks. The difference comes when a parser operates with the scripting
4945               * flag enabled, and executes the script, which this parser does not support.
4946               */
4947              if ( $this->state->current_token->has_self_closing_flag ) {
4948                  $this->state->stack_of_open_elements->pop();
4949              }
4950              return true;
4951          }
4952  
4953          /*
4954           * > An end tag whose name is "script", if the current node is an SVG script element.
4955           */
4956          if ( $this->is_tag_closer() && 'SCRIPT' === $this->state->current_token->node_name && 'svg' === $this->state->current_token->namespace ) {
4957              $this->state->stack_of_open_elements->pop();
4958              return true;
4959          }
4960  
4961          /*
4962           * > Any other end tag
4963           */
4964          if ( $this->is_tag_closer() ) {
4965              $node = $this->state->stack_of_open_elements->current_node();
4966              if ( $tag_name !== $node->node_name ) {
4967                  // @todo Indicate a parse error once it's possible.
4968              }
4969              in_foreign_content_end_tag_loop:
4970              if ( $node === $this->state->stack_of_open_elements->at( 1 ) ) {
4971                  return true;
4972              }
4973  
4974              /*
4975               * > If node's tag name, converted to ASCII lowercase, is the same as the tag name
4976               * > of the token, pop elements from the stack of open elements until node has
4977               * > been popped from the stack, and then return.
4978               */
4979              if ( 0 === strcasecmp( $node->node_name, $tag_name ) ) {
4980                  foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
4981                      $this->state->stack_of_open_elements->pop();
4982                      if ( $node === $item ) {
4983                          return true;
4984                      }
4985                  }
4986              }
4987  
4988              foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
4989                  $node = $item;
4990                  break;
4991              }
4992  
4993              if ( 'html' !== $node->namespace ) {
4994                  goto in_foreign_content_end_tag_loop;
4995              }
4996  
4997              in_foreign_content_process_in_current_insertion_mode:
4998              switch ( $this->state->insertion_mode ) {
4999                  case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
5000                      return $this->step_initial();
5001  
5002                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
5003                      return $this->step_before_html();
5004  
5005                  case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
5006                      return $this->step_before_head();
5007  
5008                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
5009                      return $this->step_in_head();
5010  
5011                  case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
5012                      return $this->step_in_head_noscript();
5013  
5014                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
5015                      return $this->step_after_head();
5016  
5017                  case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
5018                      return $this->step_in_body();
5019  
5020                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
5021                      return $this->step_in_table();
5022  
5023                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
5024                      return $this->step_in_table_text();
5025  
5026                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
5027                      return $this->step_in_caption();
5028  
5029                  case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
5030                      return $this->step_in_column_group();
5031  
5032                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
5033                      return $this->step_in_table_body();
5034  
5035                  case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
5036                      return $this->step_in_row();
5037  
5038                  case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
5039                      return $this->step_in_cell();
5040  
5041                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
5042                      return $this->step_in_select();
5043  
5044                  case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
5045                      return $this->step_in_select_in_table();
5046  
5047                  case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
5048                      return $this->step_in_template();
5049  
5050                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
5051                      return $this->step_after_body();
5052  
5053                  case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
5054                      return $this->step_in_frameset();
5055  
5056                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
5057                      return $this->step_after_frameset();
5058  
5059                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
5060                      return $this->step_after_after_body();
5061  
5062                  case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
5063                      return $this->step_after_after_frameset();
5064  
5065                  // This should be unreachable but PHP doesn't have total type checking on switch.
5066                  default:
5067                      $this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
5068              }
5069          }
5070  
5071          $this->bail( 'Should not have been able to reach end of IN FOREIGN CONTENT processing. Check HTML API code.' );
5072          // This unnecessary return prevents tools from inaccurately reporting type errors.
5073          return false;
5074      }
5075  
5076      /*
5077       * Internal helpers
5078       */
5079  
5080      /**
5081       * Creates a new bookmark for the currently-matched token and returns the generated name.
5082       *
5083       * @since 6.4.0
5084       * @since 6.5.0 Renamed from bookmark_tag() to bookmark_token().
5085       *
5086       * @throws Exception When unable to allocate requested bookmark.
5087       *
5088       * @return string|false Name of created bookmark, or false if unable to create.
5089       */
5090  	private function bookmark_token() {
5091          if ( ! parent::set_bookmark( ++$this->bookmark_counter ) ) {
5092              $this->last_error = self::ERROR_EXCEEDED_MAX_BOOKMARKS;
5093              throw new Exception( 'could not allocate bookmark' );
5094          }
5095  
5096          return "{$this->bookmark_counter}";
5097      }
5098  
5099      /*
5100       * HTML semantic overrides for Tag Processor
5101       */
5102  
5103      /**
5104       * Indicates the namespace of the current token, or "html" if there is none.
5105       *
5106       * @return string One of "html", "math", or "svg".
5107       */
5108  	public function get_namespace(): string {
5109          if ( ! isset( $this->current_element ) ) {
5110              return parent::get_namespace();
5111          }
5112  
5113          return $this->current_element->token->namespace;
5114      }
5115  
5116      /**
5117       * Returns the uppercase name of the matched tag.
5118       *
5119       * The semantic rules for HTML specify that certain tags be reprocessed
5120       * with a different tag name. Because of this, the tag name presented
5121       * by the HTML Processor may differ from the one reported by the HTML
5122       * Tag Processor, which doesn't apply these semantic rules.
5123       *
5124       * Example:
5125       *
5126       *     $processor = new WP_HTML_Tag_Processor( '<div class="test">Test</div>' );
5127       *     $processor->next_tag() === true;
5128       *     $processor->get_tag() === 'DIV';
5129       *
5130       *     $processor->next_tag() === false;
5131       *     $processor->get_tag() === null;
5132       *
5133       * @since 6.4.0
5134       *
5135       * @return string|null Name of currently matched tag in input HTML, or `null` if none found.
5136       */
5137  	public function get_tag(): ?string {
5138          if ( null !== $this->last_error ) {
5139              return null;
5140          }
5141  
5142          if ( $this->is_virtual() ) {
5143              return $this->current_element->token->node_name;
5144          }
5145  
5146          $tag_name = parent::get_tag();
5147  
5148          /*
5149           * > A start tag whose tag name is "image"
5150           * > Change the token's tag name to "img" and reprocess it. (Don't ask.)
5151           */
5152          return ( 'IMAGE' === $tag_name && 'html' === $this->get_namespace() )
5153              ? 'IMG'
5154              : $tag_name;
5155      }
5156  
5157      /**
5158       * Indicates if the currently matched tag contains the self-closing flag.
5159       *
5160       * No HTML elements ought to have the self-closing flag and for those, the self-closing
5161       * flag will be ignored. For void elements this is benign because they "self close"
5162       * automatically. For non-void HTML elements though problems will appear if someone
5163       * intends to use a self-closing element in place of that element with an empty body.
5164       * For HTML foreign elements and custom elements the self-closing flag determines if
5165       * they self-close or not.
5166       *
5167       * This function does not determine if a tag is self-closing,
5168       * but only if the self-closing flag is present in the syntax.
5169       *
5170       * @since 6.6.0 Subclassed for the HTML Processor.
5171       *
5172       * @return bool Whether the currently matched tag contains the self-closing flag.
5173       */
5174  	public function has_self_closing_flag(): bool {
5175          return $this->is_virtual() ? false : parent::has_self_closing_flag();
5176      }
5177  
5178      /**
5179       * Returns the node name represented by the token.
5180       *
5181       * This matches the DOM API value `nodeName`. Some values
5182       * are static, such as `#text` for a text node, while others
5183       * are dynamically generated from the token itself.
5184       *
5185       * Dynamic names:
5186       *  - Uppercase tag name for tag matches.
5187       *  - `html` for DOCTYPE declarations.
5188       *
5189       * Note that if the Tag Processor is not matched on a token
5190       * then this function will return `null`, either because it
5191       * hasn't yet found a token or because it reached the end
5192       * of the document without matching a token.
5193       *
5194       * @since 6.6.0 Subclassed for the HTML Processor.
5195       *
5196       * @return string|null Name of the matched token.
5197       */
5198  	public function get_token_name(): ?string {
5199          return $this->is_virtual()
5200              ? $this->current_element->token->node_name
5201              : parent::get_token_name();
5202      }
5203  
5204      /**
5205       * Indicates the kind of matched token, if any.
5206       *
5207       * This differs from `get_token_name()` in that it always
5208       * returns a static string indicating the type, whereas
5209       * `get_token_name()` may return values derived from the
5210       * token itself, such as a tag name or processing
5211       * instruction tag.
5212       *
5213       * Possible values:
5214       *  - `#tag` when matched on a tag.
5215       *  - `#text` when matched on a text node.
5216       *  - `#cdata-section` when matched on a CDATA node.
5217       *  - `#comment` when matched on a comment.
5218       *  - `#doctype` when matched on a DOCTYPE declaration.
5219       *  - `#presumptuous-tag` when matched on an empty tag closer.
5220       *  - `#funky-comment` when matched on a funky comment.
5221       *
5222       * @since 6.6.0 Subclassed for the HTML Processor.
5223       *
5224       * @return string|null What kind of token is matched, or null.
5225       */
5226  	public function get_token_type(): ?string {
5227          if ( $this->is_virtual() ) {
5228              /*
5229               * This logic comes from the Tag Processor.
5230               *
5231               * @todo It would be ideal not to repeat this here, but it's not clearly
5232               *       better to allow passing a token name to `get_token_type()`.
5233               */
5234              $node_name     = $this->current_element->token->node_name;
5235              $starting_char = $node_name[0];
5236              if ( 'A' <= $starting_char && 'Z' >= $starting_char ) {
5237                  return '#tag';
5238              }
5239  
5240              if ( 'html' === $node_name ) {
5241                  return '#doctype';
5242              }
5243  
5244              return $node_name;
5245          }
5246  
5247          return parent::get_token_type();
5248      }
5249  
5250      /**
5251       * Returns the value of a requested attribute from a matched tag opener if that attribute exists.
5252       *
5253       * Example:
5254       *
5255       *     $p = WP_HTML_Processor::create_fragment( '<div enabled class="test" data-test-id="14">Test</div>' );
5256       *     $p->next_token() === true;
5257       *     $p->get_attribute( 'data-test-id' ) === '14';
5258       *     $p->get_attribute( 'enabled' ) === true;
5259       *     $p->get_attribute( 'aria-label' ) === null;
5260       *
5261       *     $p->next_tag() === false;
5262       *     $p->get_attribute( 'class' ) === null;
5263       *
5264       * @since 6.6.0 Subclassed for HTML Processor.
5265       *
5266       * @param string $name Name of attribute whose value is requested.
5267       * @return string|true|null Value of attribute or `null` if not available. Boolean attributes return `true`.
5268       */
5269  	public function get_attribute( $name ) {
5270          return $this->is_virtual() ? null : parent::get_attribute( $name );
5271      }
5272  
5273      /**
5274       * Updates or creates a new attribute on the currently matched tag with the passed value.
5275       *
5276       * For boolean attributes special handling is provided:
5277       *  - When `true` is passed as the value, then only the attribute name is added to the tag.
5278       *  - When `false` is passed, the attribute gets removed if it existed before.
5279       *
5280       * For string attributes, the value is escaped using the `esc_attr` function.
5281       *
5282       * @since 6.6.0 Subclassed for the HTML Processor.
5283       *
5284       * @param string      $name  The attribute name to target.
5285       * @param string|bool $value The new attribute value.
5286       * @return bool Whether an attribute value was set.
5287       */
5288  	public function set_attribute( $name, $value ): bool {
5289          return $this->is_virtual() ? false : parent::set_attribute( $name, $value );
5290      }
5291  
5292      /**
5293       * Remove an attribute from the currently-matched tag.
5294       *
5295       * @since 6.6.0 Subclassed for HTML Processor.
5296       *
5297       * @param string $name The attribute name to remove.
5298       * @return bool Whether an attribute was removed.
5299       */
5300  	public function remove_attribute( $name ): bool {
5301          return $this->is_virtual() ? false : parent::remove_attribute( $name );
5302      }
5303  
5304      /**
5305       * Gets lowercase names of all attributes matching a given prefix in the current tag.
5306       *
5307       * Note that matching is case-insensitive. This is in accordance with the spec:
5308       *
5309       * > There must never be two or more attributes on
5310       * > the same start tag whose names are an ASCII
5311       * > case-insensitive match for each other.
5312       *     - HTML 5 spec
5313       *
5314       * Example:
5315       *
5316       *     $p = new WP_HTML_Tag_Processor( '<div data-ENABLED class="test" DATA-test-id="14">Test</div>' );
5317       *     $p->next_tag( array( 'class_name' => 'test' ) ) === true;
5318       *     $p->get_attribute_names_with_prefix( 'data-' ) === array( 'data-enabled', 'data-test-id' );
5319       *
5320       *     $p->next_tag() === false;
5321       *     $p->get_attribute_names_with_prefix( 'data-' ) === null;
5322       *
5323       * @since 6.6.0 Subclassed for the HTML Processor.
5324       *
5325       * @see https://html.spec.whatwg.org/multipage/syntax.html#attributes-2:ascii-case-insensitive
5326       *
5327       * @param string $prefix Prefix of requested attribute names.
5328       * @return array|null List of attribute names, or `null` when no tag opener is matched.
5329       */
5330  	public function get_attribute_names_with_prefix( $prefix ): ?array {
5331          return $this->is_virtual() ? null : parent::get_attribute_names_with_prefix( $prefix );
5332      }
5333  
5334      /**
5335       * Adds a new class name to the currently matched tag.
5336       *
5337       * @since 6.6.0 Subclassed for the HTML Processor.
5338       *
5339       * @param string $class_name The class name to add.
5340       * @return bool Whether the class was set to be added.
5341       */
5342  	public function add_class( $class_name ): bool {
5343          return $this->is_virtual() ? false : parent::add_class( $class_name );
5344      }
5345  
5346      /**
5347       * Removes a class name from the currently matched tag.
5348       *
5349       * @since 6.6.0 Subclassed for the HTML Processor.
5350       *
5351       * @param string $class_name The class name to remove.
5352       * @return bool Whether the class was set to be removed.
5353       */
5354  	public function remove_class( $class_name ): bool {
5355          return $this->is_virtual() ? false : parent::remove_class( $class_name );
5356      }
5357  
5358      /**
5359       * Returns if a matched tag contains the given ASCII case-insensitive class name.
5360       *
5361       * @since 6.6.0 Subclassed for the HTML Processor.
5362       *
5363       * @todo When reconstructing active formatting elements with attributes, find a way
5364       *       to indicate if the virtually-reconstructed formatting elements contain the
5365       *       wanted class name.
5366       *
5367       * @param string $wanted_class Look for this CSS class name, ASCII case-insensitive.
5368       * @return bool|null Whether the matched tag contains the given class name, or null if not matched.
5369       */
5370  	public function has_class( $wanted_class ): ?bool {
5371          return $this->is_virtual() ? null : parent::has_class( $wanted_class );
5372      }
5373  
5374      /**
5375       * Generator for a foreach loop to step through each class name for the matched tag.
5376       *
5377       * This generator function is designed to be used inside a "foreach" loop.
5378       *
5379       * Example:
5380       *
5381       *     $p = WP_HTML_Processor::create_fragment( "<div class='free &lt;egg&lt;\tlang-en'>" );
5382       *     $p->next_tag();
5383       *     foreach ( $p->class_list() as $class_name ) {
5384       *         echo "{$class_name} ";
5385       *     }
5386       *     // Outputs: "free <egg> lang-en "
5387       *
5388       * @since 6.6.0 Subclassed for the HTML Processor.
5389       */
5390  	public function class_list() {
5391          return $this->is_virtual() ? null : parent::class_list();
5392      }
5393  
5394      /**
5395       * Returns the modifiable text for a matched token, or an empty string.
5396       *
5397       * Modifiable text is text content that may be read and changed without
5398       * changing the HTML structure of the document around it. This includes
5399       * the contents of `#text` nodes in the HTML as well as the inner
5400       * contents of HTML comments, Processing Instructions, and others, even
5401       * though these nodes aren't part of a parsed DOM tree. They also contain
5402       * the contents of SCRIPT and STYLE tags, of TEXTAREA tags, and of any
5403       * other section in an HTML document which cannot contain HTML markup (DATA).
5404       *
5405       * If a token has no modifiable text then an empty string is returned to
5406       * avoid needless crashing or type errors. An empty string does not mean
5407       * that a token has modifiable text, and a token with modifiable text may
5408       * have an empty string (e.g. a comment with no contents).
5409       *
5410       * @since 6.6.0 Subclassed for the HTML Processor.
5411       *
5412       * @return string
5413       */
5414  	public function get_modifiable_text(): string {
5415          return $this->is_virtual() ? '' : parent::get_modifiable_text();
5416      }
5417  
5418      /**
5419       * Indicates what kind of comment produced the comment node.
5420       *
5421       * Because there are different kinds of HTML syntax which produce
5422       * comments, the Tag Processor tracks and exposes this as a type
5423       * for the comment. Nominally only regular HTML comments exist as
5424       * they are commonly known, but a number of unrelated syntax errors
5425       * also produce comments.
5426       *
5427       * @see self::COMMENT_AS_ABRUPTLY_CLOSED_COMMENT
5428       * @see self::COMMENT_AS_CDATA_LOOKALIKE
5429       * @see self::COMMENT_AS_INVALID_HTML
5430       * @see self::COMMENT_AS_HTML_COMMENT
5431       * @see self::COMMENT_AS_PI_NODE_LOOKALIKE
5432       *
5433       * @since 6.6.0 Subclassed for the HTML Processor.
5434       *
5435       * @return string|null
5436       */
5437  	public function get_comment_type(): ?string {
5438          return $this->is_virtual() ? null : parent::get_comment_type();
5439      }
5440  
5441      /**
5442       * Removes a bookmark that is no longer needed.
5443       *
5444       * Releasing a bookmark frees up the small
5445       * performance overhead it requires.
5446       *
5447       * @since 6.4.0
5448       *
5449       * @param string $bookmark_name Name of the bookmark to remove.
5450       * @return bool Whether the bookmark already existed before removal.
5451       */
5452  	public function release_bookmark( $bookmark_name ): bool {
5453          return parent::release_bookmark( "_{$bookmark_name}" );
5454      }
5455  
5456      /**
5457       * Moves the internal cursor in the HTML Processor to a given bookmark's location.
5458       *
5459       * Be careful! Seeking backwards to a previous location resets the parser to the
5460       * start of the document and reparses the entire contents up until it finds the
5461       * sought-after bookmarked location.
5462       *
5463       * In order to prevent accidental infinite loops, there's a
5464       * maximum limit on the number of times seek() can be called.
5465       *
5466       * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
5467       *
5468       * @since 6.4.0
5469       *
5470       * @param string $bookmark_name Jump to the place in the document identified by this bookmark name.
5471       * @return bool Whether the internal cursor was successfully moved to the bookmark's location.
5472       */
5473  	public function seek( $bookmark_name ): bool {
5474          // Flush any pending updates to the document before beginning.
5475          $this->get_updated_html();
5476  
5477          $actual_bookmark_name = "_{$bookmark_name}";
5478          $processor_started_at = $this->state->current_token
5479              ? $this->bookmarks[ $this->state->current_token->bookmark_name ]->start
5480              : 0;
5481          $bookmark_starts_at   = $this->bookmarks[ $actual_bookmark_name ]->start;
5482          $direction            = $bookmark_starts_at > $processor_started_at ? 'forward' : 'backward';
5483  
5484          /*
5485           * If seeking backwards, it's possible that the sought-after bookmark exists within an element
5486           * which has been closed before the current cursor; in other words, it has already been removed
5487           * from the stack of open elements. This means that it's insufficient to simply pop off elements
5488           * from the stack of open elements which appear after the bookmarked location and then jump to
5489           * that location, as the elements which were open before won't be re-opened.
5490           *
5491           * In order to maintain consistency, the HTML Processor rewinds to the start of the document
5492           * and reparses everything until it finds the sought-after bookmark.
5493           *
5494           * There are potentially better ways to do this: cache the parser state for each bookmark and
5495           * restore it when seeking; store an immutable and idempotent register of where elements open
5496           * and close.
5497           *
5498           * If caching the parser state it will be essential to properly maintain the cached stack of
5499           * open elements and active formatting elements when modifying the document. This could be a
5500           * tedious and time-consuming process as well, and so for now will not be performed.
5501           *
5502           * It may be possible to track bookmarks for where elements open and close, and in doing so
5503           * be able to quickly recalculate breadcrumbs for any element in the document. It may even
5504           * be possible to remove the stack of open elements and compute it on the fly this way.
5505           * If doing this, the parser would need to track the opening and closing locations for all
5506           * tokens in the breadcrumb path for any and all bookmarks. By utilizing bookmarks themselves
5507           * this list could be automatically maintained while modifying the document. Finding the
5508           * breadcrumbs would then amount to traversing that list from the start until the token
5509           * being inspected. Once an element closes, if there are no bookmarks pointing to locations
5510           * within that element, then all of these locations may be forgotten to save on memory use
5511           * and computation time.
5512           */
5513          if ( 'backward' === $direction ) {
5514  
5515              /*
5516               * When moving backward, stateful stacks should be cleared.
5517               */
5518              foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
5519                  $this->state->stack_of_open_elements->remove_node( $item );
5520              }
5521  
5522              foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
5523                  $this->state->active_formatting_elements->remove_node( $item );
5524              }
5525  
5526              /*
5527               * **After** clearing stacks, more processor state can be reset.
5528               * This must be done after clearing the stack because those stacks generate events that
5529               * would appear on a subsequent call to `next_token()`.
5530               */
5531              $this->state->frameset_ok                       = true;
5532              $this->state->stack_of_template_insertion_modes = array();
5533              $this->state->head_element                      = null;
5534              $this->state->form_element                      = null;
5535              $this->state->current_token                     = null;
5536              $this->current_element                          = null;
5537              $this->element_queue                            = array();
5538  
5539              /*
5540               * The absence of a context node indicates a full parse.
5541               * The presence of a context node indicates a fragment parser.
5542               */
5543              if ( null === $this->context_node ) {
5544                  $this->change_parsing_namespace( 'html' );
5545                  $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_INITIAL;
5546                  $this->breadcrumbs           = array();
5547  
5548                  $this->bookmarks['initial'] = new WP_HTML_Span( 0, 0 );
5549                  parent::seek( 'initial' );
5550                  unset( $this->bookmarks['initial'] );
5551              } else {
5552  
5553                  /*
5554                   * Push the root-node (HTML) back onto the stack of open elements.
5555                   *
5556                   * Fragment parsers require this extra bit of setup.
5557                   * It's handled in full parsers by advancing the processor state.
5558                   */
5559                  $this->state->stack_of_open_elements->push(
5560                      new WP_HTML_Token(
5561                          'root-node',
5562                          'HTML',
5563                          false
5564                      )
5565                  );
5566  
5567                  $this->change_parsing_namespace(
5568                      $this->context_node->integration_node_type
5569                          ? 'html'
5570                          : $this->context_node->namespace
5571                  );
5572  
5573                  if ( 'TEMPLATE' === $this->context_node->node_name ) {
5574                      $this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
5575                  }
5576  
5577                  $this->reset_insertion_mode_appropriately();
5578                  $this->breadcrumbs = array_slice( $this->breadcrumbs, 0, 2 );
5579                  parent::seek( $this->context_node->bookmark_name );
5580              }
5581          }
5582  
5583          /*
5584           * Here, the processor moves forward through the document until it matches the bookmark.
5585           * do-while is used here because the processor is expected to already be stopped on
5586           * a token than may match the bookmarked location.
5587           */
5588          do {
5589              /*
5590               * The processor will stop on virtual tokens, but bookmarks may not be set on them.
5591               * They should not be matched when seeking a bookmark, skip them.
5592               */
5593              if ( $this->is_virtual() ) {
5594                  continue;
5595              }
5596              if ( $bookmark_starts_at === $this->bookmarks[ $this->state->current_token->bookmark_name ]->start ) {
5597                  return true;
5598              }
5599          } while ( $this->next_token() );
5600  
5601          return false;
5602      }
5603  
5604      /**
5605       * Sets a bookmark in the HTML document.
5606       *
5607       * Bookmarks represent specific places or tokens in the HTML
5608       * document, such as a tag opener or closer. When applying
5609       * edits to a document, such as setting an attribute, the
5610       * text offsets of that token may shift; the bookmark is
5611       * kept updated with those shifts and remains stable unless
5612       * the entire span of text in which the token sits is removed.
5613       *
5614       * Release bookmarks when they are no longer needed.
5615       *
5616       * Example:
5617       *
5618       *     <main><h2>Surprising fact you may not know!</h2></main>
5619       *           ^  ^
5620       *            \-|-- this `H2` opener bookmark tracks the token
5621       *
5622       *     <main class="clickbait"><h2>Surprising fact you may no…
5623       *                             ^  ^
5624       *                              \-|-- it shifts with edits
5625       *
5626       * Bookmarks provide the ability to seek to a previously-scanned
5627       * place in the HTML document. This avoids the need to re-scan
5628       * the entire document.
5629       *
5630       * Example:
5631       *
5632       *     <ul><li>One</li><li>Two</li><li>Three</li></ul>
5633       *                                 ^^^^
5634       *                                 want to note this last item
5635       *
5636       *     $p = new WP_HTML_Tag_Processor( $html );
5637       *     $in_list = false;
5638       *     while ( $p->next_tag( array( 'tag_closers' => $in_list ? 'visit' : 'skip' ) ) ) {
5639       *         if ( 'UL' === $p->get_tag() ) {
5640       *             if ( $p->is_tag_closer() ) {
5641       *                 $in_list = false;
5642       *                 $p->set_bookmark( 'resume' );
5643       *                 if ( $p->seek( 'last-li' ) ) {
5644       *                     $p->add_class( 'last-li' );
5645       *                 }
5646       *                 $p->seek( 'resume' );
5647       *                 $p->release_bookmark( 'last-li' );
5648       *                 $p->release_bookmark( 'resume' );
5649       *             } else {
5650       *                 $in_list = true;
5651       *             }
5652       *         }
5653       *
5654       *         if ( 'LI' === $p->get_tag() ) {
5655       *             $p->set_bookmark( 'last-li' );
5656       *         }
5657       *     }
5658       *
5659       * Bookmarks intentionally hide the internal string offsets
5660       * to which they refer. They are maintained internally as
5661       * updates are applied to the HTML document and therefore
5662       * retain their "position" - the location to which they
5663       * originally pointed. The inability to use bookmarks with
5664       * functions like `substr` is therefore intentional to guard
5665       * against accidentally breaking the HTML.
5666       *
5667       * Because bookmarks allocate memory and require processing
5668       * for every applied update, they are limited and require
5669       * a name. They should not be created with programmatically-made
5670       * names, such as "li_{$index}" with some loop. As a general
5671       * rule they should only be created with string-literal names
5672       * like "start-of-section" or "last-paragraph".
5673       *
5674       * Bookmarks are a powerful tool to enable complicated behavior.
5675       * Consider double-checking that you need this tool if you are
5676       * reaching for it, as inappropriate use could lead to broken
5677       * HTML structure or unwanted processing overhead.
5678       *
5679       * Bookmarks cannot be set on tokens that do no appear in the original
5680       * HTML text. For example, the HTML `<table><td>` stops at tags `TABLE`,
5681       * `TBODY`, `TR`, and `TD`. The `TBODY` and `TR` tags do not appear in
5682       * the original HTML and cannot be used as bookmarks.
5683       *
5684       * @since 6.4.0
5685       *
5686       * @param string $bookmark_name Identifies this particular bookmark.
5687       * @return bool Whether the bookmark was successfully created.
5688       */
5689  	public function set_bookmark( $bookmark_name ): bool {
5690          if ( $this->is_virtual() ) {
5691              _doing_it_wrong(
5692                  __METHOD__,
5693                  __( 'Cannot set bookmarks on tokens that do no appear in the original HTML text.' ),
5694                  '6.8.0'
5695              );
5696              return false;
5697          }
5698          return parent::set_bookmark( "_{$bookmark_name}" );
5699      }
5700  
5701      /**
5702       * Checks whether a bookmark with the given name exists.
5703       *
5704       * @since 6.5.0
5705       *
5706       * @param string $bookmark_name Name to identify a bookmark that potentially exists.
5707       * @return bool Whether that bookmark exists.
5708       */
5709  	public function has_bookmark( $bookmark_name ): bool {
5710          return parent::has_bookmark( "_{$bookmark_name}" );
5711      }
5712  
5713      /*
5714       * HTML Parsing Algorithms
5715       */
5716  
5717      /**
5718       * Closes a P element.
5719       *
5720       * @since 6.4.0
5721       *
5722       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5723       *
5724       * @see https://html.spec.whatwg.org/#close-a-p-element
5725       */
5726  	private function close_a_p_element(): void {
5727          $this->generate_implied_end_tags( 'P' );
5728          $this->state->stack_of_open_elements->pop_until( 'P' );
5729      }
5730  
5731      /**
5732       * Closes elements that have implied end tags.
5733       *
5734       * @since 6.4.0
5735       * @since 6.7.0 Full spec support.
5736       *
5737       * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5738       *
5739       * @param string|null $except_for_this_element Perform as if this element doesn't exist in the stack of open elements.
5740       */
5741  	private function generate_implied_end_tags( ?string $except_for_this_element = null ): void {
5742          $elements_with_implied_end_tags = array(
5743              'DD',
5744              'DT',
5745              'LI',
5746              'OPTGROUP',
5747              'OPTION',
5748              'P',
5749              'RB',
5750              'RP',
5751              'RT',
5752              'RTC',
5753          );
5754  
5755          $no_exclusions = ! isset( $except_for_this_element );
5756  
5757          while (
5758              ( $no_exclusions || ! $this->state->stack_of_open_elements->current_node_is( $except_for_this_element ) ) &&
5759              in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true )
5760          ) {
5761              $this->state->stack_of_open_elements->pop();
5762          }
5763      }
5764  
5765      /**
5766       * Closes elements that have implied end tags, thoroughly.
5767       *
5768       * See the HTML specification for an explanation why this is
5769       * different from generating end tags in the normal sense.
5770       *
5771       * @since 6.4.0
5772       * @since 6.7.0 Full spec support.
5773       *
5774       * @see WP_HTML_Processor::generate_implied_end_tags
5775       * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5776       */
5777  	private function generate_implied_end_tags_thoroughly(): void {
5778          $elements_with_implied_end_tags = array(
5779              'CAPTION',
5780              'COLGROUP',
5781              'DD',
5782              'DT',
5783              'LI',
5784              'OPTGROUP',
5785              'OPTION',
5786              'P',
5787              'RB',
5788              'RP',
5789              'RT',
5790              'RTC',
5791              'TBODY',
5792              'TD',
5793              'TFOOT',
5794              'TH',
5795              'THEAD',
5796              'TR',
5797          );
5798  
5799          while ( in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true ) ) {
5800              $this->state->stack_of_open_elements->pop();
5801          }
5802      }
5803  
5804      /**
5805       * Returns the adjusted current node.
5806       *
5807       * > The adjusted current node is the context element if the parser was created as
5808       * > part of the HTML fragment parsing algorithm and the stack of open elements
5809       * > has only one element in it (fragment case); otherwise, the adjusted current
5810       * > node is the current node.
5811       *
5812       * @see https://html.spec.whatwg.org/#adjusted-current-node
5813       *
5814       * @since 6.7.0
5815       *
5816       * @return WP_HTML_Token|null The adjusted current node.
5817       */
5818  	private function get_adjusted_current_node(): ?WP_HTML_Token {
5819          if ( isset( $this->context_node ) && 1 === $this->state->stack_of_open_elements->count() ) {
5820              return $this->context_node;
5821          }
5822  
5823          return $this->state->stack_of_open_elements->current_node();
5824      }
5825  
5826      /**
5827       * Reconstructs the active formatting elements.
5828       *
5829       * > This has the effect of reopening all the formatting elements that were opened
5830       * > in the current body, cell, or caption (whichever is youngest) that haven't
5831       * > been explicitly closed.
5832       *
5833       * @since 6.4.0
5834       *
5835       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5836       *
5837       * @see https://html.spec.whatwg.org/#reconstruct-the-active-formatting-elements
5838       *
5839       * @return bool Whether any formatting elements needed to be reconstructed.
5840       */
5841  	private function reconstruct_active_formatting_elements(): bool {
5842          /*
5843           * > If there are no entries in the list of active formatting elements, then there is nothing
5844           * > to reconstruct; stop this algorithm.
5845           */
5846          if ( 0 === $this->state->active_formatting_elements->count() ) {
5847              return false;
5848          }
5849  
5850          $last_entry = $this->state->active_formatting_elements->current_node();
5851          if (
5852  
5853              /*
5854               * > If the last (most recently added) entry in the list of active formatting elements is a marker;
5855               * > stop this algorithm.
5856               */
5857              'marker' === $last_entry->node_name ||
5858  
5859              /*
5860               * > If the last (most recently added) entry in the list of active formatting elements is an
5861               * > element that is in the stack of open elements, then there is nothing to reconstruct;
5862               * > stop this algorithm.
5863               */
5864              $this->state->stack_of_open_elements->contains_node( $last_entry )
5865          ) {
5866              return false;
5867          }
5868  
5869          $this->bail( 'Cannot reconstruct active formatting elements when advancing and rewinding is required.' );
5870      }
5871  
5872      /**
5873       * Runs the reset the insertion mode appropriately algorithm.
5874       *
5875       * @since 6.7.0
5876       *
5877       * @see https://html.spec.whatwg.org/multipage/parsing.html#reset-the-insertion-mode-appropriately
5878       */
5879  	private function reset_insertion_mode_appropriately(): void {
5880          // Set the first node.
5881          $first_node = null;
5882          foreach ( $this->state->stack_of_open_elements->walk_down() as $first_node ) {
5883              break;
5884          }
5885  
5886          /*
5887           * > 1. Let _last_ be false.
5888           */
5889          $last = false;
5890          foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
5891              /*
5892               * > 2. Let _node_ be the last node in the stack of open elements.
5893               * > 3. _Loop_: If _node_ is the first node in the stack of open elements, then set _last_
5894               * >            to true, and, if the parser was created as part of the HTML fragment parsing
5895               * >            algorithm (fragment case), set node to the context element passed to
5896               * >            that algorithm.
5897               * > …
5898               */
5899              if ( $node === $first_node ) {
5900                  $last = true;
5901                  if ( isset( $this->context_node ) ) {
5902                      $node = $this->context_node;
5903                  }
5904              }
5905  
5906              // All of the following rules are for matching HTML elements.
5907              if ( 'html' !== $node->namespace ) {
5908                  continue;
5909              }
5910  
5911              switch ( $node->node_name ) {
5912                  /*
5913                   * > 4. If node is a `select` element, run these substeps:
5914                   * >   1. If _last_ is true, jump to the step below labeled done.
5915                   * >   2. Let _ancestor_ be _node_.
5916                   * >   3. _Loop_: If _ancestor_ is the first node in the stack of open elements,
5917                   * >      jump to the step below labeled done.
5918                   * >   4. Let ancestor be the node before ancestor in the stack of open elements.
5919                   * >   …
5920                   * >   7. Jump back to the step labeled _loop_.
5921                   * >   8. _Done_: Switch the insertion mode to "in select" and return.
5922                   */
5923                  case 'SELECT':
5924                      if ( ! $last ) {
5925                          foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $ancestor ) {
5926                              if ( 'html' !== $ancestor->namespace ) {
5927                                  continue;
5928                              }
5929  
5930                              switch ( $ancestor->node_name ) {
5931                                  /*
5932                                   * > 5. If _ancestor_ is a `template` node, jump to the step below
5933                                   * >    labeled _done_.
5934                                   */
5935                                  case 'TEMPLATE':
5936                                      break 2;
5937  
5938                                  /*
5939                                   * > 6. If _ancestor_ is a `table` node, switch the insertion mode to
5940                                   * >    "in select in table" and return.
5941                                   */
5942                                  case 'TABLE':
5943                                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
5944                                      return;
5945                              }
5946                          }
5947                      }
5948                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
5949                      return;
5950  
5951                  /*
5952                   * > 5. If _node_ is a `td` or `th` element and _last_ is false, then switch the
5953                   * >    insertion mode to "in cell" and return.
5954                   */
5955                  case 'TD':
5956                  case 'TH':
5957                      if ( ! $last ) {
5958                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
5959                          return;
5960                      }
5961                      break;
5962  
5963                      /*
5964                      * > 6. If _node_ is a `tr` element, then switch the insertion mode to "in row"
5965                      * >    and return.
5966                      */
5967                  case 'TR':
5968                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
5969                      return;
5970  
5971                  /*
5972                   * > 7. If _node_ is a `tbody`, `thead`, or `tfoot` element, then switch the
5973                   * >    insertion mode to "in table body" and return.
5974                   */
5975                  case 'TBODY':
5976                  case 'THEAD':
5977                  case 'TFOOT':
5978                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
5979                      return;
5980  
5981                  /*
5982                   * > 8. If _node_ is a `caption` element, then switch the insertion mode to
5983                   * >    "in caption" and return.
5984                   */
5985                  case 'CAPTION':
5986                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
5987                      return;
5988  
5989                  /*
5990                   * > 9. If _node_ is a `colgroup` element, then switch the insertion mode to
5991                   * >    "in column group" and return.
5992                   */
5993                  case 'COLGROUP':
5994                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
5995                      return;
5996  
5997                  /*
5998                   * > 10. If _node_ is a `table` element, then switch the insertion mode to
5999                   * >     "in table" and return.
6000                   */
6001                  case 'TABLE':
6002                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
6003                      return;
6004  
6005                  /*
6006                   * > 11. If _node_ is a `template` element, then switch the insertion mode to the
6007                   * >     current template insertion mode and return.
6008                   */
6009                  case 'TEMPLATE':
6010                      $this->state->insertion_mode = end( $this->state->stack_of_template_insertion_modes );
6011                      return;
6012  
6013                  /*
6014                   * > 12. If _node_ is a `head` element and _last_ is false, then switch the
6015                   * >     insertion mode to "in head" and return.
6016                   */
6017                  case 'HEAD':
6018                      if ( ! $last ) {
6019                          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
6020                          return;
6021                      }
6022                      break;
6023  
6024                  /*
6025                   * > 13. If _node_ is a `body` element, then switch the insertion mode to "in body"
6026                   * >     and return.
6027                   */
6028                  case 'BODY':
6029                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6030                      return;
6031  
6032                  /*
6033                   * > 14. If _node_ is a `frameset` element, then switch the insertion mode to
6034                   * >     "in frameset" and return. (fragment case)
6035                   */
6036                  case 'FRAMESET':
6037                      $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
6038                      return;
6039  
6040                  /*
6041                   * > 15. If _node_ is an `html` element, run these substeps:
6042                   * >     1. If the head element pointer is null, switch the insertion mode to
6043                   * >        "before head" and return. (fragment case)
6044                   * >     2. Otherwise, the head element pointer is not null, switch the insertion
6045                   * >        mode to "after head" and return.
6046                   */
6047                  case 'HTML':
6048                      $this->state->insertion_mode = isset( $this->state->head_element )
6049                          ? WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD
6050                          : WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
6051                      return;
6052              }
6053          }
6054  
6055          /*
6056           * > 16. If _last_ is true, then switch the insertion mode to "in body"
6057           * >     and return. (fragment case)
6058           *
6059           * This is only reachable if `$last` is true, as per the fragment parsing case.
6060           */
6061          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6062      }
6063  
6064      /**
6065       * Runs the adoption agency algorithm.
6066       *
6067       * @since 6.4.0
6068       *
6069       * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
6070       *
6071       * @see https://html.spec.whatwg.org/#adoption-agency-algorithm
6072       */
6073  	private function run_adoption_agency_algorithm(): void {
6074          $budget       = 1000;
6075          $subject      = $this->get_tag();
6076          $current_node = $this->state->stack_of_open_elements->current_node();
6077  
6078          if (
6079              // > If the current node is an HTML element whose tag name is subject
6080              $current_node && $subject === $current_node->node_name &&
6081              // > the current node is not in the list of active formatting elements
6082              ! $this->state->active_formatting_elements->contains_node( $current_node )
6083          ) {
6084              $this->state->stack_of_open_elements->pop();
6085              return;
6086          }
6087  
6088          $outer_loop_counter = 0;
6089          while ( $budget-- > 0 ) {
6090              if ( $outer_loop_counter++ >= 8 ) {
6091                  return;
6092              }
6093  
6094              /*
6095               * > Let formatting element be the last element in the list of active formatting elements that:
6096               * >   - is between the end of the list and the last marker in the list,
6097               * >     if any, or the start of the list otherwise,
6098               * >   - and has the tag name subject.
6099               */
6100              $formatting_element = null;
6101              foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
6102                  if ( 'marker' === $item->node_name ) {
6103                      break;
6104                  }
6105  
6106                  if ( $subject === $item->node_name ) {
6107                      $formatting_element = $item;
6108                      break;
6109                  }
6110              }
6111  
6112              // > If there is no such element, then return and instead act as described in the "any other end tag" entry above.
6113              if ( null === $formatting_element ) {
6114                  $this->bail( 'Cannot run adoption agency when "any other end tag" is required.' );
6115              }
6116  
6117              // > If formatting element is not in the stack of open elements, then this is a parse error; remove the element from the list, and return.
6118              if ( ! $this->state->stack_of_open_elements->contains_node( $formatting_element ) ) {
6119                  $this->state->active_formatting_elements->remove_node( $formatting_element );
6120                  return;
6121              }
6122  
6123              // > If formatting element is in the stack of open elements, but the element is not in scope, then this is a parse error; return.
6124              if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $formatting_element->node_name ) ) {
6125                  return;
6126              }
6127  
6128              /*
6129               * > Let furthest block be the topmost node in the stack of open elements that is lower in the stack
6130               * > than formatting element, and is an element in the special category. There might not be one.
6131               */
6132              $is_above_formatting_element = true;
6133              $furthest_block              = null;
6134              foreach ( $this->state->stack_of_open_elements->walk_down() as $item ) {
6135                  if ( $is_above_formatting_element && $formatting_element->bookmark_name !== $item->bookmark_name ) {
6136                      continue;
6137                  }
6138  
6139                  if ( $is_above_formatting_element ) {
6140                      $is_above_formatting_element = false;
6141                      continue;
6142                  }
6143  
6144                  if ( self::is_special( $item ) ) {
6145                      $furthest_block = $item;
6146                      break;
6147                  }
6148              }
6149  
6150              /*
6151               * > If there is no furthest block, then the UA must first pop all the nodes from the bottom of the
6152               * > stack of open elements, from the current node up to and including formatting element, then
6153               * > remove formatting element from the list of active formatting elements, and finally return.
6154               */
6155              if ( null === $furthest_block ) {
6156                  foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
6157                      $this->state->stack_of_open_elements->pop();
6158  
6159                      if ( $formatting_element->bookmark_name === $item->bookmark_name ) {
6160                          $this->state->active_formatting_elements->remove_node( $formatting_element );
6161                          return;
6162                      }
6163                  }
6164              }
6165  
6166              $this->bail( 'Cannot extract common ancestor in adoption agency algorithm.' );
6167          }
6168  
6169          $this->bail( 'Cannot run adoption agency when looping required.' );
6170      }
6171  
6172      /**
6173       * Runs the "close the cell" algorithm.
6174       *
6175       * > Where the steps above say to close the cell, they mean to run the following algorithm:
6176       * >   1. Generate implied end tags.
6177       * >   2. If the current node is not now a td element or a th element, then this is a parse error.
6178       * >   3. Pop elements from the stack of open elements stack until a td element or a th element has been popped from the stack.
6179       * >   4. Clear the list of active formatting elements up to the last marker.
6180       * >   5. Switch the insertion mode to "in row".
6181       *
6182       * @see https://html.spec.whatwg.org/multipage/parsing.html#close-the-cell
6183       *
6184       * @since 6.7.0
6185       */
6186  	private function close_cell(): void {
6187          $this->generate_implied_end_tags();
6188          // @todo Parse error if the current node is a "td" or "th" element.
6189          foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
6190              $this->state->stack_of_open_elements->pop();
6191              if ( 'TD' === $element->node_name || 'TH' === $element->node_name ) {
6192                  break;
6193              }
6194          }
6195          $this->state->active_formatting_elements->clear_up_to_last_marker();
6196          $this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
6197      }
6198  
6199      /**
6200       * Inserts an HTML element on the stack of open elements.
6201       *
6202       * @since 6.4.0
6203       *
6204       * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6205       *
6206       * @param WP_HTML_Token $token Name of bookmark pointing to element in original input HTML.
6207       */
6208  	private function insert_html_element( WP_HTML_Token $token ): void {
6209          $this->state->stack_of_open_elements->push( $token );
6210      }
6211  
6212      /**
6213       * Inserts a foreign element on to the stack of open elements.
6214       *
6215       * @since 6.7.0
6216       *
6217       * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6218       *
6219       * @param WP_HTML_Token $token                     Insert this token. The token's namespace and
6220       *                                                 insertion point will be updated correctly.
6221       * @param bool          $only_add_to_element_stack Whether to skip the "insert an element at the adjusted
6222       *                                                 insertion location" algorithm when adding this element.
6223       */
6224  	private function insert_foreign_element( WP_HTML_Token $token, bool $only_add_to_element_stack ): void {
6225          $adjusted_current_node = $this->get_adjusted_current_node();
6226  
6227          $token->namespace = $adjusted_current_node ? $adjusted_current_node->namespace : 'html';
6228  
6229          if ( $this->is_mathml_integration_point() ) {
6230              $token->integration_node_type = 'math';
6231          } elseif ( $this->is_html_integration_point() ) {
6232              $token->integration_node_type = 'html';
6233          }
6234  
6235          if ( false === $only_add_to_element_stack ) {
6236              /*
6237               * @todo Implement the "appropriate place for inserting a node" and the
6238               *       "insert an element at the adjusted insertion location" algorithms.
6239               *
6240               * These algorithms mostly impacts DOM tree construction and not the HTML API.
6241               * Here, there's no DOM node onto which the element will be appended, so the
6242               * parser will skip this step.
6243               *
6244               * @see https://html.spec.whatwg.org/#insert-an-element-at-the-adjusted-insertion-location
6245               */
6246          }
6247  
6248          $this->insert_html_element( $token );
6249      }
6250  
6251      /**
6252       * Inserts a virtual element on the stack of open elements.
6253       *
6254       * @since 6.7.0
6255       *
6256       * @param string      $token_name    Name of token to create and insert into the stack of open elements.
6257       * @param string|null $bookmark_name Optional. Name to give bookmark for created virtual node.
6258       *                                   Defaults to auto-creating a bookmark name.
6259       * @return WP_HTML_Token Newly-created virtual token.
6260       */
6261  	private function insert_virtual_node( $token_name, $bookmark_name = null ): WP_HTML_Token {
6262          $here = $this->bookmarks[ $this->state->current_token->bookmark_name ];
6263          $name = $bookmark_name ?? $this->bookmark_token();
6264  
6265          $this->bookmarks[ $name ] = new WP_HTML_Span( $here->start, 0 );
6266  
6267          $token = new WP_HTML_Token( $name, $token_name, false );
6268          $this->insert_html_element( $token );
6269          return $token;
6270      }
6271  
6272      /*
6273       * HTML Specification Helpers
6274       */
6275  
6276      /**
6277       * Indicates if the current token is a MathML integration point.
6278       *
6279       * @since 6.7.0
6280       *
6281       * @see https://html.spec.whatwg.org/#mathml-text-integration-point
6282       *
6283       * @return bool Whether the current token is a MathML integration point.
6284       */
6285  	private function is_mathml_integration_point(): bool {
6286          $current_token = $this->state->current_token;
6287          if ( ! isset( $current_token ) ) {
6288              return false;
6289          }
6290  
6291          if ( 'math' !== $current_token->namespace || 'M' !== $current_token->node_name[0] ) {
6292              return false;
6293          }
6294  
6295          $tag_name = $current_token->node_name;
6296  
6297          return (
6298              'MI' === $tag_name ||
6299              'MO' === $tag_name ||
6300              'MN' === $tag_name ||
6301              'MS' === $tag_name ||
6302              'MTEXT' === $tag_name
6303          );
6304      }
6305  
6306      /**
6307       * Indicates if the current token is an HTML integration point.
6308       *
6309       * Note that this method must be an instance method with access
6310       * to the current token, since it needs to examine the attributes
6311       * of the currently-matched tag, if it's in the MathML namespace.
6312       * Otherwise it would be required to scan the HTML and ensure that
6313       * no other accounting is overlooked.
6314       *
6315       * @since 6.7.0
6316       *
6317       * @see https://html.spec.whatwg.org/#html-integration-point
6318       *
6319       * @return bool Whether the current token is an HTML integration point.
6320       */
6321  	private function is_html_integration_point(): bool {
6322          $current_token = $this->state->current_token;
6323          if ( ! isset( $current_token ) ) {
6324              return false;
6325          }
6326  
6327          if ( 'html' === $current_token->namespace ) {
6328              return false;
6329          }
6330  
6331          $tag_name = $current_token->node_name;
6332  
6333          if ( 'svg' === $current_token->namespace ) {
6334              return (
6335                  'DESC' === $tag_name ||
6336                  'FOREIGNOBJECT' === $tag_name ||
6337                  'TITLE' === $tag_name
6338              );
6339          }
6340  
6341          if ( 'math' === $current_token->namespace ) {
6342              if ( 'ANNOTATION-XML' !== $tag_name ) {
6343                  return false;
6344              }
6345  
6346              $encoding = $this->get_attribute( 'encoding' );
6347  
6348              return (
6349                  is_string( $encoding ) &&
6350                  (
6351                      0 === strcasecmp( $encoding, 'application/xhtml+xml' ) ||
6352                      0 === strcasecmp( $encoding, 'text/html' )
6353                  )
6354              );
6355          }
6356  
6357          $this->bail( 'Should not have reached end of HTML Integration Point detection: check HTML API code.' );
6358          // This unnecessary return prevents tools from inaccurately reporting type errors.
6359          return false;
6360      }
6361  
6362      /**
6363       * Returns whether an element of a given name is in the HTML special category.
6364       *
6365       * @since 6.4.0
6366       *
6367       * @see https://html.spec.whatwg.org/#special
6368       *
6369       * @param WP_HTML_Token|string $tag_name Node to check, or only its name if in the HTML namespace.
6370       * @return bool Whether the element of the given name is in the special category.
6371       */
6372  	public static function is_special( $tag_name ): bool {
6373          if ( is_string( $tag_name ) ) {
6374              $tag_name = strtoupper( $tag_name );
6375          } else {
6376              $tag_name = 'html' === $tag_name->namespace
6377                  ? strtoupper( $tag_name->node_name )
6378                  : "{$tag_name->namespace} {$tag_name->node_name}";
6379          }
6380  
6381          return (
6382              'ADDRESS' === $tag_name ||
6383              'APPLET' === $tag_name ||
6384              'AREA' === $tag_name ||
6385              'ARTICLE' === $tag_name ||
6386              'ASIDE' === $tag_name ||
6387              'BASE' === $tag_name ||
6388              'BASEFONT' === $tag_name ||
6389              'BGSOUND' === $tag_name ||
6390              'BLOCKQUOTE' === $tag_name ||
6391              'BODY' === $tag_name ||
6392              'BR' === $tag_name ||
6393              'BUTTON' === $tag_name ||
6394              'CAPTION' === $tag_name ||
6395              'CENTER' === $tag_name ||
6396              'COL' === $tag_name ||
6397              'COLGROUP' === $tag_name ||
6398              'DD' === $tag_name ||
6399              'DETAILS' === $tag_name ||
6400              'DIR' === $tag_name ||
6401              'DIV' === $tag_name ||
6402              'DL' === $tag_name ||
6403              'DT' === $tag_name ||
6404              'EMBED' === $tag_name ||
6405              'FIELDSET' === $tag_name ||
6406              'FIGCAPTION' === $tag_name ||
6407              'FIGURE' === $tag_name ||
6408              'FOOTER' === $tag_name ||
6409              'FORM' === $tag_name ||
6410              'FRAME' === $tag_name ||
6411              'FRAMESET' === $tag_name ||
6412              'H1' === $tag_name ||
6413              'H2' === $tag_name ||
6414              'H3' === $tag_name ||
6415              'H4' === $tag_name ||
6416              'H5' === $tag_name ||
6417              'H6' === $tag_name ||
6418              'HEAD' === $tag_name ||
6419              'HEADER' === $tag_name ||
6420              'HGROUP' === $tag_name ||
6421              'HR' === $tag_name ||
6422              'HTML' === $tag_name ||
6423              'IFRAME' === $tag_name ||
6424              'IMG' === $tag_name ||
6425              'INPUT' === $tag_name ||
6426              'KEYGEN' === $tag_name ||
6427              'LI' === $tag_name ||
6428              'LINK' === $tag_name ||
6429              'LISTING' === $tag_name ||
6430              'MAIN' === $tag_name ||
6431              'MARQUEE' === $tag_name ||
6432              'MENU' === $tag_name ||
6433              'META' === $tag_name ||
6434              'NAV' === $tag_name ||
6435              'NOEMBED' === $tag_name ||
6436              'NOFRAMES' === $tag_name ||
6437              'NOSCRIPT' === $tag_name ||
6438              'OBJECT' === $tag_name ||
6439              'OL' === $tag_name ||
6440              'P' === $tag_name ||
6441              'PARAM' === $tag_name ||
6442              'PLAINTEXT' === $tag_name ||
6443              'PRE' === $tag_name ||
6444              'SCRIPT' === $tag_name ||
6445              'SEARCH' === $tag_name ||
6446              'SECTION' === $tag_name ||
6447              'SELECT' === $tag_name ||
6448              'SOURCE' === $tag_name ||
6449              'STYLE' === $tag_name ||
6450              'SUMMARY' === $tag_name ||
6451              'TABLE' === $tag_name ||
6452              'TBODY' === $tag_name ||
6453              'TD' === $tag_name ||
6454              'TEMPLATE' === $tag_name ||
6455              'TEXTAREA' === $tag_name ||
6456              'TFOOT' === $tag_name ||
6457              'TH' === $tag_name ||
6458              'THEAD' === $tag_name ||
6459              'TITLE' === $tag_name ||
6460              'TR' === $tag_name ||
6461              'TRACK' === $tag_name ||
6462              'UL' === $tag_name ||
6463              'WBR' === $tag_name ||
6464              'XMP' === $tag_name ||
6465  
6466              // MathML.
6467              'math MI' === $tag_name ||
6468              'math MO' === $tag_name ||
6469              'math MN' === $tag_name ||
6470              'math MS' === $tag_name ||
6471              'math MTEXT' === $tag_name ||
6472              'math ANNOTATION-XML' === $tag_name ||
6473  
6474              // SVG.
6475              'svg DESC' === $tag_name ||
6476              'svg FOREIGNOBJECT' === $tag_name ||
6477              'svg TITLE' === $tag_name
6478          );
6479      }
6480  
6481      /**
6482       * Returns whether a given element is an HTML Void Element
6483       *
6484       * > area, base, br, col, embed, hr, img, input, link, meta, source, track, wbr
6485       *
6486       * @since 6.4.0
6487       *
6488       * @see https://html.spec.whatwg.org/#void-elements
6489       *
6490       * @param string $tag_name Name of HTML tag to check.
6491       * @return bool Whether the given tag is an HTML Void Element.
6492       */
6493  	public static function is_void( $tag_name ): bool {
6494          $tag_name = strtoupper( $tag_name );
6495  
6496          return (
6497              'AREA' === $tag_name ||
6498              'BASE' === $tag_name ||
6499              'BASEFONT' === $tag_name || // Obsolete but still treated as void.
6500              'BGSOUND' === $tag_name || // Obsolete but still treated as void.
6501              'BR' === $tag_name ||
6502              'COL' === $tag_name ||
6503              'EMBED' === $tag_name ||
6504              'FRAME' === $tag_name ||
6505              'HR' === $tag_name ||
6506              'IMG' === $tag_name ||
6507              'INPUT' === $tag_name ||
6508              'KEYGEN' === $tag_name || // Obsolete but still treated as void.
6509              'LINK' === $tag_name ||
6510              'META' === $tag_name ||
6511              'PARAM' === $tag_name || // Obsolete but still treated as void.
6512              'SOURCE' === $tag_name ||
6513              'TRACK' === $tag_name ||
6514              'WBR' === $tag_name
6515          );
6516      }
6517  
6518      /**
6519       * Gets an encoding from a given string.
6520       *
6521       * This is an algorithm defined in the WHAT-WG specification.
6522       *
6523       * Example:
6524       *
6525       *     'UTF-8' === self::get_encoding( 'utf8' );
6526       *     'UTF-8' === self::get_encoding( "  \tUTF-8 " );
6527       *     null    === self::get_encoding( 'UTF-7' );
6528       *     null    === self::get_encoding( 'utf8; charset=' );
6529       *
6530       * @see https://encoding.spec.whatwg.org/#concept-encoding-get
6531       *
6532       * @todo As this parser only supports UTF-8, only the UTF-8
6533       *       encodings are detected. Add more as desired, but the
6534       *       parser will bail on non-UTF-8 encodings.
6535       *
6536       * @since 6.7.0
6537       *
6538       * @param string $label A string which may specify a known encoding.
6539       * @return string|null Known encoding if matched, otherwise null.
6540       */
6541  	protected static function get_encoding( string $label ): ?string {
6542          /*
6543           * > Remove any leading and trailing ASCII whitespace from label.
6544           */
6545          $label = trim( $label, " \t\f\r\n" );
6546  
6547          /*
6548           * > If label is an ASCII case-insensitive match for any of the labels listed in the
6549           * > table below, then return the corresponding encoding; otherwise return failure.
6550           */
6551          switch ( strtolower( $label ) ) {
6552              case 'unicode-1-1-utf-8':
6553              case 'unicode11utf8':
6554              case 'unicode20utf8':
6555              case 'utf-8':
6556              case 'utf8':
6557              case 'x-unicode20utf8':
6558                  return 'UTF-8';
6559  
6560              default:
6561                  return null;
6562          }
6563      }
6564  
6565      /*
6566       * Constants that would pollute the top of the class if they were found there.
6567       */
6568  
6569      /**
6570       * Indicates that the next HTML token should be parsed and processed.
6571       *
6572       * @since 6.4.0
6573       *
6574       * @var string
6575       */
6576      const PROCESS_NEXT_NODE = 'process-next-node';
6577  
6578      /**
6579       * Indicates that the current HTML token should be reprocessed in the newly-selected insertion mode.
6580       *
6581       * @since 6.4.0
6582       *
6583       * @var string
6584       */
6585      const REPROCESS_CURRENT_NODE = 'reprocess-current-node';
6586  
6587      /**
6588       * Indicates that the current HTML token should be processed without advancing the parser.
6589       *
6590       * @since 6.5.0
6591       *
6592       * @var string
6593       */
6594      const PROCESS_CURRENT_NODE = 'process-current-node';
6595  
6596      /**
6597       * Indicates that the parser encountered unsupported markup and has bailed.
6598       *
6599       * @since 6.4.0
6600       *
6601       * @var string
6602       */
6603      const ERROR_UNSUPPORTED = 'unsupported';
6604  
6605      /**
6606       * Indicates that the parser encountered more HTML tokens than it
6607       * was able to process and has bailed.
6608       *
6609       * @since 6.4.0
6610       *
6611       * @var string
6612       */
6613      const ERROR_EXCEEDED_MAX_BOOKMARKS = 'exceeded-max-bookmarks';
6614  
6615      /**
6616       * Unlock code that must be passed into the constructor to create this class.
6617       *
6618       * This class extends the WP_HTML_Tag_Processor, which has a public class
6619       * constructor. Therefore, it's not possible to have a private constructor here.
6620       *
6621       * This unlock code is used to ensure that anyone calling the constructor is
6622       * doing so with a full understanding that it's intended to be a private API.
6623       *
6624       * @access private
6625       */
6626      const CONSTRUCTOR_UNLOCK_CODE = 'Use WP_HTML_Processor::create_fragment() instead of calling the class constructor directly.';
6627  }
PHP Cross Reference of WordPress Trunk (Updated Daily)

/wp-includes/html-api/ -> class-wp-html-processor.php (source)