wpseek.com
A WordPress-centric search engine for devs and theme authors



_wp_scrub_utf8_fallback › WordPress Function

Since6.9.0
Deprecatedn/a
_wp_scrub_utf8_fallback ( $bytes )
Access:
  • private
Parameters:
  • (string) $bytes UTF-8 encoded string which might contain spans of invalid bytes.
    Required: Yes
See:
  • wp_scrub_utf8()
Returns:
  • (string) Input string with spans of invalid bytes swapped with the replacement character.
Defined at:
Codex:

Fallback mechanism for replacing invalid spans of UTF-8 bytes.

Example: 'Pi�a' === _wp_scrub_utf8_fallback( "PixF1a" ); // “ñ” is 0xF1 in Windows-1252.


Source

function _wp_scrub_utf8_fallback( string $bytes ): string {
	$bytes_length   = strlen( $bytes );
	$next_byte_at   = 0;
	$was_at         = 0;
	$invalid_length = 0;
	$scrubbed       = '';

	while ( $next_byte_at <= $bytes_length ) {
		_wp_scan_utf8( $bytes, $next_byte_at, $invalid_length );

		if ( $next_byte_at >= $bytes_length ) {
			if ( 0 === $was_at ) {
				return $bytes;
			}

			return $scrubbed . substr( $bytes, $was_at, $next_byte_at - $was_at - $invalid_length );
		}

		$scrubbed .= substr( $bytes, $was_at, $next_byte_at - $was_at );
		$scrubbed .= "\u{FFFD}";

		$next_byte_at += $invalid_length;
		$was_at        = $next_byte_at;
	}

	return $scrubbed;
}