ChatGPT解决这个技术问题 Extra ChatGPT

Remove multiple whitespaces

I'm getting $row['message'] from a MySQL database and I need to remove all whitespace like \n \t and so on.

$row['message'] = "This is   a Text \n and so on \t     Text text.";

should be formatted to:

$row['message'] = 'This is a Text and so on Text text.';

I tried:

 $ro = preg_replace('/\s\s+/', ' ',$row['message']);
 echo $ro;

but it doesn't remove \n or \t, just single spaces. Can anyone tell me how to do that?

The newline and tab characters are in single quotes, so you want them literal?
I fixed the quoting of the code sectin with the \n and \t by changing it to double-quotes.

A
Avatar

You need:

$ro = preg_replace('/\s+/', ' ', $row['message']);

You are using \s\s+ which means whitespace(space, tab or newline) followed by one or more whitespace. Which effectively means replace two or more whitespace with a single space.

What you want is replace one or more whitespace with single whitespace, so you can use the pattern \s\s* or \s+ (recommended)


his method is better than this: why would you replace one space with one space?
He also wants \n and \t to be replaced with space. Now his pattern does not match these, say for $x = "does\nthis\twork"; The OP wants all whitespace to be replaced with a single space.
@codaddict, how we can keep \n and remove all other multiple space and tabs from the string? please help me
Can you be more specific why "\s+" is recommended?
Note that in PHP \s not including "vertical tab" chr(11). To include it too you need to use space character class: [[:space:]]+ php.net/manual/en/regexp.reference.character-classes.php
u
uınbɐɥs
<?php
$str = "This is  a string       with
spaces, tabs and newlines present";

$stripped = preg_replace(array('/\s{2,}/', '/[\t\n]/'), ' ', $str);

echo $str;
echo "\n---\n";
echo "$stripped";
?>

This outputs

This is  a string   with
spaces, tabs and newlines present
---
This is a string with spaces, tabs and newlines present

You are a true lifesaver. I was about to jump out if the window over this.
Neat, still helpful
Well I'd never have thought about this, but 12 years later it still works well!
A
Amal Murali
preg_replace('/[\s]+/mu', ' ', $var);

\s already contains tabs and new lines, so this above regex appears to be sufficient.


Square brackets aren't needed here because there's only one thing inside them. The /m wont have an effect as there are no ^ or $ anchors and the /u won't have any effect except to slow it down slightly and die if the input string is not valid UTF-8 (it doesn't affect what \s matches, but it would affect \pZ).
L
Lukas Liesis

simplified to one function:

function removeWhiteSpace($text)
{
    $text = preg_replace('/[\t\n\r\0\x0B]/', '', $text);
    $text = preg_replace('/([\s])\1+/', ' ', $text);
    $text = trim($text);
    return $text;
}

based on Danuel O'Neal answer.


g
ghostdog74
$str='This is   a Text \n and so on Text text.';
print preg_replace("/[[:blank:]]+/"," ",$str);

This is the one that worked for me the best. Also, I would add trim to erase whitespace in the beginning and end of string
@Dziamid You can do it with trim(preg_replace(...))
n
nickf

I can't replicate the problem here:

$x = "this    \n \t\t \n    works.";
var_dump(preg_replace('/\s\s+/', ' ', $x));
// string(11) "this works."

I'm not sure if it was just a transcription error or not, but in your example, you're using a single-quoted string. \n and \t are only treated as new-line and tab if you've got a double quoted string. That is:

'\n\t' != "\n\t"

Edit: as Codaddict pointed out, \s\s+ won't replace a single tab character. I still don't think using \s+ is an efficient solution though, so how about this instead:

preg_replace('/(?:\s\s+|\n|\t)/', ' ', $x);

+1, True. For a string with plenty of single spaces (which usually is the case) it is inefficient to replace a space with space.
@coaddict: to test your hypothesis, i wrote a quick script to run through 1000 of each replacement and check the timing of each. For the string '+1, True. For a string with plenty of single spaces (which usually is the case) it is inefficient to replace a space with space. – codaddict Feb 24 \'10 at 13:32', one thousand \s+ preg_replace() calls took 0.010547876358032 seconds, and one thousand (?:\s\s+|\n|\t) preg_replace() calls took 0.013049125671387, making it almost 30% slower.
You may want to add "\r" to that last example as some computers do use a single "\r" on its own (Apple Mac?)
m
middus
preg_replace('/(\s\s+|\t|\n)/', ' ', $row['message']);

This replaces all tabs, all newlines and all combinations of multiple spaces, tabs and newlines with a single space.


\t & \n are already included in \s so your regex is exactly the same than \s\s+ that is better written \s{2,} just like @Alex Polo answer
D
Danuel O'Neal
<?php
#This should help some newbies
# REGEX NOTES FROM DANUEL
# I wrote these functions for my own php framework
# Feel Free to make it better
# If it gets more complicated than this. You need to do more software engineering/logic.
# (.)  // capture any character
# \1   // if it is followed by itself
# +    // one or more

class whitespace{

    static function remove_doublewhitespace($s = null){
           return  $ret = preg_replace('/([\s])\1+/', ' ', $s);
    }

    static function remove_whitespace($s = null){
           return $ret = preg_replace('/[\s]+/', '', $s );
    }

    static function remove_whitespace_feed( $s = null){
           return $ret = preg_replace('/[\t\n\r\0\x0B]/', '', $s);
    }

    static function smart_clean($s = null){
           return $ret = trim( self::remove_doublewhitespace( self::remove_whitespace_feed($s) ) );
    }
}
$string = " Hey   yo, what's \t\n\tthe sc\r\nen\n\tario! \n";
echo whitespace::smart_clean($string);

static function remove_whitespace is for what reason? You define but never use it.
These each have their use but none of these would achieve what the question asks for which is to replace multiple consecutive whitespace with just one. Your "remove_doublewhitespace" would only replace multiple of the same whitespace character, so it would replace "\n\n\n" with a ' ', but it would not do anything to " \r\n"
h
hharek

Without preg_replace()

$str = "This is   a Text \n and so on \t     Text text.";
$str = str_replace(["\r", "\n", "\t"], " ", $str);
while (strpos($str, "  ") !== false)
{
    $str = str_replace("  ", " ", $str);
}
echo $str;

m
matsolof

This is what I would use:

a. Make sure to use double quotes, for example:

$row['message'] = "This is   a Text \n and so on \t     Text text.";

b. To remove extra whitespace, use:

$ro = preg_replace('/\s+/', ' ', $row['message']); 
echo $ro;

It may not be the fastest solution, but I think it will require the least code, and it should work. I've never used mysql, though, so I may be wrong.


A
Alex Polo

All you need is to run it as follows:

echo preg_replace('/\s{2,}/', ' ', "This is   a Text \n and so on \t     Text text."); // This is a Text and so on Text text.

C
Catalin T.

I use this code and pattern:

preg_replace('/\\s+/', ' ',$data)

$data = 'This is   a Text 
   and so on         Text text on multiple lines and with        whitespaces';
$data= preg_replace('/\\s+/', ' ',$data);
echo $data;

You may test this on http://writecodeonline.com/php/


It works with me even in mariaDB in this query: SELECT search_able, REGEXP_REPLACE (search_able,"\\s+",' ') FROM book where id =260 So Thanks a lot
B
BigBlast

On the truth, if think that you want something like this:

preg_replace('/\n+|\t+|\s+/',' ',$string);

H
Heman G

this will replace multiple tabs with a single tab

preg_replace("/\s{2,}/", "\t", $string);

S
Shahbaz Khan

Without preg_replace, with the help of loop.

<?php

$str = "This is   a Text \n and so on \t     Text text.";
$str_length = strlen($str);
$str_arr = str_split($str);
for ($i = 0; $i < $str_length; $i++) {
    if (isset($str_arr[$i + 1])
       && $str_arr[$i] == ' '
       && $str_arr[$i] == $str_arr[$i + 1]) {
       unset($str_arr[$i]);
    } 
    else {
      continue;
    }
}

 echo implode("", $str_arr) ; 

 ?>