ChatGPT解决这个技术问题 Extra ChatGPT

Case-insensitive search and replace with sed

I'm trying to use SED to extract text from a log file. I can do a search-and-replace without too much trouble:

sed 's/foo/bar/' mylog.txt

However, I want to make the search case-insensitive. From what I've googled, it looks like appending i to the end of the command should work:

sed 's/foo/bar/i' mylog.txt

However, this gives me an error message:

sed: 1: "s/foo/bar/i": bad flag in substitute command: 'i'

What's going wrong here, and how do I fix it?

Can you try updating your copy of sed? I is a GNU extension which might be not available with your copy of sed.
EDIT: I struck through the OS X qualification, as the OP accepted an answer that doesn’t work on OS X. (As another answer indicated, sed on OS X does not support case-insensitive matching, contrary to Apple documentation.)
@danorton: Thanks for that; in case you derived the sense that the Apple documentation promises something the implementation doesn't deliver from my answer below: man sed IS consistent with the implementation - no mention of (and no support in practice) for case-insensitive matching; if you found a piece of documentation claiming otherwise, please let us know.
@mklement0, yes, sorry, I stand corrected. The Apple documentation does not make any claim of case-insensitive matching for sed.
FWIW, the GNU versions of the tools whose BSD version comes with OS X are available from various package managers. I have the full suite of text utilities installed via Homebrew with a g prefix, so I can use gsed or gdate when I need a feature not found in the stock version.

m
mklement0

Update: Starting with macOS Big Sur (11.0), sed now does support the I flag for case-insensitive matching, so the command in the question should now work (BSD sed doesn't reporting its version, but you can go by the date at the bottom of the man page, which should be March 27, 2017 or more recent); a simple example:

# BSD sed on macOS Big Sur and above (and GNU sed, the default on Linux)
$ sed 's/ö/@/I' <<<'FÖO'
F@O   # `I` matched the uppercase Ö correctly against its lowercase counterpart

Note: I (uppercase) is the documented form of the flag, but i works as well.

Similarly, starting with macOS Big Sur (11.0) awk now is locale-aware (awk --version should report 20200816 or more recent):

# BSD awk on macOS Big Sur and above (and GNU awk, the default on Linux)
$ awk 'tolower($0)' <<<'FÖO'
föo  # non-ASCII character Ö was properly lowercased

The following applies to macOS up to Catalina (10.15):

To be clear: On macOS, sed - which is the BSD implementation - does NOT support case-insensitive matching - hard to believe, but true. The formerly accepted answer, which itself shows a GNU sed command, gained that status because of the perl-based solution mentioned in the comments.

To make that Perl solution work with foreign characters as well, via UTF-8, use something like:

perl -C -Mutf8 -pe 's/öœ/oo/i' <<< "FÖŒ" # -> "Foo"

-C turns on UTF-8 support for streams and files, assuming the current locale is UTF-8-based.

-Mutf8 tells Perl to interpret the source code as UTF-8 (in this case, the string passed to -pe) - this is the shorter equivalent of the more verbose -e 'use utf8;'.Thanks, Mark Reed

(Note that using awk is not an option either, as awk on macOS (i.e., BWK awk and BSD awk) appears to be completely unaware of locales altogether - its tolower() and toupper() functions ignore foreign characters (and sub() / gsub() don't have case-insensitivity flags to begin with).)

A note on the relationship of sed and awk to the POSIX standard:

BSD sed and awk limit their functionality mostly to what the POSIX sed and POSIX awk specs mandate, whereas their GNU counterparts implement many more extensions.


m
mklement0

Editor's note: This solution doesn't work on macOS (out of the box), because it only applies to GNU sed, whereas macOS comes with BSD sed.

Capitalize the 'I'.

sed 's/foo/bar/I' file

I saw this also, and tried it... but I still get the same error message.
BSD sed has a lot of limitations, it seems. I would do this in PERL (i.e., perl -pe 's/foo/bar/i'), if that's the case.
The default install of OS X Lion gives the error: sed: 1: "s/foo/bar/I": bad flag in substitute command: 'I'
The I suffix is not a portable use of sed. POSIX sed uses only Basic Regular Expressions (BREs), which are surprisingly limited. They don't even support the + (you have to use \{1,\} instead), let alone case insensitive matching. The only portable way to do it with sed is to check for something like /[hH][eE][lL][lL][oO]/, which is often going to be impractical.
That needs to be /gI othewise it will just operate on the first match.
R
Remi Guan

Another work-around for sed on Mac OS X is to install gsedfrom MacPorts or HomeBrew and then create the alias sed='gsed'.


gsed "s/a/b/Ig" works, thanks! Why should a good working answer get a downvote?
this answer is great. used brew install gnu-sed then went to my ~/.bash_profile and added the alias. Thanks @davmat
Better to do brew install gnu-sed --with-default-names - this will override the default sed.
@Mar0ux the --with-default-names is now deprecated: brew.sh I added the gnu-sed to my PATH, but I believe there are other workarounds now: SE question
B
Benjamin W.

If you are doing pattern matching first, e.g.,

/pattern/s/xx/yy/g

then you want to put the I after the pattern:

/pattern/Is/xx/yy/g

Example:

echo Fred | sed '/fred/Is//willma/g'

returns willma; without the I, it returns the string untouched (Fred).


On MacOs I get: sed: 1: "/fred/Is//willma/g": invalid command code I
Good tip. Here's how I use it on a complex search: sed -r '/'"$PATTERN"'/I,${s//'$YELLOW'&'$NO_COLOR'/g;b};$q3' . It prints the text, and if pattern (case-insensitive) was found, it highlights text in yellow (ansi color). If not found - returns exit code 3.
B
Benjamin W.

The sed FAQ addresses the closely related case-insensitive search. It points out that a) many versions of sed support a flag for it and b) it's awkward to do in sed, you should rather use awk or Perl.

But to do it in POSIX sed, they suggest three options (adapted for substitution here):

Convert to uppercase and store original line in hold space; this won't work for substitutions, though, as the original content will be restored before printing, so it's only good for insert or adding lines based on a case-insensitive match. Maybe the possibilities are limited to FOO, Foo and foo. These can be covered by s/FOO/bar/;s/[Ff]oo/bar/ To search for all possible matches, one can use bracket expressions for each character: s/[Ff][Oo][Oo]/bar/


@D.Shawley That's not contradicting anything in the answer, right? Or did you want to add context by linking to the official spec? I can add it to the answer.
W Nothing contradictory here. I was happy to see someone referencing POSIX and wanted to add a link. The majority of the answers here were busily bemoaning the "non-standard" macOS implementation of sed which bothered me.
@D.Shawley Added a link to the spec now :)
A
Alastair Irvine

The Mac version of sed seems a bit limited. One way to work around this is to use a linux container (via Docker) which has a useable version of sed:

cat your_file.txt | docker run -i busybox /bin/sed -r 's/[0-9]{4}/****/Ig'

this is a particularly heinous thing to do. If anyone is even considering this seriously, just install a GNU sed locally.
Overkill but useful general approach to know!
S
SolarBear

Use following to replace all occurrences:

sed 's/foo/bar/gI' mylog.txt

g
gojimmypi

I had a similar need, and came up with this:

this command to simply find all the files:

grep -i -l -r foo ./* 

this one to exclude this_shell.sh (in case you put the command in a script called this_shell.sh), tee the output to the console to see what happened, and then use sed on each file name found to replace the text foo with bar:

grep -i -l -r --exclude "this_shell.sh" foo ./* | tee  /dev/fd/2 | while read -r x; do sed -b -i 's/foo/bar/gi' "$x"; done 

I chose this method, as I didn't like having all the timestamps changed for files not modified. feeding the grep result allows only the files with target text to be looked at (thus likely may improve performance / speed as well)

be sure to backup your files & test before using. May not work in some environments for files with embedded spaces. (?)


D
Deb

Following should be fine:

  sed -i 's/foo/bar/gi' mylog.txt