ChatGPT解决这个技术问题 Extra ChatGPT

How can I make my match non greedy in vim?

I have a big HTML file that has lots of markup that looks like this:

<p class="MsoNormal" style="margin: 0in 0in 0pt;">
  <span style="font-size: small; font-family: Times New Roman;">stuff here</span>
</p>

I'm trying to do a Vim search-and-replace to get rid of all class="" and style="" but I'm having trouble making the match ungreedy.

My first attempt was this

%s/style=".*?"//g

but Vim doesn't seem to like the ?. Unfortunately removing the ? makes the match too greedy.

How can I make my match ungreedy?

I think Paul's answer is good. Just to say that "?" does not mean optional in vim (if this is what you want to achieve using "?")
@LB, in many languages, .*? means match any character but be non-greedy. That's what he is trying to achieve.
For people not knowing the term ungreedy/non-greedy: it is also called lazy

R
Randy Morris

Instead of .* use .\{-}.

%s/style=".\{-}"//g

Also, see :help non-greedy


Not very intuitive, is this something that only vim does?
Everything has its own regular expression language... that's one of the biggest issues with regex.
Lots of these tools matured around the same time and independently developed their own dialect of a regular expression language. Many of these tools also were trying to solve different problems so it makes sense that the syntax could be -potentially wildly- different across these implementations. We have to accept that this is just how the real world works even though it sometimes makes our lives harder as developers. Luckily many tools at least provide a Perl-compatible implementation of regex these days. Unfortunately Vim is not one of them.
If anyone like myself defaults their search to \v (very magic flag) you'll want to use .{-}.
@Shurane @Ziggy Mnemonic: controls the number of repetitions like {1,3} does (braces). The minus sign - means: repeat as little as possible (little == minus) ;)
V
Vilhelm Gray

Non greedy search in vim is done using {-} operator. Like this:

%s/style=".\{-}"//g

just try:

:help non-greedy

P
Paul Tomblin

What's wrong with

%s/style="[^"]*"//g

Although, for my own benefit, I'd still like to better understand the ungreedy thing.
F
FrDarryl

If you're more comfortable PCRE regex syntax, which

supports the non-greedy operator ?, as you asked in OP; and doesn't require backwhacking grouping and cardinality operators (an utterly counterintuitive vim syntax requirement since you're not matching literal characters but specifying operators); and you have [g]vim compiled with perl feature, test using :ver and inspect features; if +perl is there you're good to go)

try search/replace using

:perldo s///

Example. Swap src and alt attributes in img tag:

<p class="logo"><a href="/"><img src="/caminoglobal_en/includes/themes/camino/images/header_logo.png" alt=""></a></p>

:perldo s/(src=".*?")\s+(alt=".*?")/$2 $1/

<p class="logo"><a href="/"><img alt="" src="/caminoglobal_en/includes/themes/camino/images/header_logo.png"></a></p>

perldo works great, but unfortunately does not highlight the selected test while typing the regex.
you can't use perldo for interactive regex find/replace like you can with the native vim substitute s/. Or is it possible? I'd love to be wrong about that.
W
William Pursell

I've found that a good solution to this type of question is:

:%!sed ...

(or perl if you prefer). IOW, rather than learning vim's regex peculiarities, use a tool you already know. Using perl would make the ? modifier work to ungreedy the match.


good point, but being able to do /pattern to check that you're matching the pattern correctly before applying it and using c modifier in your vim regular expression is also nice :)
this is correct. all solutions here are not close to non-greedy! if you have to match [0-9]\{7} in a line with lots of text and several occurences of that pattern, no solution here will do. The solutions here only work for simple things (which to be fair, is what was asked). but if you are doing a little more than search till the next quotation, vim won't help.
J
JJoao

With \v (as suggested in several comments)

:%s/\v(style|class)\=".{-}"//g

b
bain

Plugin eregex.vim handles Perl-style non-greedy operators *? and +?


@xsilenT github.com/othree/eregex.vim : "It is recommended to install the script using Vundle or pathogen."
sorry for that I don't know how to using Vundle or pathogen.
R
Rob Wells

G'day,

Vim's regexp processing is not too brilliant. I've found that the regexp syntax for sed is about the right match for vim's capabilities.

I usually set the search highlighting on (:set hlsearch) and then play with the regexp after entering a slash to enter search mode.

Edit: Mark, that trick to minimise greedy matching is also covered in Dale Dougherty's excellent book "Sed & Awk" (sanitised Amazon link).

Chapter Three "Understanding Regular Expression Syntax" is an excellent intro to the more primitive regexp capabilities involved with sed and awk. Only a short read and highly recommended.

HTH

cheers,


Vim's regex processing is actually quite nice. It can do things that sed can't, like match on line/column numbers or match based on per-language classification of characters as keywords or identifiers or whitespace. It also has zero-width assertions and the ability to put expressions in the right side of a replacement. If you use \v it helps clean the syntax up a lot.
@Brian, cheers. I'll do a help regex and see what I've been missing.
@RobWells, Sed & Awk, which is indeed a very good book imho, does not explicitly spend any words on greedy/lazy quantifiers. As a proof, there is absolutely no occurrence of the words greed or greedy in the book, and there's only one, but unrelated, occurrence of the word lazy.
@EnricoMariaDeAngelis it is but the example does not refer to the term explicitly. It is about how to tailor your regex to use the "not" operator to achieve non greedy matches. The term greedy and lazy arrived with Perl's NFA engine when they introduced operators to specifically modify greedy match behaviour.