ChatGPT解决这个技术问题 Extra ChatGPT

How to use regex with find command?

I have some images named with generated uuid1 string. For example 81397018-b84a-11e0-9d2a-001b77dc0bed.jpg. I want to find out all these images using "find" command:

find . -regex "[a-f0-9\-]\{36\}\.jpg".

But it doesn't work. Something wrong with the regex? Could someone help me with this?

maybe change the regextype. The default is Emacs Regular Expressions, whatever that means.

S
Susam Pal
find . -regextype sed -regex ".*/[a-f0-9\-]\{36\}\.jpg"

Note that you need to specify .*/ in the beginning because find matches the whole path.

Example:

susam@nifty:~/so$ find . -name "*.jpg"
./foo-111.jpg
./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg
./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg
susam@nifty:~/so$ 
susam@nifty:~/so$ find . -regextype sed -regex ".*/[a-f0-9\-]\{36\}\.jpg"
./test/81397018-b84a-11e0-9d2a-001b77dc0bed.jpg
./81397018-b84a-11e0-9d2a-001b77dc0bed.jpg

My version of find:

$ find --version
find (GNU findutils) 4.4.2
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Built using GNU gnulib version e5573b1bad88bfabcda181b9e0125fb0c52b7d3b
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION FTS() CBO(level=0) 
susam@nifty:~/so$ 
susam@nifty:~/so$ find . -regextype foo -regex ".*/[a-f0-9\-]\{36\}\.jpg"
find: Unknown regular expression type `foo'; valid types are `findutils-default', `awk', `egrep', `ed', `emacs', `gnu-awk', `grep', `posix-awk', `posix-basic', `posix-egrep', `posix-extended', `posix-minimal-basic', `sed'.

@Tom it's the way regex in find works. According to the man page, the regex matches the whole file path, directories included, which means there's an implicit "^ ... $" surrounding your regex. It must match the WHOLE result line.
I don't think you need the / in .*/ because .* matches zero or more of (almost) any character.
For those (like me) who didn't read the regex properly first time: Note the backslashes preceding special regex characters, e.g.: \{36\}
I had trouble finding the full list of regex types (manpage is not up to date): valid types are 'findutils-default', 'awk', ' egrep', 'ed', 'emacs', 'gnu-awk', 'grep', 'posix-awk', 'posix-basic', 'posix-egrep', 'posix -extended', 'posix-minimal-basic', 'sed'.
Make sure to put the -regextype flag before the -regex flag, otherwise it does not apply!
P
Paŭlo Ebermann

The -regex find expression matches the whole name, including the relative path from the current directory. For find . this always starts with ./, then any directories.

Also, these are emacs regular expressions, which have other escaping rules than the usual egrep regular expressions.

If these are all directly in the current directory, then

find . -regex '\./[a-f0-9\-]\{36\}\.jpg'

should work. (I'm not really sure - I can't get the counted repetition to work here.) You can switch to egrep expressions by -regextype posix-egrep:

find . -regextype posix-egrep -regex '\./[a-f0-9\-]{36}\.jpg'

(Note that everything said here is for GNU find, I don't know anything about the BSD one which is also the default on Mac.)


I had parenthesis for multiple matching strings in my regex, so the posix-egrep type worked for me.
Something to note, -regextype is an option for GNU find and not BSD (at least not Mac BSD-like) find. If this option is not available, be sure to install GNU find. If on a Mac that's possible with the brew package findutils. Find is then available via gfind.
regextype posix-egrep did the task for me. I think the default is regextype emacs.
posix-egrep can be shortened to just egrep
y
yarian

Judging from other answers, it seems this might be find's fault.

However you can do it this way instead:

find . * | grep -P "[a-f0-9\-]{36}\.jpg"

You might have to tweak the grep a bit and use different options depending on what you want but it works.


Worked well for me and provides a great degree of freedom with respect to the regex.
A downside with this is that you can't take advantage of find's -prune functionality which will skip over certain directories altogether. Most often this isn't really important, but it is worth mentioning.
-prune will still work, I guess. It would be more dangerous to use -exec - it would run on all files and not just those that grep allows to pass.
find . * is an equivalent of find (shorter command).
S
Stan Kurdziel

on Mac OS X (BSD find): Same effect as the accepted answer.

$ find -E . -regex ".*/[a-f0-9\-]{36}.jpg"

man find says -E uses extended regex support

NOTE: the .*/ prefix is needed to match a complete path:

For comparison purposes, here's the GNU/Linux version:

$ find . -regextype sed -regex ".*/[a-f0-9\-]\{36\}\.jpg"

Seems -E is not available on Ubuntu (tested on WSL Ubuntu)
@Clever Little Monkey - No, the accepted answer should work on Ubuntu, this variation is for Mac OS X specifically (or perhaps another BSD variant like FreeBSD)
The -E option is not available on the OpenBSD version of find
b
binbjz

Simple way - you can specify .* in the beginning because find matches the whole path.

$ find . -regextype egrep -regex '.*[a-f0-9\-]{36}\.jpg$'

find version

$ find --version
find (GNU findutils) 4.6.0
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
<http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION 
FTS(FTS_CWDFD) CBO(level=2)

you can specify .* in the beginning because find matches the whole path. that is a very tricky good point . if you were at dir/ and searchs for samplefile.txt with this notation find . -regex 'samplefile.*' find wont work
I do prefer egrep to sed - so thanks
t
thiton

Try to use single quotes (') to avoid shell escaping of your string. Remember that the expression needs to match the whole path, i.e. needs to look like:

 find . -regex '\./[a-f0-9-]*.jpg'

Apart from that, it seems that my find (GNU 4.4.2) only knows basic regular expressions, especially not the {36} syntax. I think you'll have to make do without it.


j
jhoepken

You should use absolute directory path when applying find instruction with regular expression. In your example, the

find . -regex "[a-f0-9\-]\{36\}\.jpg"

should be changed into

find . -regex "./[a-f0-9\-]\{36\}\.jpg"

In most Linux systems, some disciplines in regular expression cannot be recognized by that system, so you have to explicitly point out -regexty like

find . -regextype posix-extended -regex "[a-f0-9\-]\{36\}\.jpg"

M
Mark

If you want to maintain cross-platform compatibility, I could find no built-in regex search option that works across different versions of find in a consistent way.

Combine with grep

As suggested by @yarian, you could run an over-inclusive find and then run the output through grep:

find . | grep -E '<POSIX regex>'

This is likely to be slow but will give you cross-platform regex search if you need to use a full regular expression and can't reformat your search as a glob

Rewrite as a glob

The -name option is compatible with globs which will provide limited (but cross-platform) pattern matching.

You can use all the patterns that you would on the command line like * ? {} **. Although not as powerful as full regex, you might be able to reformulate your search to globs depending on your use-case.

Internet search for globs - many tutorials detailing full functionality are available online


K
Kevin

One thing I don't see covered is how to combine regular expressions with regular find syntax.

Eg: I want to find core dump files on BSD / Linux, I change to the root I want to scan.. eg: cd / then execute:

find \( -path "./dev" -o -path "./sys" -o -path "./proc" \) -prune -o -type f -regextype sed -regex ".*\.core$" -exec du -h {} \; 2> /dev/null

So I am using the prune command to exclude multiple system directories, before doing regular expression on the remaining files. Any error output (stderr) is deleted.

The important part is to use the Find syntax first, then OR (-o) with the regular expression.