ChatGPT解决这个技术问题 Extra ChatGPT

Printing everything except the first field with awk

I have a file that looks like this:

AE  United Arab Emirates
AG  Antigua & Barbuda
AN  Netherlands Antilles
AS  American Samoa
BA  Bosnia and Herzegovina
BF  Burkina Faso
BN  Brunei Darussalam

And I 'd like to invert the order, printing first everything except $1 and then $1:

United Arab Emirates AE

How can I do the "everything except field 1" trick?

Hi @cfisher , it can be done without a loop ang without the extra space.
The formulation of the question is kind of misleading. My two cents: "How to move the first field to the last position in awk"

P
Peter Mortensen

$1="" leaves a space as Ben Jackson mentioned, so use a for loop:

awk '{for (i=2; i<=NF; i++) print $i}' filename

So if your string was "one two three", the output will be:

two three

If you want the result in one row, you could do as follows:

awk '{for (i=2; i<NF; i++) printf $i " "; print $NF}' filename

This will give you: "two three"


and a extra trailing space
better to use: awk '{for(i=2;i<=NF;i++){ printf("%s",( (i>2) ? OFS : "" ) $i) } ; print ;}' which : print fields 2 to NF, add the Output Field Separator as needed (ie, except before $2). The last print add a final newline to end the current line printing. That one will work if you change FS/OFS (ie, it won't always be "space")
The second one worked real nice for me. The first one, not so much. Not really sure why. It diced up the whole text.
B
Ben Jackson

Assigning $1 works but it will leave a leading space: awk '{first = $1; $1 = ""; print $0, first; }'

You can also find the number of columns in NF and use that in a loop.


For the totally lazy; here is klashxx' code.
Great. Got rid of the leading space with sed : awk {'first = $1; $1=""; print $0'}|sed 's/^ //g'
The space is easily removed with VIM pressing 'Ctrl+V Gd' in normal mode
to remove leading space you can also just use gsub : awk '/>/ {first = $1; $1=""; gsub(/^ /,""); print $0, first}' somefile
C
Ciro Santilli Путлер Капут 六四事

Use the cut command with -f 2- (POSIX) or --complement (not POSIX):

$ echo a b c | cut -f 2- -d ' '
b c
$ echo a b c | cut -f 1 -d ' '
a
$ echo a b c | cut -f 1,2 -d ' '
a b
$ echo a b c | cut -f 1 -d ' ' --complement
b c

While not answering the question specific to awk, I found this most useful as awk was removing duplicate spaces, and cut does not.
echo a b c | cut -d' ' -f 2- is an alternative
Nice - @Luis solution works on the Mac, which doesn't support --complement
P
Peter Mortensen

Maybe the most concise way:

$ awk '{$(NF+1)=$1;$1=""}sub(FS,"")' infile
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN

Explanation:

$(NF+1)=$1: Generator of a "new" last field.

$1="": Set the original first field to null

sub(FS,""): After the first two actions {$(NF+1)=$1;$1=""} get rid of the first field separator by using sub. The final print is implicit.


Maybe I'm missing something, but this didn't work for me on gawk 5.1 -- it just moved the first field to the end of the print instead of ignoring it.
That's because the formulation of the question is misleading @SorenBjornstad. Should be: "How to move the first field to the last position"
This is a beautiful solution to the problem of having a leading or trailing space when doing the $1="" (or $NF="", or whatever you're doing). +1 from me.
N
NeronLeVelu
awk '{sub($1 FS,"")}7' YourFile

Remove the first field and separator, and print the result (7 is a non zero value so printing $0).


Best answer! Upvoted. How's it different from just using 1? I wonder the usage of this pattern and wanted to understand that. thanks!
d
dubiousjim
awk '{ saved = $1; $1 = ""; print substr($0, 2), saved }'

Setting the first field to "" leaves a single copy of OFS at the start of $0. Assuming that OFS is only a single character (by default, it's a single space), we can remove it with substr($0, 2). Then we append the saved copy of $1.


C
Chris Koknat

If you're open to a Perl solution...

perl -lane 'print join " ",@F[1..$#F,0]' file

is a simple solution with an input/output separator of one space, which produces:

United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN

This next one is slightly more complex

perl -F`  ` -lane 'print join "  ",@F[1..$#F,0]' file

and assumes that the input/output separator is two spaces:

United Arab Emirates  AE
Antigua & Barbuda  AG
Netherlands Antilles  AN
American Samoa  AS
Bosnia and Herzegovina  BA
Burkina Faso  BF
Brunei Darussalam  BN

These command-line options are used:

-n loop around every line of the input file, do not automatically print every line

-l removes newlines before processing, and adds them back in afterwards

-a autosplit mode – split input lines into the @F array. Defaults to splitting on whitespace

-F autosplit modifier, in this example splits on ' ' (two spaces)

-e execute the following perl code

@F is the array of words in each line, indexed starting with 0
$#F is the number of words in @F
@F[1..$#F] is an array slice of element 1 through the last element
@F[1..$#F,0] is an array slice of element 1 through the last element plus element 0


I run it and had an extra number at the end so i've used this version: perl -lane 'shift @F; print join " ", @F'
f
fedorqui

Let's move all the records to the next one and set the last one as the first:

$ awk '{a=$1; for (i=2; i<=NF; i++) $(i-1)=$i; $NF=a}1' file
United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN

Explanation

a=$1 save the first value into a temporary variable.

for (i=2; i<=NF; i++) $(i-1)=$i save the Nth field value into the (N-1)th field.

$NF=a save the first value ($1) into the last field.

{}1 true condition to make awk perform the default action: {print $0}.

This way, if you happen to have another field separator, the result is also good:

$ cat c
AE-United-Arab-Emirates
AG-Antigua-&-Barbuda
AN-Netherlands-Antilles
AS-American-Samoa
BA-Bosnia-and-Herzegovina
BF-Burkina-Faso
BN-Brunei-Darussalam

$ awk 'BEGIN{OFS=FS="-"}{a=$1; for (i=2; i<=NF; i++) $(i-1)=$i; $NF=a}1' c
United-Arab-Emirates-AE
Antigua-&-Barbuda-AG
Netherlands-Antilles-AN
American-Samoa-AS
Bosnia-and-Herzegovina-BA
Burkina-Faso-BF
Brunei-Darussalam-BN

D
Dennis Williamson

The field separator in gawk (at least) can be a string as well as a character (it can also be a regex). If your data is consistent, then this will work:

awk -F "  " '{print $2,$1}' inputfile

That's two spaces between the double quotes.


Best answer for the situation at hand, but, technically, this doesn't answer the question of how to print everything but the first field.
@DanMoulding: As long as the file is consistent in the use of two spaces to separate the country code and there are no other occurrences of two spaces together, my answer does address the question.
People who land on this question get here because they want to know how to print everything but the first field (see the question title). That's how I landed here. Your answer shows how to print the first field followed by the second field. While this is probably the best solution to the OP's particular situation, it doesn't solve the general problem of how to print everything but the first field.
A
Arkku

awk '{ tmp = $1; sub(/^[^ ]+ +/, ""); print $0, tmp }'


C
Community

Option 1

There is a solution that works with some versions of awk:

awk '{ $(NF+1)=$1;$1="";$0=$0;} NF=NF ' infile.txt

Explanation:

       $(NF+1)=$1                          # add a new field equal to field 1.
                  $1=""                    # erase the contents of field 1.
                        $0=$0;} NF=NF      # force a re-calc of fields.
                                           # and use NF to promote a print.

Result:

United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN

However that might fail with older versions of awk.

Option 2

awk '{ $(NF+1)=$1;$1="";sub(OFS,"");}1' infile.txt

That is:

awk '{                                      # call awk.
       $(NF+1)=$1;                          # Add one trailing field.
                  $1="";                    # Erase first field.
                        sub(OFS,"");        # remove leading OFS.
                                    }1'     # print the line.

Note that what needs to be erased is the OFS, not the FS. The line gets re-calculated when the field $1 is asigned. That changes all runs of FS to one OFS.

But even that option still fails with several delimiters, as is clearly shown by changing the OFS:

awk -v OFS=';' '{ $(NF+1)=$1;$1="";sub(OFS,"");}1' infile.txt

That line will output:

United;Arab;Emirates;AE
Antigua;&;Barbuda;AG
Netherlands;Antilles;AN
American;Samoa;AS
Bosnia;and;Herzegovina;BA
Burkina;Faso;BF
Brunei;Darussalam;BN

That reveals that runs of FS are being changed to one OFS. The only way to avoid that is to avoid the field re-calculation. One function that can avoid re-calc is sub. The first field could be captured, then removed from $0 with sub, and then both re-printed.

Option 3

awk '{ a=$1;sub("[^"FS"]+["FS"]+",""); print $0, a;}' infile.txt
       a=$1                                   # capture first field.
       sub( "                                 # replace: 
             [^"FS"]+                         # A run of non-FS
                     ["FS"]+                  # followed by a run of FS.
                            " , ""            # for nothing.
                                  )           # Default to $0 (the whole line.
       print $0, a                   # Print in reverse order, with OFS.


United Arab Emirates AE
Antigua & Barbuda AG
Netherlands Antilles AN
American Samoa AS
Bosnia and Herzegovina BA
Burkina Faso BF
Brunei Darussalam BN

Even if we change the FS, the OFS and/or add more delimiters, it works. If the input file is changed to:

AE..United....Arab....Emirates
AG..Antigua....&...Barbuda
AN..Netherlands...Antilles
AS..American...Samoa
BA..Bosnia...and...Herzegovina
BF..Burkina...Faso
BN..Brunei...Darussalam

And the command changes to:

awk -vFS='.' -vOFS=';' '{a=$1;sub("[^"FS"]+["FS"]+",""); print $0,a;}' infile.txt

The output will be (still preserving delimiters):

United....Arab....Emirates;AE
Antigua....&...Barbuda;AG
Netherlands...Antilles;AN
American...Samoa;AS
Bosnia...and...Herzegovina;BA
Burkina...Faso;BF
Brunei...Darussalam;BN

The command could be expanded to several fields, but only with modern awks and with --re-interval option active. This command on the original file:

awk -vn=2 '{a=$1;b=$2;sub("([^"FS"]+["FS"]+){"n"}","");print $0,a,b;}' infile.txt

Will output this:

Arab Emirates AE United
& Barbuda AG Antigua
Antilles AN Netherlands
Samoa AS American
and Herzegovina BA Bosnia
Faso BF Burkina
Darussalam BN Brunei

Z
ZeBadger

There's a sed option too...

 sed 's/\([^ ]*\)  \(.*\)/\2 \1/' inputfile.txt

Explained...

Swap
\([^ ]*\) = Match anything until we reach a space, store in $1
\(.*\)    = Match everything else, store in $2
With
\2        = Retrieve $2
\1        = Retrieve $1

More thoroughly explained...

s    = Swap
/    = Beginning of source pattern
\(   = start storing this value
[^ ] = text not matching the space character
*    = 0 or more of the previous pattern
\)   = stop storing this value
\(   = start storing this value
.    = any character
*    = 0 or more of the previous pattern
\)   = stop storing this value
/    = End of source pattern, beginning of replacement
\2   = Retrieve the 2nd stored value
\1   = Retrieve the 1st stored value
/    = end of replacement

K
Kjetil S.

If you're open to another Perl solution:

perl -ple 's/^(\S+)\s+(.*)/$2 $1/' file

W
Wesley Rice

A first stab at it seems to work for your particular case.

awk '{ f = $1; i = $NF; while (i <= 0); gsub(/^[A-Z][A-Z][ ][ ]/,""); print $i, f; }'

R
Rondo

Yet another way...

...this rejoins the fields 2 thru NF with the FS and outputs one line per line of input

awk '{for (i=2;i<=NF;i++){printf $i; if (i < NF) {printf FS};}printf RS}'

I use this with git to see what files have been modified in my working dir:

git diff| \
    grep '\-\-git'| \
    awk '{print$NF}'| \
    awk -F"/" '{for (i=2;i<=NF;i++){printf $i; if (i < NF) {printf FS};}printf RS}'

This was before I learned about git diff --name-only
J
Julian

Another and easy way using cat command

cat filename | awk '{print $2,$3,$4,$5,$6,$1}' > newfilename

I downvoted because this is not a dynamic approach. With this you need to know the number of arguments and assume your data is consistent. Data is almost never consistent and your approach must take this into account most of the time.