Pythonic way to combine for-loop and if-statement

python loops if-statement for-loop

I know how to use both for loops and if statements on separate lines, such as:

>>> a = [2,3,4,5,6,7,8,9,0]
... xyz = [0,12,4,6,242,7,9]
... for x in xyz:
...     if x in a:
...         print(x)
0,4,6,7,9

And I know I can use a list comprehension to combine these when the statements are simple, such as:

print([x for x in xyz if x in a])

But what I can't find is a good example anywhere (to copy and learn from) demonstrating a complex set of commands (not just "print x") that occur following a combination of a for loop and some if statements. Something that I would expect looks like:

for x in xyz if x not in a:
    print(x...)

Is this just not the way python is supposed to work?

That's how it is... don't overcomplicate things by trying to simplify them. Pythonic does not mean to avoid every explicit for loop and if statement.

You can use the list generated in your list comprehension in a for loop. That would somewhat look like your last example.

@Chewy, proper data structures will make the code faster, not syntactic sugar. For example, x in a is slow if a is a list.

This is Python, an interpreted language; why is anyone discussing how fast code is at all?

@ArtOfWarfare maybe because it is being used in places where it shouldn't. Where speed really matters.

Ski3r3n

You can use generator expressions like this:

gen = (x for x in xyz if x not in a)

for x in gen:
    print(x)

gen = (y for (x,y) in enumerate(xyz) if x not in a) returns >>> 12 when I type for x in gen: print x -- so why the unexpected behavior with enumerate?

Possible, but not nicer than the original for and if blocks.

@ChewyChunks. That would work but the call to enumerate is redundant.

I really miss in python being able to say for x in xyz if x:

for x in (x for x in xyz if x not in a): works for me, but why you shouldn't just be able to do for x in xyz if x not in a:, I'm not sure...

johnsyweb

As per The Zen of Python (if you are wondering whether your code is "Pythonic", that's the place to go):

Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Flat is better than nested.

Readability counts.

The Pythonic way of getting the sorted intersection of two sets is:

>>> sorted(set(a).intersection(xyz))
[0, 4, 6, 7, 9]

Or those elements that are xyz but not in a:

>>> sorted(set(xyz).difference(a))
[12, 242]

But for a more complicated loop you may want to flatten it by iterating over a well-named generator expression and/or calling out to a well-named function. Trying to fit everything on one line is rarely "Pythonic".

Update following additional comments on your question and the accepted answer

I'm not sure what you are trying to do with enumerate, but if a is a dictionary, you probably want to use the keys, like this:

>>> a = {
...     2: 'Turtle Doves',
...     3: 'French Hens',
...     4: 'Colly Birds',
...     5: 'Gold Rings',
...     6: 'Geese-a-Laying',
...     7: 'Swans-a-Swimming',
...     8: 'Maids-a-Milking',
...     9: 'Ladies Dancing',
...     0: 'Camel Books',
... }
>>>
>>> xyz = [0, 12, 4, 6, 242, 7, 9]
>>>
>>> known_things = sorted(set(a.iterkeys()).intersection(xyz))
>>> unknown_things = sorted(set(xyz).difference(a.iterkeys()))
>>>
>>> for thing in known_things:
...     print 'I know about', a[thing]
...
I know about Camel Books
I know about Colly Birds
I know about Geese-a-Laying
I know about Swans-a-Swimming
I know about Ladies Dancing
>>> print '...but...'
...but...
>>>
>>> for thing in unknown_things:
...     print "I don't know what happened on the {0}th day of Christmas".format(thing)
...
I don't know what happened on the 12th day of Christmas
I don't know what happened on the 242th day of Christmas

Sounds like from the comments below, I should be studying up on generators. I've never used them. Thanks. Is a generator faster than the equivalent combination of FOR and IF statements? I've also used sets, but sometimes redundant elements in a list are information I can't discard.

@ChewyChunks: Generators are not the only way to be Pythonic!

@Johnsyweb, if you're going to quote the Zen of Python: "There should be one-- and preferably only one --obvious way to do it."

@Wooble: There should. I quoted that section in my answer to another question around the same time!

the python language fails on three counts of the zen of python: and I disagree with the other three (explicit, simple, flat). I'm no newbie: it has been my primary language for 30 months and I did major project[s] using it every year since 2012 . Is this comment off topic? Given the zen was put in relief in the question not necessarily

WestCoastProjects

The following is a simplification/one liner from the accepted answer:

a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]

for x in (x for x in xyz if x not in a):
    print(x)

12
242

Notice that the generator was kept inline. This was tested on python2.7 and python3.6 (notice the parens in the print ;) )

It is honestly cumbersome even so: the x is mentioned four times.

Alexander Oh

I personally think this is the prettiest version:

a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]
for x in filter(lambda w: w in a, xyz):
  print x

Edit

if you are very keen on avoiding to use lambda you can use partial function application and use the operator module (that provides functions of most operators).

https://docs.python.org/2/library/operator.html#module-operator

from operator import contains
from functools import partial
print(list(filter(partial(contains, a), xyz)))

filter(a.__contains__, xyz). Usually when people use lambda, they really need something much simpler.

I think you misunderstood something. __contains__ is a method like any other, only it is a special method, meaning it can be called indirectly by an operator (in in this case). But it can also be called directly, it is a part of the public API. Private names are specifically defined as having at most one trailing underscore, to provide exception for special method names - and they are subject to name mangling when lexically in class scopes. See docs.python.org/3/reference/datamodel.html#specialnames and docs.python.org/3.6/tutorial/classes.html#private-variables .

It is certainly ok, but two imports just to be able to refer to a method that's accessible using just an attribute seems weird (operators are usually used when double dispatch is essential, but in is singly dispatched wrt right operand). Besides, note that operator also exports contains method under the name __contains__, so it surely is not a private name. I think you'll just have to learn to live with the fact that not every double underscore means "keep away". :-]

I think your lambda needs fixing to include not : lambda w: not w in a, xyz

The filter seems more elegant, especially for complex conditions that would become defined functions instead of lambdas, maybe naming the lambda function would add some readability, The generator seems better when the iterated elements are some modification on the list items

Wim Feijen

I would probably use:

for x in xyz: 
    if x not in a:
        print(x...)

@KirillTitov Yes python is a fundamentally non-functional language (this is a purely imperative coding - and I agree with this answer's author that it is the way python is set up to be written. Attempting to use functionals leads to poorly reading or non-pythonic results. I can code functionally in every other language I use (scala, kotlin, javascript, R, swift, ..) but difficult/awkward in python

sloth

a = [2,3,4,5,6,7,8,9,0]
xyz = [0,12,4,6,242,7,9]  
set(a) & set(xyz)  
set([0, 9, 4, 6, 7])

Very Zen, @lazyr, but would not help me improve a complex code block that depends on iterating through one list and ignoring matching elements in another list. Is it faster to treat the first list as a set and compare union / difference with a second, growing "ignore" list?

Try this

import time a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] start = time.time() print (set(a) & set(xyz)) print time.time() - start

@ChewyChunks if either of the lists change during the iteration it will probably be faster to check each element against the ignore list -- except you should make it an ignore set. Checking for membership in sets is very fast: if x in ignore: ....

@lazyr I just rewrote my code using an ignore set over an ignore list. Appears to process time much slower. (To be fair I was comparing using if set(a) - set(ignore) == set([]): so perhaps that's why it was much slower than checking membership. I'll test this again in the future on a much simpler example than what I'm writing.

Lauritz V. Thaulow

You can use generators too, if generator expressions become too involved or complex:

def gen():
    for x in xyz:
        if x in a:
            yield x

for x in gen():
    print x

This is a bit more useful to me. I've never looked at generators. They sound scary (because I saw them in modules that were generally a pain to use).

Khanis Rok

I liked Alex's answer, because a filter is exactly an if applied to a list, so if you want to explore a subset of a list given a condition, this seems to be the most natural way

mylist = [1,2,3,4,5]
another_list = [2,3,4]

wanted = lambda x:x in another_list

for x in filter(wanted, mylist):
    print(x)

this method is useful for the separation of concerns, if the condition function changes, the only code to fiddle with is the function itself

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

for x in filter(wanted, mylist):
    print(x)

The generator method seems better when you don't want members of the list, but a modification of said members, which seems more fit to a generator

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

generator = (x**0.5 for x in mylist if wanted(x))

for x in generator:
    print(x)

Also, filters work with generators, although in this case it isn't efficient

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

generator = (x**0.9 for x in mylist)

for x in filter(wanted, generator):
    print(x)

But of course, it would still be nice to write like this:

mylist = [1,2,3,4,5]

wanted = lambda x:(x**0.5) > 10**0.3

# for x in filter(wanted, mylist):
for x in mylist if wanted(x):
    print(x)

Chung-Yen Hung

Use intersection or intersection_update

intersection : a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] ans = sorted(set(a).intersection(set(xyz)))

intersection_update: a = [2,3,4,5,6,7,8,9,0] xyz = [0,12,4,6,242,7,9] b = set(a) b.intersection_update(xyz) then b is your answer

peawormsworth

A simple way to find unique common elements of lists a and b:

a = [1,2,3]
b = [3,6,2]
for both in set(a) & set(b):
    print(both)

m.hasheminejad

based on the article here: https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a I used the following code for the same reason and it worked just fine:

an_array = [x for x in xyz if x not in a]

This line is a part of the program! this means that XYZ is an array which is to be defined and assigned previously, and also the variable a

Using generator expressions (which is recommended in the selected answer) makes some difficulties because the result is not an array

Pythonic way to combine for-loop and if-statement

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US