Why is the use of len(SEQUENCE) in condition values considered incorrect by Pylint?

python conditional-statements pylint

Considering this code snippet:

from os import walk

files = []
for (dirpath, _, filenames) in walk(mydir):
    # More code that modifies files
if len(files) == 0: # <-- C1801
    return None

I was alarmed by Pylint with this message regarding the line with the if statement:

[pylint] C1801:Do not use len(SEQUENCE) as condition value

The rule C1801, at first glance, did not sound very reasonable to me, and the definition on the reference guide does not explain why this is a problem. In fact, it downright calls it an incorrect use.

len-as-condition (C1801): Do not use len(SEQUENCE) as condition value Used when Pylint detects incorrect use of len(sequence) inside conditions.

My search attempts have also failed to provide me a deeper explanation. I do understand that a sequence's length property may be lazily evaluated, and that __len__ can be programmed to have side effects, but it is questionable whether that alone is problematic enough for Pylint to call such a use incorrect. Hence, before I simply configure my project to ignore the rule, I would like to know whether I am missing something in my reasoning.

When is the use of len(SEQ) as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?

Because you can evaluate the truthiness of the sequence directly. pylint wants you to do if files: or if not files:

len doesn't know the context in which it is called, so if computing the length means traversing the entire sequence, it must; it doesn't know that the result is just being compared to 0. Computing the boolean value can stop after it sees the first element, regardless of how long the sequence actually is. I think pylint is being a tad opinionated here, though; I can't think of any situation where it is wrong to use len, just that it's a worse option than the alternative.

@E_net4 I think that PEP-8 is probably the place to start.

@E_net4 I've submitted an issue about not giving reasonable suggestion as a potential enhancement on Pylint's GitHub.

SEQUENCES need an 'empty()' or 'isempty()' like C++ imo.

Anthony Geoghegan

When is the use of len(SEQ) as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?

It’s not really problematic to use len(SEQUENCE) – though it may not be as efficient (see chepner’s comment). Regardless, Pylint checks code for compliance with the PEP 8 style guide which states that

For sequences, (strings, lists, tuples), use the fact that empty sequences are false. Yes: if not seq: if seq: No: if len(seq): if not len(seq):

As an occasional Python programmer, who flits between languages, I’d consider the len(SEQUENCE) construct to be more readable and explicit (“Explicit is better then implicit”). However, using the fact that an empty sequence evaluates to False in a Boolean context is considered more “Pythonic”.

How to make this work then: if len(fnmatch.filter(os.listdir(os.getcwd()), 'f_*')):

@Marichyasana I guess things like that can (theoretically) be written as if next(iter(...), None) is not None: (if the sequence can't contain None). That's long, but the len(fnmatch...) is long too; both need to be split.

I'm also an occassional Python user and often I have the impression that the "Pythonic way" got kind of tangled in its own ambiguity.

Just a general question, can these PEP recommendations be revised? Another reason why the len(s) == 0 is superior in my opinion is that it is generalizable for other types of sequences. For example, pandas.Series and numpy arrays. if not s: is not on the other hand, and in that case you would need to use a separate evaluation for all possible types of arrays-like objects (i.e pd.DataFrame.empty).

By the way, none of collections.abc classes states __bool__ method. In other words, how can I be sure that I can use bool(seq) if I know that it's a collections.abc.Collection? Moreso, some libraries declire that it's forbidden to check bool(collection) for their classes.

Cameron Hayne

Note that the use of len(seq) is in fact required (instead of just checking the bool value of seq) when using NumPy arrays.

a = numpy.array(range(10))
if a:
    print "a is not empty"

results in an exception: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

And hence for code that uses both Python lists and NumPy arrays, the C1801 message is less than helpful.

I agree with your statement. With issue #1405 now raised, I hope to see C1801 either reformed to something useful or disabled by default.

plus it is useless for checking if a sequence has a given number of elements. It's only good for checking it it is completely empty in the best of cases.

Peter Mortensen

This was a issue in Pylint, and it no longer considers len(x) == 0 as incorrect.

You should not use a bare len(x) as a condition. Comparing len(x) against an explicit value, such as if len(x) == 0 of if len(x) > 0 is totally fine and not prohibited by PEP 8.

From PEP 8:

# Correct: if not seq: if seq: # Wrong: if len(seq): if not len(seq):

Note that explicitly testing for the length is not prohibited. The Zen of Python states:

Explicit is better than implicit.

In the choice between if not seq and if not len(seq), both are implicit, but the behaviour is different. But if len(seq) == 0 or if len(seq) > 0 are explicit comparisons and are in many contexts the correct behaviour.

In Pylint, PR 2815 has fixed this bug, first reported as issue 2684. It will continue to complain about if len(seq), but it will no longer complain about if len(seq) > 0. The PR was merged 2019-03-19, so if you are using Pylint 2.4 (released 2019-09-14) you should not see this problem.

JayRizzo

Pylint was failing for my code and research led me to this post:

../filename.py:49:11: C1801: Do not use `len(SEQUENCE)` to determine if a sequence is empty (len-as-condition)
../filename.py:49:34: C1801: Do not use `len(SEQUENCE)` to determine if a sequence is empty (len-as-condition)

This was my code before:

def list_empty_folders(directory):
"""The Module Has Been Build to list empty Mac Folders."""
for (fullpath, dirnames, filenames) in os.walk(directory):
    if len(dirnames) == 0 and len(filenames) == 0:
        print("Exists: {} : Absolute Path: {}".format(
            os.path.exists(fullpath), os.path.abspath(fullpath)))

This was after my code fix. By using the int() attribute, I seem to have satisfied the Pep8/Pylint and doesn't seem to have a negative impact on my code:

def list_empty_folders(directory):
"""The Module Has Been Build to list empty Mac Folders."""
for (fullpath, dirnames, filenames) in os.walk(directory):
    if len(dirnames).__trunc__() == 0 and len(filenames).__trunc__() == 0:
        print("Exists: {} : Absolute Path: {}".format(
            os.path.exists(fullpath), os.path.abspath(fullpath)))

My Fix

By adding .__trunc__() to the sequence it seems to have settled the need.

I do not see a difference in the behaviour, but if anyone knows specifics that I am missing, please let me know.

You are calling __trunc__() on the output of len(seq), which (somewhat redundantly) truncates the length value to an integer. It only "feints" the lint without addressing the reason behind it. Didn't the suggestion in the accepted answer work for you?

Not in my attempts. I understand the redundancy, but even after this issue has been addressed by the developers in github.com/PyCQA/pylint/issues/1405 & 2684 and has been merged in, to my understanding this shouldn't be an issue when running pylint but I still see this issue even after updating my pylint. I Just wanted to share, as this worked for me, even if it isn't entirely appropriate. But, to clarify even if it is redundant if you are doing a len(seq) == 0 comparison, trunc shouldn't have to do anything as they are already integers. right?

Exactly, it is already an integer, and __trunc__() doesn't do anything meaningful. Note that I did not refer to the comparison as being redundant, but to this attempt at truncating the length. The warning only disappears because it only expects an expression of the form len(seq) == 0. I believe that the lint in this case would expect you to replace the if statement with the following: if not dirnames and not filenames:

Testing for truthiness has the unintended consequences of being "always true" if the __bool__ function isn't defined in the underlying sequence.

Why is the use of len(SEQUENCE) in condition values considered incorrect by Pylint?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US