Considering this code snippet:
from os import walk
files = []
for (dirpath, _, filenames) in walk(mydir):
# More code that modifies files
if len(files) == 0: # <-- C1801
return None
I was alarmed by Pylint with this message regarding the line with the if statement:
[pylint] C1801:Do not use len(SEQUENCE) as condition value
The rule C1801, at first glance, did not sound very reasonable to me, and the definition on the reference guide does not explain why this is a problem. In fact, it downright calls it an incorrect use.
len-as-condition (C1801): Do not use len(SEQUENCE) as condition value Used when Pylint detects incorrect use of len(sequence) inside conditions.
My search attempts have also failed to provide me a deeper explanation. I do understand that a sequence's length property may be lazily evaluated, and that __len__
can be programmed to have side effects, but it is questionable whether that alone is problematic enough for Pylint to call such a use incorrect. Hence, before I simply configure my project to ignore the rule, I would like to know whether I am missing something in my reasoning.
When is the use of len(SEQ)
as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?
if files:
or if not files:
len
doesn't know the context in which it is called, so if computing the length means traversing the entire sequence, it must; it doesn't know that the result is just being compared to 0. Computing the boolean value can stop after it sees the first element, regardless of how long the sequence actually is. I think pylint is being a tad opinionated here, though; I can't think of any situation where it is wrong to use len
, just that it's a worse option than the alternative.
When is the use of len(SEQ) as a condition value problematic? What major situations is Pylint attempting to avoid with C1801?
It’s not really problematic to use len(SEQUENCE)
– though it may not be as efficient (see chepner’s comment). Regardless, Pylint checks code for compliance with the PEP 8 style guide which states that
For sequences, (strings, lists, tuples), use the fact that empty sequences are false. Yes: if not seq: if seq: No: if len(seq): if not len(seq):
As an occasional Python programmer, who flits between languages, I’d consider the len(SEQUENCE)
construct to be more readable and explicit (“Explicit is better then implicit”). However, using the fact that an empty sequence evaluates to False
in a Boolean context is considered more “Pythonic”.
Note that the use of len(seq) is in fact required (instead of just checking the bool value of seq) when using NumPy arrays.
a = numpy.array(range(10))
if a:
print "a is not empty"
results in an exception: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
And hence for code that uses both Python lists and NumPy arrays, the C1801 message is less than helpful.
This was a issue in Pylint, and it no longer considers len(x) == 0
as incorrect.
You should not use a bare len(x)
as a condition. Comparing len(x)
against an explicit value, such as if len(x) == 0
of if len(x) > 0
is totally fine and not prohibited by PEP 8.
From PEP 8:
# Correct: if not seq: if seq: # Wrong: if len(seq): if not len(seq):
Note that explicitly testing for the length is not prohibited. The Zen of Python states:
Explicit is better than implicit.
In the choice between if not seq
and if not len(seq)
, both are implicit, but the behaviour is different. But if len(seq) == 0
or if len(seq) > 0
are explicit comparisons and are in many contexts the correct behaviour.
In Pylint, PR 2815 has fixed this bug, first reported as issue 2684. It will continue to complain about if len(seq)
, but it will no longer complain about if len(seq) > 0
. The PR was merged 2019-03-19, so if you are using Pylint 2.4 (released 2019-09-14) you should not see this problem.
Pylint was failing for my code and research led me to this post:
../filename.py:49:11: C1801: Do not use `len(SEQUENCE)` to determine if a sequence is empty (len-as-condition)
../filename.py:49:34: C1801: Do not use `len(SEQUENCE)` to determine if a sequence is empty (len-as-condition)
This was my code before:
def list_empty_folders(directory):
"""The Module Has Been Build to list empty Mac Folders."""
for (fullpath, dirnames, filenames) in os.walk(directory):
if len(dirnames) == 0 and len(filenames) == 0:
print("Exists: {} : Absolute Path: {}".format(
os.path.exists(fullpath), os.path.abspath(fullpath)))
This was after my code fix. By using the int()
attribute
, I seem to have satisfied the Pep8/Pylint and doesn't seem to have a negative impact on my code:
def list_empty_folders(directory):
"""The Module Has Been Build to list empty Mac Folders."""
for (fullpath, dirnames, filenames) in os.walk(directory):
if len(dirnames).__trunc__() == 0 and len(filenames).__trunc__() == 0:
print("Exists: {} : Absolute Path: {}".format(
os.path.exists(fullpath), os.path.abspath(fullpath)))
My Fix
By adding .__trunc__()
to the sequence it seems to have settled the need.
I do not see a difference in the behaviour, but if anyone knows specifics that I am missing, please let me know.
__trunc__()
on the output of len(seq)
, which (somewhat redundantly) truncates the length value to an integer. It only "feints" the lint without addressing the reason behind it. Didn't the suggestion in the accepted answer work for you?
this worked for me
, even if it isn't entirely appropriate. But, to clarify even if it is redundant if you are doing a len(seq) == 0 comparison, trunc shouldn't have to do anything as they are already integers. right?
__trunc__()
doesn't do anything meaningful. Note that I did not refer to the comparison as being redundant, but to this attempt at truncating the length. The warning only disappears because it only expects an expression of the form len(seq) == 0
. I believe that the lint in this case would expect you to replace the if statement with the following: if not dirnames and not filenames:
__bool__
function isn't defined in the underlying sequence.
Success story sharing
if len(fnmatch.filter(os.listdir(os.getcwd()), 'f_*')):
if next(iter(...), None) is not None:
(if the sequence can't containNone
). That's long, but thelen(fnmatch...)
is long too; both need to be split.len(s) == 0
is superior in my opinion is that it is generalizable for other types of sequences. For example,pandas.Series
and numpy arrays.if not s:
is not on the other hand, and in that case you would need to use a separate evaluation for all possible types of arrays-like objects (i.epd.DataFrame.empty
).of collections.abc
classes states__bool__
method. In other words, how can I be sure that I can usebool(seq)
if I know that it's acollections.abc.Collection
? Moreso, some libraries declire that it's forbidden to checkbool(collection)
for their classes.