In Django, what's the difference between the following two:
Article.objects.values_list('comment_id', flat=True).distinct()
VS
Article.objects.values('comment_id').distinct()
My goal is to get a list of unique comment ids under each Article
. I've read the documentation (and in fact have used both approaches). The results overtly seem similar.
if self.id in Article.objects.values_list('comment_id', flat=True):
while using values you need to access the dictionary
Article.objects.filter(comment_id=self.id).exists()
?
The values()
method returns a QuerySet containing dictionaries:
<QuerySet [{'comment_id': 1}, {'comment_id': 2}]>
The values_list()
method returns a QuerySet containing tuples:
<QuerySet [(1,), (2,)]>
If you are using values_list()
with a single field, you can use flat=True
to return a QuerySet of single values instead of 1-tuples:
<QuerySet [1, 2]>
values()
Returns a QuerySet that returns dictionaries
, rather than model instances, when used as an iterable.
values_list()
Returns a QuerySet that returns list of tuples
, rather than model instances, when used as an iterable.
distinct()
distinct are used to eliminate the duplicate
elements.
Example:
>>> list(Article.objects.values_list('id', flat=True)) # flat=True will remove the tuples and return the list
[1, 2, 3, 4, 5, 6]
>>> list(Article.objects.values('id'))
[{'id':1}, {'id':2}, {'id':3}, {'id':4}, {'id':5}, {'id':6}]
values
distinct()
eliminates the duplicate elements from query results, not from database.
You can get the different values with:
set(Article.objects.values_list('comment_id', flat=True))
distinct()
to eliminate duplicates at database level.
distinct()
) will stream data only when needed.
"values()" returns a QuerySet of dictionaries.
For example:
print(User.objects.all().values()) # Return all fields
# <QuerySet [{'id': 1, 'name': 'John'}, {'id': 2, 'name': 'Tom'}]>
print(User.objects.all().values("name")) # Return "name" field
# <QuerySet [{'name': 'John'}, {'name': 'Tom'}]>
"values_list()" returns a QuerySet of tuples.
For example:
print(User.objects.all().values_list()) # Return all fields
# <QuerySet [(1, 'John'), (2, 'Tom')]>
print(User.objects.all().values_list("name")) # Return "name" field
# <QuerySet [('John',), ('Tom',)]>
"values_list()" with "flat=True" returns a QuerySet of values. *No or One field with "flat=True" is allowed and one field must be the 1st argument with "flat=True" which must be the 2nd argument.
For example:
print(User.objects.all().values_list(flat=True)) # Return "id" field
# <QuerySet [1, 2]>
print(User.objects.all().values_list("name", flat=True)) # Return "name" field
# <QuerySet ['John', 'Tom']>
print(User.objects.all().values_list(flat=True, "name")) # Error
print(User.objects.all().values_list("id", "name", flat=True)) # Error
The best place to understand the difference is at the official documentation on values / values_list. It has many useful examples and explains it very clearly. The django docs are very user freindly.
Here's a short snippet to keep SO reviewers happy:
values
Returns a QuerySet that returns dictionaries, rather than model instances, when used as an iterable.
And read the section which follows it:
value_list
This is similar to values() except that instead of returning dictionaries, it returns tuples when iterated over.
Success story sharing
distinct()
is used huh?distinct()
works any differently. The important thing is which data structure that you want to work with.values()
returns aQuerySet
and not alist
. Although the object returned byvalues()
looks like alist
, it doesn't behave like one in some cases. For example, it won't be json serializable unless we convert it into a `list'values_list
to a true Python list by just using thelist
function:list(Article.objects.values_list('comment_id', flat=True).distinct())