How to sort by column in descending order in Spark SQL?

scala apache-spark apache-spark-sql

I tried df.orderBy("col1").show(10) but it sorted in ascending order. df.sort("col1").show(10) also sorts in ascending order. I looked on stackoverflow and the answers I found were all outdated or referred to RDDs. I'd like to use the native dataframe in spark.

He means "df.sort("col1").show(10) also sorts in ascending order"

This solution worked perfectly for me : stackoverflow.com/a/38575271/5957143

Gabber

You can also sort the column by importing the spark sql functions

import org.apache.spark.sql.functions._
df.orderBy(asc("col1"))

import org.apache.spark.sql.functions._
df.sort(desc("col1"))

importing sqlContext.implicits._

import sqlContext.implicits._
df.orderBy($"col1".desc)

import sqlContext.implicits._
df.sort($"col1".desc)

also when you're ordering ascending by all columns, the asc keyword is not necessary: ..orderBy("col1", "col2").

Sky

It's in org.apache.spark.sql.DataFrame for sort method:

df.sort($"col1", $"col2".desc)

Note $ and .desc inside sort for the column to sort the results by.

import org.apache.spark.sql.functions._ and import sqlContext.implicits._ also get you a lot of nice functionality.

@Vedom: Shows a syntax error: df.sort($"Time1", $"Time2".desc) SyntaxError: invalid syntax at the $ symbol

@kaks, need to import functions/implicits as described above to avoid that error

Nic Scozzaro

PySpark only

I came across this post when looking to do the same in PySpark. The easiest way is to just add the parameter ascending=False:

df.orderBy("col1", ascending=False).show(10)

Reference: http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.DataFrame.orderBy

The question is marked with a scala tag, but this answer is for python only as this syntax as well as a function signature are python-only.

Paul Reiners

import org.apache.spark.sql.functions.desc

df.orderBy(desc("columnname1"),desc("columnname2"),asc("columnname3"))

This is a duplicate answer from the one 3 years earlier by @AmitDubey. should be removed in favor of that one.

OneCricketeer

df.sort($"ColumnName".desc).show()

zx485

In the case of Java:

If we use DataFrames, while applying joins (here Inner join), we can sort (in ASC) after selecting distinct elements in each DF as:

Dataset<Row> d1 = e_data.distinct().join(s_data.distinct(), "e_id").orderBy("salary");

where e_id is the column on which join is applied while sorted by salary in ASC.

Also, we can use Spark SQL as:

SQLContext sqlCtx = spark.sqlContext();
sqlCtx.sql("select * from global_temp.salary order by salary desc").show();

where

spark -> SparkSession

salary -> GlobalTemp View.

How to sort by column in descending order in Spark SQL?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US