ChatGPT解决这个技术问题 Extra ChatGPT

How to sort by column in descending order in Spark SQL?

I tried df.orderBy("col1").show(10) but it sorted in ascending order. df.sort("col1").show(10) also sorts in ascending order. I looked on stackoverflow and the answers I found were all outdated or referred to RDDs. I'd like to use the native dataframe in spark.

He means "df.sort("col1").show(10) also sorts in ascending order"
This solution worked perfectly for me : stackoverflow.com/a/38575271/5957143

G
Gabber

You can also sort the column by importing the spark sql functions

import org.apache.spark.sql.functions._
df.orderBy(asc("col1"))

Or

import org.apache.spark.sql.functions._
df.sort(desc("col1"))

importing sqlContext.implicits._

import sqlContext.implicits._
df.orderBy($"col1".desc)

Or

import sqlContext.implicits._
df.sort($"col1".desc)

also when you're ordering ascending by all columns, the asc keyword is not necessary: ..orderBy("col1", "col2").
S
Sky

It's in org.apache.spark.sql.DataFrame for sort method:

df.sort($"col1", $"col2".desc)

Note $ and .desc inside sort for the column to sort the results by.


import org.apache.spark.sql.functions._ and import sqlContext.implicits._ also get you a lot of nice functionality.
@Vedom: Shows a syntax error: df.sort($"Time1", $"Time2".desc) SyntaxError: invalid syntax at the $ symbol
@kaks, need to import functions/implicits as described above to avoid that error
N
Nic Scozzaro

PySpark only

I came across this post when looking to do the same in PySpark. The easiest way is to just add the parameter ascending=False:

df.orderBy("col1", ascending=False).show(10)

Reference: http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.DataFrame.orderBy


The question is marked with a scala tag, but this answer is for python only as this syntax as well as a function signature are python-only.
P
Paul Reiners
import org.apache.spark.sql.functions.desc

df.orderBy(desc("columnname1"),desc("columnname2"),asc("columnname3"))

This is a duplicate answer from the one 3 years earlier by @AmitDubey. should be removed in favor of that one.
O
OneCricketeer
df.sort($"ColumnName".desc).show()

z
zx485

In the case of Java:

If we use DataFrames, while applying joins (here Inner join), we can sort (in ASC) after selecting distinct elements in each DF as:

Dataset<Row> d1 = e_data.distinct().join(s_data.distinct(), "e_id").orderBy("salary");

where e_id is the column on which join is applied while sorted by salary in ASC.

Also, we can use Spark SQL as:

SQLContext sqlCtx = spark.sqlContext();
sqlCtx.sql("select * from global_temp.salary order by salary desc").show();

where

spark -> SparkSession

salary -> GlobalTemp View.


关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now