ChatGPT解决这个技术问题 Extra ChatGPT

View tabular file such as CSV from command line [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers. Closed 3 years ago. We don’t allow questions about general computing hardware and software on Stack Overflow. You can edit the question so it’s on-topic for Stack Overflow. We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations. Improve this question

Anyone know of a command-line CSV viewer for Linux/OS X? I'm thinking of something like less but that spaces out the columns in a more readable way. (I'd be fine with opening it with OpenOffice Calc or Excel, but that's way too overpowered for just looking at the data like I need to.) Having horizontal and vertical scrolling would be great.

Since i can't give an answer: SC-IM is a CLI viewer and editor for tables that can also open CSV. github.com/andmarti1424/sc-im

L
Lekensteyn

You can also use this:

column -s, -t < somefile.csv | less -#2 -N -S

column is a standard unix program that is very convenient -- it finds the appropriate width of each column, and displays the text as a nicely formatted table.

Note: whenever you have empty fields, you need to put some kind of placeholder in it, otherwise the column gets merged with following columns. The following example demonstrates how to use sed to insert a placeholder:

$ cat data.csv
1,2,3,4,5
1,,,,5
$ sed 's/,,/, ,/g;s/,,/, ,/g' data.csv | column -s, -t
1  2  3  4  5
1           5
$ cat data.csv
1,2,3,4,5
1,,,,5
$ column -s, -t < data.csv
1  2  3  4  5
1  5
$ sed 's/,,/, ,/g;s/,,/, ,/g' data.csv | column -s, -t
1  2  3  4  5
1           5

Note that the substitution of ,, for , , is done twice. If you do it only once, 1,,,4 will become 1, ,,4 since the second comma is matched already.


I really like this option -- it's good to know about column. I ended up making this a short shell script (most of it is boilerplate "how do I use it?" and error checking code). github.com/benjaminoakes/utilities/blob/master/view-csv
The 'Debian GNU/Linux' version of column has the '-n' option: "By default, the column command will merge multiple adjacent delimiters into a single delimiter when using the -t option; this option disables that behavior. This option is a Debian GNU/Linux extension."
It seems to break if you have column values (quoted) with commas in them. Any idea how to fix this?
from man column: -n By default, the column command will merge multiple adjacent delimiters into a single delimiter when using the -t option; this option disables that behavior. This option is a Debian GNU/Linux extension.
Unfortunately if a value contains a comma, it will be split even if it is quoted.
B
Boris Verkhovskiy

You can install csvtool (on Ubuntu) via

sudo apt-get install csvtool

and then run:

csvtool readable filename | view -

This will make it nice and pretty inside of a read-only vim instance, even if you have some cells with very long values.


For those not on Debian-base distros, this tool seems to originate from here: docs.camlcity.org/docs/godisrc/ocaml-csv-1.1.6.tar.gz Unfortunately the "homepage" link is dead, and I don't see an easy way to download the whole archive in a go.
The tool can't handle files with 100Mb+
This tool is available from the ocaml-csv package in the base for me in Centos7
S
Sandeep

Have a look at csvkit. It provides a set of tools that adhere to the UNIX philosophy (meaning they are small, simple, single-purposed and can be combined).

Here is an example that extracts the ten most populated cities in Germany from the free Maxmind World Cities database and displays the result in a console-readable format:

$ csvgrep -e iso-8859-1 -c 1 -m "de" worldcitiespop | csvgrep -c 5 -r "\d+" 
  | csvsort -r -c 5 -l | csvcut -c 1,2,4,6 | head -n 11 | csvlook
-----------------------------------------------------
|  line_number | Country | AccentCity | Population  |
-----------------------------------------------------
|  1           | de      | Berlin     | 3398362     |
|  2           | de      | Hamburg    | 1733846     |
|  3           | de      | Munich     | 1246133     |
|  4           | de      | Cologne    | 968823      |
|  5           | de      | Frankfurt  | 648034      |
|  6           | de      | Dortmund   | 594255      |
|  7           | de      | Stuttgart  | 591688      |
|  8           | de      | Düsseldorf | 577139      |
|  9           | de      | Essen      | 576914      |
|  10          | de      | Bremen     | 546429      |
-----------------------------------------------------

Csvkit is platform independent because it is written in Python.


Works great on my MAC. Very useful for reading large files.
I like Csvkit. csvlook | less -S
To get csvkit you can just pip install it: pip install csvkit. Enjoy!
The link to the Maxmind database is dead
One can use brew also to install this, just run brew install csvkit
S
Scott Hansen

Tabview: lightweight python curses command line CSV file viewer (and also other tabular Python data, like a list of lists) is here on Github

Features:

Python 2.7+, 3.x

Unicode support

Spreadsheet-like view for easily visualizing tabular data

Vim-like navigation (h,j,k,l, g(top), G(bottom), 12G goto line 12, m - mark, ' - goto mark, etc.)

Toggle persistent header row

Dynamically resize column widths and gap

Sort ascending or descending by any column. 'Natural' order sort for numeric values.

Full-text search, n and p to cycle between search results

'Enter' to view the full cell contents

Yank cell contents to clipboard

F1 or ? for keybindings

Can also use from python command line to visualize any tabular data (e.g. list-of-lists)


Great tool. Opened a huge file that crashed csvtool and openoffice. Very fast too.
After 'pip install tabview' on windows successfully, how do I launch the program? I can use 'tabview file.csv' on linux successfully, but windows does not seem to work. Thanks!
I don't believe the curses module is available on Windows. Sorry! There may be a third party module available but I haven't done any development for Windows.
@CiroSantilli烏坎事件2016六四事件法轮功, unfortunately not yet. I'm hoping to put some time into tabview soon...it's been rather dormant for awhile here. :(
TabView now recommends VisiData which is just an amazing interactive viewer for CSV files. jsvine.github.io/intro-to-visidata
M
Matt Ball

If you're a vimmer, use the CSV plugin, which is juuust beautiful:

https://www.256bit.org/%7Echrisbra/csv.gif


Too slow, but the idea is cool nonetheless.
u
user3751385

The nodejs package tecfu/tty-table can be globally installed to do precisely this:

apt-get install nodejs
npm i -g tty-table
cat data.csv | tty-table

https://i.stack.imgur.com/3KcvX.png

It can also handle streams.

For more info, see the docs for terminal usage here.


Please leave a reason if you downvote. This package works and works well.
nodejs is a webserver platform. You should not recommend someone to cut bread with a chainsaw.
node is a general purpose scripting system with CLI bindings, how is that different from using a perl one-liner or something from CPAN?
I really like this option, but when I pipe it to less, it doesn't look right. Do you know if something extra is required to make it work with less?
This package breaks if the file contains many columns (in particular more than the horizontal width of the terminal screen can handle) and doesn't align them properly thereafter.
s
smartmic

xsv is more than a viewer. I recommend it for most CSV task on the command line, especially when dealing with large datasets.


rust based. same author as ripgrep. Very cool.
p
pisswillis

Ofri's answer gives you everything you asked for. But.. if you don't want to remember the command you can add this to your ~/.bashrc (or equivalent):

csview()
{
local file="$1"
sed "s/,/\t/g" "$file" | less -S
}

This is exactly the same as Ofri's answer except I have wrapped it in a shell function and am using the less -S option to stop the wrapping of lines (makes less behaves more like a office/oocalc).

Open a new shell (or type source ~/.bashrc in your current shell) and run the command using:

csview <filename>


This doesn't handle comma in quotations.
T
Tom Weiss

I used pisswillis's answer for a long time.

csview()
{
    local file="$1"
    sed "s/,/\t/g" "$file" | less -S
}

But then combined some code I found at http://chrisjean.com/2011/06/17/view-csv-data-from-the-command-line which works better for me:

csview()
{
    local file="$1"
    cat "$file" | sed -e 's/,,/, ,/g' | column -s, -t | less -#5 -N -S
}

The reason it works better for me is that it handles wide columns better.


N
Nikos Alexandris

Yet another multi-functional CSV (and not only) manipulation tool: Miller. From its own description, it is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON. (link to github repository: https://github.com/johnkerl/miller)


I landed here through a search engine so if somebody finds this answer, here's a handy miller command line that pretty-prints CSV headers, draws table borders and right-aligns the column values: mlr --icsv --opprint --barred --right cat YOUR_FILE.csv (replace --icsv with --itsv if your file is TSV).
O
Ofri Raviv

Here's a (probably too) simple option:

sed "s/,/\t/g" filename.csv | less

That was my first inclination as well. But you have to insert enough tabs to match the longest value for your column... Started getting a little complicated and I thought "someone else must have done this already."
You're also ignoring the fact that commas might be quoted and therefore not separators. (amongst other things)
s
stefan.schroedl

tblless in the Tabulator package wraps the unix column command, and also aligns numeric columns.


It's not bad, it works reliably but formatting can definitely be better, and it doesn't infer maximum column widths well -- it kind of blindly enforces an arbitrary limit. Miller (mlr) definitely does it better.
N
Nico Schlömer

I've created tablign for these (and other) purposes. Install with

pip install tablign

and

$ cat test.csv
Header1,Header2,Header3
Pizza,Artichoke dip,Bob's Special of the Day
BLT,Ham on rye with the works,
$ tablign test.csv
Header1 , Header2                   , Header3
Pizza   , Artichoke dip             , Bob's Special of the Day
BLT     , Ham on rye with the works ,

Also works if the data is separated by something else than commas. Most importantly, it preserves the delimiters so you can also use it to style your ASCII tables without sacrificing your [Markdown,CSV,LaTeX] syntax.


Collecting tablify Could not find a version that satisfies the requirement tablify (from versions: ) No matching distribution found for tablify
@masterxilo I'd renamed it to tablign. Fixed in the description.
Perfect, just works.
Looks good but uses a lot of memory(couple of GB) on a 70MB file.
J
Jean Vincent

I wrote this csv_view.sh to format CSVs from the command line, this reads the entire file to figure out the optimal width of each column (requires perl, assumes there are no commas in fields, also uses less):


#!/bin/bash

perl -we '
  sub max( @ ) {
    my $max = shift;

    map { $max = $_ if $_ > $max } @_;
    return $max;
  }

  sub transpose( @ ) {
    my @matrix = @_;
    my $width  = scalar @{ $matrix[ 0 ] };
    my $height = scalar @matrix;

    return map { my $x = $_; [ map { $matrix[ $_ ][ $x ] } 0 .. $height - 1 ] } 0 .. $width - 1;
  }

  # Read all lines, as arrays of fields
  my @lines = map { s/\r?\n$//; [ split /,/ ] } ;

  my $widths =
    # Build a pack expression based on column lengths
    join "",

    # For each column get the longest length plus 1
    map { 'A' . ( 1 + max map { length } @$_ ) }

    # Get arrays of columns
    transpose

    @lines
  ;

  # Format all lines with pack
  map { print pack( $widths, @$_ ) . "\n" } @lines;
' $1 | less -NS


R
Raptor

Using TxtSushi you can do:

csvtopretty filename.csv | less -S

Downvote for not being a one line install procedure. I don't have the time to compile this :(. If you could provide a package that would be awesome.
@masterxilo that's not a valid reason to downvote. Many packages today require several steps to install. Plus, it would probably be faster to install than to write the comment.
p
pratyahara

Tabview is really good. Worked with 200+MB files that displayed nicely which were buggy with LibreOffice as well as csv plugin in gvim.

The Anaconda version is available here: https://anaconda.org/bioconda/tabview


J
James Durbin

I wrote a script, viewtab , in Groovy for just this purpose. You invoke it like:

viewtab filename.csv

It is basically a super-lightweight spreadsheet that can be invoked from the command line, handles CSV and tab separated files, can read VERY large files that Excel and Numbers choke on, and is very fast. It's not command-line in the sense of being text-only, but it is platform independent and will probably fit the bill for many people looking for a solution to the problem of quickly inspecting many or large CSV files while working in a command line environment.

The script and how to install it are described here:

http://bayesianconspiracy.blogspot.com/2012/06/quick-csvtab-file-viewer.html


R
Rufus Pollock

There's this short command line script in python: https://github.com/rgrp/csv2ascii/blob/master/csv2ascii.py

Just download and place in your path. Usage is like

csv2ascii.py [options] csv-file-path

Convert csv file at csv-file-path to ascii form returning the result on stdout. If csv-file-path = '-' then read from stdin.

Options:

-h, --help            show this help message and exit
  -w WIDTH, --width=WIDTH
                        Width of ascii output
  -c COLUMNS, --columns=COLUMNS
                        Only display this number of columns