ChatGPT解决这个技术问题 Extra ChatGPT

Where to get "UTF-8" string literal in Java?

I'm trying to use a constant instead of a string literal in this piece of code:

new InputStreamReader(new FileInputStream(file), "UTF-8")

"UTF-8" appears in the code rather often, and would be much better to refer to some static final variable instead. Do you know where I can find such a variable in JDK?

BTW, on a second thought, such constants are bad design: Public Static Literals ... Are Not a Solution for Data Duplication

Note: if you are already on Java 7, use Files.newBufferedWriter(Path path, Charset cs) from NIO.
That's some really bad advice from your link. He wants you to make a wrapper class for every possible string constant you might use?

J
Jameson

In Java 1.7+, java.nio.charset.StandardCharsets defines constants for Charset including UTF_8.

import java.nio.charset.StandardCharsets;

...

StandardCharsets.UTF_8.name();

For Android: minSdk 19


do you use .toString() on that?
.toString() will work but the proper function is .name(). 99.9% toString is not the answer.
btw .displayName() will also work unless it is overridden for localization as intended.
You don't really need to call name() at all. You can directly pass the Charset object into the InputStreamReader constructor.
And there are other libs out there which do require a String, perhaps because of legacy reasons. In such cases, I keep a Charset object around, typically derived from StandardCharsets, and use name() if needed.
y
yegor256

Now I use org.apache.commons.lang3.CharEncoding.UTF_8 constant from commons-lang.


For those using Lang 3.0: org.apache.commons.lang3.CharEncoding.UTF_8. (Note "lang3").
If you're using Java 1.7, see @Roger's answer below since it's part of the standard library.
P.S. "@Roger's answer below" is now @Roger's answer above. ☝
That class is deprecated since Java 7 introduce java.nio.charset.StandardCharsets
J
JuanMoreno

The Google Guava library (which I'd highly recommend anyway, if you're doing work in Java) has a Charsets class with static fields like Charsets.UTF_8, Charsets.UTF_16, etc.

Since Java 7 you should just use java.nio.charset.StandardCharsets instead for comparable constants.

Note that these constants aren't strings, they're actual Charset instances. All standard APIs that take a charset name also have an overload that take a Charset object which you should use instead.


So, should be Charsets.UTF_8.name()?
@kilaka Yeah use name() instead of getDisplayName() since name() is final and getDisplayName() is not
@Buffalo: Please read my answer again: it recommends using java.nio.charset.StandardCharsets when possible, which is not third party code. Additionally, the Guava Charsets definitions are not "constantly modified" and AFAIK have never broken backwards compatibility, so I don't think your criticism is warranted.
@Buffalo: That's as it may be, but I doubt your issues had anything to do with the Charsets class. If you want to complain about Guava, that's fine, but this is not the place for those complaints.
Please do not include a multi-megabyte library to get one string constant.
c
cosjav

In case this page comes up in someones web search, as of Java 1.7 you can now use java.nio.charset.StandardCharsets to get access to constant definitions of standard charsets.


I have been trying to use this but it does not seem to work. 'Charset.defaultCharset());' seems to work after including 'java.nio.charset.*' but I can't seem to explicitly refer to UTF8 when I am trying to use 'File.readAllLines'.
@Roger What seems to be the problem? From what I can see you can just call: Files.readAllLines(Paths.get("path-to-some-file"), StandardCharsets.UTF_8);
I don't know what the problem was, but it worked for me after changing something which I can't remember.
^^^ You probably had to change the target platform in the IDE. If 1.6 was your latest JDK when you installed the IDE, it probably picked it as the default & kept it as the default long after you'd updated both the IDE and JDK themselves in-place.
A
Alfredo Carrillo

This constant is available (among others as: UTF-16, US-ASCII, etc.) in the class org.apache.commons.codec.CharEncoding as well.


t
tskuzzy

There are none (at least in the standard Java library). Character sets vary from platform to platform so there isn't a standard list of them in Java.

There are some 3rd party libraries which contain these constants though. One of these is Guava (Google core libraries): http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/base/Charsets.html


It took me a second to catch on to this... Guava's Charsets constants are (no surprise) Charsets, not Strings. InputStreamReader has another constructor that takes a Charset rather than a string. If you really need the string, it's e.g. Charsets.UTF_8.name().
Character sets do may vary from platform to platform, but UTF-8 is guaranteed to exist.
All charsets defined in StandardCharsets are guaranteed to exist in every Java implementation on every platform.
A
Andrew Tobilko

You can use Charset.defaultCharset() API or file.encoding property.

But if you want your own constant, you'll need to define it yourself.


The default charset is usually determinded by the OS and locale settings, I don't think there is any guarantee that it remains the same for multiple java invocations. So this is no replacement for a constant sepcifying "utf-8".
M
Mostafa Vatanpour

In Java 1.7+

Do not use "UTF-8" string, instead use Charset type parameter:

import java.nio.charset.StandardCharsets

...

new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);

J
JJD

If you are using OkHttp for Java/Android you can use the following constant:

import com.squareup.okhttp.internal.Util;

Util.UTF_8; // Charset
Util.UTF_8.name(); // String

it's removed from OkHttp, so next way is: Charset.forName("UTF-8").name() when you need support for lower Android than API 19+ otherwise you can use: StandardCharsets.UTF_8.name()
V
Vazgen Torosyan

Constant definitions for the standard. These charsets are guaranteed to be available on every implementation of the Java platform. since 1.7

 package java.nio.charset;
 Charset utf8 = StandardCharsets.UTF_8;

s
sendon1982

Class org.apache.commons.lang3.CharEncoding.UTF_8 is deprecated after Java 7 introduced java.nio.charset.StandardCharsets

@see JRE character encoding names

@since 2.1

@deprecated Java 7 introduced {@link java.nio.charset.StandardCharsets}, which defines these constants as

{@link Charset} objects. Use {@link Charset#name()} to get the string values provided in this class.

This class will be removed in a future release.