I'm trying to use a constant instead of a string literal in this piece of code:
new InputStreamReader(new FileInputStream(file), "UTF-8")
"UTF-8"
appears in the code rather often, and would be much better to refer to some static final
variable instead. Do you know where I can find such a variable in JDK?
BTW, on a second thought, such constants are bad design: Public Static Literals ... Are Not a Solution for Data Duplication
Files.newBufferedWriter(Path path, Charset cs)
from NIO.
In Java 1.7+, java.nio.charset.StandardCharsets defines constants for Charset
including UTF_8
.
import java.nio.charset.StandardCharsets;
...
StandardCharsets.UTF_8.name();
For Android: minSdk 19
Now I use org.apache.commons.lang3.CharEncoding.UTF_8
constant from commons-lang.
org.apache.commons.lang3.CharEncoding.UTF_8
. (Note "lang3").
The Google Guava library (which I'd highly recommend anyway, if you're doing work in Java) has a Charsets
class with static fields like Charsets.UTF_8
, Charsets.UTF_16
, etc.
Since Java 7 you should just use java.nio.charset.StandardCharsets
instead for comparable constants.
Note that these constants aren't strings, they're actual Charset
instances. All standard APIs that take a charset name also have an overload that take a Charset
object which you should use instead.
java.nio.charset.StandardCharsets
when possible, which is not third party code. Additionally, the Guava Charsets definitions are not "constantly modified" and AFAIK have never broken backwards compatibility, so I don't think your criticism is warranted.
Charsets
class. If you want to complain about Guava, that's fine, but this is not the place for those complaints.
In case this page comes up in someones web search, as of Java 1.7 you can now use java.nio.charset.StandardCharsets to get access to constant definitions of standard charsets.
Files.readAllLines(Paths.get("path-to-some-file"), StandardCharsets.UTF_8);
This constant is available (among others as: UTF-16
, US-ASCII
, etc.) in the class org.apache.commons.codec.CharEncoding
as well.
There are none (at least in the standard Java library). Character sets vary from platform to platform so there isn't a standard list of them in Java.
There are some 3rd party libraries which contain these constants though. One of these is Guava (Google core libraries): http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/base/Charsets.html
StandardCharsets
are guaranteed to exist in every Java implementation on every platform.
You can use Charset.defaultCharset()
API or file.encoding
property.
But if you want your own constant, you'll need to define it yourself.
In Java 1.7+
Do not use "UTF-8" string, instead use Charset
type parameter:
import java.nio.charset.StandardCharsets
...
new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);
If you are using OkHttp for Java/Android you can use the following constant:
import com.squareup.okhttp.internal.Util;
Util.UTF_8; // Charset
Util.UTF_8.name(); // String
Charset.forName("UTF-8").name()
when you need support for lower Android than API 19+ otherwise you can use: StandardCharsets.UTF_8.name()
Constant definitions for the standard. These charsets are guaranteed to be available on every implementation of the Java platform. since 1.7
package java.nio.charset;
Charset utf8 = StandardCharsets.UTF_8;
Class org.apache.commons.lang3.CharEncoding.UTF_8
is deprecated after Java 7 introduced java.nio.charset.StandardCharsets
@see JRE character encoding names
@since 2.1
@deprecated Java 7 introduced {@link java.nio.charset.StandardCharsets}, which defines these constants as
{@link Charset} objects. Use {@link Charset#name()} to get the string values provided in this class.
This class will be removed in a future release.
Success story sharing
.toString()
will work but the proper function is.name()
. 99.9% toString is not the answer..displayName()
will also work unless it is overridden for localization as intended.name()
at all. You can directly pass theCharset
object into theInputStreamReader
constructor.String
, perhaps because of legacy reasons. In such cases, I keep aCharset
object around, typically derived fromStandardCharsets
, and usename()
if needed.