Why can't I use switch statement on a String?

e

erickson

Switch statements with String cases have been implemented in Java SE 7, at least 16 years after they were first requested. A clear reason for the delay was not provided, but it likely had to do with performance.

Implementation in JDK 7

The feature has now been implemented in javac with a "de-sugaring" process; a clean, high-level syntax using String constants in case declarations is expanded at compile-time into more complex code following a pattern. The resulting code uses JVM instructions that have always existed.

A switch with String cases is translated into two switches during compilation. The first maps each string to a unique integer—its position in the original switch. This is done by first switching on the hash code of the label. The corresponding case is an if statement that tests string equality; if there are collisions on the hash, the test is a cascading if-else-if. The second switch mirrors that in the original source code, but substitutes the case labels with their corresponding positions. This two-step process makes it easy to preserve the flow control of the original switch.

Switches in the JVM

For more technical depth on switch, you can refer to the JVM Specification, where the compilation of switch statements is described. In a nutshell, there are two different JVM instructions that can be used for a switch, depending on the sparsity of the constants used by the cases. Both depend on using integer constants for each case to execute efficiently.

If the constants are dense, they are used as an index (after subtracting the lowest value) into a table of instruction pointers—the tableswitch instruction.

If the constants are sparse, a binary search for the correct case is performed—the lookupswitch instruction.

In de-sugaring a switch on String objects, both instructions are likely to be used. The lookupswitch is suitable for the first switch on hash codes to find the original position of the case. The resulting ordinal is a natural fit for a tableswitch.

Both instructions require the integer constants assigned to each case to be sorted at compile time. At runtime, while the O(1) performance of tableswitch generally appears better than the O(log(n)) performance of lookupswitch, it requires some analysis to determine whether the table is dense enough to justify the space–time tradeoff. Bill Venners wrote a great article that covers this in more detail, along with an under-the-hood look at other Java flow control instructions.

Before JDK 7

Prior to JDK 7, enum could approximate a String-based switch. This uses the static valueOf method generated by the compiler on every enum type. For example:

Pill p = Pill.valueOf(str);
switch(p) {
  case RED:  pop();  break;
  case BLUE: push(); break;
}

It might be faster to just use If-Else-If instead of a hash for a string based switch. I have found dictionaries to be quite expensive when only storing a few items.

An if-elseif-elseif-elseif-else might be faster, but I'd take the cleaner code 99 times times out of 100. Strings, being immutable, cache their hash code, so "computing" the hash is fast. One would have to profile code to determine what benefit there is.

The reason given against adding switch(String) is that it wouldn't meet the performance guarantees expects from switch() statements. They didn't want to "mislead" developers. Frankly I don't think they should guarantee the performance of switch() to begin with.

If you are just using Pill to take some action based on str I would argue if-else is preferable as it allows you to handle str values outside of the range RED,BLUE without needing to catch an exception from valueOf or manually check for a match against the name of each enumeration type which just adds unnecessary overhead. In my experience it has only made sense to use valueOf to transform into an enumeration if a typesafe representation of the String value was needed later on.

@fernal73 It depends on how many ifs you have cascaded, and whether the switch string's hash code has already been computed. For two or three, it could probably be faster. At some point though, the switch statement will probably perform better. More importantly, for many cases, the switch statement is probably more readable.

J

JeeBee

If you have a place in your code where you can switch on a String, then it may be better to refactor the String to be an enumeration of the possible values, which you can switch on. Of course, you limit the potential values of Strings you can have to those in the enumeration, which may or may not be desired.

Of course your enumeration could have an entry for 'other', and a fromString(String) method, then you could have

ValueEnum enumval = ValueEnum.fromString(myString);
switch (enumval) {
   case MILK: lap(); break;
   case WATER: sip(); break;
   case BEER: quaff(); break;
   case OTHER: 
   default: dance(); break;
}

This technique also lets you decide on issues such a case insensitivity, aliases, etc. Instead of depending on a language designer to come up with the "one size fits all" solution.

Agree with JeeBee, if you are switching on strings probably need an enum . The string usually represents something going to an interface (user or otherwise) that may or not change in the future so better replace it with enums

See xefer.com/2006/12/switchonstring for a nice write-up of this method.

@DavidSchmitt The write-up has one major flaw. It catches all exceptions instead of the ones that are actually thrown by the methode.

P

Patrick M

The following is a complete example based on JeeBee's post, using java enum's instead of using a custom method.

Note that in Java SE 7 and later you can use a String object in the switch statement's expression instead.

public class Main {

    /**
    * @param args the command line arguments
    */
    public static void main(String[] args) {

      String current = args[0];
      Days currentDay = Days.valueOf(current.toUpperCase());

      switch (currentDay) {
          case MONDAY:
          case TUESDAY:
          case WEDNESDAY:
              System.out.println("boring");
              break;
          case THURSDAY:
              System.out.println("getting better");
          case FRIDAY:
          case SATURDAY:
          case SUNDAY:
              System.out.println("much better");
              break;

      }
  }

  public enum Days {

    MONDAY,
    TUESDAY,
    WEDNESDAY,
    THURSDAY,
    FRIDAY,
    SATURDAY,
    SUNDAY
  }
}

J

James Curran

Switches based on integers can be optimized to very efficent code. Switches based on other data type can only be compiled to a series of if() statements.

For that reason C & C++ only allow switches on integer types, since it was pointless with other types.

The designers of C# decided that the style was important, even if there was no advantage.

The designers of Java apparently thought like the designers of C.

Switches based on any hashable object may be implemented very efficiently using a hash table – see .NET. So your reason isn't completely correct.

Yeah, and this is the thing I don't understand. Are they afraid hashing objects will, in the long run, become too expensive?

@Nalandial: actually, with a little effort on the part of the compiler, it's not expensive at all because when the set of strings is known, it's pretty easy to generate a perfect hash (this isn't done by .NET, though; probably not worth the effort, either).

@Nalandial & @Konrad Rudolph - While hashing a String (due to it's immutable nature) seems like a solution to this problem you have to remember that all non-final Objects can have their hashing functions overridden. This makes it difficult at compile time to ensure consistency in a switch.

You can also construct a DFA to match the string (like regular expression engines do). Possibly even more efficient than hashing.

s

spongebob

An example of direct String usage since 1.7 may be shown as well:

public static void main(String[] args) {

    switch (args[0]) {
        case "Monday":
        case "Tuesday":
        case "Wednesday":
            System.out.println("boring");
            break;
        case "Thursday":
            System.out.println("getting better");
        case "Friday":
        case "Saturday":
        case "Sunday":
            System.out.println("much better");
            break;
    }

}

D

DJClayworth

James Curran succinctly says: "Switches based on integers can be optimized to very efficent code. Switches based on other data type can only be compiled to a series of if() statements. For that reason C & C++ only allow switches on integer types, since it was pointless with other types."

My opinion, and it's only that, is that as soon as you start switching on non-primitives you need to start thinking about "equals" versus "==". Firstly comparing two strings can be a fairly lengthy procedure, adding to the performance problems that are mentioned above. Secondly if there is switching on strings there will be demand for switching on strings ignoring case, switching on strings considering/ignoring locale,switching on strings based on regex.... I would approve of a decision that saved a lot of time for the language developers at the cost of a small amount of time for programmers.

Technically, regexes already "switch", as they are basically just state machines; they merely have only two "cases", matched and not matched. (Not taking into account things like [named] groups/etc., though.)

docs.oracle.com/javase/7/docs/technotes/guides/language/… states: The Java compiler generates generally more efficient bytecode from switch statements that use String objects than from chained if-then-else statements.

P

PhiLho

Beside the above good arguments, I will add that lot of people today see switch as an obsolete remainder of procedural past of Java (back to C times).

I don't fully share this opinion, I think switch can have its usefulness in some cases, at least because of its speed, and anyway it is better than some series of cascading numerical else if I saw in some code...

But indeed, it is worth looking at the case where you need a switch, and see if it cannot be replaced by something more OO. For example enums in Java 1.5+, perhaps HashTable or some other collection (sometime I regret we don't have (anonymous) functions as first class citizen, as in Lua — which doesn't have switch — or JavaScript) or even polymorphism.

"sometime I regret we don't have (anonymous) functions as first class citizen" That's no longer true.

@dorukayhan Yes, of course. But do you want to add a comment at all answers from the last ten years to tell the world we can have them if we update to newer versions of Java? :-D

h

hyper-neutrino

If you are not using JDK7 or higher, you can use hashCode() to simulate it. Because String.hashCode() usually returns different values for different strings and always returns equal values for equal strings, it is fairly reliable (Different strings can produce the same hash code as @Lii mentioned in a comment, such as "FB" and "Ea") See documentation.

So, the code would look like this:

String s = "<Your String>";

switch(s.hashCode()) {
case "Hello".hashCode(): break;
case "Goodbye".hashCode(): break;
}

That way, you are technically switching on an int.

Alternatively, you could use the following code:

public final class Switch<T> {
    private final HashMap<T, Runnable> cases = new HashMap<T, Runnable>(0);

    public void addCase(T object, Runnable action) {
        this.cases.put(object, action);
    }

    public void SWITCH(T object) {
        for (T t : this.cases.keySet()) {
            if (object.equals(t)) { // This means that the class works with any object!
                this.cases.get(t).run();
                break;
            }
        }
    }
}

Two different string can have the same hashcode, so if you switch on hashcodes the wrong case-branch might be taken.

@Lii Thanks for pointing this out! It's unlikely, though, but I wouldn't trust it working. "FB" and "Ea" have the same hashcode, so it's not impossible to find a collision. The second code is probably more reliable.

I am surprised this compiles, as case statements had to, I thought, always be constant values, and String.hashCode() is not such (even if in practice calculation has never changed between JVMs).

@StaxMan Hm interesting, I never stopped to observe that. But yeah, case statement values don't have to be determinable at compile-time so it works finely.

C

Charles Goodwin

For years we've been using a(n open source) preprocessor for this.

//#switch(target)
case "foo": code;
//#end

Preprocessed files are named Foo.jpp and get processed into Foo.java with an ant script.

Advantage is it is processed into Java that runs on 1.0 (although typically we only supported back to 1.4). Also it was far easier to do this (lots of string switches) compared to fudging it with enums or other workarounds - code was a lot easier to read, maintain, and understand. IIRC (can't provide statistics or technical reasoning at this point) it was also faster than the natural Java equivalents.

Disadvantages are you aren't editing Java so it's a bit more workflow (edit, process, compile/test) plus an IDE will link back to the Java which is a little convoluted (the switch becomes a series of if/else logic steps) and the switch case order is not maintained.

I wouldn't recommend it for 1.7+ but it's useful if you want to program Java that targets earlier JVMs (since Joe public rarely has the latest installed).

You can get it from SVN or browse the code online. You'll need EBuild to build it as-is.

You don't need the 1.7 JVM to run code with a String switch. The 1.7 compiler turns the String switch into something that uses previously existing byte code.

p

plugwash

Other answers have said this was added in Java 7 and given workarounds for earlier versions. This answer tries to answer the "why"

Java was a reaction to the over-complexities of C++. It was designed to be a simple clean language.

String got a little bit of special case handling in the language but it seems clear to me that the designers were trying to keep the amount of special casing and syntactic sugar to a minimum.

switching on strings is fairly complex under the hood since strings are not simple primitive types. It was not a common feature at the time Java was designed and doesn't really fit in well with the minimalist design. Especially as they had decided not to special case == for strings, it would be (and is) a bit strange for case to work where == doesn't.

Between 1.0 and 1.4 the language itself stayed pretty much the same. Most of the enhancements to Java were on the library side.

That all changed with Java 5, the language was substantially extended. Further extensions followed in versions 7 and 8. I expect that this change of attitude was driven by the rise of C#

Narrative about switch(String) fits with history, timeline, context cpp/cs.

It was a big mistake not to implement this feature, everything else is a cheap excuse Java lost many users over the years because of the lack of progress and the stubbornness of designers not to evolve the language. Fortunately they completely changed direction and attitude after JDK7

d

dreamcrash

The technicalities were nicely explained in this answer. I just wanted to add that with Java 12 switch expressions you can do it with the following syntax:

String translation(String cat_language) {
    return switch (cat_language) {
        case "miau miau" -> "I am to run";
        case "miauuuh" -> "I am to sleep";
        case "mi...au?" ->  "leave me alone";
        default ->  "eat";
    };
}

I

Iskuskov Alexander

JEP 354: Switch Expressions (Preview) in JDK-13 and JEP 361: Switch Expressions (Standard) in JDK-14 will extend the switch statement so it can be used as an expression.

Now you can:

directly assign variable from switch expression,

use new form of switch label (case L ->): The code to the right of a "case L ->" switch label is restricted to be an expression, a block, or (for convenience) a throw statement.

use multiple constants per case, separated by commas,

and also there are no more value breaks: To yield a value from a switch expression, the break with value statement is dropped in favor of a yield statement.

So the demo from the answers (1, 2) might look like this:

  public static void main(String[] args) {
    switch (args[0]) {
      case "Monday", "Tuesday", "Wednesday" ->  System.out.println("boring");
      case "Thursday" -> System.out.println("getting better");
      case "Friday", "Saturday", "Sunday" -> System.out.println("much better");
    }

I

Imtiaz Shakil Siddique

In Java 11+ it's possible with variables too. The only condition is it must be a constant.

For Example:

final String LEFT = "left";
final String RIGHT = "right";
final String UP = "up";
final String DOWN = "down";

String var = ...;

switch (var) {
    case LEFT:
    case RIGHT:
    case DOWN:
    default:
        return 0;
}

PS. I've not tried this with earlier jdks. So please update the answer if it's supported there too.

info: labels must be "constant expressions" since version 7: JLS 14.11

C

Conete Cristian

Not very pretty, but here is another way for Java 6 and bellow:

String runFct = 
        queryType.equals("eq") ? "method1":
        queryType.equals("L_L")? "method2":
        queryType.equals("L_R")? "method3":
        queryType.equals("L_LR")? "method4":
            "method5";
Method m = this.getClass().getMethod(runFct);
m.invoke(this);

Why can't I use switch statement on a String?

Follow WeChat

Want to stay one step ahead of the latest teleworks?

相似问题

Platform

Support

Contact US