ChatGPT解决这个技术问题 Extra ChatGPT

Safe integer parsing in Ruby

I have a string, say '123', and I want to convert it to the integer 123.

I know you can simply do some_string.to_i, but that converts 'lolipops' to 0, which is not the effect I have in mind. I want it to blow up in my face when I try to convert something invalid, with a nice and painful Exception. Otherwise, I can't distinguish between a valid 0 and something that just isn't a number at all.

EDIT: I was looking for the standard way of doing it, without regex trickery.


C
Community

Ruby has this functionality built in:

Integer('1001')                                    # => 1001  
Integer('1001 nights')  
# ArgumentError: invalid value for Integer: "1001 nights"  

As noted in answer by Joseph Pecoraro, you might want to watch for strings that are valid non-decimal numbers, such as those starting with 0x for hex and 0b for binary, and potentially more tricky numbers starting with zero that will be parsed as octal.

Ruby 1.9.2 added optional second argument for radix so above issue can be avoided:

Integer('23')                                     # => 23
Integer('0x23')                                   # => 35
Integer('023')                                    # => 19
Integer('0x23', 10)
# => #<ArgumentError: invalid value for Integer: "0x23">
Integer('023', 10)                                # => 23

P
Purfideas

This might work:

i.to_i if i.match(/^\d+$/)

PSA: in Ruby, ^ and $ have subtly different meanings as metachars than in most other regexp flavors. You probably mean to use \A and \Z instead.
to be pedantic, the mention of different regex anchors as per @pje may be incorrect depending on the desired behavior. Instead consider using \z in place of \Z as the description for the capitalized Z anchor is: "Matches end of string. If string ends with a newline, it matches just before newline" -- ruby-doc.org/core-2.1.1/Regexp.html
J
Joseph Pecoraro

Also be aware of the affects that the current accepted solution may have on parsing hex, octal, and binary numbers:

>> Integer('0x15')
# => 21  
>> Integer('0b10')
# => 2  
>> Integer('077')
# => 63

In Ruby numbers that start with 0x or 0X are hex, 0b or 0B are binary, and just 0 are octal. If this is not the desired behavior you may want to combine that with some of the other solutions that check if the string matches a pattern first. Like the /\d+/ regular expressions, etc.


That's what I'd expect from the conversion though
In Ruby 1.9, you can pass the base as a second argument.
i
ian

Another unexpected behavior with the accepted solution (with 1.8, 1.9 is ok):

>> Integer(:foobar)
=> 26017
>> Integer(:yikes)
=> 26025

so if you're not sure what is being passed in, make sure you add a .to_s.


test in Ruby 1.9. Integer(:foobar) => can't convert Symbol into Integer (TypeError)
i
ian

I like Myron's answer but it suffers from the Ruby disease of "I no longer use Java/C# so I'm never going to use inheritance again". Opening any class can be fraught with danger and should be used sparingly, especially when it's part of Ruby's core library. I'm not saying don't ever use it, but it's usually easy to avoid and that there are better options available, e.g.

class IntegerInString < String

  def initialize( s )
    fail ArgumentError, "The string '#{s}' is not an integer in a string, it's just a string." unless s =~ /^\-?[0-9]+$/
    super
  end
end

Then when you wish to use a string that could be a number it's clear what you're doing and you don't clobber any core class, e.g.

n = IntegerInString.new "2"
n.to_i
# => 2

IntegerInString.new "blob"
ArgumentError: The string 'blob' is not an integer in a string, it's just a string.

You can add all sorts of other checks in the initialize, like checking for binary numbers etc. The main thing though, is that Ruby is for people and being for people means clarity. Naming an object via its variable name and its class name makes things much clearer.


佚名

I had to deal with this in my last project, and my implementation was similar, but a bit different:

class NotAnIntError < StandardError 
end

class String
  def is_int?    
    self =~ /^-?[0-9]+$/
  end

  def safe_to_i
    return self.to_i if is_int?
    raise NotAnIntError, "The string '#{self}' is not a valid integer.", caller
  end
end

class Integer
  def safe_to_i
    return self
  end            
end

class StringExtensions < Test::Unit::TestCase

  def test_is_int
    assert "98234".is_int?
    assert "-2342".is_int?
    assert "02342".is_int?
    assert !"+342".is_int?
    assert !"3-42".is_int?
    assert !"342.234".is_int?
    assert !"a342".is_int?
    assert !"342a".is_int?
  end

  def test_safe_to_i
    assert 234234 == 234234.safe_to_i
    assert 237 == "237".safe_to_i
    begin
      "a word".safe_to_i
      fail 'safe_to_i did not raise the expected error.'
    rescue NotAnIntError 
      # this is what we expect..
    end
  end

end

d
dusan
someString = "asdfasd123"
number = someString.to_i
if someString != number.to_s
  puts "oops, this isn't a number"
end

Probably not the cleanest way to do it, but should work.


C
Community

Re: Chris's answer

Your implementation let's things like "1a" or "b2" through. How about this instead:

def safeParse2(strToParse)
  if strToParse =~ /\A\d+\Z/
    strToParse.to_i
  else
    raise Exception
  end
end

["100", "1a", "b2", "t"].each do |number|
  begin
    puts safeParse2(number)
  rescue Exception
    puts "#{number} is invalid"
  end
end

This outputs:

100
1a is invalid
b2 is invalid
t is invalid

to be pedantic, the mention of different regex anchors as per @pje and used may be incorrect depending on the desired behavior. Instead consider using \z in place of \Z as the description for the capitalized Z anchor is: "Matches end of string. If string ends with a newline, it matches just before newline" -- ruby-doc.org/core-2.1.1/Regexp.html