ChatGPT解决这个技术问题 Extra ChatGPT

Why does the MongoDB Java driver use a random number generator in a conditional?

I saw the following code in this commit for MongoDB's Java Connection driver, and it appears at first to be a joke of some sort. What does the following code do?

if (!((_ok) ? true : (Math.random() > 0.1))) {
    return res;
}

(EDIT: the code has been updated since posting this question)

Which part of it is confusing you?
i think it's confusing. this code is executed in a catch block !
@MarkoTopolnik: Is it? It could be written much more clearly as if (!ok || Math.random() < 0.1) (or something similar).
github.com/mongodb/mongo-java-driver/commit/… you are not first, see comment to that line
@msangel Those guys seem to be criticising the logic, not the coding style.

E
Erik Schierboom

After inspecting the history of that line, my main conclusion is that there has been some incompetent programming at work.

That line is gratuitously convoluted. The general form a? true : b for boolean a, b is equivalent to the simple a || b The surrounding negation and excessive parentheses convolute things further. Keeping in mind De Morgan's laws it is a trivial observation that this piece of code amounts to if (!_ok && Math.random() <= 0.1) return res; The commit that originally introduced this logic had if (_ok == true) { _logger.log( Level.WARNING , "Server seen down: " + _addr, e ); } else if (Math.random() < 0.1) { _logger.log( Level.WARNING , "Server seen down: " + _addr ); } —another example of incompetent coding, but notice the reversed logic: here the event is logged if either _ok or in 10% of other cases, whereas the code in 2. returns 10% of the times and logs 90% of the times. So the later commit ruined not only clarity, but correctness itself. I think in the code you have posted we can actually see how the author intended to transform the original if-then somehow literally into its negation required for the early return condition. But then he messed up and inserted an effective "double negative" by reversing the inequality sign. Coding style issues aside, stochastic logging is quite a dubious practice all by itself, especially since the log entry does not document its own peculiar behavior. The intention is, obviously, reducing restatements of the same fact: that the server is currently down. The appropriate solution is to log only changes of the server state, and not each its observation, let alone a random selection of 10% such observations. Yes, that takes just a little bit more effort, so let's see some.

I can only hope that all this evidence of incompetence, accumulated from inspecting just three lines of code, does not speak fairly of the project as a whole, and that this piece of work will be cleaned up ASAP.


Additionally this appears to be, as far as I can tell, the official 10gen Java driver for MongoDB so in addition to having an opinion on the Java driver, I think it gives me an opinion on the code of MongoDB
Excellent analysis of just a few lines of code, I might just turn it into an interview question! Your fourth point is the real key why there's something fundamentally wrong with this project (the others could be dismissed as unfortunate programmer's bugs).
@ChrisTravers It is the official mongo java driver for mongo.
m
msangel

https://github.com/mongodb/mongo-java-driver/commit/d51b3648a8e1bf1a7b7886b7ceb343064c9e2225#commitcomment-3315694

11 hours ago by gareth-rees:

Presumably the idea is to log only about 1/10 of the server failures (and so avoid massively spamming the log), without incurring the cost of maintaining a counter or timer. (But surely maintaining a timer would be affordable?)


Not to nitpick but: 1/10th of the time it will return res, so it will log the other 9/10 times.
@Supericy That's definitely not nitpicking. That's just yet more evidence of this person's terrible coding practices.
t
tpdi

Add a class member initialized to negative 1:

  private int logit = -1;

In the try block, make the test:

 if( !ok && (logit = (logit + 1 ) % 10)  == 0 ) { //log error

This always logs the first error, then every tenth subsequent error. Logical operators "short-circuit", so logit only gets incremented on an actual error.

If you want the first and tenth of all errors, regardless of the connection, make logit class static instead of a a member.

As had been noted this should be thread safe:

private synchronized int getLogit() {
   return (logit = (logit + 1 ) % 10);
}

In the try block, make the test:

 if( !ok && getLogit() == 0 ) { //log error

Note: I don't think throwing out 90% of the errors is a good idea.


N
Neeme Praks

I have seen this kind of thing before.

There was a piece of code that could answer certain 'questions' that came from another 'black box' piece of code. In the case it could not answer them, it would forward them to another piece of 'black box' code that was really slow.

So sometimes previously unseen new 'questions' would show up, and they would show up in a batch, like 100 of them in a row.

The programmer was happy with how the program was working, but he wanted some way of maybe improving the software in the future, if possible new questions were discovered.

So, the solution was to log unknown questions, but as it turned out, there were 1000's of different ones. The logs got too big, and there was no benefit of speeding these up, since they had no obvious answers. But every once in a while, a batch of questions would show up that could be answered.

Since the logs were getting too big, and the logging was getting in the way of logging the real important things he got to this solution:

Only log a random 5%, this will clean up the logs, whilst in the long run still showing what questions/answers could be added.

So, if an unknown event occurred, in a random amount of these cases, it would be logged.

I think this is similar to what you are seeing here.

I did not like this way of working, so I removed this piece of code, and just logged these messages to a different file, so they were all present, but not clobbering the general logfile.


Except that we're talking about a database driver here... wrong problem space, IMO!
@StevenSchlansker I never said this was a good practice. I removed this piece of code, and just logged these messages to a different file.