Skip to content

PROTOCOL_CONNECTION_LOST while streaming #1892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
TheRoSS opened this issue Dec 6, 2017 · 8 comments
Open

PROTOCOL_CONNECTION_LOST while streaming #1892

TheRoSS opened this issue Dec 6, 2017 · 8 comments

Comments

@TheRoSS
Copy link

TheRoSS commented Dec 6, 2017

The issue is the same #566
It has been marked as closed but the problem is still here
I am using node 0.12.18 with mysqljs 2.15.0

We have the database with 1.5kk rows in someTable and the following code with long async result handler:

var db = mysql.createConnection(dbConf);
var q = db.query("SELECT * FROM someTable");
var count = 0;

q.on("end", done);
q.on("error", function (err) {console.error(err.stack || err);});

q.on("result", function (row) {
    console.log("row:", ++count, row.id);
    db.pause()

    setTimeout(function () {
        db.resume();
    }, 1000);
});

The script fails in about 15 minutes with error PROTOCOL_CONNECTION_LOST on processing ~900 row.

The cause of this error can be revealed by tcpdump. Try to monitor all traffic between your script and the database server in real-time. The server will finish to transfer data in about 5 minutes. The db.pause()/db.resume() calls do not affect it. The server's response will be buffered by driver and will wait for processing by "long async result handler". Meanwhile the server itself will be idle and after inactivity timeout (10 minutes) will close the connection.

@dougwilson
Copy link
Member

Thanks for the report @TheRoSS ! If the fix I made to that previous issue didn't work for you, then I'm really out of ideas on how to fix this. Please if you can provide some information on how to solve or a pull request with an implementation to solve, may be the only way forward.

@TheRoSS
Copy link
Author

TheRoSS commented Dec 6, 2017

The hack with

setInterval(function () {
    db.ping();    // or db.query("SELECT 1")
}, 10000);

does not work because all queries are enqueued and will be executed AFTER the streaming will be done.

I see a solution to intercept the tcp stream "end" event from the database server and "close" query to make setInterval hacks can work

@TheRoSS TheRoSS closed this as completed Dec 6, 2017
@TheRoSS TheRoSS reopened this Dec 6, 2017
@dougwilson
Copy link
Member

Hi @TheRoSS I'm not quite sure I understand what you're describing. From my understanding of your description, if you react on the "end" event, that was Node.js signaling that the TCP socket has already closed, so there wouldn't be a way to rescue the connection any longer. Perhaps if you want to code up a solution that works for you? Other than that, I can take a stab at something, just need details steps to reproduce the issue, perhaps just some DDL and data to load that would cause your above code to error out in the same way.

@TheRoSS
Copy link
Author

TheRoSS commented Dec 7, 2017

No. I mean:

  1. catch the stream end from the database (or something else by which we can detect that all data from the server were received)
  2. unlock queries queue for special command, db.ping (for user queries the queue is still locked until the streamed query will be finished)
  3. setInterval.unref with db.ping and rather big interval (10 seconds or 1 minute for example)
  4. catch the streamed query handling finish (success or by error), then clearInterval

It's sadly that I do not know the internals of your driver. I need some time to investigate the code and create patch. Only you can say now my idea is applicable or not.

@dougwilson
Copy link
Member

Yea, still not 100% sure I understand. I can take a stab at something, just need details steps to reproduce the issue, perhaps just some DDL and data to load that would cause your above code to error out in the same way.

@TheRoSS
Copy link
Author

TheRoSS commented Dec 7, 2017

To reproduce the issue you need:

  1. Any database with ~1000 rows
  2. Adjust inactivity timeouts to some reasonable values. By default network timeout is 10 minutes (on my server at least). But you can set wait_timeout mysql parameter in server my.cnf to some lesser value to wait less time
  3. Run script from my first message
  4. (optional) You can also run tcpdump to monitor network traffic between script and mysql server

From tcpdump monitoring you will see that server will complete data transmission in first few seconds. But we have 1 second async handler for every row. So script is expected to execute 1000+ seconds. But if your inactivity timeout is 10 minutes than the script will fail on 600+ item handling

@TheRoSS
Copy link
Author

TheRoSS commented Dec 7, 2017

I looked at code...
The problem is not simple.
To solve it the driver should catch data end from the server at the moment the server sends it. But the driver reads and buffers byte stream from the server but not packets. If I call pause() than the driver stops byte stream parsing, and the eof command waits in the depth of the buffered byte stream.

To correctly work with streams the driver should buffer packets but not bytes. The bytes should be parsed and formed to packets as soon as they received from the server. And pause() call pauses packet handling but not byte stream. Than you can catch eof command at time

@TheRoSS
Copy link
Author

TheRoSS commented Dec 7, 2017

As a workaround to async handling of rows in streaming mode for those who need to work with large banks of data I can propose to separate the process into two phases: first read data in streaming mode and synchronously handle them to create patch, then apply this patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants