Cp-ansible: Kafka Connect Module Output

Created on 15 Sep 2020  ·  12Comments  ·  Source: confluentinc/cp-ansible

Apart from the fact that this is a very great feature and makes life much easier, the output is unfortunately often not very helpful. You get a 400 return when deploying n connectors, no matter if the JDBC password in one connector is not correct or if the format of another connector is not suitable. At least this is my impression so far. I haven't had time to look into the code in more detail, but maybe someone can answer the question, if this is due to the Connector API, or if this could be improved in the module?

bug help wanted

All 12 comments

@Fobhep Thanks for the question. Where do you get the 400 errror, do you mean after Connect restarts and the health check runs?

If so, the health check simply checks to see if we can query the list of Connectors from the Connect API. So if it ends in 400, this means that Connect has failed to start for some reason.

Can you confirm where you are receiving the 400 error?

Thanks

The error happens when running the kafka-connector deployment task and Ansible returns either
Request timed out or Bad Request

After digging in the logs I then managed to find Exceptions indicating that eg the password for one connector was wrong.

@Fobhep Are you able to share which connectors you tried deploying and which one has the misconfiguration? We want to reproduce this in house.

We think it maybe an issue in the python library, whereby if a new connector fails it doesn't return the error code from the API, where as if an existing connector update fails it does.

@JumaX In that particular customer scenarion it was JDBC connectors only

Another thing I noticed only now:

Sometimes I get a

"HTTP Error: 409 Conflict", but the module itself is saying "changed: true" .

Now I am aware that the REST API may return 409 upon POSTing while a rebalance is in action.
But shouldn't the module still fail if a POST job was not done?
Or does 409 mean, the POST was done, but there was a Rebalane at the same time going on?

Anything new here? This REST API for adding connectors seems to have its own mind. Just added a set of 6 jdbc oracle connectors to it (3source, 3sink).
First time i got a 400 bad request, and nothing was configured... ok
Retry with the exact same config. Now 1 of 6 is deployed, still got a 400 bad request....

This was added as a contribution from the community, I've spoken with the author and he is making it a priority to review this, this week.

@Fobhep @JumaX Resuming work on this issue now, sorry for the late reply. I'll rewrite the error management so that we get an explicit message/result for each connector.

I'll also see if there's a way to wait for a rebalance to finish. The 409 is indeed the response we get when there's a rebalance, which is why initially I did not treat it as an error, but it's true it masks an error if there's one, which is unfortunate.

Quick update: I have completely rewritten the error management and I have added a status check on the tasks of connectors, which means that if a connector fails to initialize, it will be detected and returned as an error. Preparing a PR now.

@ldom any status updates about this?

@jamuska PR is there but has not been merged yet (https://github.com/confluentinc/cp-ansible/pull/490). I guess Justin is waiting for the molecule tests. I'll work on them this week.

@ldom @jamuska Correct, we are waiting on the molecule tests. Let me know if I can be of assistance @ldom.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

a-narenji picture a-narenji  ·  5Comments

LGouellec picture LGouellec  ·  4Comments

OneCricketeer picture OneCricketeer  ·  7Comments

OneCricketeer picture OneCricketeer  ·  6Comments

Fobhep picture Fobhep  ·  7Comments