Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-7903: Update mongo driver from 3.12 to 4.2 #2201

Merged
merged 2 commits into from
Apr 28, 2021

Conversation

vvysotskyi
Copy link
Member

DRILL-7903: Update mongo driver from 3.12 to 4.2

Description

  • Replaced old mongo-java-driver with the latest version of mongodb-driver-sync.
  • Replaced de.flapdoodle.embed.mongo usage with testcontainers to avoid dependencies on old mongo-java-driver.
  • Removed dependency on mongo from the java-exec module (and moved Bson reader).

Documentation

Yes.

Testing

All unit tests pass (now unit tests use docker containers with real mongo).

Copy link
Member

@luocooong luocooong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vvysotskyi Nice work. Thanks for your PR.

@@ -64,9 +64,9 @@
<scope>test</scope>
</dependency>
<dependency>
<groupId>de.flapdoodle.embed</groupId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The de.flapdoodle.embed is a good embedded mongo framework. It is a pity that it cannot support our test now. Can you explain that why to using the based on docker framework to instead this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

de.flapdoodle.embed uses mongo-java-driver and is incompatible with mongodb-driver-sync, so we have to move out to another solution for testing.

return list1.size() - list2.size();
}
};
private static final Comparator<List<MongoSubScanSpec>> LIST_SIZE_COMPARATOR = Comparator.comparingInt(List::size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More simple (with Java Lambda), but not easy to read and debug. (Not a problem)

@@ -530,8 +503,7 @@ public ScanStats getScanStats() {
if (recordCount != 0) {
//toJson should use client's codec, otherwise toJson could fail on
// some types not known to DocumentCodec, e.g. DBRef.
final DocumentCodec codec =
new DocumentCodec(client.getMongoClientOptions().getCodecRegistry(), new BsonTypeClassMap());
DocumentCodec codec = new DocumentCodec(db.getCodecRegistry(), new BsonTypeClassMap());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change due to a version requirement (driver 4.2) or exist an issue before?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, API was changed, there is no method for MongoClientOptions anymore.

MongoDatabase database = storagePlugin.getClient().getDatabase(CONFIG);
//Identify the primary shard of the queried database.
MongoCollection<Document> collection = database.getCollection(DATABASES);
Bson filter = new Document(ID, this.scanSpec.getDbName());
Bson projection = new Document(PRIMARY, select);
Document document = collection.find(filter).projection(projection).first();
Preconditions.checkNotNull(document);
Document document = Objects.requireNonNull(collection.find(filter).projection(projection).first());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An user had reported a problem on our Slack channel (some days ago):

I am trying to query a Sharded cluster of MongoDB using embedded mode and distributed mode. But both is giving an error of:
query: select * from mongo.db.collection limit 1;
warning UserRemoteException : SYSTEM ERROR: ClassCastException: org.bson.types.ObjectId cannot be cast to java.lang.String
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: ClassCastException: org.bson.types.ObjectId cannot be cast to java.lang.String
Please, refer to logs for more information.
[Error Id: 43733103-4cac-4aef-a09f-275df4b4ee4b on ubuntu:31010]
store.mongo.all_text_mode : true
store.mongo.bson.record.reader: false
store.mongo.read_numbers_as_double: true
It throws an error only when collection is sharded, on non-sharded collection, it is working fine.

Could you please check that the failed not an issues? thanks

Drill 1.18
MongoDB 4.4 Sharded (Not a Replica Sets)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reproduced it locally and fixed it.

.version(Version.Main.V3_4)
.net(new Net(LOCALHOST, MONGOS_PORT, Network.localhostIsIPv6()))
.cmdOptions(cmdOptions).build();
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a new practice to apply the same opeation (In my production environment):

.....
rs.initiate(cfg);
while(db.isMaster().ismaster == false) { }
db = db.getSiblingDB('admin');
db.createUser({user:"admin", pwd:"0000", roles:["root"]});
db.createUser({user:"users", pwd:"0000", roles:["userAdminAnyDatabase"]});
.....

mongo --quiet init.js

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't understand what you proposed and how to use it...
This code initializes replica set as it is recommended in official mongo instructions: https://docs.mongodb.com/manual/tutorial/deploy-replica-set/
And after that, waits until it is initialized.

Code part related to creating users defined in the createMongoUser() method and it is quite flexible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a problem. It's only a new practice to initial cluster that using the script to execute once instead of calling the execInContainer many times.

Copy link
Member Author

@vvysotskyi vvysotskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luocooong, thanks for the review, I have addressed your comments, could you please look again?

@@ -64,9 +64,9 @@
<scope>test</scope>
</dependency>
<dependency>
<groupId>de.flapdoodle.embed</groupId>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

de.flapdoodle.embed uses mongo-java-driver and is incompatible with mongodb-driver-sync, so we have to move out to another solution for testing.

@@ -530,8 +503,7 @@ public ScanStats getScanStats() {
if (recordCount != 0) {
//toJson should use client's codec, otherwise toJson could fail on
// some types not known to DocumentCodec, e.g. DBRef.
final DocumentCodec codec =
new DocumentCodec(client.getMongoClientOptions().getCodecRegistry(), new BsonTypeClassMap());
DocumentCodec codec = new DocumentCodec(db.getCodecRegistry(), new BsonTypeClassMap());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, API was changed, there is no method for MongoClientOptions anymore.

.version(Version.Main.V3_4)
.net(new Net(LOCALHOST, MONGOS_PORT, Network.localhostIsIPv6()))
.cmdOptions(cmdOptions).build();
try {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't understand what you proposed and how to use it...
This code initializes replica set as it is recommended in official mongo instructions: https://docs.mongodb.com/manual/tutorial/deploy-replica-set/
And after that, waits until it is initialized.

Code part related to creating users defined in the createMongoUser() method and it is quite flexible.

MongoDatabase database = storagePlugin.getClient().getDatabase(CONFIG);
//Identify the primary shard of the queried database.
MongoCollection<Document> collection = database.getCollection(DATABASES);
Bson filter = new Document(ID, this.scanSpec.getDbName());
Bson projection = new Document(PRIMARY, select);
Document document = collection.find(filter).projection(projection).first();
Preconditions.checkNotNull(document);
Document document = Objects.requireNonNull(collection.find(filter).projection(projection).first());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reproduced it locally and fixed it.

Copy link
Member

@luocooong luocooong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vvysotskyi +1
Thanks for the revision.

@luocooong luocooong merged commit ed0c304 into apache:master Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants