Welcome to the Solr MongoDB Importer project. This project provides MongoDB support for the Solr Data Import Handler.
- Retrive data from a MongoDB collection
- Authenticate using MongoDB authentication
- Map Mongo fields to Solr fields wit mapMongoFields option (for accessing nested fields use "." (dot) as path separator eg.: Params.Size)
- Date conversion of field value to required format
- Delta import available
- Upgrade Solr to 7.3
- Upgrade Mongo Driver to 3.6.4
- Change to Maven and create fat jar
-
MongoDataSource - Provides a MongoDB datasource
- database (required) - The name of the data base you want to connect to
- host (optional - default: localhost) - for replica set add comma separated values
- port (optional - default: 27017) - for different ports in replica set add comma separated values
- username (optional)
- password (optional)
- mapMongoFields (optional - default: true)
-
MongoEntityProcessor - Use with the MongoDataSource to query a MongoDB collection
- collection (required)
- query (required)
- deltaQuery (optional)
- deltaImportQuery (optional)
-
MongoMapperTransformer - Map MongoDB fields to your Solr schema
- mongoField (required)
- dateFormat (optional)
-
Build your own Jar using Maven pom.xml
-
You will also need the below libs placed inside Solr core/collection1/lib folder:
-
Add lib directives to your solrconfig.xml
<lib dir="./lib/" regex="solr-mongo-importer.*\.jar"/> <lib dir="./lib/" regex="mongo-java-driver.*\.jar"/>
-
Add the below fields config in schema.xml inside tag
<field name="name" type="string" indexed="true" stored="true"/> <field name="size" type="int" indexed="true" stored="true"/> <field name="created" type="date" indexed="true" stored="true"/>
-
Declare data-config file in solrconfig.xml by adding below code inside tag
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler>
-
Add the below documents in mongo collection
use test; db.products.update({"name":"Prod1"},{$set: { "attrib":{"size":1}, "deleted":"false"}, $currentDate: {lastmodified: true, created: true}}, {upsert: true, multi:true}); db.products.update({"name":"Prod2"},{$set: { "attrib":{"size":2}, "deleted":"false"}, $currentDate: {lastmodified: true, created: true}}, {upsert: true, multi:true});
-
Add the below documents in mongo collection ONLY to test delete functionality
use test; db.products.update({"name":"Prod1"},{$set: {"deleted":"true"}, $currentDate: {lastmodified: true}}, {upsert: true, multi:true});
-
Create a data-config.xml file in the path collection1\conf\ (which by default holds solrconfig.xml and schema.xml)
Here is a sample data-config.xml showing the use of all components
<?xml version="1.0" encoding="UTF-8" ?> <dataConfig> <dataSource name="MongoSource" type="MongoDataSource" database="test"/> <document name="products"> <entity name="product" processor="MongoEntityProcessor" query='{$where: "${dataimporter.request.clean} != false || this.lastmodified > ISODate(\"${dataimporter.last_index_time}\")"}' collection="products" datasource="MongoSource" deltaQuery="{'UpdateDate':{$gt:{$date:'${dih.last_index_time}'}}}" deltaImportQuery="{'_id':'${dih.delta._id}'}" transformer="MongoMapperTransformer" mapMongoFields="true"> <!-- If mongoField name and the field declared in schema.xml are same than no need to declare below. If not same than you have to refer the mongoField to field in schema.xml ( Ex: mongoField="EmpNumber" to name="EmployeeNumber"). --> <field column="_id" name="id"/> <!-- docId has long type--> <field column="name" name="name" mongoField="name"/> <field column="size" name="size" mongoField="attrib.size"/> <field column="created" name="created" mongoField="created" dateFormat="yyyy-MM-dd HH:mm:ss"/> <field column="$skipDoc" mongoField="deleted"/> <field column="$deleteDocById" mongoField="_id"/> </entity> </document> </dataConfig>
- Date transformation is only needed for non-date objects ##Usage To run full-import ( Deletes all data in index and does a Fresh full import)
https://localhost:8983/solr/collection1/dataimport?command=full-import&clean=true&indent=true&wt=json
To run delta import( Imports only the modified data(based on the query) and deletes the data(based on $deleteDocById & $skipDoc in data-config.xml) )
https://localhost:8983/solr/collection1/dataimport?command=full-import&clean=false&indent=true&wt=json