Importing and Indexing .CSV file in solr
1) Download the latest version of Apache Solr from the following location:http://lucene.apache.org/solr/downloads.html
2) Once the Solr zip file is downloaded unzip it into "C:\\Program Files" folder. The extracted folder will contain few subfolders such as bin, Contrib , dust, docs, example, license, server etc.
3) Now open command prompt in administrative mode and goto bin directory of solr to start it.Ex:- C:\Program Files\solr-6.0.0\bin > solr start And to stop solr C:\Program Files\solr-6.0.0\bin > solr stop -all
4) Now its time to create Core-Admin to import, index, and search data. For that we have to use the following commands.
Ex:- C:\Program Files\solr-6.0.0\bin >solr create -c jcg -d basic_configs
5) We can see the following output in the command window.Creating new core 'jcg' using command:
http://localhost:8983/solr/admin/cores?action=CREATE&name=jcg&instanceDir=jcg
{ "responseHeader":{ "status":0, "QTime":663}, "core":"jcg"}
6) Now we navigate to the following URL and we can see CoreAdmin(jcg) core being populated in the core selector. You can also see the statistics of the core.
7) http://localhost:8983/solr
8) We have created CoreAdmin along with this we got two more folders inside the core folder named as "Conf" and "Data". So the "Conf " folder contains some important files like Solrconfig.xml and ManagedSchema.xml.
9) Solrconfig.xml => Contains "Data Import handler" to index .csv file(otherwise we have to add in it).Ex:- <requestHandler name="/update" class="solr.UpdateRequestHandler"/>Or some other handlers(Mandatory)
10) ManagedSchema.xml => we have to add "fields"(Column name) in it to index .csv file.(Mandatory){ Note:- Whatever the fields you are adding it must be match with .csv file column name with its type like int, string etc(compatible type) }
Ex:- <uniqueKey>id</uniqueKey>
<!-- Fields added for books.csv load-->
<field name="cat" type="text_general" indexed="true" stored="true"/>
{ Note :- If there is no "id" field in .csv file then delete <uniqueKey>id</uniqueKey> and <field name="id" type="string"........../> from managedschema.xml file}
11) Add .csv file into example/example docs folder
12) Now stop the solr and start again.
13) Execute the following commmand to index .csv file.
Ex:-C:\ProgramFiles\solr-6.0.0\example\exampledocs > java -Dtype=text/csv -Dc=CoreAdminname -jar post.jar filename.csv
14) It will index your .csv file now goto query panel and search
2) Once the Solr zip file is downloaded unzip it into "C:\\Program Files" folder. The extracted folder will contain few subfolders such as bin, Contrib , dust, docs, example, license, server etc.
3) Now open command prompt in administrative mode and goto bin directory of solr to start it.Ex:- C:\Program Files\solr-6.0.0\bin > solr start And to stop solr C:\Program Files\solr-6.0.0\bin > solr stop -all
4) Now its time to create Core-Admin to import, index, and search data. For that we have to use the following commands.
Ex:- C:\Program Files\solr-6.0.0\bin >solr create -c jcg -d basic_configs
5) We can see the following output in the command window.Creating new core 'jcg' using command:
http://localhost:8983/solr/admin/cores?action=CREATE&name=jcg&instanceDir=jcg
{ "responseHeader":{ "status":0, "QTime":663}, "core":"jcg"}
6) Now we navigate to the following URL and we can see CoreAdmin(jcg) core being populated in the core selector. You can also see the statistics of the core.
7) http://localhost:8983/solr
8) We have created CoreAdmin along with this we got two more folders inside the core folder named as "Conf" and "Data". So the "Conf " folder contains some important files like Solrconfig.xml and ManagedSchema.xml.
9) Solrconfig.xml => Contains "Data Import handler" to index .csv file(otherwise we have to add in it).Ex:- <requestHandler name="/update" class="solr.UpdateRequestHandler"/>Or some other handlers(Mandatory)
10) ManagedSchema.xml => we have to add "fields"(Column name) in it to index .csv file.(Mandatory){ Note:- Whatever the fields you are adding it must be match with .csv file column name with its type like int, string etc(compatible type) }
Ex:- <uniqueKey>id</uniqueKey>
<!-- Fields added for books.csv load-->
<field name="cat" type="text_general" indexed="true" stored="true"/>
{ Note :- If there is no "id" field in .csv file then delete <uniqueKey>id</uniqueKey> and <field name="id" type="string"........../> from managedschema.xml file}
11) Add .csv file into example/example docs folder
12) Now stop the solr and start again.
13) Execute the following commmand to index .csv file.
Ex:-C:\ProgramFiles\solr-6.0.0\example\exampledocs > java -Dtype=text/csv -Dc=CoreAdminname -jar post.jar filename.csv
14) It will index your .csv file now goto query panel and search
Comments
Post a Comment