ElasticSearch in Ravello in 30 minutes


Create an application in Ravello from a clone:

I have two machines, k1 for elesticsearch (no ssh access), saltok for ssh with private key jump.

ravello1

Installers & instructions:

ElasticSearch: https://www.elastic.co/downloads/elasticsearch

cerebro: https://github.com/lmenezes/cerebro

fscrawler: https://github.com/dadoonet/fscrawler

Configure and Start ElasticSearch:

vi elasticsearch-6.0.0/config
network.host: 0.0.0.0

transport.host: localhost

transport.tcp.port: 9300
elasticsearch-6.0.0/bin/elasticsearch-plugin install x-pack
elasticsearch-6.0.0/bin/elasticsearch

Crawl a website into a directory:

wget --no-clobber --convert-links --random-wait -r -p --level 10 -E -e robots=off -U mozilla https://javiermugueta.wordpress.com

Configure fscrawler:

fscrawler-2.4/bin/fscrawler javi --loop 1 --rest --username elastic --upgrade

Edit config file (/home/oracle/.fscrawler/javi/_settings.json) and set index directory:

{

  "name" : "javi",

  "fs" : {

    "url" : "/oracle/javi",

...

Launch fscrawler:

fscrawler-2.4/bin/fscrawler javi --loop 1 --username elastic

Launch cerebro:

cerebro-0.7.1/bin/cerebro

Connect to cerebro ui:

http://k1-cotscfbp21sep20172-yji6xufr.srv.ravcloud.com:9000

cerebro1

Make a query:

javi/_search?q=Almost every cloud should have its (i)PaaS

cerebro2

In addition I’ve crawled the whole www.intratext.com to a directory and indexed with fscrawler: more than 1,6 million docs indexed!

cerebro3

Enjoy 😉

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.