一些平鋪直敘技術相關文: 使用 elasticsearch-river-jdbc 匯入 MySQL 到 elasticsearch

2014年7月7日星期一

使用 elasticsearch-river-jdbc 匯入 MySQL 到 elasticsearch

Github Repo: Elasticsearch-river-jdbc

(本文若有用到 ES，用來代表 elasticsearch 這個詞)

為什麼要用 Elasticsearch-river-jdbc?
因為我想知道怎麼把 MySQL 的資料轉進去，但是，Elastic Search 原生並不支援 MySQL，所以我們需要用一些特殊的方式，去觸發 Elasticsearch 去 reindexing (重新建立 index)，而 elasticsearch-river-jdbc 正好是其中一種選擇。

Elasticsearch River
在 elasticsearch 中，river 這個字代表的是一個資料的來源，也是其他資料庫同步資料到 ES 的方法，通常他是以一種 plugin 或是說外掛的方式，裝在一個 ES 上的服務，透過讀取 river 中的資料並且把它 index 到 ES，像是現在官方有提供的 river 有 couchdb, rabbitmq, twitter, wikipedia。

下載與安裝方式

./bin/plugin --install jdbc --url http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-river-jdbc/1.2.1.1/elasticsearch-river-jdbc-1.2.1.1-plugin.zip

Example: 建立 river
這個 sample 用意在於，把我的 MySQL 的 phalconblog 這個資料庫的 posts 這張 table，匯入到 elasticsearch 的 phalconblog 的 index，以及 type name 為 posts。

curl -XPUT 'localhost:9200/_river/my_jdbc_river/_meta' -d '{
"type": "jdbc",
"jdbc": {
"strategy": "simple",
"driver": "com.mysql.jdbc.Driver",
"url": "jdbc:mysql://localhost:3306/phalconblog",
"user": "root",
"password": "",
"sql": "select * from posts",
"index": "phalconblog",
"type": "posts"
}
}'

建立 river 的時候你會送出一組 JSON 格式的資料，這些參數的用意可以看 DBC River parameters 的說明。

成功建立 river
建立成功的話，會回傳一個 JSON:

{
"_index":"_river",
"_type":"my_jdbc_river",
"_id":"_meta",
"_version":1,
"created":true
}

並且你可以訪問 http://localhost:9200/phalconblog/posts/_search?q=* 查看回傳的資料，照理說，原本 MySQL 的 posts 這張 table 的資料應該就會進來了。

{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "phalconblog",
"_type": "posts",
"_id": "tovXtRGsRXOF5pR0cUTxBA",
"_score": 1,
"_source": {
"id": 1,
"title": "你好嗎",
"body": "c9s",
"excerpt": "2014/06/12",
"published": "2014-06-13T00:00:00.000Z",
"updated": "2014-07-13T00:00:00.000Z",
"pinged": "2",
"to_ping": "2"
}
}
]
}
}

出現錯誤的情況

java.sql.SQLException: No suitable driver found for jdbc:mysql: .....

恩，我就是出現這個錯誤。
這是因為 JDBC 的 river 並沒有提供各種 vender(供應商) 的 driver，所以必須要自己去下載 MySQL 的 driver jar，並且放到 elasticsearch 的 lib 資料夾底下。

下載 mysql-connector-java，解壓縮之後，把 mysql-connector-java-X.X.XX-bin.jar 這個 jar 檔放到你的 elasticsearch 的 lib 資料夾底下，重啟 elasticsearch。

常用指令

顯示所有的 river

curl -XGET http://localhost:9200/_river/_search?q=*

刪除一個 river

curl -XDELETE 'localhost:9200/_river/my_jdbc_river'

刪除一個 index
以刪除名為 jdbc 的 index 為例:

curl -XDELETE 'http://localhost:9200/jdbc/'

參考

沒有留言:

張貼留言

若你看的文章，時間太久遠的問題就別問了，因為我應該也忘了... XD

一些平鋪直敘技術相關文

2014年7月7日星期一

使用 elasticsearch-river-jdbc 匯入 MySQL 到 elasticsearch

沒有留言:

張貼留言

Vue multiselect set autofocus and tinymce set autofocus

關於我自己

2014年7月7日 星期一

使用 elasticsearch-river-jdbc 匯入 MySQL 到 elasticsearch

沒有留言:

張貼留言

Vue multiselect set autofocus and tinymce set autofocus

2014年7月7日星期一