基于Docker的hbase-phoenix-all-in-one试用问题总结

一、docker hub上搜索phoenix镜像

安装命令:

docker run --detach -p 2181:2181 -p 8765:8765 -p 15165:15165 -p 16000:16000 -p 16010:16010 -p 16020:16020 --name th_phoenix boostport/hbase-phoenix-all-in-one:latest

进入容器bash:

docker exec -it phoenix bash

执行sqlline:

bash-4.4# export HBASE_CONF_DIR=/opt/hbase/conf/
bash-4.4# /opt/phoenix-server/bin/sqlline.py localhost

二、存储笔记

经过验证,Phoenix的table,默认仅建立一个命名为0的Family Column,但是在创建表的时候可以将列分散到多个Family Column中。

三、性能测试

官网FAQ中讲述,在中等集群中,可以做到100M(一亿)条记录全表扫描仅需要用20S。

插入测试:但是博主在Windows10上使用Docker镜像测试,对于以下测试表,采用SSD硬盘单线程插入速度大概在每秒200个点左右,10线程的话,总计每秒接近700条记录左右。采用普通机械硬盘,10线程插入数量为500条记录。

count(*)测试:200万条记录时,count(*)需要用50-60秒;

select 单条:速度在0.3秒左右;

select avg()等1万条:速度在1秒左右;

测试表结构:

CREATE TABLE wx_program (id BIGINT PRIMARY KEY, status VARCHAR,liaowen DOUBLE,luwen1 DOUBLE,luwen2 DOUBLE,liaowei DOUBLE,xSudu DOUBLE,D1 DOUBLE,zSudu DOUBLE,G1 DOUBLE,kaomuT DOUBLE,kaoliaoT DOUBLE,kaomuTem DOUBLE,kaoliaoTem DOUBLE,lengqueT DOUBLE,chuiguanSudu DOUBLE,jinliaoT DOUBLE,chuiguanQiya DOUBLE,pLength DOUBLE,danzhongJunzhi DOUBLE,danzhongBiaozhuncha DOUBLE,mohouJunzhi DOUBLE,mohouBiaozhuncha DOUBLE,danzhong DOUBLE,mohou DOUBLE,file VARBINARY, name VARCHAR, ax DOUBLE ARRAY)

insert测试代码:

import phoenixdb
import phoenixdb.cursor
import time
import numpy
import random


database_url = 'http://localhost:8765/'
conn = phoenixdb.connect(database_url, autocommit=True)
# CREATE TABLE wx_program (id BIGINT PRIMARY KEY, status VARCHAR,liaowen DOUBLE,luwen1 DOUBLE,luwen2 DOUBLE,liaowei DOUBLE,xSudu DOUBLE,D1 DOUBLE,zSudu DOUBLE,G1 DOUBLE,kaomuT DOUBLE,kaoliaoT DOUBLE,kaomuTem DOUBLE,kaoliaoTem DOUBLE,lengqueT DOUBLE,chuiguanSudu DOUBLE,jinliaoT DOUBLE,chuiguanQiya DOUBLE,pLength DOUBLE,danzhongJunzhi DOUBLE,danzhongBiaozhuncha DOUBLE,mohouJunzhi DOUBLE,mohouBiaozhuncha DOUBLE,danzhong DOUBLE,mohou DOUBLE,file VARBINARY, name VARCHAR, ax DOUBLE ARRAY)
cursor = conn.cursor()
# cursor.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, username VARCHAR)")
# cursor.execute("UPSERT INTO users VALUES (?, ?)", (1, 'admin'))
# cursor.execute("SELECT * FROM users")
# print(cursor.fetchall())

# cursor = conn.cursor(cursor_factory=phoenixdb.cursor.DictCursor)
# cursor.execute("SELECT * FROM users WHERE id=1")
# print (cursor.fetchone()['USERNAME'])

i = 0
while True:
    id  = int(time.time()*1000*1000)
    tag = ["left","right","up","down"]

    cursor.execute("UPSERT INTO wx_program (id, status, liaowen, luwen1, luwen2, liaowei, xSudu, D1, zSudu, G1, kaomuT, kaoliaoT, kaomuTem, kaoliaoTem, lengqueT, chuiguanSudu, jinliaoT, chuiguanQiya, pLength, danzhongJunzhi, danzhongBiaozhuncha, mohouJunzhi, mohouBiaozhuncha, danzhong, mohou, name) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)",
    (
         id, 
         tag[random.randint(0, len(tag)-1)],
         round(numpy.random.normal(10, 0.1), 5),
         round(numpy.random.normal(20, 0.1), 5),
         round(numpy.random.normal(30, 0.1), 5),
         round(numpy.random.normal(40, 0.1), 5),
         round(numpy.random.normal(50, 0.1), 5),
         round(numpy.random.normal(60, 0.1), 5),
         round(numpy.random.normal(70, 0.1), 5),
         round(numpy.random.normal(80, 0.1), 5),
         round(numpy.random.normal(90, 0.1), 5),
         round(numpy.random.normal(100, 0.1), 5),
         round(numpy.random.normal(110, 0.1), 5),
         round(numpy.random.normal(120, 0.1), 5),
         round(numpy.random.normal(130, 0.1), 5),
         round(numpy.random.normal(140, 0.1), 5),
         round(numpy.random.normal(150, 0.1), 5),
         round(numpy.random.normal(160, 0.1), 5),
         round(numpy.random.normal(170, 0.1), 5),
         round(numpy.random.normal(180, 0.1), 5),
         round(numpy.random.normal(190, 0.1), 5),
         round(numpy.random.normal(200, 0.1), 5),
         round(numpy.random.normal(210, 0.1), 5),
         round(numpy.random.normal(220, 0.1), 5),
         round(numpy.random.normal(230, 0.1), 5),
         "danny"
         ))
    i += 1
    if i%1000==0:
        print("i=" +str(i)+ " id:" + str(id))

 

三、已知缺点

  1. sdk 语言仅支持Java和Python两种;
  2. Python版本目前不支持任何Array类型,但可以使用ARRAY_TO_STRING函数暂时代替,例如select id,ARRAY_TO_STRING(val,’,’)as val from ph_test_tb12;

参考链接:

  • https://python-phoenixdb.readthedocs.io/en/latest/
  • https://github.com/Boostport/hbase-phoenix-all-in-one/issues
分享到:更多 ()