大数据查询引擎 Trino 的安装及使用

简介

Trino 是一种分布式 SQL 查询引擎,旨在查询分布在一个或多个异构数据源上的大型数据集。
大数据查询引擎 Trino 的安装及使用

一、准备

官网 https://trino.io/

  1. Trino 406 安装包(https://repo1.maven.org/maven2/io/trino/trino-server/406/trino-server-406.tar.gz)可以从 https://trino.io/docs/current/installation/deployment.html 找到
  2. JDK 17
  3. python 2.6.x, 2.7.x, or 3.x

二、安装 JDK 17

cd /usr/local
mkdir java
cd java
wget https://download.oracle.com/java/17/archive/jdk-17.0.6_linux-x64_bin.tar.gz
tar -zxvf jdk-17.0.6_linux-x64_bin.tar.gz

三、 安装 Trino

cd /usr/local
wget https://repo1.maven.org/maven2/io/trino/trino-server/406/trino-server-406.tar.gz
tar -zxvf trino-server-406.tar.gz
cd trino-server-406/bin

vim launcher 
# 在 exec "$(dirname "$0")/launcher.py" "$@" 上面添加 PATH=/usr/local/java/jdk-17.0.6/bin/:$PATH
# 即:
    PATH=/usr/local/java/jdk-17.0.6/bin/:$PATH
    exec "$(dirname "$0")/launcher.py" "$@"

四、 配置 Trino jvm.config

cd cd /usr/local/trino-server-406
mkdir ./etc
vim jvm.config

# 填入以下内容
-server
-Xmx4G
-XX:InitialRAMPercentage=80
-XX:MaxRAMPercentage=80
-XX:G1HeapRegionSize=32M
-XX:+ExplicitGCInvokesConcurrent
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:-OmitStackTraceInFastThrow
-XX:ReservedCodeCacheSize=512M
-XX:PerMethodRecompilationCutoff=10000
-XX:PerBytecodeRecompilationCutoff=10000
-Djdk.attach.allowAttachSelf=true
-Djdk.nio.maxCachedBufferSize=2000000
-XX:+UnlockDiagnosticVMOptions
-XX:+UseAESCTRIntrinsics
# Disable Preventive GC for performance reasons (JDK-8293861)
-XX:-G1UsePreventiveGC

五、 配置 Trino node.properties

cd cd /usr/local/trino-server-406/etc
vim node.properties

# 填入以下内容
node.environment=production
node.id=67d8c4ab-191d-46c4-afb3-f657daab2493
node.data-dir=/var/trino/data

# 注: 
    1. 集群所有的 environment 值为一样
    2. 每个节点的id要不一样

六、 配置 Trino config.properties

cd cd /usr/local/trino-server-406/etc
vim config.properties

# 填入以下内容
1. 以下是coordinator的最小配置:
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8086
discovery.uri=http://nn-1:8086

2. 这是workers的最低配置:
coordinator=false
http-server.http.port=8086
discovery.uri=http://nn-1:8086

3. 或者,如果您要设置一台机器进行测试,同时充当coordinator和workers,请使用此配置:
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8086
discovery.uri=http://nn-1:8086

# 注:discovery.uri 都和 coordinator 一样

七、 配置 Trino log.properties

cd cd /usr/local/trino-server-406/etc
vim log.properties

# 填入以下内容
io.trino=INFO

可取值:DEBUG, INFO, WARN, ERROR

catalog 配置, 以iceberg为例

cd /usr/local/trino-server-406/etc

mkdir catalog

vim iceberg.properties

# 填入一下内容
connector.name=iceberg
iceberg.file-format=PARQUET
iceberg.catalog.type=HIVE_METASTORE
hive.metastore.uri=thrift://nn-2:9083,thrift://nn-1:9083
hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml

九、 启动及使用

cd /usr/local/trino-server-406/bin
./launcher start

jdbc 链接:
trino://admin@nn-1:8086/iceberg
weinxin
我的微信
有问题微信找我
DannyWu

发表评论

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

Protected with IP Blacklist CloudIP Blacklist Cloud