NVIDIA Jetson 开发板使用

Jetson 是NVIDIA嵌入式系列开发板，见上图。

GaryGao99

1276人浏览 · 2025-09-04 18:56:43

GaryGao99 · 2025-09-04 18:56:43 发布

1、nvidia jetson 是什么

Jetson 是NVIDIA嵌入式系列开发板，见上图

技术规格：

2、硬件架构

架构图

ARM CPU

Arm Cortex-A78AE

12 CPU cores

64KB Instruction L1 Cache

64KB Data Cache

256 KB L2 Cache

最大主频 2.2 GHz

Ampere 架构 GPU

2 Graphic Processing Clusters (GPCs)

8 Texture Processing Clusters (TPCs),

16 Streaming Multiprocessors (SM’s)

192 KB of L1-cache per SM

4 MB of L2 Cache

128 CUDA cores per SM

3、benchmark

4、查看自己设备的型号

root@ubuntu:~# jtop

显示如下：

监控GPU状态

root@ubuntu:~# tegrastats

RAM 3216/16384MB (lfb 1234x4MB) SWAP 0/0MB  
GPU 25%@1122 EMC 12%@1600 APE 150 MTS fg 0%  
GR3D 22% CV 0% NVENC 0% NVDEC 0%

其中 GR3D 代表GPU利用率；

5、使用

安装jetson-container

git clone https://github.com/dusty-nv/jetson-containers
bash jetson-containers/install.sh

如果要编译镜像，需要修改/etc/docker/daemon.json，添加 "default-runtime": "nvidia"

添加后重启docker并检查

$ sudo systemctl restart docker
$ sudo docker info | grep 'Default Runtime'
Default Runtime: nvidia

修改文件 ".env" （环境变量INDEX_HOST必须设置，否则编译镜像报错）

激活环境变量：

# activate it
source .env

5.1 benchmark

将jetson设置为高功率模式：

# check the current power mode
$ sudo nvpmodel -q
NV Power Mode: MODE_30W
2

# set it to mode 0 (typically the highest)
$ sudo nvpmodel -m 0

# reboot if necessary, and confirm the changes
$ sudo nvpmodel -q
NV Power Mode: MAXN
0

bash jetson-containers/packages/llm/mlc/benchmarks.sh

结果：

model	prefill_rate	decode_rate
Llama-2-7b	536.70	36.29
Qwen2.5-0.5B	2269.55	81.71
Qwen2.5-1.5B	413.98	45.27
Qwen2.5-7B	506.35	32.82

实测在高功率（60W）的条件下，Llama-2-7B的decode_rate的最大约36与上图官方给的47有一点差距；

5.2 llamaspeech

llamaspeech 是级联的结构 ASR+LLM+TTS，启动下面的服务之前，需要先启动ASR(riva)服务(否则ASR无法使用)；

下载 riva_quickstart_arm64_2.19.0

下载后解压riva_quickstart_arm64_2.19.0.zip

启动 riva服务（需要NGC key）：

bash riva_init.sh
bash riva_start.sh

启动llamaspeaker

jetson-containers run $(autotag nano_llm) \
  python3 -m nano_llm.agents.web_chat --api=mlc \
    --model /models/Meta-Llama-3-8B-Instruct \
    --asr=riva --tts=piper

如果无法从huggingface下载llama模型，可以从ModelScope 魔搭社区下载llama模型；

服务启动后，打开浏览器 https://IP_ADDRESS:8050