NVIDIA Jetson Xavier NX Benchmarks

❯ 7z b7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs LE)
LE
CPU Freq: - 64000000 64000000 - 128000000 256000000 512000000 - 2048000000
RAM size: 7763 MB, # CPU hardware threads: 6
RAM usage: 1323 MB, # Benchmark threads: 6
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 9309 466 1945 9057 | 113067 568 1699 9642
23: 9036 454 2026 9207 | 110032 564 1688 9521
24: 9186 465 2122 9877 | 108373 567 1678 9512
25: 9337 469 2275 10661 | 106344 564 1679 9464
---------------------------------- | ------------------------------
Avr: 464 2092 9701 | 566 1686 9535
Tot: 515 1889 9618
❯ 7z b7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs LE)
LE
CPU Freq: 1597 1763 1797 1796 1795 1786 1792 1792 1791
RAM size: 3713 MB, # CPU hardware threads: 6
RAM usage: 1323 MB, # Benchmark threads: 6
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 5848 525 1083 5690 | 118603 535 1892 10115
23: 5615 530 1080 5722 | 114888 532 1868 9941
24: 5511 539 1099 5926 | 113151 536 1853 9932
25: 5223 540 1105 5964 | 107952 530 1812 9607
---------------------------------- | ------------------------------
Avr: 533 1092 5826 | 533 1856 9899
Tot: 533 1474 7862
❯ 7z b7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs LE)
LE
CPU Freq: 64000000 - - - - - 512000000 - -
RAM size: 3793 MB, # CPU hardware threads: 6
RAM usage: 1323 MB, # Benchmark threads: 6
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 4468 476 913 4347 | 94209 522 1540 8034
23: 4114 473 886 4192 | 92694 525 1528 8021
24: 3965 472 902 4264 | 90617 525 1516 7954
25: 3840 480 913 4385 | 88555 525 1501 7881
---------------------------------- | ------------------------------
Avr: 475 904 4297 | 524 1521 7972
Tot: 500 1213 6135
❯ sysbench --test=memory --memory-block-size=1M --memory-total-size=8G run
sysbench 1.0.18 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Running memory speed test with the following options:
block size: 1024KiB
total size: 8192MiB
operation: write
scope: global
Initializing worker threads...Threads started!Total operations: 8192 (15042.72 per second)8192.00 MiB transferred (15042.72 MiB/sec)General statistics:
total time: 0.5421s
total number of events: 8192
Latency (ms):
min: 0.06
avg: 0.07
max: 0.50
95th percentile: 0.08
sum: 532.89
Threads fairness:
events (avg/stddev): 8192.0000/0.00
execution time (avg/stddev): 0.5329/0.00
❯ sysbench --test=memory --memory-block-size=1M --memory-total-size=4G run
sysbench 0.4.12: multi-threaded system evaluation benchmark
Running the test with following options:
Number of threads: 1
Doing memory operations speed test
Memory block size: 1024K
Memory transfer size: 4096MMemory operations type: write
Memory scope type: global
Threads started!
Done.
Operations performed: 4096 ( 1236.33 ops/sec)4096.00 MB transferred (1236.33 MB/sec)Test execution summary:
total time: 3.3130s
total number of events: 4096
total time taken by event execution: 3.3096
per-request statistics:
min: 0.29ms
avg: 0.81ms
max: 2.31ms
approx. 95 percentile: 1.57ms
Threads fairness:
events (avg/stddev): 4096.0000/0.00
execution time (avg/stddev): 3.3096/0.00
rock64@rockpro64:~$ sysbench --test=memory --memory-block-size=1M --memory-total-size=8G run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.18 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Running memory speed test with the following options:
block size: 1024KiB
total size: 8192MiB
operation: write
scope: global
Initializing worker threads...Threads started!Total operations: 8192 ( 7796.96 per second)8192.00 MiB transferred (7796.96 MiB/sec)General statistics:
total time: 1.0473s
total number of events: 8192
Latency (ms):
min: 0.12
avg: 0.13
max: 0.68
95th percentile: 0.13
sum: 1044.03
Threads fairness:
events (avg/stddev): 8192.0000/0.00
execution time (avg/stddev): 1.0440/0.00
docker run -it --runtime=nvidia --rm -v $(pwd):/work -w /work nvcr.io/nvidia/l4t-pytorch:r32.4.2-pth1.5-py3
root@40cb50bd2bc3:/work/pytorch# time python3 mnist.py --epochs=1
Train Epoch: 1 [0/60000 (0%)] Loss: 2.333409
..
Test set: Average loss: 0.0564, Accuracy: 9802/10000 (98%)
real 0m52.126s
user 1m2.100s
sys 0m8.316s
root@40cb50bd2bc3:/work/pytorch# time python3 mnist.py --epochs=1 --no-cuda
Train Epoch: 1 [0/60000 (0%)] Loss: 2.311259
..
Test set: Average loss: 0.0520, Accuracy: 9825/10000 (98%)
real 12m59.672s
user 65m59.436s
sys 0m44.076s
> docker run -it --rm -v $(pwd):/work -w /work --rm pytorch/pytorch bash
> time python3 mnist.py --epochs=1
....
real 1m39.388s
user 5m32.886s
sys 1m6.857s
> python3 mnist.py
...
2020-06-12 19:59:30.666323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3560 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2020-06-12 19:59:32.241779: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
step: 100, loss: 0.373517, accuracy: 0.898438
step: 200, loss: 0.250729, accuracy: 0.933594
...
Test Accuracy: 0.966400
real 0m52.721s
user 0m51.516s
sys 0m5.936s
> python3 mnist.py
...
2020-06-12 19:57:30.661145: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.2'; dlerror: libcudart.so.10.2: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/cuda-10.2/targets/aarch64-linux/lib:
2020-06-12 19:57:30.661284: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
step: 100, loss: 0.347883, accuracy: 0.914062
step: 200, loss: 0.248891, accuracy: 0.933594
...
Test Accuracy: 0.964100
real 0m51.953s
user 1m14.548s
sys 0m4.824s

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store