从 Python 第三方进度条库 tqdm 谈起

tqdm

最近一款新的进度条 tqdm 库比较热门,声称比老版的 python-progressbar 库的单次响应时间提高了 10 倍以上。

Overhead is low -- about 60ns per iteration (80ns with gui=True). By comparison, the well established ProgressBar has an 800ns/iter overhead.

初读其源码,组织结构明显继承 python-progressbar,只是主代码行数从 357 提升到了 614。10 倍性能提升的奥妙在哪里呢?
在解答这个问题之前,我想先用这篇文章介绍下进度条的原理,然后,根据原理用几行代码实现一个简单的进度条。

progress bar 的原理

其实进度条的原理十分的简单,无非就是在 shell 中不断重写当前输出。
这时就不得不提到文本系统中的控制符。我们挑跟这次有关的看一下。

  • \r = CR (Carriage Return) // moves the cursor to the beginning of the line without advancing to the next line(该控制符告诉输出端,将光标移到当前行的首位而不换行)
  • \n = LF (Line Feed) // moves the cursor down to the next line without returning to the beginning of the line - *In a nix environment \n moves to the beginning of the line.(传统意义上的换行符,将光标移到下一行,但_并不移到首位_ )
  • \r\n = CR + LF // a combi of \r and \n (换行并移动光标到行首)

这时,想要实现一个进度条,就十分简单,看下方代码。

Bash 实现

#/usr/bin/bash
for i in {1..100};
do
    echo -ne "$i% \r"
    sleep 0.01
done
echo -ne "\n"

但是,echo -n 存在明显的兼容性问题。

-n Do not print the trailing newline character. This may also be achieved by appending \c to the end of the string, as is done by iBCS2 compatible systems. Note that this option as well as the effect of \c are implementation-defined in IEEE Std 1003.1-2001 (POSIX.1) as amended by Cor. 1-2002. Applications aiming for maximum portability are strongly encouraged to use printf(1) to suppress the newline character.

Some shells may provide a builtin echo command which is similar or identical to this utility. Most notably, the builtin echo in sh(1) does not accept the -n option. Consult the builtin(1) manual page.

推荐使用 printf

#/usr/bin/bash
for i in {1..100};
do
    printf "%s%% \r" $i
    sleep 0.01
done
printf "\n"

Python 实现

Python 主要使用系统库里的标准输出,sys.stdout 提供了便利的方法用于向 shell 打印输出。具体的方法介绍这里不赘述。

import sys
import time

for i in range(100):
    sys.stdout.write('   \r')
    sys.stdout.flush()
    sys.stdout.write('{}%\r'.format(i))
    sys.stdout.flush()
    time.sleep(0.01)

References