如何使用OCR工具即时从屏幕区域提取文本？

在Ubuntu 12.10中，如果我键入

gnome-screenshot -a | tesseract output

它返回：

 ** Message: Unable to use GNOME Shell's builtin screenshot interface, resorting to fallback X11.

如何从屏幕上选择文本并将其转换为文本（剪贴板或文档）？

谢谢！

也许已经有一些工具可以做到这一点，但您也可以创建一个简单的脚本，其中包含一些截图工具和tesseract，正如您尝试使用的那样。

以此脚本为例（在我的系统中，我将其保存为/usr/local/bin/screen_ts ）：

 #!/bin/bash # Dependencies: tesseract-ocr imagemagick scrot select tesseract_lang in eng rus equ ;do break;done # Quick language menu, add more if you need other languages. SCR_IMG=`mktemp` trap "rm $SCR_IMG*" EXIT scrot -s $SCR_IMG.png -q 100 # increase quality with option -q from default 75 to 100 # Typo "$SCR_IMG.png000" does not continue with same name. mogrify -modulate 100,0 -resize 400% $SCR_IMG.png #should increase detection rate tesseract $SCR_IMG.png $SCR_IMG &> /dev/null cat $SCR_IMG.txt exit

并使用剪贴板支持：

 #!/bin/bash # Dependencies: tesseract-ocr imagemagick scrot xsel select tesseract_lang in eng rus equ ;do break;done # quick language menu, add more if you need other languages. SCR_IMG=`mktemp` trap "rm $SCR_IMG*" EXIT scrot -s $SCR_IMG.png -q 100 # increase image quality with option -q from default 75 to 100 mogrify -modulate 100,0 -resize 400% $SCR_IMG.png #should increase detection rate tesseract $SCR_IMG.png $SCR_IMG &> /dev/null cat $SCR_IMG.txt | xsel -bi exit

它使用scrot来获取屏幕， tesseract识别文本和cat以显示结果。剪贴板版本还使用xsel将输出xsel到剪贴板。

样本用法

注意：默认情况下不会安装scrot ， xsel ， imagemagick和tesseract-ocr ，但可以从默认存储库中获取。

您可以用gnome-screenshot替换scrot ，但可能需要做很多工作。关于输出，您可以使用任何可以读取文本文件的内容（使用文本编辑器打开，将识别的文本显示为通知等）。

GUI版本的脚本

这是OCR脚本的简单图形版本，包括语言选择对话框：

 #!/bin/bash # DEPENDENCIES: tesseract-ocr imagemagick scrot yad # AUTHOR: Glutanimate 2013 (http://askubuntu.com/users/81372/) # NAME: ScreenOCR # LICENSE: GNU GPLv3 # # BASED ON: OCR script by Salem (http://askubuntu.com/a/280713/81372) TITLE=ScreenOCR # set yad variables ICON=gnome-screenshot # - tesseract won't work if LC_ALL is unset so we set it here # - you might want to delete or modify this line if you # have a different locale: export LC_ALL=en_US.UTF-8 # language selection dialog LANG=$(yad \ --width 300 --entry --title "$TITLE" \ --image=$ICON \ --window-icon=$ICON \ --button="ok:0" --button="cancel:1" \ --text "Select language:" \ --entry-text \ "eng" "ita" "deu") # - You can modify the list of available languages by editing the line above # - Make sure to use the same ISO codes tesseract does (man tesseract for details) # - Languages will of course only work if you have installed their respective # language packs (https://code.google.com/p/tesseract-ocr/downloads/list) RET=$? # check return status if [ "$RET" = 252 ] || [ "$RET" = 1 ] # WM-Close or "cancel" then exit fi echo "Language set to $LANG" SCR_IMG=`mktemp` # create tempfile trap "rm $SCR_IMG*" EXIT # make sure tempfiles get deleted afterwards scrot -s $SCR_IMG.png -q 100 #take screenshot of area mogrify -modulate 100,0 -resize 400% $SCR_IMG.png # postprocess to prepare for OCR tesseract -l $LANG $SCR_IMG.png $SCR_IMG # OCR in given language cat $SCR_IMG | xsel -bi # pass to clipboard exit

除了上面列出的依赖项之外，您还需要从webupd8 PPA安装Zenity fork YAD以使脚本正常工作。

不知道是否有人需要我的解决方案。这是一个与Wayland一起运行的。

它在文本编辑器中显示字符识别，如果你添加参数“是”，你可以从护目镜转换工具中获得翻译（互联网连接是强制性的）在你可以使用之前安装tesseract-ocr imagemagick和google-trans。当您看到要识别的文本时，使用Alt + F2启动脚本，即在gnome中。在文本周围移动courser。而已。这个脚本仅适用于gnome的testetd。对于其他窗口管理器，它可以适应。要翻译其他语言的文本，请替换第25行中的语言ID。

 #!/bin/bash # Dependencies: tesseract-ocr imagemagick google-trans translate="no" translate=$1 SCR_IMG=`mktemp` trap "rm $SCR_IMG*" EXIT gnome-screenshot -a -f $SCR_IMG.png # increase quality with option -q from default 75 to 100 # Typo "$SCR_IMG.png000" does not continue with same name. mogrify -modulate 100,0 -resize 400% $SCR_IMG.png #should increase detection rate tesseract $SCR_IMG.png $SCR_IMG &> /dev/null if [ $translate = "yes" ] ; then trans :de file://$SCR_IMG.txt -o $SCR_IMG.translate.txt gnome-text-editor $SCR_IMG.translate.txt else gnome-text-editor $SCR_IMG.txt fi exit

我刚刚写了一篇关于如何在现代使用截图的博客。虽然我的目标是中文，但屏幕演员和代码是英文的。 OCR只是其中一项function。

我的OCRfunction：

在konsole + vimx或gedit中打开以进一步编辑。
对于vimx + english，启用拼写检查。
支持动态语言选择，无需硬编码。
转换和测试时的进度对话框很慢。

function代码：

 function ocr () { tmpj="$1" tmpocr="$2" tmpocr_p="$3" atom="$(tesseract --list-langs 2>&1)"; atom=(`echo "${atom#*:}"`); atom=(`echo "$(printf 'FALSE\n%s\n' "${atom[@]}")"`); atom[0]='True' ans=(`yad --center --height=200 --width=300 --separator='|' --on-top --list --title '' --text='Select Languages:' --radiolist --column '✓' --column 'Languages' "${atom[@]}" 2>/dev/null`) && ans="$(echo "${ans:5:-1}")" && convert "$tmpj[x2000]" -unsharp 15.6x7.8+2.69+0 "$tmpocr_p" | yad --on-top --title '' --text='Converting ...' --progress --pulsate --auto-close 2>/dev/null && tesseract "$tmpocr_p" "$tmpocr" -l "$ans" 2>>/tmp/tesseract.log | yad --percentage=50 --on-top --title '' --text='Tesseracting ...' --progress --pulsate --auto-close 2>/dev/null && if [[ "$ans" == 'eng' ]]; then konsole -e "vimx -c 'setlocal spell spelllang=en_us' -n $tmpocr.txt" 2>/dev/null; else gedit "$tmpocr.txt"; fi rm "$tmpocr_p" }

来电代码：

 for cmd in "mktemp" "convert" "tesseract" "gedit" "konsole" "vimx" "yad"; do command -v $cmd >/dev/null 2>&1 || { LANG=POSIX; xmessage "Require $cmd but it's not installed. Aborting." >&2; exit 1; }; :; done tmpj="$(mktemp /tmp/`date +"%s_%Y-%m-%d"`_XXXXXXXXXX.png)" tmpocr="$(mktemp -u /tmp/`date +"%s_%Y-%m-%d"`_ocr_XXXXX)" tmpocr_p="$tmpocr"+'.png' gnome-screenshot -a -f "$tmpj" 2>&1 >/dev/null | ts >>/tmp/gnome_area_PrtSc_error.log ocr $tmpj $tmpocr $tmpocr_p &

将这2个代码组合在单个shell脚本中运行。

截图1：在此处输入图像描述

截图2：在此处输入图像描述

如何使用OCR工具即时从屏幕区域提取文本？

自升级以来Oracle Virtualbox无法打开

Nautilus描述与/ usr / share / applications文件目录

富士通Lifebook a532上的触摸板

安装专有的Nvidia驱动程序后没有出现Unity（GPU已经下降了Bus / ACPI失败）

备份不再有效

限制用户登录尝试（Ubuntu 12.10，pam_tally.so，pam_tally2.so）

VirtualBox’/etc/init.d/vboxdrv setup’问题

/ dev / sda2包含分区后出错的文件系统（更新）

如何在Ubuntu中安装Adobe Reader X？

LibreOffice 3.6及以上版本中的滚动和显示问题（错误？）