Inside this tutorial you will learn how to setup your development environments on a Raspberry Pi 3B/4B.
Step #1: Install Operating System level dependencies
When you’re ready, go ahead and update your system:
$ sudo apt-get update $ sudo apt-get upgrade
From there, we’ll install various libraries required to install OpenCV:
$ sudo apt-get install libhdf5-dev libhdf5-serial-dev $ sudo apt-get install libqtgui4 libqtwebkit4 libqt4-test python3-pyqt5 $ sudo apt-get install libatlas-base-dev $ sudo apt-get install libjasper-dev
Next, we’ll install packages required to install Tesseract:
$ sudo apt-get install libicu-dev libpango1.0-dev libcairo2-dev $ sudo apt-get install automake ca-certificates g++ git libtool libleptonica-dev make pkg-config $ sudo apt-get install --no-install-recommends asciidoc docbook-xsl xsltproc $ sudo apt-get install libpng-dev libjpeg8-dev libtiff5-dev zlib1g-dev
Step #2: Compiling and Installing Tesseract 4.1.1 from Source
In this step, we will compile and install Tesseract from source. You might wonder why simply installing Tesseract using apt-get would not work and the answer to that is upon some testing we found that various Tesseract options such as whitelisting, blacklisting, etc don’t work when Tesseract is installed using apt-get .
Now using the following set of commands we will download Tesseract 4.1.1, configure the build, compile tesseract, and install Tesseract.
$ wget -O tesseract.zip https://github.com/tesseract-ocr/tesseract/archive/4.1.1.zip $ unzip tesseract.zip $ mv tesseract-4.1.1 tesseract $ cd tesseract $ ./autogen.sh $ ./configure $ make $ sudo make install $ sudo ldconfig
Lastly, we will be downloading the English language and OSD Tesseract model files that would otherwise be installed automatically if you were installing Tesseract using apt-get :
$ cd /usr/local/share/tessdata/ $ sudo wget https://github.com/tesseract-ocr/tessdata_fast/raw/master/eng.traineddata $ sudo wget https://github.com/tesseract-ocr/tessdata_fast/raw/master/osd.traineddata
Step #3: Install pip and virtual environments
In this step, we will set up pip and Python virtual environments.
We will use the de-facto Python package manager, pip.
Let’s download and install pip:
$ wget https://bootstrap.pypa.io/get-pip.py $ sudo python3 get-pip.py
Let’s now install virtual environment tools now:
$ pip3 install virtualenv virtualenvwrapper
From here, we need to update our bash profile to accommodate virtualenvwrapper . Open up the ~/.bashrc file with Nano or another text editor:
$ nano ~/.bashrc
And insert the following lines at the end of the file:
# virtualenv and virtualenvwrapper export WORKON_HOME=$HOME/.local/bin/.virtualenvs export VIRTUALENVWRAPPER_PYTHON=/usr/bin/python3 export VIRTUALENVWRAPPER_VIRTUALENV=$HOME/.local/bin/virtualenv source $HOME/.local/bin/virtualenvwrapper.sh
Save the file (ctrl + x , y , enter ) and exit to your terminal.
Go ahead and source/load the changes into your profile:
$ source ~/.bashrc
Now we’re ready to create your Python 3 optical character recognition virtual environment named ocr :
$ mkvirtualenv ocr -p python3
Step #3: Installing packages into your ocr virtual environment
Let’s first install OpenCV, PyTesseract, and Pi Camera python package using the following commands:
$ workon ocr $ pip install numpy opencv-contrib-python $ pip install pytesseract $ pip install "picamera[array]"
Next, we will install other computer vision and machine learning libraries:
$ pip install pillow scipy $ pip install scikit-learn scikit-image $ pip install imutils matplotlib $ pip install requests beautifulsoup4 $ pip install textblob progressbar pandas
Since there are no official Tensorflow 2.2 pre-compiled binaries available for Raspberry Pi, we will be using unofficial pre-compiled binaries to install it using the following commands:
$ cd ~ $ workon ocr $ wget https://github.com/lhelontra/tensorflow-on-arm/releases/download/v2.2.0/tensorflow-2.2.0-cp37-none-linux_armv7l.whl $ pip install tensorflow-2.2.0-cp37-none-linux_armv7l.whl
That’s it! Your Raspberry Pi is now ready to run code examples from the OCR Book.
Note:
- Few chapters in the book require you to install additional packages so always check the beginning of the chapter to make sure you are not missing out on installing any particular package.
- Unfortunately, it seems like there are no PyTorch precompiled binaries available for Raspberry Pi and hence you won’t be able to run code from chapter(s) using EasyOCR package.