DEV Community

Haider Ali Punjabi
Haider Ali Punjabi

Posted on • Originally published at blog.haideralipunjabi.com

Automating Android Games with Python & Pytesseract: Sudoku

Introduction

I made a Python Script to Automate a Sudoku Game on Android after watching Engineer Man's Videos on Youtube doing the same for different games.

The script can be divided into 5 parts

  1. Connecting to an Android device using ADB, and getting the screenshot of the game from it
  2. Using Pillow to process the screenshot for pytesseract
  3. Using pytesseract to extract the Sudoku Game Grid to a 2D List in Python.
  4. Solving the Sudoku Game
  5. Sending the solved input to your Android Device using Python

Out of the 5, I will be focusing mostly on 2,3 & 5 as 1 & 4 are topics that have been extensively covered.

Link to the game I automated: https://play.google.com/store/apps/details?id=com.quarzo.sudoku

The complete code is available on the following repository:

Github: haideralipunjabi/sudoku_automate

You can also watch the script in action on:

Libraries Used

Tutorial

1 (a). Using ADB to Connect to your Device

Most of the tutorials on internet use Wired ADB, which discourages many people from using this method. I will be using Wireless ADB, which isn't very difficult to setup.

  1. Go to your Phone Settings > System > Developer Options (This might vary in different phones, so if it is not the same in your's, look it up on the internet)
  2. Turn on Android Debugging and ADB over Network.

ADB Over Network

  1. Note the IP Address and Port shown under ADB over Network
  2. Install ADB on your computer
  3. Go to your command-line / command prompt and enter

    adb connect <ip-address>:<port>

Use the IP Address and Port from Step 3

  1. When connecting for the first time, you will need to authorize the connection on your phone.
  2. Your device should be connected to your PC over WiFi.

1 (b). Using ADB with Python (pure-python-adb)

You can define the following function to connect to the first ADB device connected to your computer using Python

from ppadb.client import Client

def connect_device():
    adb = Client(host='127.0.0.1',port=5037)
    devices = adb.devices()

    if len(devices) == 0:
        print("No Devices Attached")
        quit()
    return devices[0]
Enter fullscreen mode Exit fullscreen mode

We will be using this function later to return an instance of ppadb.device.Device which will be used to take a screenshot, and send input to your device.

1 (c). Taking a Screenshot and saving it

pure-python-adb makes it very easy to capture a screenshot of your device. The screencap function is all that you need to get the screenshot. Use Pythons File IO to save it to screen.png\

def take_screenshot(device):
    image = device.screencap()
    with open('screen.png', 'wb') as f:
        f.write(image)
Enter fullscreen mode Exit fullscreen mode

Screenshot of Sudoku

2. Processing the screenshot with Pillow

In the captured screenshot, the accuracy of any OCR will be very low. To increase accuracy, I used Pillow to process the screenshot so that it only shows the numbers in black color on a white background.

To do that, we first convert the image to grayscale (or single channel) using image.convert('L'). This will make the convert the colors to shades of greys (0-255).

Grayscale Screenshot of Sudoku

After this, we need the numbers (which are the darkest, or very near to black) in black color, and the rest in white. For this, we use image.point() so that all the greys > 50 become white (255) and the rest (numbers) become 0. I also increased the Contrast and Sharpness a bit to be on the safer side.

Processed Screenshot of Sudoku

def process_image(image):
    image = image.convert('L')
    image = image.point(lambda x: 255 if x > 50 else 0, mode='L')
    image = ImageEnhance.Contrast(image).enhance(10)
    image = ImageEnhance.Sharpness(image).enhance(2)
    return image
Enter fullscreen mode Exit fullscreen mode

3. Extracting the numbers from the image using pytesseract

Using pytesseract on the whole image might give us the numbers, but it won't tell us in which box the number was present. So, I use Pillow to crop each box and then use pytesseract on the cropped images. Before using pytesseract, I defined some functions to give me the coordinates of each box and to give me a cropped image of each box.

Since Sudoku has a 9x9 grid, I use two for loops from 0 to 8 to loop over each box. The pytesseract wasn't accurate enough on the default configuration and I had to pass the config --psm 10 --oem 0.

  • The --psm argument defines the Page Segmentation Method. 10 stands for Treat the image as a single character. This seemed most appropriate since I am passing cropped images of each box.
  • The --oem argument defines the OCR Engine Mode. 0 stands for Legacy Engine Only.

The following function will extract the numbers from the passed image and return a 9x9 2D List with the numbers.

def get_grid_from_image(image):
    grid = []
    bar = Bar("Processing: ", max=81)
    for i in range(9):
        row = []
        for j in range(9):
            digit = pytesseract.image_to_string(
                get_box(image, i, j), config='--psm 10 --oem 0')
            if digit.isdigit():     # If pytesseract returned a digit
                row.append(int(digit))
            else:
                row.append(0)
            bar.next()
        grid.append(row)
    return grid
Enter fullscreen mode Exit fullscreen mode

4. Solving the Sudoku Game

Now that we have the 9x9 Sudoku, we need to solve it. Solving Sudoku is a topic that has been covered a lot, and I also copied this code from geeksforgeeks.org.

Here's the geekforgeeks article on Sudoku

5. Sending the solved input to your Android Device using Python

To send the input, I first filtered out the input from the solved Sudoku Grid,i.e, only send the values which were missing. I used the get_coords function from earlier to get the coords of each box and then calculated their centres. I sent a touch at that centre using ADB, and then sent over the solution.

def automate_game(org_grid, solved_grid):
    for i in range(9):
        for j in range(9):
            if org_grid[i][j] == 0:     # If the box was blank in the game
                x1, y1, x2, y2 = get_coords(i, j)
                center = (x1 + (x2 - x1)/2, y1 + (y2-y1)/2)     # Calculating the center of the box (to select it)
                solution = solved_grid[i][j]
                device.shell(
                    f'input touchscreen swipe {center[0]} {center[1]} {center[0]} {center[1]} 5')
                device.shell(f'input text {solution}')

Enter fullscreen mode Exit fullscreen mode

Running the code

All the code that I wrote is in functions and they are called one by one. Note that the grid that I get in step 3 isn't passed directly to step 4. I use deepcopy to create a copy of it, so that I can compare the solved grid with the unsolved/original one in step 5.

if __name__ == "__main__":
    # Connect the device using ADB
    device = adb.connect_device()
    # Take Screenshot of the screen and save it in screen.png
    adb.take_screenshot(device)
    image = Image.open('screen.png')
    image = process_image(image)        # Process the image for OCR
    org_grid = get_grid_from_image(image)      # Convert the Image to 2D list using OCR / Pytesseract
    solved_grid = deepcopy(org_grid)        # Deepcopy is used to prevent the function from modifying the original sudoku game
    solve_sudoku(solved_grid)
    automate_game(org_grid, solved_grid)        # Input the solved game into your device

Enter fullscreen mode Exit fullscreen mode

References

Top comments (0)