Improve your workflow using Automation in Python

Hi developers I am Yash Makan and today we are going to talk about automation. This will be a complete guide for automating tasks on different devices like mobile, PC and web etc. So if you are a beginner or an expert in this automation field you'll surely learn something.

Table of Contents

  • Introduction
  • What is Automation?
  • Types of Automation in Python
    • Software Automation
      • PC
      • Web
      • Mobile
    • Hardware Automation
      • Raspberry Pi
  • Template Matching
    • What is Template Matching?
    • How does it work and how to achieve template matching in Python?
  • Automating PC 💻
    • Tools & Libraries
      • PyAutoGui
      • OS & Subprocess
      • File Management & shutil
      • Template Matching
    • .pyw to run files in the background
  • Automating Web 🕸
    • Tools & Libraries
      • requests
      • beautifulsoup
      • selenium
      • asyncio & concurrent.futures
  • Automating Mobile 📱
    • Tools
      • ADB
    • How to automate android phones using ADB?
    • What can you automate?
  • Conclusion
  • Contact Me
    • Website
    • Twitter

Introduction

Humans can perform a lot of things that computers aren't very adept at yet. Exercising rigorous critical and contextual judgement, for example, or practising empathy. Computers, on the other hand, are fantastic at accomplishing things that we either can't or wouldn't want to do because of their consistency, accuracy, and rapid-fire speed. As an example, consider the performance of repeated, time-consuming tasks. We like to be challenged and rewarded for doing complex things that allow us to grow and improve in our skill sets. In general, we don't like plugging the same numbers into a spreadsheet over and over again. This is one of the best examples of a profitable co-working relationship between humans and computers. Using a programming language such as Python, we can automate the completion of repetitive operations in an efficient and effective manner.

What is Automation?

You might have some idea of what these names refer to, but it’s always good to brush up your knowledge with a more substantial definition.

Automation: A process in which a manually performed action is transformed into one that happens automatically.

In our case a program, allows the task to be completed autonomously, on its own, without the need for user participation.

Types of Automation in Python

Mainly there are two types of automation programs,

  1. Software Automation
    1. PC - Automation scripts on the PC are very useful. These usually don't require any external library to be installed. You can automate files management and even turn your pc into a mini Jarvis of your own, creating shortcuts and commands to shorten the time on repetitive daily tasks. Some examples of automating in PC includes
      1. File Creation: Filling out PDFs and Excel files
      2. File Conversion: Converting image files
      3. Performing quick math equations
      4. Game Automation: Python can play games for you like GTA or watchdogs
      5. Bulk file renaming and sorting etc.
    2. Web - You can automate the web using some libraries like requests, Beautiful Soup🥣, Selenium and much more. You can do a lot of things and web automation is a must know as this will help you in any career. Here are some ideas you can create on web automation
      1. Email Automation: make a bot that can read emails and send messages when you are at work or delete email or spam mail...
      2. Bot Automation: create a bot for Twitter, Instagram. This bot can like your friend's messages or even reply to them. You can create a discord bot for your server.
      3. Trading: you can even create a trading program that can trade stocks, forex or even crypto for you. isn't this cool!
      4. There's much more you can do like news, weather, lyrics scraper programs
    3. Mobile - Many of you guys don't know but you can even automate your mobile phones using python. Here I am talking specifically about android phones, not iPhones. You can use tools like ADB(Android Debug Bridge) to control your android phones using python and automate stuff. You can automate things like
      1. Game Automation: Python playing temple run, subway surfer or even PUBG or COD. you just call it😮.
      2. Sending WhatsApp messages automatically
      3. SMS Automation: Send good morning messages to your relatives or even send birthday wishes early without remembering. 🐍 I got you covered bro!
      4. Any other you can imagine or any gesture like swipe, clicking you can make using python ADB automation and automate anything
  2. Hardware Automation
    1. Raspberry Pi - This is a little different. Raspberry pi is a little computer where you can connect servos, smart devices like lights, TV etc and control using gestures or even voice. Some cool projects ideas with raspberry pi can be
      1. IoT Home Automation: Control lights, TV using voice command, gestures etc.
      2. Surveillance Camera with face recognition to open doors
      3. Weather station to detect rain or sunny weather. You can even predict the weather with prediction models LSTM, KNN etc

I am not writing about hardware automation in this article but if you are interested then do let me know in the discussion below... 🙂

Template Matching

It is very important to discuss template matching before explaining automation in different devices as template matching reduces many problems which usually occur while developing automated tasks. It is mainly used in automating mobile and PC. So first learn what is template matching,

what is Template Matching?

Template matching: is a technique in digital image processing for finding small parts of an image which match a template image. It can be used in manufacturing as a part of quality control, a way to navigate a mobile robot, or as a way to detect edges in images.
source: Wikipedia

If you are confused by the definition of template matching above then don't worry I am going to explain this in deep. To understand this let's take an example,

I gave you a task to write a python script where the program will open the notepad and start writing a predefined text.

Suppose your notepad application is located on your desktop screen on (100, 130) So you made a program to click on 100, 130 and then write the text

what do you think is this approach correct? what if you change the location of your notepad application later to (500, 100). will the program still work?

The answer is no. We need a universal way to open the application from the desktop, if even the user changes the location the program still automatically detects its location and click. To achieve this so that the computer can automatically detect and recognize the icon of the notepad is known as template matching.

How does it work and how to achieve template matching in Python?

Let's understand this without code first.

Here as you can see in the image there are two images sample_image & target_image. What we are going to do is search pixel-by-pixel in sample_image about target_image. If target_image is present in the sample_image then the function will return the (x,y) of the target_image i.e the notepad application. Then we will click on the specific location. This is how template matching works. Let's also see the python code to achieve the same we discussed.

import cv2 # pip install opencv-python

def get_location(target_image, sample_image):
    # read screenshot image
    img_rgb = cv2.imread(sample_image)
    # convert BGR to grayscale image
    img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
    # read target image
    template = cv2.imread(target_image, 0)
    # extracting height and width of target image
    height, width = template.shape[::]
    # template match using cv2
    res = cv2.matchTemplate(img_gray, template, cv2.TM_SQDIFF)
    # extract top_left position of target image in screenshot image
    _, _, top_left, _ = cv2.minMaxLoc(res)
    # calculating the bottom right position of target image
    bottom_right = (top_left[0] + width, top_left[1] + height)
    # calculating x, y postion
    return (top_left[0]+bottom_right[0])//2, (top_left[1]+bottom_right[1])//2

sample_image = take_screenshot() # sample_image.png
with open("target_image.png") as f:
    target_image = f.read()
x, y = get_location(target_image, sample_image)
click(x,y)

That's it! now you know what template matching is, how does it work and how to code it in python. Let's continue ahead...

Automating PC 💻

Alright, now we will be discussing automating things on your PC. The tools/libraries that can be used for automating tasks in PC are

  • PyAutoGUI: PyAutoGUI is an amazing library created by asweigart. This library has multiple features like clicking, taking screenshots, keystrokes, writing text etc. This tool is a must-have if you developing an automation script for PC. It is compatible with windows, mac & ubuntu.

    pip install pyautogui

  • os & subprocess: If you want to call os commands, retrieve output use external CLI with your program, delete a file, create file or folder. Then these two libraries are very important. You don't have to install them. These are pre-installed with python.

  • File management & shutil: If you are planning to use automation for file management then open() is very important. It helps you read data from any file in bytes or as text and even lets you change, append or write any data. Moreover, you can use shutil which offers a number of high-level operations on files and collections of files. In particular, functions are provided which support file copying and removal.

  • Template Matching

Fun Fact: You can create a .pyw file that runs in the background so if you want to make a program something like file sorting for your downloads folder then you can write the code in the .pyw file. You can also make this .pyw file at the startup of your machine so that you don't have to run the python file again & again.

Automating WEB 🕸

Knowing to automate tasks on the web is very important. This will help you in a lot of times when you don't want to do some repetitive tasks or want to scrape some data from any website. The possibilities are endless. The tools and libraries required for web automation can be

  • requests: Requests library is used to send requests using methods like GET, POST, PUT etc. Using this library can be used to extract HTML, JSON, XML data from any website URL. This is a very basic package and you will be using this library in one way or another. I would like to tell you that this library is not available with python. You have to install using

    pip install requests

  • BeautifulSoup: This library is also very important and lets you pull data from HTML and XML files. It helps to parse the HTML file and can find any tag by id, class, XPath, CSS etc. This library can be installed by using

    pip install beautifulsoup4

  • Selenium: This library is very powerful. It lets you access the web like a user. The features of selenium are typing in input fields, clicking buttons, switching tabs, taking screenshots, downloading files, executing javascript and much more. It is used for little complex automation projects. Selenium uses webdriver and can run on different browsers like chrome, firefox, safari etc. Also, selenium can run headless meaning the browser window can be hidden from the user. You can install selenium using

    pip install selenium

  • webbot: This library is developed using selenium. The only point is that selenium is a little complex with id, class, CSS & XPath selectors. Unlike selenium using webbot is extremely easy. You can install webbot using

    pip install selenium

  • webdriver: As I have told you earlier webbot and selenium depends on webdriver of chrome, firefox, safari or any other browser.

Automating Mobile 📱

I think most of you don't know but you can also automate tasks on your mobile phone using python. There is no such library but you have to install ADB.

ADB: ADB(Android Debug Bridge) is a command-line tool that lets you communicate with android devices.*

How to automate android phones using ADB?

To automate android phones with python you have to install ADB from here. After installation, we can start writing code in python.

Let's First import the dependencies we will be using, as told earlier we are importing subprocess to call ADB command-line tool.

import cv2, subprocess

Then we will be assigning a function called ADB in which the user will pass the os command and the function will return output from the terminal.

def adb(command):
    proc = subprocess.Popen(command.split(' '), stdout=subprocess.PIPE, shell=True)
    (out, _) = proc.communicate()
    return out.decode('utf-8')

What can you automate?

I have already created a gist for you if you want to access the ADB code. You can find the gist here

  1. Swipe Gesture

    def swipe(start_x, start_y, end_x, end_y, duration_ms):
        adb("adb shell input swipe {} {} {} {} {}".format(start_x, start_y, end_x, end_y, duration_ms))
    
  2. Clicking

    def click(tap_x, tap_y):
        adb("adb shell input tap {} {}".format(tap_x, tap_y))
    
  3. Take Screenshots

    def take_screenshot(final):
        adb(f"adb exec-out screencap -p > ./images/{final}.png")
    
  4. Send SMS

    def send_sms(number, body):
        adb(f'adb shell am start -a android.intent.action.SENDTO -d sms:{number} --es  sms_body "{body}" --ez exit_on_sent true')
    
  5. Calling

    def call(number):
        adb(f"adb shell am start -a android.intent.action.CALL -d tel:{number}")
    
  6. Download File from phone to PC

    def download(path, output_path):
        adb(f"adb pull {path} {output_path}") #/sdcard/xxx.mp4
    
  7. Remove Dir

    def remove(path):
        adb(f"adb shell rm {path}") #/sdcard/...
    
  8. ScreenRecord

    def screen_record(name, time):
        adb(f"adb shell screenrecord /sdcard/{name} --time-limit {time}")
        download(f"/sdcard/{name}",f"./mobile/{name}")
        remove(f"/sdcard/{name}")
    
  9. Power On/Off

    def switch_phone_on_off():
        adb("adb shell input keyevent 26")
    
  10. Keyevents (list of keyevent codes here)

    def keyevent(key):
        adb(f"adb shell input keyevent {key}")
    
  11. Send whatsapp message

    def send_whatsapp_message(phone, message):
        adb(f'adb shell am start -a android.intent.action.VIEW -d "https://api.whatsapp.com/send?phone={phone}"')
        adb('ping 127.0.0.1 -n 2 > nul')
        adb(f'adb shell input text "{message}"')
        adb('adb shell keyevent 22')
        adb('adb shell keyevent 22')
        adb('adb shell input keyevent 22')
        adb('adb shell input keyevent 22')
        adb('adb shell input keyevent 66')
    
  12. Template Matching: You can use template matching to detect icon location and position by taking a screenshot and then clicking. If you don't know template matching then I've explained that earlier in this blog.

Conclusion

I would like to conclude this article by saying possibilities are endless. You can automate anything you imagine if it's for your PC, mobile or for the web. I hope you like my article, if yes! then make sure to hit the ❤ icon below and also don't forget to 🔖 bookmark. If you are having trouble with anything then do let me know in the discussion below and also If you have any specific topic for my next blog then do let me know. I will add that to my bucket list 📒 as well. You can also follow me on twitter to get notified about my blogs. With that being said I will hope to see you again. Till then b-bye!

Contact Me

Twitter: @Yash_Makan

21