“Fall in love with the process and the results will follow.” - Bradley Whitford
From here on…
The last post in this blog column was an overview of why we would opt for AI over the traditional programming process. It also dropped hints on some common terms in this AI arena.
To jog our memories, the 2 main uses for opting for building an AI model (Guru99, G2) are:
to reduce or avoid repetitive tasks - “build an intelligent application from scratch”
to enhance an existing monotonous process - “add machine learning or deep learning to pre-existing software application”
Here is an outline of what we will be covering the coming weeks.
With an AI model, a “machine acts as a blueprint of the human mind, by being able to understand, analyze, and learn from data through specially designed algorithms.” (IOTforAll)
This process can be achieved by:
Machine Learning (ML) where structured data will be supplied
E.g. Google Maps is regularly used by all of us in our day-to-day lives.
An essential upgrade from the previous GPS systems was learning from traffic data and providing the commuter with the least congested route.
Deep Learning (DL) using a sample representation of unstructured data
This step is usually taken after an initial AI model with machine learning is vetted out.
Natural language processing (NLP), speech recognition and computer vision are more specific applications as a result of this DL.
Machines and humans read and interpret each other's language through NLP.
E.g. Virtual chatbots are a very common application of this model.
Repetitive tasks such as answering FAQs or recording client’s name and address for processing an order
Implementing speech recognition enables consumers such as yourself to interact with virtual/ digital assistants.
E.g. Siri on iPhones or Alexa from Amazon
In computer vision models, the machine marks a face using x and y coordinates with pre-determined width and height parameters. It then identifies the facial features such as eyes, nose, etc. labeling them as “landmarks''. (IOTforAll)
E.g. Facebook detects and tags your face in the photos you upload to their site.
Data can be structured or unstructured.
Structured data
E.g. If you have labeled photographs with your name and your friend’s, depending on whose face appears in the photographs.
Unstructured data
E.g. If these same photographs are labeled as “DSC001”, “DSC002”, etc.
Usually this will be helpful for use cases where you require clustering analysis. (See the next section for more information on this concept.)
Our options
The AI software you choose can be in the form of “algorithms, libraries, or frameworks of code, or developer kits libraries” (G2). As per SoftwareTestingHelp, this broadly falls under 4 types of AI software:
AI platforms - these are drag-and-drop applications with built-in algorithms and code framework.
Chatbots - these specifically use NLP for conversations between machine and human. The answers these provide become more accurate with more interactions.
DL - these algorithms are built on artificial neural networks (mimicking the human brain’s neural connections). The advantage is that training by humans is not always required.
ML - wide range of libraries available to facilitate these algorithms.
In the table below, you will find some options for getting started to build an AI model as far as AI platforms and ML software options go. This has been further classified into pre- built models and customizable models. You could acquire a basic framework from various open sources and tweak the model as per your needs or start fresh all by yourself. It also lists the types of data (i.e. text, video/ image or audio) these options can analyze.
Additional Notes:
You use images to train.
You use audio files to train.
You use written text files to train.
On a side note, if you are feeling adventurous, you can build models using languages like Python, Java and R. Tensorflow is the recommended library you could use for processing various types of data.
In the coming weeks, we will be building models for autocorrect/ spellcheck function, detecting an object/ person in an image and detecting sound in an audio clip.
Which options would you choose?
Fun Note:
“Machine learning is the art of study of algorithms that learn from examples and experiences.” (Guru99)
Every time I read this definition, I feel fascinated observing my newborn. This is real-world training of her brain. I call out the names, or colours or action words associated with the objects she picks up. For example, I say, “This is a blue cup. We turn the lid of the blue cup to open. There is water inside the blue cup. Put your lips on here and slurp from the blue cup.” Her data set includes my dialogue and the image of a blue cup - a combination of image and sound training.