Analytics India Magazine has published an Article on ‘ 10 Emerging Analytics Startups in India to watch out in 2020’

10 Emerging Analytics Startups in India to watch out in 2020

We are happy to announce that ‘testAIng.com’, Company founded in 2019, has found a place in the esteem list.
We are all the more delighted because our Convener, Vipul Kocher, is the CEO of testAIng.com.

testAIng.com helps customers by testing AI systems using tools – its own as well as third parties’ – to support the testing processes, including automation. testAIng.com has brought together top thought leaders in AI and in the testing space. Its patent pending frameworks for AI testing is another key differentiating factor.

Congratulations testAIng.com & to every member associated with it. Visit www.testaing.com to know more about the Company.

Testing Artificial Intelligence (AI) Systems

Testing Artificial Intelligence (AI) Systems

Artificial Intelligent systems fall within the domain of scientific software. Scientific software help in decision making, critical thinking, and cognitive analysis. In general, testing scientific software is tough and so is testing AI systems. In reality, the scope of testing AI systems goes beyond functional and non-functional areas. We discuss the AI testing approach in this article along with some myths related to AI.

Why Test AI?

AI can also go wrong; it can fail! See tai Newsletter, June 2019. It is time to break the prevailing myths around AI. Some of the leading myths about AI which Gartner [6] addresses are:

Myth #1: AI works in the same way as the human brain

AI has not yet become as mature as a human brain. The category of problems solved by AI systems is Narrow. They are trained to perform a specific task and assist humans in solving pressing problems. Deep Learning is a form of machine learning, which emulates the human brain, but has not yet reached the unique capability of the human brain.

Myth #2: Intelligent machines learn on their own

Can ‘AI’ learn and evolve on its own? The answer is an emphatic ‘No’. Human intervention is always needed to develop AI
based systems and solutions, even when it comes to upgrading or optimizing software. The need for humans will always be there. For example, humans label data which is used in machine learning. This proves that AI as a technology will not take away the jobs of humans. Taking inputs from humans is at the core of AI technology.

Myth #3: AI is free of bias

Since AI systems learn from the training dataset provided by humans, there will always be an intrinsic bias in AI-based solutions.
There are no ways to completely abandon this bias, but diversified datasets and the diversity in developing and testing teams helps to reduce the selection and confirmation bias. Due to the existence of various myths, people think that there is no need to test AI systems. The truth is that AI systems need to be tested, but with a different approach.

Why is Testing of AI Systems Different?

Since AI involves training the machines to learn, unlearn, and optimize specific tasks during their lifecycle, there is a specific need to train and test them ethically.
The development of AI systems is different from the development of traditional software. The differences appear not only in the software development life cycle (SDLC), but also in the software testing life cycle (STLC). Non-functional aspects of testing defined in ISO25010 lists many quality characteristics, such as performance, security, reliability, robustness, and usability. However, AI systems need to be hammered on the anvil of ‘Ethics’ before getting deployed in realtime environment.

1. Ethical Behaviour

Research work of Rik Marselis [2] shows that most quality characteristics do not cover all the relevant aspects of testing AI systems. The extended quality characteristics specifically crucial for AI systems are:

  • Intelligent Behavior which signifies the ability to learn and comprehend with transparency of choices.
  • Morality as related to ethics, privacy, and human friendliness.
  • Personality which consists of the individual’s distinctive characteristics.

These quality characteristics can be ensured with the use of XAI (eXplainable AI) – where the developer/tester will have an answer to the question: why did the machine make a specific decision? XAI is a rapidly developing field. To explore more about XAI, stay tuned with our upcoming newsletters.

2. The Test Oracle Problem

The oracle problem remains one of the challenging problems in the world of software testing; it gets exacerbated in the field of testing AI, since the outcomes are nondeterministic. Without test oracle automation, humans need to determine whether the outcome of the testing process is correct or incorrect. Very little research has been done in this direction. In machine learning and in AI algorithms, often, there are no oracles without human intervention. For example, the image recognition problem is solved by the supervised machine learning (ML) algorithm, which begins by labelling the dataset correctly for training and testing. Metamorphic testing is used to mitigate the ‘oracle’ problem. We will discuss this in detail in our upcoming newsletters.

3. Criticality of Input Data

The ‘input data’, with which the ML models are trained and tested play a major role in explaining the systems’ outcome. Some important points to consider are:

  • Generating corner cases in ML systems are tough and costly.
  • Testing ML systems, to quite an extent, depends on the imagination and creativity of the tester who considers every possible boundary case scenario.
  • Simulating ML systems does not always guarantee a fool proof outcome, as opposed to traditional software.

References:

Books:

Tom van de Ven, Rik Marselis and Humayun Shaukat. Testing in the Digital Age: AI makes the difference. Kleine Uil, Uitgeverij, 2018 – ISBN 978 90 75414 87 5

Whitepaper:

Testing Of Artificial Intelligence https://www.sogeti.com/ globalassets/global/ downloads/reports/testing-ofartificial-intelligence_sogetireport_11_12_2017-.pdf

Articles:

EuroSTAR 2018 tutorial by Rik Marselis: Testing Intelligent Machines. https://www.slideshare. net/RikMarselis/eurostar-2018tutorial-rik-marselis-testingintelligent-machines

Test your Machine Learning Algorithm with Metamorphic Testing. https://medium. com/trustableai/testing-aiwith-metamorphic-testing61d690001f5c

Testing scientific software: A systematic literature review. https://www.sciencedirect. com/science/article/abs/pii/ S0950584914001232

Debunking the key myths around Artificial Intelligence: Gartner. https://content.techgig.com/ debunking-the-key-myths-aroundartificial-intelligence-gartner/ articleshow/68007651.cms

Sonika Bengani

Righteousness/Dharma (धर्मं) at tai

Machines Can win over Humans?

Machines Can win over Humans?

We may think, winning the world league is like reaching the pinnacle for humans. World champions win these world leagues.

But have you ever realized that even Artificial Intelligence (AI) programs can play games and can win over the world champions?
There are many games where Artificial Intelligence programs have won over world champions!
When you hear this the first time it looks weird! Right!!
Yes, for anyone hearing this the first time, it looks ridiculous, Machines winning over Humans!!??

Well, Artificial Intelligence aided machines can do miracles. These machines can think and apply logic on the fly like humans do!

Today Artificial intelligence (AI) has entered our mainstream lives. We are able to see complex to complex human tasks being demonstrated by powerful AI programs.

One of the biggest proof of the pudding of AI programs coming of age demonstrating human level has been in the domain of games. This is an area where humans have excelled in digesting the potential moves and dynamically deriving winning strategies

This is far beyond the conventional view of automation of many of the things and tasks, which Artificial Intelligence has been demonstrating its capability for a few decades.

Let’s explore some of these complex, human-like tasks, demonstrated in recent past by complex Artificial Intelligence programs.

  1. Poker

We saw in early 2017 how a complete Artificial Intelligence program was able to defeat mankind in playing Poker

Poker is a card game which is oriented towards gambling, involving skill and an extremely complex strategy.  It also involves dynamic human thinking, on the fly strategy formations, flexible optimizations to choose the right path among millions of possible paths. This game is very complex and is played by people who bet according to their whims. Two Artificial Intelligence programs have proved that they can be better in playing Poker. They have recently beaten human professional card players at a popular poker game of Texas Hold’em. The team behind them is known as DeepStack.

  1. AlphaGo 

A similar achievement was demonstrated in yet another complex game namely

GO in October 2015

GO is a highly complex game involving a maze of choices and paths.

Go program is played by two people like in Checkers, One player plays with black stones and the other with white stones.

Players of Go use their logic and understanding of the context, to place the stones.

AlphaGo is an AI program developed by Alphabet Inc’s Google DeepMind in London. This program was used to play the board game Go in October 2015. It became the first Computer Go program to beat professional Go player.

  1. Jeopardy

Before this, we had a highly publicized game show named “Jeopardy” in which the IBM AI software program won the world Jeopardy championship. In 2011, IBM Watson won over the two of the world greatest Jeopardy Champions. Ken Jennings had the won 74 matches and Brad Rutter had won the biggest prize of $3.25 million. These two Jeopardy players were the best in the decade and had won over millions of dollars in the past decade. Watson was named after the founder of IBM.

Watson (the unknown Jeopardy) had spent years of work to contest against these two human champions. This opponent of the two world champions

  • never smiled,
  • nor had any emotions.
  • was kept in a black room with his answers transmitted to the computer in the studio.
  • never spoke, his answers were shown in the text format.
  • was producing a good amount of noise and was kept away from humans
  • Lights on the Watson’s representative desk (with the software connecting to the server) turned green when it was a right answer and orange when it was a wrong answer

Watson consisted of multiple racks of ten Power 750 servers. Most of the time, it was Watson who played correct and not the world champions.

This is a classic example to prove that AI-aided machines can over beat human beings.

The complexity of Jeopardy is in the variety and complexity of the questions that can be generated from the immense world knowledge database.

Watson software was able to digest this volume of world knowledge and convert into answers to the questions thrown at it at an extremely fast pace.

Watson stage replica in Jeopardy! contest, Mountain View, California

Interns demonstrating Watson capabilities in Jeopardy! exhibition match, 2011

IBM Watson

The forebearer of these achievements was the remarkable win of the world chess champion Grand Master “Kasparov” by IBM AI program Deep Blue. This game was played between the world champion Kasparov and an IBM supercomputer-aided with Artificial Intelligence called DeepBlue. The first game was played in Philadelphia in 1996 and won by Kasparov. The second was played in 1997 in New York City and won by Deep Blue. This was the first defeat of a human by a Machine.

With this win of Deep Blue, it was symbolic that Artificial Intelligence is about to catch up to human Intelligence. Some of the critics blatantly denied the power of Artificial Intelligence pointing that Deep Blue relied on the brute computational force to evaluate millions of positions.

  1. Chinook – Checkers

Even before this win, in the past decade, there have been successful instances of AI programs winning over human champions in games. For Ex: The Chinook software defeated the World Checkers Champion.

English Draughts, also known as American 8 x 8 checkers is a group of strategy board games for two players., involving diagonal moves of uniform game pieces and mandatory captures by jumping over the opponent pieces.

This game involves good logic and strategy to win over the opponent. Several notable advances in AI playing games have started with this English draughts. In the 1950s, Arthur Samuel created the first board playing program. In 2007, scientists at University of Alberta created a program called Chinook which won over a human world champion in Checkers.

Exploring the above cases, one may wonder if we can expect AI programs to play physical games like “Kabaddi”, Cricket, “Kho-Kho” etc!! To play games like these, we homo sapiens use our body and mind. In all the above examples quoted, we had a software program and no physical body involved. Well, it is possible to have a Robot play instead of an AI program. It does not look out of reach.

It may be possible that we will soon start watching the Cricket finals of Homo Sapiens Vs Robots!!

About me: I am Neelima Vobugari, founder of Tarah AI. I have more than 15 years of experience in IT in marquee technologies ranging from Mainframes, web technologies, XML to Data Science, ML and AI. In this blog of mine, I will publish articles and videos which help you understand, learn what is ML, AI. These articles are not aimed just for people who want to build their career in AI but also for the entrepreneurs, CEOs who wish to use AI in their businesses. By 2020 I want to help at least 100000 people to learn ML, AI, and use.

Neelima Vobugari

B.Tech, MBA, Data Science Specialist from John Hopkins University

Strategic Lens for Machine Learning

With more digitization in place, data generated and collected is increasing day by day. Each of us uses social media applications like Facebook, Twitter, Linked in etc. Facebook alone generates 500+ terabytes every day. This is the information revealed by Facebook sometime in 2012.

Right now, the number of users using any of these social media sites has almost tripled since that time. The number of Internet of things (IoT ) devices used in various operations in our day to day life has also increased. To process all this data and to get good business insights from them, we use Machine Learning techniques. Machine Learning is defined as something that gives the computers the ability to learn without explicitly programming, from experience.

Machine Learning is a process where we explore, examine and visualize data to get relevant insights, and understanding to solve a variety of learning problems like correlation, prediction, summarization, pattern matching and fault detection.

There are two ways we can classify the algorithms, first based on the Learning type, and the second based on the type of Data the Algorithms are applied on.

Machine Learning applies different algorithms to data to solve problems. These algorithms help computers to learn insights from the data. There are two broad popular categories of Learning

  • Supervised Learning: Supervised Learning is applied to solve those problems where a target or output variable is present, that is, the output variable to predicted on or classified upon is present.
  • Unsupervised Learning: In this learning, there is no concept of Target or output variable. When the algorithm is applied to the data, a pattern is extracted from the data. This data pattern further helps in understanding the data better. A classic example where unsupervised learning is used is the Market Basket Analysis for retail shopping.

The other category based on which the problems can be classified is based on the type of data, Structured or Unstructured.

Data can be structured or unstructured.

  • Structured data is defined as that data which has columns for each data item and the data is labeled. Ideally, a table can be used to represent the same.

Ex: Data from an Excel sheet, RDBMS, Time Series Data, Transaction data and other databases

  • Unstructured Data: This data is free-flowing data without an inherent structure, or This data type is audio, video, text and image format.

Example: Emails, Facebook feeds, surveillance videos, traffic videos, speeches, images from cameras etc.

These twin dimensions give us a lens to view Machine Learning algorithms. ML algorithms can be viewed as a combination of

  • Supervised on Structured Data
  • Unsupervised on Structured Data
  • Supervised on Unstructured Data
  • Unsupervised on Unstructured Data

Machine Learning Decision Matrix

Let us examine each quadrant.

Supervised Learning on Structured Dataà Some of the examples in this category are:

Predicting the time taken to reach your office. Suppose we have a dataset with many parameters like Starting Location, Target Location, Vehicle Type, Vehicle Type, Brand of the Vehicle, Age of the person riding, Gender of the person etc.  It being a Supervised learning, the output variable, i.e., the time taken to reach the office is also part of the DataSet.
  
Predicting if It would rain or not: Predicting if it rains or not based on the data given. In the input dataset, there could be variables like humidity, cloudiness etc.
  
Predicting the price of a Stock in Stock Market, based on parameters like past price, segment etc.
  
Predicting if a student passes the exam or not based on marks, GRE scores etc

B). Unsupervised on Structured Dataà Some of the examples in this category are:

Market Basket Analysis based on supermarket data.
  
Customer Segmentation based on Structured Sales Data

C). Supervised Learning on Unstructured Data à Some of the examples in this category are:

Predicting if a mail is a spam or a genuine
  
Classification of digits of images of handwritten text into numbers
  
Sentiment Analysis using tweets or the text in an unstructured format.
  
Predicting cancer, based on the image recognition. Predicting if a cancer is malignant or benign

D). Unsupervised on Unstructured Data à  Some of the examples are:

Automatically figuring out if two images are the same creature
  
Recommendations for movies based on reviews

The other examples in this quadrant are:

Finding different groups of tweets segregated by the topics

Figuring out the trending topics based on the Facebook statuses and/or tweets

So, what we presented is a simple 2 X 2 lens for viewing Machine learning.

The above matrix helps Business Analysts take a systematic approach to analyzing the applicability of Machine Learning for the relevant Business Case. Further down, it can serve as a feed to the Data Scientists to set up the appropriate Machine Learning pipeline to solve the Business problems. These can range from simple segmentation of the data to complex feature detection in videos.

We have termed this Matrix as “Machine Learning Decision Matrix”. Explore it and learn!!

Neelima Vobugari

B.Tech, MBA, Data Science Specialist from John Hopkins University

Why AI is required to have explainability?

In today s world of machine learning dominated Artificial Intelligence applications, there is a renewed push for the agenda of explainability. The triggers for explainability could be multifold:
  • A comfort feeling of knowing what you are handing over control to
  • A knowledge of the influence of the precise input component contributing to a decision helps us troubleshoot better in case of trouble
  • Compliance and legislative issues forcing visibility into models for traceability of decisions
  • Better understanding of the causal relationships between the output and input data helps in prescribing right remedy
  • A deeper insight into local influence of input on specific ouput cases helps us in deriving the right test cases for quality audit of the AI systems
  • A transparent approach helps us in doing a good deal of sensitivity analysis to help perturbing the system and test of outputs for mutated conditions
  • Enhanced coverage of all kinds of scenarios like extreme and corner cases will be triggered if a good base for exlainability is set up
  • Triggering what if analysis for explainability helps in exploring influence of the range of different input features
  • In today s highly successful deep learning driven machine learning models there is an extreme sense of opaqueness which makes them less amenable for explainability. So suitable pertrurbation based on reverse engineering approaches vital for deep learning explainability
  • In safety critical systems like defence, healthcare it becomes vital that diagnoses or decisions powered by machine learning systems be accountable for their decisions
  • Explainable AI is mandatory for such safety critical systems
  • A crucial byproduct of incomplete analysis of inputs for AI systems is biased systems. Explainability as a part of lifecycle enables a more complete coverage for the AI systems
  • Last but not least explainability will be the foundation of trust. This trust will lead to enhanced applicability of AI systems in broader use cases and more widely
So we can safely say explainable AI is the need of the hour and every earnest attempt be made to develop a broadbased agenda in the AI ML DL community for people, process and technology dimensions of explainable AI.

Dr. Srinivas Padmanabhuni

Ph.D. in AI from University of Alberta, Edmonton, Canada

Why we need to let go of our programming instinct in ML based AI?

In the modern world of software we are used to the paradigm of software engineering. The discipline of software engineering is predicated upon following a rigid regimen of quality programming, with deployment of expert coders and programmers to enable building of systems. Hence, a basic necessity often stressed is that developers need to be highly skilled in programming and coding to develop robust systems.

Is this also true for building robust AI/ML systems?

The answer happens to be NO.

Issue is that there is not enough understanding in industry professionals of the the nature of AI applications being widely different from the usual commonplace software applications. The difference comes in the below illustration.

In normal software development our endeavor is to develop a program based on the logic of the expected flow, which is again based on the domain knowledge and know how of the developer who translates requirements to corresponding code.

But in context of AI applications developed via running ML and DL algorithms on a large collection of data, the intent is as outlined in the image below.

The key message that appears here is that in AI we feed data to the AI algorithm and expect a running program (also known as model) as a output which summarizes the hidden patterns and equations hidden and relationships present in the data.

In view of such an inverted operation AI poses several scenarios:

Q1. Can we inject our logic into the program based on our understanding of the domain.

Q2. If we don’t know the program logic how are we expected to ascertain correct behaviour of the generated program from AI.

Q3. Lastly how to observe changes to the behaviour of the program with changed logic.

Yes there are several similar questions that come to mind when we deal with AI based on ML.

Here are some perspectives.

  1. The only way we can generate a new model here is either by changing the data fed into the AI algorithm or by changing the conditions of the AI algorithm like hyperparameter.
  2. No We cannot inject our own logic into the generated model/program
  3. Perturb the data to observe changes in program generated. This is the aspect currently being explored in the context of testing deep learning and machine learning models for explainability.
  4. We do not often have the luxury of knowing the exact logic of the model/program generated by the AI algorithm.
  5. Very often there is a critical need to change the model due to its unsuitability for a certain class of inputs.
  6. Similarly applicability of the model also decreased due to changed data conditions like in case of concept drift where underlying truth of the data has changed considerably.
  7. Last but not the least all this lack of control of being able to debug / modify the program per se puts a lot of risk into deployment of the models/programs.
  8. These risks can be mitigated via a prudent coverage of the input scenarios data fed into the AI algorithm to ensure that fair and unbiased coverage of data has happened to be manifested in the generated model.
  9. Further techniques like cross validation, leave one out validation etc are proposed to be put in place in generation of the models so as to reduce risk of over dependence on a certain part of the data.

In parting our message is : AI is data in, program out.

Keep this phenomenon in sight and accordingly place bets of the right combination of checks, processes and validations as part of the overall process to overcome this lack of visibility and manipulatability of the program.

Hence Follow a strict well laid out to end to end process in AI , example say adopt the CRISP-DM process.

Happy AI-ing..

Dr. Srinivas Padmanabhuni

Ph.D. in AI from University of Alberta, Edmonton, Canada

Failures of Artificial Intelligence

Failures of Artificial Intelligence

AI Fails!

1. In a recent incident, Amazon had to scrap an AI enabled recruiting tool, owing to the inherent bias it was showing against recruiting women. This bias was an inherent result of the basic statistical dominance of men in the industry.

2. Microsoft unveiled a bot titled Tay to experiment with end users conversational data, learning in the process from users conversations. However, owing to racist and illogical inputs being fed by end users
the bot took an unpleasant tone leading to Microsoft withdrawing the bot.

3. In a popular incident leading to loss of life, an Uber self-driving car killed a pedestrian woman in the night. As per [4], the pedestrian was detected by the car vision system however the advice to brake was not enacted due to automatic braking system being turned off. 4. In a recent security analysis [5], painting of lanes in a wrong direction led the vision system of a Tesla car to wrongly steer the car into the oncoming traffic.

What does all these incidents mean?

These are few of the many real life accidents or erroneous behaviours
exhibited by AI powered systems in modern world. These are as diverse in applications as driverless cars to HR systems to chat applications. These highlight the real dangers as must be taken care of in terms of end to end software management of AI powered systems. Highlighting more specifically is the specific needs of verification and validation in context of AIpowered systems. These needs translate to thorough end to end testing of AI powered systems. In addition to the usual needs for testing functional systems, these systems highlight additional desiderata for testing AI systems primarily the concepts of testing ethics, testing for biases, and testing for explainability

Thus, there is a systematic need for setting an agenda for Testing AI systems. Hence, a clarion call for putting in place processes, technologies and techniques for end to end testing of AI systems.

Is there a notion other way round..??

While all above is the case for testing AI systems with a view to avoiding failure or accident scenarios of AI systems, is there a case for using AI as a technology for current testing processes. The answer is an astounding YES. AI by virtue of its relying on large data repositories, aims to derive practical insights and actionable knowledge from these data repositories. In context of testing processes in Software Development Life Cycle (SDLC), there are a number of data elements which are generated, ranging from test cases, requirement specs, code files, bugs, bug fixes etc. These sources of data are a huge source of data for a range of AI processes enacted upon these data. In view of this, our attempt is to highlight a range of SDLC processes in testing which can use AI techniques on these data sources for actionable insights and/or automation.

Let us list a few for example below:

  • Automated Defect Prediction
  • Automated Test Case generation from Text/Images
  • Automated Test Evaluation
  • Test Management Optimization
  • Automated GUI testing
  • Test prioritization
  • Automated Test Data Generation
  • Test Oracle Emulation
  • Test coverage analysis
  • Automated Traceability

Thus, we are just about to step on an El Dorado of potential gold mines in terms of making test processes cheaper better and faster by using state of the art AI techniques in testing.

Happy testAIing!