Beginning Explore artificial intelligence and computer vision

What is artificial intelligence?

Explore into artificial intelligence

For the definition of artificial intelligence, academic research area always have different understandings. The widely accepted definition is:

  • Artificial intelligence is the use of machines to simulate human cognitive abilities technology.

Artificial intelligence involves a wide range of insights, learning, reasoning and decision-making.

From the perspective of industry application, the core ability of artificial intelligence ability is to make judgments or predictions based on given input.

The rise of deep learning and the three booms of AI.

The Turing test, the cornerstone of artificial intelligence

Three core elements of artificial intelligence

Three core elements of AI: data, algorithm and compute resource.



When you give a computer a task, you tell it not only what to do, but how to do it and a set of instructions about how to do it is called an algorithm.

  • Traditional algorithms – traversal
  • Smarter algorithms – gradient descent
  • More complex algorithms – machine learning

Compute Resource/Power

Breakthrough in computing power – traditional CPU and new computing acceleration technology.

smart chip

Artificial intelligence technonly relationship

  • Machine learning: a way to achieve artificial intelligence

It is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and other subjects. Machine learning is the core of artificial intelligence, the fundamental way to make computers intelligent, and its applications are widespread In all fields of artificial intelligence, it mainly uses induction and synthesis rather than deduction.

  • Deep learning: a technology that implements machine learning.

It uses a deep neural network to process the model more complex, so that the model has a deeper understanding of the data. It is a method of machine learning based on data representation learning. The motivation is to establish and simulate the human brain to analyzing the learning neural network, it imitates the mechanism of the human brain to interpret data, such as images, sounds and texts. The essence of deep learning is to learn more by building a machine learning model with many hidden layers and massive training data. Use the features to ultimately improve the accuracy of classification or prediction.

  • Artificial neural network: a machine learning algorithm

Neural networks generally have input layer -> hidden layer -> output layer. Generally speaking, a neural network with more than two hidden layers is called a deep neural network. Deep learning is a machine that uses a deep architecture like a deep neural network. Learn method.

What is machine Learning

Artificial intelligence is a technology that uses machines to simulate human cognitive abilities.

  • Traditional artificial intelligence methods: logical reasoning, expert systems (answering questions based on manually defined rules), etc.;

  • Contemporary artificial intelligence generally acquires the ability to make predictions and judgments through learning-machine learning

Normal cat: round head, short face, five fingers on the forelimbs, four toes on the hind limbs, with sharp and curved claws at the ends of the toes,
The claws can stretch. Nocturnal. ---Baidu Encyclopedia

Typical machine learning process

What is Neural Network

  • How do people think? –Biological Neural Network


    1. External stimulation passes through nerve endings and turns converted into electrical signals, transduced to nerve cells (Also called neuron)
    2. Numerous neurons form the nerve center
    3. The nerve center integrates various signals to do judgement.
    4. According to the instructions of the nerve center, the human body respond to external stimuli.
  • How does the machine think? –Artificial neural networks

    Artificial neuron

    Input: x1,x2,x3 Output: output Simplified model: It is agreed that each input has only two possible 1 or 0

    All inputs are 1, which means that various conditions are met, and the output is 1;

    All inputs are 0, which means that the condition is not true, and the output is 0

    Is watermelon good or bad?
    Color: green; root: curled up; knock: voiced thoughts. ---Good melon
    Family Spring Outing?
    Price: high and low; weather: good or bad; family: can you travel
  • The logical architecture of the neural network

What is Deep Learning

Deep neural network & deep learning

  • The traditional neural network has developed to a situation with multiple hidden layers,

  • Neural networks with multiple hidden layers are called deep neural networks, and machine learning research based on deep neural networks is called deep learning.

The foreseeable future of artificial intelligence

Computer vision

  • Typical technology:

Face detection, tracking, recognition and attribute analysis, pedestrian and vehicle detection, tracking, recognition and attribute analysis, text detection and recognition, object detection and recognition

  • Typical application:

Face authentication, intelligent transportation, robot vision (such as drones), image search engine, image and video understanding, image and video beautification

Speech Recognition

  • Typical technology:

Voice recognition, voiceprint recognition, multi-microphone array system

  • Typical application:

Voice input, voice control, intelligent assistant, machine translation, robot hearing

natural language

  • Typical technology:

Words and sentences embedded, semantic modeling

  • Typical application:

Chatbot, smart assistant, smart customer service, video Frequency understanding, machine translation

Computer vision (CV)

What is CV

Several more rigorous definitions:

  • “Construct a clear and meaningful description of the objective objects in the image” (Ballard & Brown, 1982)

  • “Calculate the characteristics of the three-dimensional world from one or more digital images” (Trucco & Verri, 1998)

  • “Based on perceptual images to make useful decisions for objective objects and scenes” (Sockman & Shapiro, 2001)

Overview in one sentence:

It means that the computer has the ability to see, know, and think. It can be said that the computer has vision, that is, computer vision.

Deep learning and CV

Application of CV

Image Classification

Image Classification - Neural Neural Network (CNN)

Linear rectifier layer–RELU

Pooling layer-pool

Target Detection


Target Tracking

Semantic Image Segmentation

Instance Segmentation

CV skills tree construction