Correct Answer: Pooling
Explanation: Pooling layers in convolutional neural networks (CNNs) apply downsampling operations to the feature maps by selecting the maximum or average value within each pooling window, reducing the spatial dimensions while preserving important features.
Correct Answer: Activation
Explanation: Activation layers in convolutional neural networks (CNNs) introduce non-linearity by applying an activation function, such as ReLU or sigmoid, to the output of convolutional or pooling operations, enabling the network to learn complex patterns.
Correct Answer: Normalization
Explanation: Normalization layers in convolutional neural networks (CNNs) are often applied to normalize the feature maps or adjust their scale and distribution, improving training stability and convergence, such as batch normalization or layer normalization.
Correct Answer: Convolution
Explanation: Convolutional layers in convolutional neural networks (CNNs) are responsible for learning hierarchical patterns and features through the application of learnable filters or kernels across the input data, extracting features at different spatial scales.
Correct Answer: Convolution
Explanation: Convolutional layers in convolutional neural networks (CNNs) apply the convolution operation to extract features from the input data through the application of learnable filters or kernels, capturing localized patterns and structures.
Correct Answer: Object Recognition
Explanation: Object recognition in computer vision involves identifying and locating objects within an image or video, commonly used for tasks like image classification and object detection.
Correct Answer: Image Generation
Explanation: Image generation in computer vision involves generating new images or modifying existing ones based on learned patterns and features, commonly used for tasks like image inpainting, style transfer, and generative adversarial networks (GANs).
Correct Answer: Image Compression
Explanation: Image compression with convolutional neural networks (CNNs) involves compressing the size of images while minimizing loss of visual information, enabling efficient storage and transmission of images across networks and devices.
Correct Answer: Object Recognition
Explanation: Object recognition in computer vision involves identifying and classifying the main objects within an image or video, commonly used for tasks like autonomous driving, surveillance, and content-based image retrieval.
Correct Answer: Recurrent Connectivity
Explanation: Recurrent neural networks (RNNs) utilize recurrent connectivity, allowing information to persist and be shared across time steps, which is crucial for processing sequential data such as time series, text, or speech.
Correct Answer: Cell State
Explanation: In recurrent neural networks (RNNs), the cell state is a crucial component that allows the network to maintain an internal state or memory, enabling the capture of long-range dependencies in sequential data.
Correct Answer: Recurrent Operation
Explanation: In recurrent neural networks (RNNs), the recurrent operation allows information from the current time step and the previous hidden state to be combined and passed to the next time step, facilitating sequential data processing.
Correct Answer: Long Short-Term Memory (LSTM)
Explanation: Long Short-Term Memory (LSTM) networks address the vanishing gradient problem in recurrent neural networks (RNNs) by introducing additional connections, such as the forget gate, input gate, and output gate, that carry information across many time steps while controlling the flow of information.
Correct Answer: Gated Recurrent Unit (GRU)
Explanation: Gated Recurrent Unit (GRU) networks simplify the design of recurrent neural networks (RNNs) by combining the forget and input gates into a single update gate, reducing the number of parameters compared to LSTM while still addressing the vanishing gradient problem.
Correct Answer: Unfolding
Explanation: In recurrent neural networks (RNNs), unfolding refers to the process of representing the network architecture over multiple time steps to handle sequential data, creating a directed acyclic graph (DAG) structure.
Correct Answer: Bidirectional Recurrent Neural Network (BiRNN)
Explanation: Bidirectional Recurrent Neural Networks (BiRNNs) process input sequences in both forward and backward directions, allowing the network to capture information from both past and future contexts, enabling better understanding of the sequential data.
Correct Answer: Long Short-Term Memory (LSTM)
Explanation: Long Short-Term Memory (LSTM) networks are commonly used in natural language processing (NLP) for tasks such as text generation, machine translation, and sentiment analysis due to their ability to capture long-range dependencies and mitigate the vanishing gradient problem.
Correct Answer: Gated Recurrent Unit (GRU)
Explanation: Gated Recurrent Unit (GRU) networks are commonly used in natural language processing (NLP) for tasks such as text generation and machine translation due to their computational efficiency, achieved by combining the forget and input gates into a single update gate.
Correct Answer: Language Modeling
Explanation: Language modeling in natural language processing (NLP) involves predicting the next word in a sequence based on the context provided by the preceding words, a task commonly addressed using recurrent neural networks (RNNs) such as LSTMs or GRUs.
Correct Answer: Long Short-Term Memory (LSTM)
Explanation: Long Short-Term Memory (LSTM) networks are particularly suitable for natural language processing (NLP) tasks that require understanding and generating sequential data, such as text summarization and dialogue generation, due to their ability to capture long-range dependencies.
Correct Answer: Text Classification
Explanation: Text classification in natural language processing (NLP) involves categorizing text documents into predefined categories or labels, a task commonly addressed using recurrent neural networks (RNNs) for sequence modeling and capturing contextual information.
Correct Answer: Bidirectional Recurrent Neural Network (BiRNN)
Explanation: Bidirectional Recurrent Neural Networks (BiRNNs) are capable of capturing bidirectional contextual information and are suitable for natural language processing (NLP) tasks such as named entity recognition and part-of-speech tagging, where contextual information from both past and future words is important.
Correct Answer: Tokenization
Explanation: Tokenization in natural language processing (NLP) refers to the process of converting text data into a more structured format, such as tokens or words, for further analysis, enabling tasks like text classification, sentiment analysis, and language modeling.
Correct Answer: Stemming
Explanation: Stemming in natural language processing (NLP) involves reducing words to their base or root form by removing suffixes, allowing variants of words to be treated as the same token, although it may not always produce valid words.
Correct Answer: Part-of-Speech Tagging
Explanation: Part-of-Speech (POS) tagging in natural language processing (NLP) involves identifying the syntactic roles of words within a sentence, such as nouns, verbs, adjectives, and adverbs, aiding in tasks like grammar parsing and semantic analysis.
Correct Answer: Lemmatization
Explanation: Lemmatization in natural language processing (NLP) involves grouping together inflected forms of a word to their base or dictionary form, considering the word’s morphological properties such as tense and case, resulting in valid words.
Correct Answer: Lemmatization
Explanation: Lemmatization in natural language processing (NLP) is particularly useful for tasks that require understanding the semantics of words, such as information retrieval and question answering, as it produces valid dictionary forms of words.
Correct Answer: Tokenization
Explanation: Tokenization in natural language processing (NLP) breaks down text into individual words or tokens, facilitating further analysis and processing by converting unstructured text data into a structured format.
Correct Answer: Stemming
Explanation: Stemming in natural language processing (NLP) is useful for tasks such as text indexing and search, as it reduces words to their base forms, increasing the likelihood of matching similar words and improving retrieval accuracy.
Correct Answer: Stemming
Explanation: Stemming in natural language processing (NLP) aims to reduce words to their base or root form by removing suffixes and prefixes, facilitating tasks such as text retrieval and indexing.
Correct Answer: Porter Stemmer
Explanation: The Porter Stemmer algorithm is a commonly used approach in stemming, which reduces words to their root forms by applying a set of heuristic rules, widely used in natural language processing (NLP) tasks.
Correct Answer: Lemmatization
Explanation: Lemmatization in natural language processing (NLP) aims to group together inflected forms of a word to their base or dictionary form, considering the word’s morphological properties such as tense and case.
Correct Answer: Dictionary-based Lemmatization
Explanation: Dictionary-based lemmatization typically involves utilizing a dictionary or vocabulary to map words to their corresponding base forms, ensuring accurate lemmatization by referencing known word forms.
Correct Answer: Tokenization
Explanation: Tokenization in natural language processing (NLP) breaks down text into individual words or tokens, facilitating further analysis and processing by converting unstructured text data into a structured format.
Correct Answer: Words
Explanation: In tokenization, text is typically segmented into individual words or tokens, which are the basic units of analysis for natural language processing (NLP) tasks.
Correct Answer: Lemmatization
Explanation: Lemmatization in natural language processing (NLP) is useful for tasks such as named entity recognition and sentiment analysis, as it reduces words to their base forms, improving analysis and understanding of the text.
Correct Answer: Analyzing the emotional tone or sentiment expressed in text
Explanation: The primary objective of sentiment analysis in natural language processing (NLP) is to analyze the emotional tone or sentiment expressed in text, determining whether the sentiment is positive, negative, or neutral.
Correct Answer: Document-level sentiment analysis
Explanation: Document-level sentiment analysis involves classifying the sentiment of a given text into predefined categories such as positive, negative, or neutral, considering the overall sentiment expressed in the entire document.
Correct Answer: Word Embeddings
Explanation: Word embeddings in sentiment analysis involve representing words or phrases in a text as numerical vectors in a high-dimensional space, capturing semantic relationships between words and enabling machine learning models to analyze sentiment based on these representations.
Correct Answer: Aspect-based sentiment analysis
Explanation: Aspect-based sentiment analysis focuses on identifying the sentiment associated with specific aspects or features mentioned in a text, such as product reviews, enabling more fine-grained analysis of sentiment.
Correct Answer: TF-IDF
Explanation: Term Frequency-Inverse Document Frequency (TF-IDF) in sentiment analysis involves assigning weights to words based on their importance in determining the sentiment of a text, considering both their frequency within the document and rarity across documents in the corpus.
Correct Answer: Supervised Learning
Explanation: In sentiment analysis, supervised learning involves training a machine learning model on labeled data, where each text sample is associated with a sentiment label, enabling the model to predict the sentiment of unseen text based on patterns learned from the training data.
Correct Answer: Sentence-level sentiment analysis
Explanation: Sentence-level sentiment analysis focuses on determining the sentiment expressed within individual sentences or phrases, enabling more granular analysis of sentiment compared to document-level sentiment analysis.
Correct Answer: Entity-level sentiment analysis
Explanation: Entity-level sentiment analysis involves identifying the sentiment associated with specific entities mentioned in a text, such as people, organizations, or products, enabling targeted analysis of sentiment towards individual entities within a document.
Correct Answer: Clustering documents into thematic groups
Explanation: The primary objective of topic modeling in natural language processing (NLP) is to cluster documents into thematic groups based on the underlying topics or themes present in the text.
Correct Answer: Latent Semantic Analysis (LSA)
Explanation: Latent Semantic Analysis (LSA) is a statistical technique commonly used in topic modeling to uncover hidden patterns and structures within a collection of text documents, enabling the identification of underlying topics.
Correct Answer: Latent Dirichlet Allocation (LDA)
Explanation: In topic modeling, Latent Dirichlet Allocation (LDA) is a generative probabilistic model that represents documents as a mixture of topics, where each topic is characterized by a distribution over words, allowing for the discovery of latent topics within the document collection.
Correct Answer: Latent Dirichlet Allocation (LDA)
Explanation: Latent Dirichlet Allocation (LDA) is a topic modeling algorithm based on the assumption that each document exhibits multiple topics, and each word in the document is attributable to one of the document’s topics, allowing for the discovery of topics in the document collection.
Correct Answer: Latent Semantic Analysis (LSA)
Explanation: Latent Semantic Analysis (LSA) involves representing documents as high-dimensional vectors based on the frequency of words, enabling clustering and analysis of document similarities by capturing latent semantic relationships between words and documents.
Correct Answer: Grouping documents into thematic clusters based on topic similarity
Explanation: In topic modeling, document clustering involves grouping documents into thematic clusters based on topic similarity, enabling the organization and exploration of large document collections.
Correct Answer: Non-Negative Matrix Factorization (NMF)
Explanation: Non-Negative Matrix Factorization (NMF) is a topic modeling algorithm that aims to factorize a term-document matrix into two non-negative matrices, where one matrix represents the topics and their distributions over documents, and the other matrix represents the documents and their distributions over topics.
Correct Answer: Discovering the underlying topics within a document collection
Explanation: In topic modeling, topic inference refers to the process of discovering the underlying topics within a document collection, enabling the extraction of meaningful themes and structures from the text data.
Correct Answer: Representing words as dense numerical vectors
Explanation: The primary objective of word embeddings in natural language processing (NLP) is to represent words as dense numerical vectors in a high-dimensional space, capturing semantic relationships between words and enabling machine learning models to understand and process language more effectively.
Correct Answer: Word2Vec
Explanation: Word2Vec is a popular word embedding technique based on the Skip-gram and Continuous Bag of Words (CBOW) models, which are trained on large corpora to learn word representations by capturing semantic relationships between words.
Correct Answer: Skip-gram
Explanation: In the Word2Vec model, the Skip-gram approach involves predicting the context words surrounding a target word given its occurrence in a sentence, aiming to learn word embeddings that capture semantic relationships between words.
Correct Answer: GloVe
Explanation: GloVe (Global Vectors for Word Representation) is a word embedding technique based on the idea of global word-word co-occurrence statistics and factorization of a word-context matrix, aiming to learn word representations that capture both global and local semantic relationships between words.
Correct Answer: FastText
Explanation: FastText is a word embedding technique that incorporates subword information and character n-grams to represent words as vectors, enabling the model to handle out-of-vocabulary words and capture morphological similarities between words.
Correct Answer: The co-occurrence statistics of words in a corpus
Explanation: In the GloVe (Global Vectors for Word Representation) model, the word-context matrix represents the co-occurrence statistics of words in a corpus, capturing the relationships between words based on their contextual usage.
Correct Answer: Elmo
Explanation: Elmo (Embeddings from Language Models) is a word embedding technique capable of generating contextualized word representations by considering the surrounding words in a sentence, capturing nuances in word meaning based on their context.
Correct Answer: Generating contextualized word representations
Explanation: The primary objective of transformer models in natural language processing (NLP) is to generate contextualized word representations by capturing complex relationships between words in a sentence or document.
Correct Answer: BERT (Bidirectional Encoder Representations from Transformers)
Explanation: BERT (Bidirectional Encoder Representations from Transformers) is pre-trained using a masked language modeling (MLM) objective and a next sentence prediction (NSP) task, enabling it to generate bidirectional contextualized word representations.
Correct Answer: Predicting randomly masked words in a sentence based on the surrounding context
Explanation: In BERT (Bidirectional Encoder Representations from Transformers), the masked language modeling (MLM) objective involves predicting randomly masked words in a sentence based on the surrounding context, enabling the model to understand bidirectional relationships between words.
Correct Answer: GPT (Generative Pre-trained Transformer)
Explanation: GPT (Generative Pre-trained Transformer) is based on a decoder-only architecture and is trained using an autoregressive language modeling (LM) objective, where the model predicts the next word in a sequence given the previous context.
Correct Answer: Transformer (original architecture)
Explanation: The original Transformer model introduced the concept of attention mechanisms, allowing the model to focus on relevant parts of the input sequence during processing, which has since become a fundamental component of transformer-based models like BERT and GPT.
Correct Answer: Focusing on relevant parts of the input sequence during processing
Explanation: In transformer models like BERT and GPT, the self-attention mechanism allows the model to focus on relevant parts of the input sequence during processing, enabling the generation of contextualized word representations by attending to different words in the input sequence.
Correct Answer: XLNet (eXtreme Language Understanding Transformer)
Explanation: XLNet (eXtreme Language Understanding Transformer) introduced the concept of bidirectional context modeling, enabling it to generate bidirectional contextualized word representations similar to BERT, but with an autoregressive pre-training objective.
Correct Answer: Detecting objects and patterns in images
Explanation: The primary objective of image processing in computer vision is to detect objects and patterns in images, enabling tasks such as object recognition, segmentation, and scene understanding.
Correct Answer: Image enhancement
Explanation: Image enhancement is an image processing technique that involves enhancing the visual quality of images by adjusting parameters such as brightness, contrast, and sharpness, improving the overall appearance of the image.
Correct Answer: Extracting regions of interest from images
Explanation: In image processing, image segmentation involves dividing an image into multiple regions or segments, each representing a distinct object or region of interest, enabling further analysis and understanding of the image content.
Correct Answer: Edge detection
Explanation: Edge detection is an image processing technique that involves detecting the boundaries of objects and regions in an image by identifying abrupt changes in intensity or color, which typically indicate the presence of edges or boundaries.
Correct Answer: Histogram equalization
Explanation: Histogram equalization is an image processing operation that aims to equalize the distribution of pixel intensities in an image, redistributing pixel values to improve the overall contrast and visual appearance of the image.
Correct Answer: Filtering and modifying the shapes of objects in images
Explanation: In image processing, morphological operations are primarily used for filtering and modifying the shapes of objects in images, enabling tasks such as noise reduction, object extraction, and image enhancement.
Correct Answer: Morphological operations
Explanation: Morphological operations in image processing involve modifying the shapes of objects in an image based on a structured pattern or kernel, such as dilation, erosion, opening, and closing, enabling various image processing tasks.
Correct Answer: Removing noise and artifacts from images
Explanation: The primary objective of image filtering in image processing is to remove noise and artifacts from images, improving the quality and clarity of the image for further analysis and interpretation.
Correct Answer: Red, Green, Blue
Explanation: The RGB color model uses three primary color channels: Red, Green, and Blue, which combine in various intensities to produce a wide range of colors in digital images.
Correct Answer: By adjusting the intensity of each channel from 0 to 255
Explanation: In the RGB color model, colors are represented using combinations of the primary color channels (Red, Green, and Blue) by adjusting the intensity of each channel from 0 to 255, where 0 represents no intensity (black) and 255 represents maximum intensity (full color).
Correct Answer: Grayscale
Explanation: Grayscale image representation encodes images using a single channel representing the intensity of light, typically ranging from 0 (black) to 255 (white), with shades of gray in between, making it simpler than the RGB color model.
Correct Answer: Using a single intensity value per pixel
Explanation: In the grayscale representation of images, pixel values are typically encoded using a single intensity value per pixel, representing the brightness or luminance of the pixel, without the need for separate color channels as in the RGB color model.
Correct Answer: Grayscale
Explanation: Grayscale image representation is commonly used for image processing tasks such as edge detection and feature extraction due to its simplicity and effectiveness in capturing structural information while reducing computational complexity compared to color representations like RGB.
Correct Answer: Detecting and localizing objects within images
Explanation: The primary objective of object detection algorithms in computer vision is to detect and localize objects within images, enabling tasks such as identifying the presence and location of objects of interest.
Correct Answer: YOLO (You Only Look Once)
Explanation: YOLO (You Only Look Once) is known for its real-time performance and single-shot detection capability, allowing it to detect objects in images with high efficiency and accuracy in a single pass through the network.
Correct Answer: By dividing the image into a grid of cells and predicting bounding boxes and class probabilities for each cell
Explanation: In YOLO (You Only Look Once), objects are detected and localized within images by dividing the image into a grid of cells and predicting bounding boxes and class probabilities for each cell, enabling efficient and accurate object detection in a single pass through the network.
Correct Answer: SSD (Single Shot MultiBox Detector)
Explanation: SSD (Single Shot MultiBox Detector) utilizes a fixed set of default bounding boxes at different scales and aspect ratios to predict object locations and categories in images, enabling efficient and accurate object detection with a single shot.
Correct Answer: Faster R-CNN (Faster Region-based Convolutional Neural Network)
Explanation: Faster R-CNN (Faster Region-based Convolutional Neural Network) is known for its multi-scale feature fusion and region proposal network (RPN), which enable accurate object detection by generating region proposals and refining them based on learned features.
Correct Answer: Faster R-CNN (Faster Region-based Convolutional Neural Network)
Explanation: Faster R-CNN (Faster Region-based Convolutional Neural Network) combines region proposal generation with object classification in a single end-to-end network architecture, enabling efficient and accurate object detection with improved speed and performance.
Correct Answer: Reducing false positive detections by filtering out redundant bounding boxes
Explanation: In object detection algorithms like YOLO and SSD, non-maximum suppression (NMS) is used to reduce false positive detections by filtering out redundant bounding boxes and keeping only the most confident predictions for each object.
Correct Answer: Verifying or identifying individuals based on facial features
Explanation: The primary objective of face recognition in computer vision is to verify or identify individuals based on their facial features, enabling applications such as biometric authentication and surveillance.
Correct Answer: Convolutional Neural Networks (CNNs)
Explanation: Convolutional Neural Networks (CNNs) are commonly used for representing and encoding facial features in face recognition systems due to their ability to learn complex hierarchical features directly from raw image data.
Correct Answer: Face encoding
Explanation: In face recognition systems, the process of capturing and encoding facial features from images or video frames is known as face encoding, where facial characteristics are extracted and represented as numerical vectors for further processing and comparison.
Correct Answer: Face verification
Explanation: Face verification in face recognition involves comparing the facial features of an individual against a database of known identities to determine whether the person is who they claim to be, typically used for authentication purposes.
Correct Answer: Face detection
Explanation: Face detection in face recognition focuses on detecting and localizing faces within images or video frames, identifying the regions of interest that contain facial information for further processing.
Correct Answer: Face alignment
Explanation: Face alignment in face recognition involves correcting for variations in pose, scale, and illumination to ensure that facial features are accurately aligned and extracted for subsequent processing, improving the overall accuracy of face recognition systems.
Correct Answer: Euclidean distance
Explanation: Euclidean distance is commonly used in face recognition systems to quantify the similarity between facial feature vectors for verification or identification, measuring the straight-line distance between the feature vectors in the multi-dimensional feature space.
Correct Answer: Siamese networks
Explanation: Siamese networks are a technique used in face recognition to learn a discriminative embedding space where faces of the same identity are closer together, enabling accurate face verification and identification tasks.
Correct Answer: Identifying the semantic meaning of objects in images
Explanation: The primary objective of semantic segmentation in computer vision is to identify the semantic meaning of objects in images by assigning class labels to each pixel, enabling detailed understanding of the scene and its contents.
Correct Answer: Convolutional Neural Networks (CNNs)
Explanation: Convolutional Neural Networks (CNNs) are commonly used for pixel-level labeling of objects and regions in semantic segmentation tasks due to their ability to learn hierarchical features and capture spatial dependencies within images.
Correct Answer: Semantic labeling
Explanation: In semantic segmentation, the process of assigning a class label to each pixel in an image is known as semantic labeling, where each Artificial Intelligence pixel is classified into predefined categories representing different objects or regions.
Correct Answer: Convolutional Neural Networks (CNNs)
Explanation: Convolutional Neural Networks (CNNs) are commonly used for semantic segmentation tasks due to their ability to capture spatial information and learn hierarchical features, making them well-suited for pixel-level labeling of objects and regions in images.
Correct Answer: Intersection over Union (IoU)
Explanation: Intersection over Union (IoU) is commonly used as an evaluation metric to measure the accuracy of semantic segmentation models by calculating the overlap between predicted and ground truth segmentation masks, providing a measure of segmentation performance.
Correct Answer: Virtual memory
Explanation: Virtual memory is responsible for managing memory resources and facilitating data exchange between the CPU and storage devices by providing an abstraction layer that allows the CPU to access more memory than physically available.