By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.
Exam-Ready Study Guide for AI-102
Azure Cognitive Search is a fully managed cloud search service that enables AI-powered information retrieval from structured and unstructured data (PDFs, images, databases, etc.). It’s critical in ML pipelines where semantic search, document processing, and knowledge extraction are needed—such as: - Enterprise document search (e.g., legal contracts, medical records, customer support tickets). - AI-enriched knowledge bases (e.g., extracting entities, key phrases, and relationships from invoices or research papers). - Hybrid search (combining keyword and vector search for RAG applications). - Knowledge mining (e.g., building a chatbot that answers questions from internal company documents).
Real-world scenario: A healthcare provider wants to extract patient diagnoses, medications, and lab results from unstructured clinical notes (PDFs, scanned forms) and make them searchable for doctors. They use Azure Cognitive Search with AI enrichment (OCR, entity recognition, key phrase extraction) to index the documents, then expose the results via a secure API for a custom EHR dashboard.
Azure Cognitive Search (ACS): Microsoft’s managed search-as-a-service for full-text, vector, and hybrid search. Best for AI-enriched document processing (unlike Azure AI Document Intelligence, which focuses on structured data extraction from forms).
Index: A searchable data structure (like a database table) that stores documents (JSON objects) with fields (e.g., title, content, entities). Supports filtering, sorting, and faceting.
title
content
entities
Indexer: A crawler that automatically extracts data from a data source (Blob Storage, SQL DB, Cosmos DB) and populates an index. Can run on a schedule or be triggered manually.
Skillset: A pipeline of AI enrichments (e.g., OCR, entity recognition, translation) applied to unstructured data during indexing. Uses prebuilt skills (e.g., EntityRecognition, KeyPhraseExtraction) or custom skills (Azure Functions, ML models).
EntityRecognition
KeyPhraseExtraction
AI Enrichment: The process of applying AI models (via Cognitive Services or custom ML) to extract structured data from unstructured content (e.g., text, images). Example: Using Azure Form Recognizer to pull tables from PDFs.
Knowledge Store: A persistent storage (Blob Storage, Table Storage, or Cosmos DB) where enriched data (from skillsets) is projected (saved) for downstream analytics (e.g., Power BI, Synapse). Unlike an index, which is optimized for search, a knowledge store is for long-term storage and analysis.
Vector Search (Semantic Search): Uses embeddings (from Azure OpenAI, Hugging Face, or custom models) to enable semantic similarity search (e.g., "Find documents about 'heart disease' even if they don’t contain the exact phrase").
Semantic Search (Preview): An enhanced search mode in ACS that uses deep learning to improve relevance (e.g., understanding synonyms, context). Requires Azure OpenAI for embeddings.
Cognitive Services (Azure AI Services): Prebuilt AI models (e.g., Text Analytics, Computer Vision, Translator) used in skillsets for enrichment. Example: SentimentAnalysis skill for classifying document tone.
SentimentAnalysis
Custom Skills: User-defined functions (Azure Functions, Logic Apps, or ML models) that extend skillsets (e.g., calling a custom NER model hosted in Azure ML).
Projection: The process of saving enriched data from a skillset into a knowledge store (e.g., tables, objects, or files). Example: Storing extracted entities in Azure Table Storage for analytics.
Data Source: The origin of data (e.g., Blob Storage, SQL DB, Cosmos DB) that an indexer crawls to populate an index.
id
language
json { "name": "clinical-notes-index", "fields": [ { "name": "id", "type": "Edm.String", "key": true }, { "name": "content", "type": "Edm.String", "searchable": true }, { "name": "entities", "type": "Collection(Edm.String)", "filterable": true }, { "name": "language", "type": "Edm.String", "filterable": true } ] }
json { "name": "clinical-notes-skillset", "skills": [ { "@odata.type": "#Microsoft.Skills.Text.EntityRecognitionSkill", "context": "/document", "inputs": [ { "name": "text", "source": "/document/content" } ], "outputs": [ { "name": "entities", "targetName": "entities" } ] }, { "@odata.type": "#Microsoft.Skills.Text.KeyPhraseExtractionSkill", "context": "/document", "inputs": [ { "name": "text", "source": "/document/content" } ], "outputs": [ { "name": "keyPhrases", "targetName": "keyPhrases" } ] } ] }
json { "name": "clinical-notes-indexer", "dataSourceName": "clinical-notes-blob", "targetIndexName": "clinical-notes-index", "skillsetName": "clinical-notes-skillset", "knowledgeStore": { "storageConnectionString": "DefaultEndpointsProtocol=https;AccountName=...", "projections": [ { "tables": [ { "tableName": "entities", "generatedKeyName": "entityId" } ], "objects": [], "files": [] } ] } }
http GET https://[service-name].search.windows.net/indexes/[index-name]/docs?search=heart%20disease&$select=id,content,entities
json { "name": "embedding", "type": "Collection(Edm.Single)", "searchable": true, "dimensions": 1536, "vectorSearchConfiguration": "vector-config" }
json "vectorSearch": { "algorithmConfigurations": [ { "name": "vector-config", "kind": "hnsw", "hnswParameters": { "m": 4, "efConstruction": 400, "efSearch": 500, "metric": "cosine" } } ] }
http POST https://[service-name].search.windows.net/indexes/[index-name]/docs/search?api-version=2023-11-01 { "vector": { "value": [0.1, 0.2, ..., 0.1536], "fields": "embedding", "k": 5 } }
lastModified
text-embedding-ada-002
Correct answer: Azure Cognitive Search (Document Intelligence is for forms, not search).
Azure Cognitive Search vs. Azure AI Search (Semantic Search):
Exam trap: A question asks which service provides semantic search—the answer is Azure Cognitive Search (not a standalone "Azure AI Search").
Knowledge Store vs. Index:
Solution: Use incremental indexing or split data into smaller batches.
Vector search limits (as of 2024):
Exam trap: A question asks about scaling vector search—know that partitioning is required for large datasets.
Skillset limits:
Answer:
Tricky question:
A legal firm wants to extract key clauses, dates, and parties from 50,000 contracts stored in Blob Storage and make them searchable via a web app. They also need to analyze extracted data in Power BI. Which Azure services should they use? Answer: ? Azure Cognitive Search (indexing + AI enrichment) + Azure Blob Storage (data source) + Azure Table Storage (knowledge store for Power BI) + Cognitive Services (prebuilt skills for entity extraction). ? Azure AI Document Intelligence (wrong—this is for forms, not full-text search).
A retail company wants to implement semantic search for product recommendations (e.g., "Find me a red dress under $50"). They already use Azure OpenAI for embeddings. What’s the minimum Azure service they need? Answer: ? Azure Cognitive Search (supports vector search with Azure OpenAI embeddings). ? Azure AI Search (doesn’t exist—semantic search is a feature of Cognitive Search).
A data engineer notices that their indexer fails after processing 10,000 documents. What’s the most likely cause, and how should they fix it? Answer: ? Cause: Indexer batch limit (max 10,000 documents per run). ? Fix: Split data into smaller batches or use incremental indexing (track lastModified timestamps).
Join 4M+ learners. Unlock unlimited quizzes, wrong-answer tracking, flashcards + reminders, study guides, and 1-on-1 challenges.