Fatskills
Practice. Master. Repeat.
Study Guide: Cloud ML - Azure AI Engineer Associate (Exam AI-102): Azure AI Document Intelligence – Prebuilt Models (Invoice, Receipt, ID), Custom Models (Template, Neural)
Source: https://www.fatskills.com/hesi/chapter/cloud-ml-cert-azure-ai-azure-ai-document-intelligence-prebuilt-models-invoice-receipt-id-custom-models-template-neural

Cloud ML - Azure AI Engineer Associate (Exam AI-102): Azure AI Document Intelligence – Prebuilt Models (Invoice, Receipt, ID), Custom Models (Template, Neural)

By Fatskills Exam Guides Team — the exam nerds behind 28,500+ quizzes and 2.1M practice questions across 500+ global exams.

⏱️ ~8 min read

Azure_AI – Azure AI Document Intelligence – Prebuilt Models (Invoice, Receipt, ID), Custom Models (Template, Neural)

Azure AI Document Intelligence – Prebuilt & Custom Models (AI-102 Exam Study Guide)

What This Is

Azure AI Document Intelligence (formerly Form Recognizer) is a cloud-based AI service that extracts text, key-value pairs, tables, and structures from documents (PDFs, images, scans) using prebuilt models (for invoices, receipts, IDs) or custom models (for domain-specific layouts). It’s critical in automated document processing pipelines, such as: - Accounts payable automation (extracting invoice data into ERP systems like SAP). - KYC (Know Your Customer) workflows (validating IDs and passports). - Contract analysis (pulling clauses from legal documents into databases). - Healthcare claims processing (reading medical forms and lab reports).

Unlike OCR (Optical Character Recognition), Document Intelligence understands document layouts (e.g., "this text is a total amount" vs. "this is a vendor name") and can handle noisy scans, handwriting, and multi-page files.


Key Terms & Services

  • Azure AI Document Intelligence (formerly Form Recognizer): Microsoft’s managed document understanding service that extracts structured data from unstructured documents. Best for high-accuracy extraction without manual rule-writing.

  • Prebuilt Models: Ready-to-use models for common document types (invoices, receipts, IDs, business cards, tax forms). No training required—just send a document and get JSON output.

  • Custom Models (Template & Neural):

  • Template Model: Uses fixed layout rules (e.g., "the total is always in the bottom-right corner"). Works well for structured, consistent documents (e.g., standardized forms).
  • Neural Model: Uses deep learning to generalize across varied layouts (e.g., invoices from different vendors). Requires more labeled data but handles unstructured documents better.

  • Layout Model: A prebuilt model that extracts text, tables, and selection marks (checkboxes) from any document. Useful for generic document parsing before applying a specialized model.

  • Labeling Tool (Document Intelligence Studio): A web-based UI for manually labeling documents to train custom models. Supports collaborative labeling and exports data in the correct format for training.

  • Training Data (Labeled Documents): For custom models, you need 5–10 labeled documents (for Template) or 10–50+ (for Neural). Labels define fields to extract (e.g., "InvoiceNumber," "DueDate").

  • API Endpoints (REST & SDKs):

  • Analyze API: Submits a document for processing (supports PDF, JPEG, PNG, TIFF).
  • Get Analyze Result API: Retrieves extraction results in JSON format (key-value pairs, tables, confidence scores).
  • SDKs: Available for Python, .NET, Java, JavaScript (e.g., azure-ai-formrecognizer Python package).

  • Confidence Scores: Each extracted field has a confidence score (0–1). Use this to filter low-confidence results or trigger human review.

  • Batch Processing: Document Intelligence supports asynchronous batch processing for large document sets (e.g., processing 10,000 invoices overnight).

  • Azure Blob Storage Integration: Documents can be stored in Blob Storage and referenced by URL (instead of uploading directly to the API).

  • Azure Cognitive Search Integration: Extracted data can be indexed in Azure Cognitive Search for full-text search and analytics (e.g., "Find all invoices from Vendor X in Q3 2023").

  • Cost Model:

  • Pay-per-document (prebuilt models: ~$0.01–$0.05 per page; custom models: ~$0.05–$0.10 per page).
  • Free tier: 500 pages/month for prebuilt models.

Step-by-Step / Process Flow

1. Choose Between Prebuilt or Custom Model

  • Use a prebuilt model if:
  • Your documents match standard formats (invoices, receipts, IDs, business cards).
  • You need quick deployment with no training data.
  • Use a custom model if:
  • Your documents have unique layouts (e.g., medical forms, custom contracts).
  • You need higher accuracy for domain-specific fields.

2. Set Up Azure AI Document Intelligence

  1. Create a Document Intelligence resource in the Azure Portal (under "AI + Machine Learning").
  2. Copy the endpoint URL and API key (needed for API calls).
  3. (Optional) Enable managed identity for secure access to Blob Storage.

3. Use a Prebuilt Model (Example: Invoice Processing)

  1. Upload a document (PDF, image) to Blob Storage or use a local file.
  2. Call the Analyze API (Python example): ```python from azure.ai.formrecognizer import DocumentAnalysisClient from azure.core.credentials import AzureKeyCredential

endpoint = "YOUR_ENDPOINT" key = "YOUR_API_KEY" document_url = "https://yourstorage.blob.core.windows.net/invoices/invoice1.pdf"

client = DocumentAnalysisClient(endpoint, AzureKeyCredential(key)) poller = client.begin_analyze_document_from_url("prebuilt-invoice", document_url) result = poller.result() ``
3. Parse the JSON output (extract fields like
VendorName,InvoiceTotal,DueDate).
4. Filter low-confidence fields (e.g., only accept
confidence > 0.8`).

4. Train a Custom Model (Example: Medical Claim Forms)

  1. Label documents in Document Intelligence Studio:
  2. Upload 5–50 sample documents.
  3. Define fields to extract (e.g., "PatientName," "ProcedureCode").
  4. Draw bounding boxes around each field.
  5. Train the model:
  6. Choose Template (for fixed layouts) or Neural (for varied layouts).
  7. Submit training job via API or Studio.
  8. Test the model:
  9. Upload a new document and check extraction accuracy.
  10. Adjust labels if needed and retrain.
  11. Deploy the model (it gets a unique model ID for API calls).

5. Integrate with Downstream Systems

  • Store extracted data in Azure SQL Database, Cosmos DB, or Blob Storage.
  • Trigger workflows (e.g., Power Automate for approvals, Logic Apps for ERP updates).
  • Index in Azure Cognitive Search for search and analytics.

Common Mistakes

Mistake Correction
Using a prebuilt model for a custom document type (e.g., trying to extract "PatientID" from a medical form using the "Invoice" model). Train a custom model if your document doesn’t match a prebuilt type. Prebuilt models only work for standard formats.
Assuming the Layout model extracts key-value pairs (e.g., expecting it to know "Total Amount" is a field). The Layout model only extracts text, tables, and selection marks—it doesn’t understand semantics. Use a prebuilt or custom model for key-value extraction.
Not filtering low-confidence results (e.g., accepting all extracted fields without checking confidence scores). Always filter fields by confidence (e.g., if field.confidence < 0.7: flag_for_review()). Low-confidence fields often contain errors.
Training a Neural model with only 5 documents (Neural models need 10–50+ labeled samples for good accuracy). Use a Template model if you have <10 labeled documents. Neural models require more data but generalize better.
Storing documents in a non-Azure storage (e.g., AWS S3) and trying to use them directly with Document Intelligence. Upload documents to Azure Blob Storage or use local files for API calls. Document Intelligence does not natively support AWS/GCP storage.

Certification Exam Insights

What the AI-102 Exam Tests

  1. Prebuilt vs. Custom Model Selection
  2. Tricky trap: The exam may describe a scenario with slightly non-standard invoices (e.g., "invoices from 50 different vendors with varying layouts"). The correct answer is not always a prebuilt model—if layouts vary significantly, a Neural custom model is better.
  3. Key rule: If the document type is not in the prebuilt list (invoice, receipt, ID, business card, tax form), you must use a custom model.

  4. When to Use Document Intelligence vs. Other Azure Services

  5. Azure Cognitive Search: Best for full-text search (e.g., "Find all documents containing 'urgent'"). Use Document Intelligence first to extract structured data, then index it in Cognitive Search.
  6. Azure Computer Vision (OCR): Only extracts raw text (no key-value pairs or tables). Use Document Intelligence if you need structured data.
  7. Azure Applied AI Services (e.g., Metrics Advisor, Immersive Reader): Not for document extraction—these are for anomaly detection and accessibility.

  8. Cost and Performance Tradeoffs

  9. Prebuilt models are cheaper (~$0.01–$0.05/page) but less flexible.
  10. Custom models cost more (~$0.05–$0.10/page) but handle unique layouts.
  11. Batch processing is cheaper than real-time API calls for large volumes.

  12. Confidence Scores and Error Handling

  13. The exam may ask: "What should you do if a field’s confidence score is 0.6?"
    • Answer: Flag for human review or fall back to a rule-based system.
  14. Never ignore confidence scores—they indicate extraction reliability.

Quick Check Questions

Question 1

A healthcare provider needs to extract patient names, procedure codes, and insurance IDs from 10,000 scanned medical claim forms. The forms have consistent layouts but include handwritten notes. Which Azure AI Document Intelligence approach should they use? ? Answer: Train a custom Template model. Explanation: Prebuilt models don’t support medical forms, and while Neural models handle handwriting better, Template models work well for consistent layouts and require less training data.


Question 2

A retail company wants to process receipts from thousands of stores, each with slightly different layouts. They need high accuracy but have limited labeled data. Which model should they use? ? Answer: Use the prebuilt Receipt model first, then fine-tune with a custom Neural model if needed. Explanation: The prebuilt Receipt model will work for most cases, but if accuracy is insufficient, they can label a small dataset (10–20 receipts) and train a Neural model for better generalization.


Question 3

A logistics company is extracting shipping labels with barcodes, sender/recipient addresses, and package weights. They want to minimize costs and avoid training a custom model. Which Azure service should they use? ? Answer: Azure AI Document Intelligence Layout model + Azure Computer Vision (for barcodes). Explanation: The Layout model extracts text and tables, while Computer Vision’s Read API can decode barcodes. This avoids custom model training costs.


Last-Minute Cram Sheet

  1. Prebuilt models: Invoice, Receipt, ID, Business Card, Tax (US W-2), Layout. No training needed.
  2. Custom models:
  3. Template: Fixed layouts, 5–10 labeled docs, cheaper.
  4. Neural: Varied layouts, 10–50+ labeled docs, more accurate but expensive.
  5. Confidence scores: Always filter low-confidence fields (e.g., confidence < 0.7).
  6. Document Intelligence vs. Computer Vision:
  7. Document Intelligence = structured data extraction (key-value pairs, tables).
  8. Computer Vision = raw text + object detection (no semantics).
  9. Storage: Documents must be in Azure Blob Storage or local files (no AWS S3/GCP Cloud Storage).
  10. Cost:
  11. Prebuilt: ~$0.01–$0.05/page.
  12. Custom: ~$0.05–$0.10/page.
  13. Free tier: 500 pages/month for prebuilt models.
  14. Batch processing: Use asynchronous API for large document sets.
  15. Trap: The Layout model does NOT extract key-value pairs—only text, tables, and selection marks.
  16. Trap: Neural models need more data—don’t use them if you only have 5 labeled docs (use Template instead).