Microsoft study guide

How to pass Microsoft Azure Data Fundamentals (DP-900)

17 min read4 domains coveredFree practice, no sign-up

Microsoft Azure Data Fundamentals (DP-900) is an entry-level certification that checks whether you can describe core data concepts and recognise which Azure data service fits a given workload. It does not ask you to build anything or write production code. It asks you to know what structured, semi-structured, and unstructured data are, what separates a transactional workload from an analytical one, and which managed Azure service Microsoft positions for each job. Almost every question is a definition, a comparison, or a short scenario that resolves to naming the right service or the right concept.

It suits people early on an Azure data path: analysts, students, career changers, and developers or administrators from other clouds who need a credential that proves they understand the vocabulary and the service map. There are no prerequisites, and most candidates do not have deep hands-on Azure experience. That is by design. The exam rewards clean recall of definitions and the ability to tell similar services apart, not operational depth.

The pass-or-fail line is precision on the distinctions Microsoft repeats. Blob versus File versus Table versus Queue. Azure SQL Database versus Managed Instance versus SQL on a VM. A star schema fact table versus a dimension table. A Power BI report versus a dashboard versus a paginated report. The questions reuse the same handful of contrasts, so a candidate who has drilled those pairs until they are automatic clears the exam comfortably, while one who only half-remembers them loses marks to the distractor that looks almost right.

DP-900 is a vocabulary-and-service-map exam: nearly every question is a definition or a short scenario that resolves to naming the correct data concept or the Azure service Microsoft positions for that workload.

Difficulty

Foundational

Best for

People starting an Azure data journey: data analysts, students, career changers, and developers or administrators from other platforms who want a recognised credential proving they understand core data concepts and the Azure data service map.

Prerequisites

None. No prior Azure or database experience is required. Familiarity with everyday data ideas such as tables, files, and reports helps, and a free Azure account to click through the services makes the concepts stick faster.

Typically 40 to 60 questions
Questions
45 min
Time allowed
700 / 1000
Pass mark
$99
Exam cost (USD)
262
Practice questions

How this exam thinks

This exam thinks in definitions and best-fit matches. It hands you a concept to identify or a short workload description, and the right answer is the textbook definition or the single Azure service Microsoft positions for that scenario. When several services could loosely apply, pick the one whose stated purpose matches the workload exactly: the access pattern, the data shape, or the management model named in the question is the deciding signal. Transactional and high-volume small reads and writes point to operational stores; large-scale aggregation and reporting point to analytics services. Treat the vendor's own description of each service as the source of truth, and let the distinctive feature in the question, such as global distribution, unstructured binary data, or automated patching, select the answer.

What each domain tests and how to study it

The DP-900 blueprint is split across 4 domains. Weights are the official share of the exam; see the official exam guide for the authoritative breakdown.

  1. Describe Core Data Concepts

    30% of exam

    What you must be able to do. Be able to classify any data example as structured, semi-structured, or unstructured, name the storage option and workload type it belongs to, and identify which data role owns the task.

    In one sentenceThe vocabulary domain: data formats, storage options, transactional versus analytical workloads, and who does what on a data team.

    Recall check: answer these from memory first
    • Classify each as structured, semi-structured, or unstructured: a customer table with a fixed schema, JSON profiles whose fields differ per record, and a folder of scanned images.
    • What is the difference between a transactional (OLTP) workload and an analytical (OLAP) workload, and which one is about running the business versus understanding it?
    • Name the data engineer, data analyst, and data scientist responsibilities in one line each.

    What it tests. The shared language the rest of the exam builds on. Representing data as structured (relational tables with a fixed schema), semi-structured (JSON and other documents whose fields vary between records), and unstructured (images, video, free text, and binary files); identifying storage options including file stores, databases, and data lakes; telling transactional (OLTP) workloads, which handle many small fast reads and writes, apart from analytical (OLAP) workloads, which aggregate large volumes for reporting; and recognising the data engineer, data analyst, and data scientist roles and what each is responsible for. It also covers file formats such as row-based Avro and the descriptive JSON header it carries.

    How to study it. Drill classification until it is instant: take any example and label it structured, semi-structured, or unstructured, then say which storage option and workload it implies. Build a three-column table for the data roles and fill it from memory: the data engineer builds and maintains pipelines and storage, the data analyst models and visualises data for decisions, the data scientist applies statistics and machine learning. Learn the two workload types as a contrast, not in isolation: OLTP is many small fast transactions for running the business, OLAP is heavy aggregation over large history for understanding it. Practise on the sample questions and read every worked explanation, watching for the one word that fixes the category, such as fields varying between records meaning semi-structured.

    Easy to confuse

    • Semi-structured versus unstructured data. Semi-structured data carries some organising tags or keys even if the fields vary between records, such as JSON documents; unstructured data has no inherent field structure at all, such as images, video, or free text. Varying fields still count as semi-structured, not unstructured.
    • Transactional (OLTP) versus analytical (OLAP) workload. OLTP handles many small, fast reads and writes for the live application and keeps data current; OLAP aggregates large volumes of historical data for reporting and analysis. One runs the business in real time, the other analyses it after the fact.

    Worked example from the DP-900 bank

    Free sampleDescribe Core Data Conceptsmedium

    Storing customer profiles as JSON documents whose fields vary between records is an example of semi-structured data. Is this statement correct?

    • AYes Correct
    • BNo
    JSON documents with fields that differ between records are semi-structured, not unstructured. JSON is named in the grounding as a common semi-structured format precisely because its documents can vary in their specific fields between instances.

    Why A is correct: Correct. JSON is named in the grounding as a common semi-structured format precisely because its documents can vary in their specific fields between instances.

    Why B is wrong: Answering No is wrong because semi-structured data is defined as having some structure while allowing variation between instances, and JSON with varying fields is the common example the grounding gives.

  2. Identify Considerations for Relational Data on Azure

    23% of exam

    What you must be able to do. Be able to define the core relational objects (table, row, key, view, index, stored procedure) and select the correct Azure SQL deployment option from its management model.

    In one sentenceThe relational domain: core relational concepts plus choosing the right Azure SQL family member by who manages what.

    Recall check: answer these from memory first
    • In a relational table, what does a row represent, what does a column represent, and what does the primary key do?
    • Define a view and a stored procedure, and give the one feature that separates them.
    • For which Azure SQL option do you manage the operating system and engine patching yourself, and which two options hand that to Microsoft?

    What it tests. Relational fundamentals and the Azure SQL service map. Describing relational concepts: a table models an entity, a row is one instance of that entity, a primary key uniquely identifies each row, normalisation reduces redundancy, an index speeds lookups, a view is a virtual table from a SELECT query, and a stored procedure packages SQL that runs on command and can take parameters. Then identifying relational Azure data services: the Azure SQL family (Azure SQL Database, Azure SQL Managed Instance, and SQL Server on Azure Virtual Machines) and Azure Database for the open-source engines MySQL, PostgreSQL, and MariaDB, distinguished mainly by how much of the operating system and engine Microsoft manages for you.

    How to study it. Split this domain into two lists and drill each. First, the relational object definitions: write one sentence each for table, row, primary key, foreign key, index, view, and stored procedure, then test yourself by reading a definition and naming the object. The view-versus-stored-procedure and view-versus-index contrasts are common traps, so do them by hand. Second, the Azure SQL management ladder: SQL Server on a VM (IaaS) means you patch the OS and the engine yourself; Azure SQL Managed Instance and Azure SQL Database (PaaS) hand patching, backups, and recovery to Microsoft, with Managed Instance offering near-full SQL Server compatibility and Database being the fully managed single-database PaaS. Learn the responsibility split as the deciding factor, because that is how the exam frames the choice.

    Easy to confuse

    • View versus stored procedure. A view is a virtual table defined by a SELECT query that you read like a table; a stored procedure is saved SQL that runs on command, can take parameters, and can modify data. A view exposes rows; a procedure executes logic.
    • Azure SQL Database versus SQL Server on Azure Virtual Machines. Azure SQL Database is fully managed PaaS where Microsoft patches and backs up the engine for you; SQL Server on Azure VMs is IaaS where you own the operating system and decide when to update the OS and SQL Server software. The deciding factor is who is responsible for patching.

    Worked example from the DP-900 bank

    Free sampleIdentify Considerations for Relational Data on Azuremedium

    For which Azure SQL option are you responsible for deciding when to update or upgrade the operating system and SQL Server software, and for managing the OS yourself?

    • AAzure SQL Database, where Microsoft automatically updates and patches the SQL Server software for you
    • BAzure SQL Managed Instance, which provides fully automated updates, backups, and recovery
    • CSQL Server on Azure Virtual Machines, where you manage the operating system and SQL Server updates Correct
    • DAzure SQL Database Hyperscale, which separates storage and compute and scales rapidly on demand
    Only SQL Server on Azure VMs (IaaS) makes OS and engine patching your responsibility; both PaaS options automate it. With SQL Server on Azure VMs you have full control over the OS and SQL Server configuration and it is up to you to decide when to update or upgrade the OS and database software.

    Why A is wrong: SQL Database is fully managed PaaS with automated updates, so you do not manage OS or engine patching.

    Why B is wrong: Managed Instance is PaaS with automated software update management, so the OS and engine patching is handled for you.

    Why C is correct: Correct. With SQL Server on Azure VMs you have full control over the OS and SQL Server configuration and it is up to you to decide when to update or upgrade the OS and database software.

    Why D is wrong: Hyperscale is a SQL Database service tier and remains fully managed PaaS, so Microsoft handles patching.

  3. Describe Considerations for Working with Non-Relational Data on Azure

    18% of exam

    What you must be able to do. Be able to match an unstructured or non-relational workload to the correct Azure Storage service (blob, file, table, queue) and state what makes Azure Cosmos DB distinctive.

    In one sentenceThe non-relational domain: picking the right Azure Storage service by data shape, and knowing Cosmos DB for global distribution.

    Recall check: answer these from memory first
    • Match each to a workload: storing images and video, a mountable cross-platform file share, simple key-value rows, and messages passed asynchronously between application components.
    • What is the difference between Azure Blob storage and Azure Files in how applications reach the data?
    • How does Azure Cosmos DB make data available in additional regions, and what does the service do automatically when you add one?

    What it tests. Azure Storage and Cosmos DB selection. The four Azure Storage services and their jobs: Blob storage holds massive unstructured binary data such as images, video, logs, and backups as objects read through a storage API; Azure Files offers managed file shares reachable over SMB and NFS and via a URL with a shared access signature token; Table storage keeps key-value and key-attribute data in rows; Queue storage holds messages for asynchronous processing between application components. Then the capabilities of Azure Cosmos DB: a globally distributed, multi-model NoSQL service where you add Azure regions to an account at any time and the service automatically replicates the data to each one, supporting several APIs and low-latency access at global scale.

    How to study it. Memorise the four-way Azure Storage split by the data shape and access need each one serves, then practise reading a workload and naming the service: unstructured binary objects mean Blob, a mountable file share means Files, simple key-value rows mean Table, and asynchronous messages between components mean Queue. The Blob-versus-Files trap appears often, so anchor it: Blob is object storage accessed through an API, Files is a file share you mount or reach with a SAS URL. For Cosmos DB, fix the headline features: turnkey global distribution where adding a region triggers automatic replication, multi-model support through multiple APIs, and low latency at global scale. The exam usually tests Cosmos DB by its global distribution behaviour, so know that adding a region replicates automatically rather than requiring manual export and import.

    Easy to confuse

    • Azure Blob storage versus Azure Files. Blob storage is object storage for massive unstructured binary data, read and written through the Blob storage API; Azure Files is a managed file share you mount over SMB or NFS or reach via a URL with a SAS token. Blob is objects via an API, Files is a mountable share.
    • Azure Table storage versus Azure Queue storage. Table storage persists structured key-value and key-attribute data as rows you query; Queue storage holds messages temporarily so application components can communicate asynchronously. Table is for storing data, Queue is for passing messages.

    Worked example from the DP-900 bank

    Free sampleDescribe Considerations for Working with Non-Relational Data on Azuremedium

    An application running on Azure Cosmos DB needs to serve users on multiple continents with data kept close to each of them. How does Cosmos DB make data available in additional regions?

    • AYou must export the data and import it into a separate database deployed in each region you want to serve.
    • BYou add Azure regions to the account at any time, and the service automatically replicates the data to each one. Correct
    • CYou configure a scheduled nightly job that copies the container's items to read replicas in the other regions.
    • DYou deploy a separate Cosmos DB account per region and keep them in sync through application code on every write.
    Cosmos DB offers turnkey global distribution: add a region and replication to it happens automatically. Cosmos DB is built for global distribution: you add Azure regions to the account at any time and the service automatically replicates your data to each region.

    Why A is wrong: Wrong. Cosmos DB replicates automatically; manual export and re-import into separate databases is not how its global distribution works.

    Why B is correct: Correct. Cosmos DB is built for global distribution: you add Azure regions to the account at any time and the service automatically replicates your data to each region.

    Why C is wrong: Wrong. Replication in Cosmos DB is continuous and automatic, not a scheduled batch copy that the customer sets up.

    Why D is wrong: Wrong. A single account spans multiple regions with built-in replication, so per-region accounts synced by app code are unnecessary.

  4. Describe an Analytics Workload on Azure

    29% of exam

    What you must be able to do. Be able to describe data warehousing and the star schema, name the large-scale and real-time analytics services, and tell Power BI reports, dashboards, and paginated reports apart.

    In one sentenceThe analytics domain: warehousing and star schemas, Synapse and Fabric and data lakes, stream processing, and Power BI visualisation.

    Recall check: answer these from memory first
    • In a star schema, what does the central fact table hold and what do the surrounding dimension tables hold, and which one carries the numbers you aggregate?
    • What makes Microsoft Fabric distinct as an analytics platform, and what shared storage do all its workloads use?
    • In Power BI, how do a report, a dashboard, and a paginated report differ in purpose?

    What it tests. The analytics stack from ingestion to visualisation. Large-scale analytics elements: Azure Synapse Analytics, Microsoft Fabric as a unified end-to-end SaaS analytics platform whose workloads all share OneLake storage, and data lakes; the star schema, where a central fact table holds the numeric values that can be aggregated and surrounding dimension tables hold the descriptive entities you group and slice by. Real-time analytics: stream processing and event ingestion, the difference between batch and stream. And data visualisation in Microsoft Power BI: interactive reports, dashboards that pin visuals from reports, paginated reports for pixel-perfect printable output, and storage modes such as Direct Lake that connect a semantic model directly to OneLake files without a separate import step.

    How to study it. Learn the star schema as a pair: the fact table holds numbers you aggregate, the dimension tables hold the entities you aggregate by, and a fact joined directly to its dimensions is a star schema while dimensions branching into further detail tables make it a snowflake. Fix the platform identities: Microsoft Fabric is the unified SaaS analytics platform built on shared OneLake, Synapse Analytics is the enterprise analytics service, and a data lake stores raw files of any shape. For real-time, hold the batch-versus-stream contrast: batch processes data in scheduled chunks, stream processes events continuously as they arrive. For Power BI, separate the three artefact types by purpose: a report is an interactive multi-page analysis, a dashboard is a single-page canvas of pinned visuals, and a paginated report is built for precise printable layouts. Note that Direct Lake mode reads OneLake files directly without importing them first.

    Easy to confuse

    • Fact table versus dimension table (star schema). The fact table holds the numeric, aggregatable measures such as sales amounts; the dimension tables hold the descriptive entities such as product, customer, and time that you group and slice those numbers by. Numbers live in the fact table, attributes in the dimensions.
    • Power BI report versus dashboard. A report is an interactive, often multi-page analysis built on a single semantic model; a dashboard is a single-page canvas that pins selected visuals, which can come from several reports, for an at-a-glance view. A report is the deep analysis, a dashboard is the curated summary.

    Worked example from the DP-900 bank

    Free sampleDescribe an Analytics Workload on Azureeasy

    Which description best characterises Microsoft Fabric as a platform for large-scale analytics?

    • AA relational engine for transactional workloads where data is normalised for fast row writes
    • BA cloud analytics platform built on Apache Spark and optimised for code-first engineering
    • CA standalone integration service for building pipelines that target on-premises sources
    • DA unified end-to-end SaaS analytics platform whose workloads all share OneLake storage Correct
    Microsoft Fabric is a single SaaS analytics platform built on shared OneLake storage. Microsoft Fabric is described as a unified, end-to-end SaaS analytics platform built on OneLake, where every Fabric workload reads from and writes to the same tenant-wide data lake.

    Why A is wrong: That describes a transactional database tuned for writes, not Fabric, which is built for analytics across shared OneLake storage.

    Why B is wrong: Being built on Apache Spark for code-first engineering describes Azure Databricks, a different service in the grounding.

    Why C is wrong: A standalone pipeline service for hybrid on-premises sources describes Azure Data Factory, not the unified Fabric platform.

    Why D is correct: Correct. Microsoft Fabric is described as a unified, end-to-end SaaS analytics platform built on OneLake, where every Fabric workload reads from and writes to the same tenant-wide data lake.

A study plan that works

  1. Map the four domains and book a date

    Day 1

    Read the official skills outline and note the four domains and their weights. Core Data Concepts and the Analytics Workload are the two heaviest, so they will earn the most time. Book a provisional exam date now: a fixed date converts open-ended study into a plan and is the strongest predictor of actually sitting.

  2. Lock the core vocabulary

    Week 1

    Before touching any Azure service, nail the shared language: structured versus semi-structured versus unstructured data, transactional versus analytical workloads, the file and database and data lake storage options, and the three data roles. Use the recall prompts in this guide: cover the answer, classify or define from memory, then reveal. If you cannot label an example on sight, you do not own it yet.

  3. Drill the relational concepts and Azure SQL ladder

    Week 2

    Write one-line definitions for table, row, key, index, view, and stored procedure, then test by reading a definition and naming the object. Learn the Azure SQL management ladder by who patches what: SQL on a VM is your responsibility, Managed Instance and SQL Database are Microsoft's. Practise the view-versus-stored-procedure and Database-versus-VM calls until the responsibility split alone decides them.

  4. Build the non-relational storage decision tree

    Week 3

    Memorise the four Azure Storage services by data shape: Blob for unstructured objects, Files for mountable shares, Table for key-value rows, Queue for messages. Drill the Blob-versus-Files trap. Then learn Cosmos DB by its global distribution behaviour: add a region and replication happens automatically. Read the worked explanation on every sample question, including the ones you got right.

  5. Cover the analytics stack and Power BI

    Week 4

    Learn the star schema as a fact-versus-dimension pair, fix the identities of Synapse, Fabric, OneLake, and data lakes, and hold the batch-versus-stream contrast. Separate the Power BI report, dashboard, and paginated report by purpose. This domain is heavy, so do not leave it short, and tie each service back to whether the workload is large-scale, real-time, or visualisation.

  6. Drill weak domains and space the review

    Week 5

    Use your per-domain accuracy to attack the two domains dragging you down rather than re-reading what you already know. Then space it: revisit each domain's recall prompts after a few days and again a week later. Spacing roughly doubles what sticks compared with cramming the night before.

  7. Sit a timed mock and calibrate

    Week 6

    Take at least one full timed mock under exam conditions to rehearse pacing and the flag-and-return habit. Treat the score as a per-domain readiness signal, not a single number, and review every missed question, naming the definition or service distinction you misread, before you book or sit.

Know when you're ready

Readiness for DP-900 is a measured score on questions you have not seen before, not a feeling that the services sound familiar. Those are different things, and the gap between them is where people slip on a foundational exam they assumed was easy. Re-reading notes builds recognition, and recognition feels like knowledge, so confidence rises while real recall lags. The fix is to test yourself: if you can read a fresh definition or scenario, name the concept or the single Azure service that fits, and say why each near-miss distractor is wrong, you know it. If you can only nod along to an explanation, you do not yet. Aim to clear every one of the four domains comfortably on unseen questions across more than one session, with particular confidence on the Blob-versus-Files, Azure SQL deployment, fact-versus-dimension, and Power BI artefact distinctions the exam reuses. The practice bank, with a worked explanation and a reason every distractor is wrong on every question, is where you find out whether you are there. Not before.

Ready to put this into practice?

Free DP-900 questions with worked explanations. No sign-up.

Practise DP-900 free

Exam-day tips

  • Read each scenario for the distinctive feature. The one detail that stands out, such as unstructured binary data, global distribution, or automated patching, is what selects the service, so find it before judging the options.
  • Tell the storage services apart by data shape. Unstructured objects mean Blob, a mountable share means Files, key-value rows mean Table, and asynchronous messages mean Queue; do not default to the one you have heard of most.
  • Use the management model to pick the Azure SQL option. If you patch the OS and engine yourself it is SQL on a VM; if Microsoft does it for you it is Azure SQL Database or Managed Instance.
  • Keep the star schema pair straight. Numbers and measures live in the fact table, descriptive entities live in the dimension tables, and a fact joined directly to its dimensions is a star schema.
  • Separate the Power BI artefacts by purpose. A report is the interactive analysis, a dashboard is the pinned single-page summary, and a paginated report is the pixel-perfect printable layout.
  • Watch the yes-or-no and true-false items closely. They hinge on one precise claim, so read the statement against the textbook definition and do not let a plausible-sounding phrase override it.
  • Flag and move on. Answer every question once before dwelling on a hard one, then return to the flagged items with the time you have left.

Frequently asked questions

Is DP-900 hard?

No. It is a foundational, beginner-level exam that tests definitions and best-fit service matches, not building or coding. The difficulty is precision on the distinctions Microsoft reuses, such as Blob versus Files or the Azure SQL deployment options. Candidates who drill those pairs until they are automatic clear it comfortably.

Do I need Azure experience or a database background before sitting DP-900?

No. There are no prerequisites and most candidates do not have deep hands-on experience. Everyday familiarity with tables, files, and reports helps, and a free Azure account to click through Blob storage, Azure SQL, and Power BI makes the concepts stick faster, but none of it is required.

How long should I study for DP-900?

Most people are ready in four to six weeks of steady study. Those with a data or IT background often need less, while complete beginners should give extra time to the two heaviest domains, Core Data Concepts and the Analytics Workload, and to the service-selection distinctions the exam reuses.

What is on the DP-900 exam?

Four domains: core data concepts, relational data on Azure, non-relational data on Azure, and analytics workloads on Azure. Together they cover data formats and workloads, the Azure SQL family, Azure Storage and Cosmos DB, and the analytics stack from data lakes and Synapse and Fabric through to Power BI visualisation.

Does DP-900 require coding or writing SQL?

No. You should recognise what relational objects such as views and stored procedures do and reason about data concepts, but you are never asked to write a query or any code. The exam is about identifying concepts and matching workloads to the right Azure service.

Is DP-900 useful, and what can I take next?

It is a solid first step that proves you understand core data concepts and the Azure data service map, which is valuable for analysts, students, and career changers. A common next step is a role-based data certification such as the Azure Data Engineer or Power BI Data Analyst track once you have hands-on practice.

How many practice questions should I do before booking?

Enough that every domain clears comfortably on questions you have not seen and a timed mock feels relaxed on pacing. Quality of review beats raw volume: on every question, read the worked explanation and name the definition or service distinction that picked the answer, including on the ones you got right.

Is DP-900 worth it as a starting point on the Azure data path?

It is a solid entry point for analysts, career changers, and IT professionals who need a recognised way to demonstrate they understand core data concepts and the Azure data service map before moving into a more specialised role. The value is mostly in establishing a foundation and signalling intent rather than proving deep technical skill, which is by design at foundational level. Common next steps are the role-based certifications that follow naturally from it, such as the DP-300 for database administration, the PL-300 for Power BI analysis, or the DP-600 and DP-700 for Microsoft Fabric engineering.

Examworthy is not affiliated with or endorsed by Microsoft. This guide is original study material based on the public exam blueprint. We never reproduce live exam items. DP-900 and related marks belong to their respective owners.