GitHub study guide

How to pass GitHub Copilot (GH-300)

29 min read6 domains coveredFree practice, no sign-up

The GitHub Copilot certification (GH-300) tests one habit above feature recall: choosing the right Copilot surface, plan, or data-handling control for a stated need, and remembering that the developer stays accountable for validating whatever Copilot produces. GitHub hands you a short developer situation with a constraint - explain code without editing it, draft a shell script at the terminal, stop a secrets file being read as context, classify bespoke categories that a plain instruction keeps getting wrong - and asks which feature, setting, or prompting technique fits. The hard part is rarely knowing that "Copilot Chat" exists. It is knowing which surface wins when two or three options could plausibly work and only one matches what the scenario actually asked for.

It suits working developers, team leads, and administrators who already use "GitHub Copilot" day to day and need to prove they can use it effectively and responsibly across the IDE, the terminal, and chat. The exam draws across six weighted domains. Using Copilot features carries the most marks, followed by responsible use, developer productivity, privacy and content exclusions, prompt engineering, and the data and architecture model. There is no enforced prerequisite, but the questions assume genuine exposure to inline completions, "Copilot Chat", Agent, Edit and Plan modes, the "GitHub Copilot CLI", content exclusions, and the plan tiers.

The exam rewards decision rules over memorised marketing. Most questions are scenarios where the wrong answers are plausible-sounding misuses of real features: a content exclusion proposed as a way to make output deterministic, duplication detection sold as a licence guarantee, a Copilot Space pitched as an organisation-wide off switch. The skill being tested is reading the need, picking the feature built for it, and rejecting the option that quietly misdescribes what a real feature does. Running throughout is one responsible-AI thread: Copilot predicts plausible text, fluency is not correctness, and accepted output must be reviewed by the accountable human before it is trusted.

GH-300 is a pick-the-right-surface exam: almost every question is a developer scenario with a stated need, and the right answer is the Copilot surface, plan, data-handling control, or prompting technique built for it - with the human always accountable for validating the output.

Difficulty

Foundational

Best for

Working developers, team leads, and administrators who already use GitHub Copilot in the IDE, terminal, and chat and want to prove they can choose the right surface, configure data controls, craft effective prompts, and operate the tool responsibly under real constraints.

Prerequisites

None enforced. Genuine hands-on use of GitHub Copilot - inline completions, Copilot Chat, Agent and Edit and Plan modes, the GitHub Copilot CLI, and at least exposure to content exclusions and the plan tiers - is what actually carries you through the scenarios.

60
Questions
90 min
Time allowed
700 / 1000
Pass mark
$99
Exam cost (USD)
273
Practice questions

How this exam thinks

One habit decides this exam: read the scenario for the need, then pick the Copilot surface, plan, data control, or prompting technique built for it. Almost every question is a short developer situation - explain this code, draft this command, stop this file being read, steer this output - and several options will sound capable. Only one matches what the scenario actually asked for, and the rest are usually plausible misuses of real features.

The surface split is the spine. Inline ghost-text completions finish code as you type; "Copilot Chat" answers questions and explains existing code in natural language without editing; Edit Mode makes a scoped multi-file change you review inline; Agent Mode runs tools and acts autonomously across the project; Plan Mode breaks a task into ordered steps before implementing; the "GitHub Copilot CLI" suggests and explains shell commands at the terminal. Pure explanation with no edits is Chat. A live terminal command is the CLI. A scoped rewrite is Edit Mode. Autonomous multi-step work is Agent Mode. Pick by what the scenario needs done, not by which feature is newest.

The data and control questions reward knowing what each control really does and rejecting the option that misdescribes it. Content exclusions stop named files being used as context for completions and chat at the repository level; they are not a way to make output deterministic, hide a standard from the model, or drive CLI generation. Duplication detection compares outbound suggestions against public code and suppresses matches; it is not a licence guarantee. Organisation Copilot policies are the central on or off switch for a feature across all members; a Space or an instructions file is not. Custom instructions inject persistent project conventions on every request. Throughout, one responsible-AI rule overrides everything: Copilot predicts plausible text, so fluent confidence is never proof of correctness, and the accountable developer must validate accepted output - especially for recent external facts, hardcoded secrets, and licence concerns - before trusting it.

What each domain tests and how to study it

The GH-300 blueprint is split across 6 domains. Weights are the official share of the exam; see the official exam guide for the authoritative breakdown.

  1. Use GitHub Copilot responsibly

    18% of exam

    What you must be able to do. Given a scenario where Copilot output looks convincing, choose the responsible response: name the generative-AI risk on show, pick the mitigation that builds understanding rather than bans the tool or invents a fake safeguard, and verify recent or external facts against an authoritative source before acting.

    In one sentenceThe responsibility domain: recognising that Copilot predicts plausible text, that fluency is never proof of correctness, and that the accountable human must validate output before trusting it.

    Recall check: answer these from memory first
    • A convincing Copilot suggestion calls a helper whose name and signature look entirely real. Which generative-AI risk is this, and why does the suggestion still have to be validated?
    • Juniors keep accepting suggestions for algorithms they do not understand. What is the mitigation that builds skill, and why is banning the tool or inventing a tier-based gate wrong?
    • Chat returns a confident, specific regulatory clause and date for a change that happened after the model's training cut-off. What is the responsible way to treat that answer, and why does no plan or filter rescue it?

    What it tests. Operating GitHub Copilot responsibly when the output is confident and convincing. Naming the risks and limitations of generative AI such as hallucination, where the model produces a fluent, plausible suggestion that may reference an API or behaviour that does not exist; bias, where output reflects skewed training-data patterns; and over-reliance, where developers accept code they do not understand and erode their own skill. Identifying the harm on show and the strategy that mitigates it - using Chat to explain unfamiliar suggestions and verifying the logic, rather than banning the tool or inventing a plan-based gate. Explaining why fluent, confident phrasing is produced by the same prediction process whether or not the content is correct, so tone is never a correctness signal. Treating confident answers about facts after the model's training cut-off as unverified and confirming them against the authoritative source, because no plan, grounding, or filter validates external facts for you.

    How to study it. Make one rule your reflex: confidence is not correctness, and the accountable developer validates output. Drill the named risks until you can label them from a scenario - hallucination is a confident suggestion referencing something that does not exist, bias is skewed treatment of one group of inputs, over-reliance is accepting code you do not understand. For each, learn the real mitigation and reject the fake ones: the answer to over-reliance is using Copilot to explain and verify, never banning juniors or inventing a tier that withholds algorithms. Practise the post-cut-off trap specifically: when Chat states a recent regulatory clause or date confidently, the responsible move is to treat it as unverified and confirm it against the authoritative source, not to trust that some Enterprise feed or proxy filter kept it current. Watch for distractors that attribute fact-checking powers to features that have none - grounding in open files, the proxy, or the audit log do not certify accuracy.

    Easy to confuse

    • Hallucination versus bias. Hallucination is when the model produces a confident, plausible output that may reference an API or behaviour that does not actually exist; bias is when output reflects skewed training-data patterns and treats one group of inputs less fairly than another. A convincing but non-existent helper is hallucination, not bias.
    • Fluent confidence versus verified correctness. Fluent, confident phrasing is produced by the same prediction process whether or not the content is right, so tone carries no information about accuracy. The model does not check facts, run tests, or clear duplication before sounding sure, so confidence must never be read as a correctness signal.

    Worked example from the GH-300 bank

    Free sampleUse GitHub Copilot responsiblymedium

    A developer asks GitHub Copilot Chat to summarise a recent regulatory change affecting their payment flow, and Chat returns a confident, specific answer citing a clause and an effective date. The change happened after the model's training data was collected. What is the responsible way to treat this answer?

    • AAccept the answer because Chat grounds responses in the open files and repository context, which keeps regulatory facts current and trustworthy.
    • BAccept the answer if the team is on Copilot Enterprise, since the enterprise tier connects Chat to live regulatory feeds that keep such facts up to date.
    • CAccept the answer because the proxy post-processes responses and would have filtered out a clause or date that was factually incorrect.
    • DTreat the clause and date as unverified, because the model can state recent facts fluently yet wrongly, and confirm them against the authoritative regulatory source before acting. Correct
    Treat confident Copilot answers about post-cut-off external facts as unverified and confirm them against authoritative sources before acting. Large language models generate fluent answers even for events after their training cut-off, with no guarantee of accuracy. Recent external facts such as regulatory clauses must be verified against the authoritative source, because no Copilot plan, grounding, or filter validates them.

    Why A is wrong: Repository grounding is real but applies to code context, not external regulatory facts, so it does not make the cited clause or date reliable.

    Why B is wrong: This is tempting because Enterprise adds capabilities, but no plan gives Chat a live regulatory feed, so the tier does not make recent external facts trustworthy.

    Why C is wrong: The proxy screens for unsafe content, not factual accuracy, so it cannot validate a regulatory clause or effective date.

    Why D is correct: Models present plausible answers about events after their cut-off without genuine knowledge, so the fluent specifics must be verified against the authoritative source before they are relied upon.

  2. Use GitHub Copilot features

    28% of exam

    What you must be able to do. Given a task and a need, choose the right Copilot surface - inline completions, Copilot Chat, Edit Mode, Agent Mode, Plan Mode, or the GitHub Copilot CLI - and the right extension or governance mechanism such as MCP, Spaces, instructions files, the correct plan tier, or an organisation Copilot policy.

    In one sentenceThe largest domain: matching each task to the Copilot surface built for it, knowing what each plan unlocks, and using policies, MCP, Spaces and instructions to extend and govern the tool.

    Recall check: answer these from memory first
    • A developer wants Copilot to explain in prose how existing retry logic works, without changing any code. Which surface fits, and why not Edit, Agent, or Plan Mode?
    • An administrator wants to switch one Copilot capability off for every developer in an organisation in a single action. Which mechanism does that, and why is a custom instructions file or a Space the wrong answer?
    • A developer wants the GitHub Copilot CLI to draft a multi-step backup script. What is the intended way to obtain it, and why are Spaces, prompt files, or content exclusions not how the CLI generates scripts?

    What it tests. Choosing the right Copilot surface and extension for a task. Enabling Copilot in the IDE and using inline ghost-text completions; using "Copilot Chat" for conversational explanations of existing code with no edits; using Edit Mode for a scoped multi-file rewrite the developer reviews inline; using Agent Mode to run tools and act autonomously across a project; using Plan Mode to break a task into ordered steps before implementing; installing and using the "GitHub Copilot CLI" to draft and explain shell commands and scripts at the terminal. Extending Copilot with "Model Context Protocol" servers, "Copilot Spaces" for shared context, Spark, pull request summaries, code review, and instructions files. Comparing the plan tiers and the features each unlocks, including "Copilot Business" and "Copilot Enterprise". Managing organisation-wide policies, feature availability, audit logs, and subscriptions, including the REST API and the organisation Copilot policy as the central control for turning a feature on or off for every member.

    How to study it. Build the surface decision tree first and drill it from the need, because this domain is selection above all. Fix the six surfaces by what they do: inline completes as you type, Chat explains in prose without editing, Edit Mode does a scoped reviewable rewrite, Agent Mode acts autonomously with tools, Plan Mode sequences steps before building, the CLI suggests and explains shell commands at the terminal. For a pure explanation with no edits, it is Chat; for a live terminal command, the CLI; for an autonomous cross-project change, Agent Mode. Learn the extension features by their real job and reject misuses: a custom instructions file injects persistent conventions, a Copilot Space holds shared context, an MCP server exposes tools the model can call, and none of these is an organisation off switch - that is the organisation Copilot policy. Separate the admin controls too: organisation policies govern feature availability centrally, audit logs record activity, and the REST API manages subscriptions and seats. Drill the plan comparison so you know what Business and Enterprise each unlock rather than guessing.

    Easy to confuse

    • Agent Mode versus Edit Mode. Edit Mode makes a scoped, developer-requested multi-file change that you review inline before accepting; Agent Mode acts autonomously, running tools and tracing the project to apply whatever changes it judges necessary. Choose Edit Mode when the scope is defined and you stay in control of each change, Agent Mode when the task is open-ended and you delegate the steps.
    • Copilot Chat versus the GitHub Copilot CLI. Copilot Chat answers natural-language questions and explains existing code inside the IDE without leaving it; the GitHub Copilot CLI suggests and explains shell commands and scripts directly at the terminal. For a live terminal command the developer cannot recall, the CLI is the surface, not Chat.
    • Organisation Copilot policy versus a custom instructions file or Copilot Space. An organisation Copilot policy is the central control that enables or disables a feature for every member in one action; a custom instructions file injects conventions into requests and a Copilot Space holds shared context, but neither turns a feature on or off organisation-wide. Central feature availability is always the policy.

    Worked example from the GH-300 bank

    Free sampleUse GitHub Copilot featuresmedium

    A developer opens an unfamiliar service file and wants Copilot to explain in prose how the retry logic works and why the team chose exponential backoff, without changing any code. Which Copilot surface best fits this need?

    • AUse "Copilot Chat" to ask a natural-language question about the retry logic and receive a conversational explanation of the existing behaviour. Correct
    • BUse "Agent Mode" so Copilot can autonomously run tools, trace the call graph and apply any refactors it judges necessary across the project.
    • CUse "Edit Mode" to request a scoped multi-file rewrite of the retry logic that the developer can then review inline before accepting.
    • DUse "Plan Mode" to have Copilot break the explanation task into ordered steps before it begins implementing the requested changes.
    Recognise that Copilot Chat is the surface for conversational explanations of existing code when no edits are required. Copilot Chat handles question-and-answer interactions and explains existing code in natural language without applying edits, so it serves a pure explanation need where the other modes would alter files or scope a build that the developer did not request.

    Why A is correct: Copilot Chat answers conversational questions and explains existing code in prose without modifying files, which matches a read-only explanation request exactly.

    Why B is wrong: Agent Mode is built to take autonomous action and edit files, which exceeds and contradicts a read-only request to merely have the existing logic explained.

    Why C is wrong: Edit Mode is designed to propose and apply code changes for review, but the developer wants an explanation and explicitly does not want any code changed.

    Why D is wrong: Plan Mode scopes implementation work into steps, but no change is wanted here, so producing a build plan answers a question that was never asked.

  3. Understand GitHub Copilot data and architecture

    13% of exam

    What you must be able to do. Given a question about what travels where, describe the inline-completion data flow through the Copilot proxy, the proxy filtering and post-processing pipeline including duplication detection and what the developer sees when a candidate is filtered, and the limitation that a model only uses its training data or supplied context.

    In one sentenceThe architecture domain: how an inline completion's prompt is built and sent through the Copilot proxy, what the filtering pipeline does to candidates, and why brand-new private information must be supplied as context.

    Recall check: answer these from memory first
    • Walk through what travels to the service for one inline completion, and name two descriptions that overstate it.
    • A candidate completion is discarded during proxy filtering. What does the developer see in the editor, and what do they not see?
    • Why is the model unlikely to apply an internal coding standard published this morning with no public footprint, and what would make it able to?

    What it tests. Understanding how GitHub Copilot handles data and what its model can and cannot know. Describing the inline-completion data flow: the editor builds a prompt from nearby code and open-file context, sends it through the Copilot proxy to the cloud-hosted model, and returns a suggestion - it does not upload the whole repository per keystroke or run an offline model locally. Describing the proxy filtering pipeline and post-processing, including duplication detection that compares each candidate against publicly available code and suppresses matches above a length threshold, and the behaviour when a candidate is filtered: no ghost text appears for that request rather than a recoloured or warned suggestion. Describing the limitations of large language models: a model can only use what was in its training data or what is supplied as context, so an internal standard published today with no public footprint must be provided explicitly, and plan-level differences in how data is used and shared.

    How to study it. Trace one inline completion end to end until you can narrate it: the editor assembles a prompt from nearby and open-file context, sends it through the Copilot proxy to a cloud model, and returns a suggestion. Reject the distractors that exaggerate this - no whole-repo upload per keystroke, no local execution of the project, no bundled offline model. Learn the filtering pipeline as something acting on outbound candidates: duplication detection compares a candidate to public code and suppresses close matches, and a filtered candidate simply produces no ghost text for that request, not a flagged or recoloured version. Internalise the core LLM limitation as a test: the model only knows its training data plus the supplied context, so anything new and private - a standard published this morning - has to be given as context, and no context window silently ingests every org file for you. Separate this clean model fact from the privacy controls: duplication detection is about outbound public-code matches, content exclusions are about withholding named files as context.

    Easy to confuse

    • Inline completion prompt versus a whole-repository upload. An inline completion sends a prompt built from nearby code and open-file context through the proxy to the model; it does not upload the entire local repository on every keystroke, run the project, or use a bundled offline model. The context is assembled per request, not the whole repo.
    • Duplication detection versus content exclusions. Duplication detection compares outbound candidate suggestions against publicly available code and suppresses matches; content exclusions stop named local files being used as context for completions and chat. One filters what leaves against public code, the other withholds what is read in.

    Worked example from the GH-300 bank

    Free sampleUnderstand GitHub Copilot data and architectureeasy

    An engineer expects GitHub Copilot to reason about a new internal coding standard published this morning that has no presence in any public source. Why is the underlying language model unlikely to apply this standard on its own?

    • AThe context window automatically ingests every file in the organisation regardless of what is shared in the request.
    • BThe standard was not in the training data and is not provided as context, so the model has no basis to apply it. Correct
    • CContent exclusions have hidden the standard from the model because every internal document is excluded by default.
    • DCopilot Enterprise refuses to reason about any standard newer than its most recent seat assignment date.
    Recognise that an LLM only uses its training data or supplied context, so brand new private information must be provided explicitly. A language model has no awareness of information that was neither in its training data nor supplied as context for the request. A standard published today with no public footprint meets neither condition, so it must be given as context to be applied.

    Why A is wrong: The context window is bounded and only holds what is supplied for a given request, so it does not silently ingest every organisation file, making this incorrect.

    Why B is correct: An LLM only draws on its training data or context supplied at request time, so a brand new internal standard that is neither will not be applied unless it is given as context.

    Why C is wrong: Internal documents are not excluded by default; exclusions must be configured, so this misrepresents how content exclusions behave.

    Why D is wrong: Seat assignment dates have no bearing on what the model can reason about, so tying model knowledge to seat dates is factually wrong.

  4. Apply prompt engineering and context crafting

    13% of exam

    What you must be able to do. Given a weak or inconsistent Copilot result, choose the prompting fix: structure the prompt as goal then context and constraints then expected output, switch from zero-shot to few-shot when category boundaries are company-specific, and supply consistent examples that each pair an input with its exact expected output.

    In one sentenceThe prompting domain: structuring requests clearly, knowing when few-shot examples beat a plain instruction, and understanding that Copilot assembles context from the surrounding code before sending the prompt.

    Recall check: answer these from memory first
    • A vague one-line request returns unfocused code. What prompt structure produces a focused answer, and why do a single word or several crammed requests fail?
    • A plain instruction keeps misclassifying tickets into five company-specific categories. What is the most effective prompt adjustment, and why does raising urgency not help?
    • Name the two properties of few-shot examples that most directly drive consistent output, and explain why mixed conventions cause mixed results.

    What it tests. Crafting prompts and context that produce focused, on-target output. Structuring a prompt as goal, then the relevant inputs and constraints, then the expected output, so Copilot has a clear ordered picture rather than the vague result of a one-word or scattered request. Applying zero-shot prompting for standard tasks and few-shot prompting when labels are company-specific and their boundaries cannot be inferred from names alone, supplying labelled examples that demonstrate the intended boundaries. Knowing the properties of effective few-shot examples: they share one consistent convention and pair each representative input with its exact expected output, so the model imitates a single clear mapping rather than competing ones. Explaining the prompt process flow - that Copilot assembles context from the surrounding code before sending the prompt, which is why a populated file yields stronger suggestions than an empty buffer - and how chat history is used.

    How to study it. Practise rewriting weak prompts into the goal-then-context-then-expected-output shape until it is automatic, and learn why the wrong answers fail: a single ambiguous word, several unrelated requests crammed together, or the same broad sentence repeated all starve the model of specifics. Drill the zero-shot versus few-shot decision on the bespoke-category case: when category names carry too little meaning, labelled examples demonstrate the boundaries, and raising urgency or retrieving the bare names does not. Learn the two properties that make few-shot work and quote them in your head - one consistent convention throughout, and each input paired with its exact expected output - because mixed conventions are the direct cause of mixed output. Tie context quality to the prompt process flow: Copilot assembles surrounding code as context before sending, so a populated project file beats an empty scratch buffer for the same one-line prompt. Reject distractors that invent mechanisms - Copilot does not train a personalised model at request time or route by file size.

    Easy to confuse

    • Zero-shot versus few-shot prompting. Zero-shot gives an instruction with no examples and works when the task is standard and the labels self-explain; few-shot supplies labelled examples and is needed when categories are company-specific and their boundaries cannot be inferred from names alone. Bespoke labels that keep going wrong are the signal to add examples.
    • A richer context window versus a personalised model trained at request time. A populated file yields stronger suggestions because Copilot assembles more relevant surrounding code as context before sending the prompt, not because it trains a personalised model on the open file at request time or routes larger files to a bigger model. The gain is context assembly, not on-the-fly training.

    Worked example from the GH-300 bank

    Free sampleApply prompt engineering and context craftingmedium

    A developer must classify free-text support tickets into five bespoke categories whose names and boundaries are specific to their company and not standard industry terms. A plain instruction keeps placing tickets in the wrong category. What is the most effective adjustment to the prompt?

    • AAdd a few labelled example tickets, one per category, so Copilot learns the company-specific boundaries from demonstrations rather than from the category names alone. Correct
    • BKeep the zero-shot instruction but raise its urgency, telling Copilot the classification is critical so it tries harder to pick the correct bespoke category.
    • CMove the category list into a Copilot Spaces context so the names are retrieved automatically, leaving the model to interpret each boundary unaided.
    • DApply a content exclusion to the ticket data so Copilot cannot misread old tickets, which removes the source of the wrong category assignments.
    Use few-shot examples when category labels are company-specific and their boundaries cannot be inferred from names alone. When categories are idiosyncratic, their names carry too little meaning for zero-shot classification, so labelled example tickets demonstrate the intended boundaries and let the model generalise, which neither urgency wording nor context retrieval of the bare names can do.

    Why A is correct: Bespoke category boundaries are hard to infer from names, and labelled examples teach the distinctions directly, so few-shot demonstrations correct the misclassification.

    Why B is wrong: Urgency language does not convey what the company-specific categories actually mean, so the model still lacks the boundary information it needs to classify correctly.

    Why C is wrong: Spaces can surface the names as context, but the model still infers boundaries from names alone, which is the very gap that labelled examples are needed to close.

    Why D is wrong: Content exclusions restrict context sources and do nothing to define bespoke categories, so they cannot fix classification driven by unclear category boundaries.

  5. Improve developer productivity with GitHub Copilot

    14% of exam

    What you must be able to do. Given a productivity task - generate a function, enforce test conventions, find a shell command, or harden a generated test - choose the practice or surface that delivers it, and apply responsible testing: stub external services, assert the real behaviour, and review the output yourself.

    In one sentenceThe productivity domain: using Copilot for generation, tests, documentation, and learning, while remembering that a generated test still needs stubbing, real assertions, and human review.

    Recall check: answer these from memory first
    • You want Copilot to draft a full function body before you write any code. What prompt practice gives the best first draft, and why is an empty def or a content exclusion wrong?
    • A team wants every test suggestion to follow a fixed fixture, naming pattern, and layout without restating the rules each time. Which mechanism enforces this, and why not a per-session few-shot paste?
    • A generated integration test hits a real payment gateway and asserts only that the call returned without error. Name three corrections that make it responsible and effective.

    What it tests. Using GitHub Copilot to do real work faster and well. Generating accurate code from a descriptive signature plus a comment naming the inputs, expected output, and an example, rather than an empty or contextless prompt; refactoring, documenting, and modernising legacy code. Enforcing persistent test conventions - a shared fixture, naming pattern, and arrange-act-assert layout - with a repository custom instructions file so they apply automatically without per-session repetition. Generating unit and integration tests, identifying edge cases, and writing assertions, then correcting a generated test responsibly by stubbing the external service, asserting the real persistence behaviour rather than only a non-error call, and reviewing it yourself because the developer stays accountable. Reducing context switching by using the "GitHub Copilot CLI" to suggest and explain shell commands in the terminal without opening a browser, and using Copilot to suggest security and performance improvements.

    How to study it. Learn each productivity task as a match between a need and the right practice or surface, and keep the accountability thread running through all of it. For code generation, the answer is a descriptive signature plus a comment naming inputs, expected output, and an example - not an empty def or a content exclusion. For persistent conventions across every request, the mechanism is a repository custom instructions file, not a per-session few-shot paste or an MCP tool called on demand. For a forgotten shell command without leaving the terminal, it is the GitHub Copilot CLI. Drill the responsible-testing multi-select hardest: a generated integration test that hits a live gateway and asserts only a non-error call must be fixed by stubbing the external service, asserting the record was actually persisted with expected fields, and being reviewed by the accountable developer - and a content exclusion does nothing to make a live external call deterministic. Reject any option that treats a clean Copilot run or successful compile as sufficient evidence of correctness.

    Easy to confuse

    • Custom instructions file versus a per-session few-shot prompt. A repository custom instructions file records conventions once and is supplied to Copilot on every request automatically, so test suggestions honour the fixture, naming, and layout without repetition; a few-shot prompt demonstrates the conventions afresh each session and must be re-pasted every time. Persistent, automatic enforcement is the instructions file.
    • Stubbing an external service versus a content exclusion over its module. Stubbing or mocking the external service removes the slowness, flakiness, and side effects of a real call so the test verifies behaviour deterministically; a content exclusion only stops Copilot reading the module as context and does nothing to make a live external call behave deterministically at run time. Test reliability comes from the stub, not the exclusion.

    Worked example from the GH-300 bank

    Free sampleImprove developer productivity with GitHub Copilotmedium

    A developer has asked Copilot Chat to generate an integration test that confirms an order service persists a record after it calls an external payment gateway. The generated test runs the real gateway and asserts only that the call returned without error. Before relying on this test, which corrections reflect responsible, effective testing practice? Select THREE.

    • AReplace the live gateway call with a stub or mock so the test does not depend on a third-party service, removing the slowness, flakiness, and side effects of hitting the real endpoint. Correct
    • BAccept the test unchanged because it was produced by Copilot and compiled successfully, treating a clean run against the live gateway as sufficient evidence that the persistence works.
    • CAdd assertions that the order record was actually written to the store with the expected fields, since the test must verify the persistence behaviour it claims to cover rather than only the call. Correct
    • DReview the generated test for correctness yourself before trusting it, because the developer remains accountable for validating Copilot output and a passing run does not confirm the test is right. Correct
    • EAdd a content exclusion over the payment module so Copilot cannot read it, on the assumption that this makes the live external call behave deterministically during the test run.
    Make a generated integration test reliable by stubbing external services, asserting the real persistence behaviour, and validating the test yourself, since Copilot output needs correction and verification. An integration test that hits a real gateway and asserts only a non-error call is slow, flaky, and proves nothing about persistence, so responsible practice replaces the third party with a stub, asserts the record was stored with expected fields, and has the accountable developer review the test, none of which content exclusions or blind acceptance achieve.

    Why A is correct: Stubbing the external gateway isolates the integration under test and removes the slowness and non-determinism of a live call, which is the responsible way to test around a third party.

    Why B is wrong: A clean run that only checks the call returned says nothing about persistence and ignores the developer's duty to validate, so accepting it unchanged is not responsible.

    Why C is correct: Asserting the record was stored with expected fields verifies the behaviour the test claims to check, since asserting only that the call returned proves nothing about persistence.

    Why D is correct: Responsible use keeps the developer accountable for validating generated tests, so reviewing the test for correctness is required before relying on it regardless of whether it ran.

    Why E is wrong: Content exclusions govern context only and never change runtime behaviour, so they cannot stabilise a live external call or fix the test's missing persistence checks.

  6. Configure privacy, content exclusions, and safeguards

    14% of exam

    What you must be able to do. Given a privacy, ownership, or safety concern, choose the right safeguard: repository content exclusions to stop named files being read as context, duplication detection to suppress public-code matches, and human review for licence and secret hygiene because no filter certifies licence status or vets a generated secret.

    In one sentenceThe privacy domain: configuring content exclusions to withhold named files, enabling duplication detection to suppress public-code matches, and knowing that licence and secret hygiene still rest on human review.

    Recall check: answer these from memory first
    • A team wants Copilot to stop using a named secrets file as context for completions and chat across a repository. Which control does this, and why is .gitignore or duplication detection wrong?
    • Duplication detection is enabled, yet an accepted suggestion resembles a copyleft snippet. What is the right licence-hygiene practice, and why is an enabled filter not proof of a clear licence?
    • Copilot suggests code that hardcodes an API token. Why can it produce such a string, and what is the correct response?

    What it tests. Controlling what Copilot can access and handling its output safely. Configuring repository content exclusions to stop named files - such as a deployment-secrets file - being used as context for code completions and IDE chat, which is distinct from .gitignore, duplication detection, or MCP. Describing ownership and the limitations of Copilot output, and enabling duplication detection, which compares each candidate suggestion against publicly available code and suppresses matches optionally above a length threshold - but is not a licence guarantee, so a developer must still review accepted output for similarity to licensed sources and confirm the licence position before merging. Applying security warnings and treating output correctly: Copilot can predict a plausible token-shaped literal, so a hardcoded secret in a suggestion is untrusted output to remove and replace with a reference to a secure store. Troubleshooting issues with suggestions and content exclusions.

    How to study it. Pin down what each safeguard actually does and, just as important, what it does not. Content exclusions are configured by path at the repository level on Copilot Business and Enterprise and stop the matched files being sent as context for completions and chat - they are not .gitignore, not duplication detection, and not an MCP feature, and they do not make output deterministic. Duplication detection compares outbound candidates against public code and suppresses matches above a length threshold; learn the hard limit that it is not a licence guarantee, so accepted code that resembles a copyleft snippet still needs human review of similarity and licence position before merging. For secrets, internalise the chain: Copilot predicts plausible code, so it can emit a token-shaped literal, which makes that output untrusted - remove the hardcoded value and load the secret from a secure store, rather than trusting any filter to vet it. Across this domain the accountable developer is the real safeguard; every distractor that treats an enabled filter as proof is wrong.

    Easy to confuse

    • Content exclusions versus .gitignore. Content exclusions are a Copilot setting configured by path at the repository level that stop the matched files being used as context for completions and chat; .gitignore only tells Git to stop tracking a file and does not withhold anything from Copilot requests. To keep a secrets file out of Copilot context, use content exclusions, not .gitignore.
    • Duplication detection versus a licence guarantee. Duplication detection lowers the chance of surfacing verbatim public code by suppressing close matches, but it does not certify a suggestion's licence status; accepted output may still resemble licensed material, so the accountable developer must review similarity and confirm the licence position before merging. An enabled filter is not licence proof.

    Worked example from the GH-300 bank

    Free sampleConfigure privacy, content exclusions, and safeguardsmedium

    A developer accepts a substantial Copilot suggestion that happens to resemble a snippet from a copyleft project, even though duplication detection is enabled. Which practice best maintains good licence hygiene for the accepted output?

    • ARely on duplication detection alone, treating an enabled filter as proof that the accepted code carries no licence obligations whatsoever.
    • BReview the suggestion for similarity to licensed sources and validate its licence position before merging, since the developer stays accountable. Correct
    • CAssume that because the developer owns accepted output, any resemblance to licensed code transfers no obligations onto the project.
    • DDisable duplication detection so the suggestion arrives faster, then document the snippet's origin only if a reviewer later asks about it.
    Understand that the accountable developer must review accepted suggestions for licence concerns, as duplication detection alone is not a licence guarantee. Duplication detection lowers the chance of reproducing public code but does not certify a suggestion's licence status, and accepted output may still resemble licensed material. The accountable developer must review for similarity and confirm the licence position before merging.

    Why A is wrong: Duplication detection reduces but does not eliminate the chance of a match, so treating it as proof of a clean licence status removes the human check that is still needed.

    Why B is correct: Because the developer is accountable for accepted output, reviewing it for licence concerns and confirming its position before merging is the practice that maintains licence hygiene.

    Why C is wrong: Owning the accepted code does not extinguish third-party licence obligations that attached to matching material, so this misreads ownership as a licence waiver.

    Why D is wrong: Disabling the filter removes a safeguard against public code matches and defers licence review, which weakens rather than maintains licence hygiene.

A study plan that works

  1. Map the blueprint and book a date

    Day 1

    Read the six domains and their weights. Book a provisional date now: a fixed date turns open-ended study into a plan and is the strongest predictor of actually sitting. Note that Using Copilot features at 28 percent and Responsible use at 18 percent are nearly half the exam between them, so plan the heaviest study there, with productivity, privacy, prompting, and architecture filling the rest.

  2. Build the surface and control decision trees

    Week 1

    Before drilling any domain, build the maps the whole exam rests on. Fix the six surfaces by what each does (inline, Copilot Chat, Edit Mode, Agent Mode, Plan Mode, GitHub Copilot CLI) and the controls by their real job (content exclusions, duplication detection, organisation Copilot policy, custom instructions, Copilot Spaces, Model Context Protocol). Use the recall prompts in this guide: cover the answer, choose the surface or control from the need, then reveal. If you cannot pick from the need alone, you do not own it yet.

  3. Go deep on features and responsible use

    Weeks 1 to 2

    These two domains are nearly half the exam, so they get the most time. Drill the surface split until choosing Chat for a pure explanation and the CLI for a terminal command is automatic, and learn what each plan tier and the organisation policy unlock. In parallel, make the responsible-AI rule a reflex: confidence is not correctness, the accountable developer validates output, and recent external facts must be confirmed against the authoritative source. Read the worked explanation on every practice question, including the ones you got right.

  4. Lock productivity and privacy controls

    Weeks 2 to 3

    Productivity and the privacy controls are dependable marks once drilled as need-to-mechanism matches. Fix code generation from a descriptive signature and comment, the custom instructions file for persistent test conventions, and the GitHub Copilot CLI for terminal commands. For privacy, separate content exclusions (withhold named files as context) from duplication detection (suppress public-code matches), and learn the limits: neither is a licence guarantee, and a hardcoded secret in a suggestion is untrusted output to replace.

  5. Cover prompting and the data and architecture model

    Week 3

    Drill the prompt structure (goal, then context and constraints, then expected output), the zero-shot versus few-shot decision, and the two properties of effective few-shot examples. Trace one inline completion end to end through the Copilot proxy and learn what the filtering pipeline does to a candidate, plus the core LLM limitation that a model only uses training data or supplied context. Reject the distractors that overstate the data flow or invent request-time training.

  6. Drill weak domains, then space the review

    Week 4

    Use your per-domain accuracy to attack the domains dragging you down, not to re-read what you already know. Then space it: revisit each domain's recall prompts after a few days and again later in the week. Spacing roughly doubles what sticks compared with cramming, and it is the cheapest gain available before the exam.

  7. Sit a timed mock and calibrate

    Weeks 4 to 5

    Take at least one full timed mock under exam conditions to rehearse pacing and the flag-and-return habit across the whole question set in the time allowed. Treat the score as a per-domain readiness signal, not a single number, and review every missed question - naming the need you misread and the feature you confused - before you book or sit.

Know when you're ready

Readiness for the GitHub Copilot certification is a score on fresh scenarios you have not seen before, not a feeling that the features are familiar. Those are different things, and the gap between them is where people fail. Re-reading the docs and using Copilot daily build fluency, and fluency feels like knowledge, so confidence rises while real recall does not. The fix is to test yourself: if you can read a new scenario, name the need, and pick the right surface or control while explaining why each other option misdescribes a real feature, you know it; if you can only nod along to an explanation, you do not yet.

Be especially wary of early confidence on the feature list. Knowing that Copilot Chat, Agent Mode, content exclusions, and duplication detection exist is the easy half; choosing between them under a stated need, when two would sound capable, and rejecting the plausible misuse of a real feature, is the half the exam actually tests. Trust your measured per-domain accuracy over your gut, and aim to clear every domain comfortably on unseen questions across more than one session, not to scrape a single pass.

This guide gives you the map. The practice bank is where you find out whether you can navigate it, with a worked explanation and a reason every distractor is wrong on every question. Readiness scoring tells you when you are there. Not before.

Ready to put this into practice?

Free GH-300 questions with worked explanations. No sign-up.

Practise GH-300 free

Exam-day tips

  • Read the scenario for the need first. The task - explain without editing, draft a terminal command, withhold a file, steer output - is what picks the surface or control, so find it before you judge the options.
  • Pick the surface by what must happen, not by which feature is newest. Pure explanation with no edits is Copilot Chat; a live shell command is the GitHub Copilot CLI; a scoped reviewable rewrite is Edit Mode; autonomous multi-step work is Agent Mode.
  • Treat fluency as a trap, never a signal. Copilot produces confident phrasing whether or not the content is correct, so a convincing suggestion still has to be validated, and recent external facts must be confirmed against the authoritative source.
  • Know what each control really does and reject the misuse. Content exclusions withhold named files as context, duplication detection suppresses public-code matches, the organisation Copilot policy is the central feature off switch - none of these makes output deterministic or guarantees a licence.
  • Remember the developer stays accountable. Any option that treats an enabled filter, a clean run, or a successful compile as proof of correctness or a clear licence is the wrong answer; the human reviews and validates.
  • Use few-shot only when labels are bespoke. When category boundaries are company-specific and a plain instruction keeps failing, labelled examples that share one convention and pair each input with its exact output are the fix - not raising urgency.
  • Flag and move on. Cover every question once before you spend time on a hard one; collecting the clear marks first protects the ones you actually know within the time limit.

Frequently asked questions

Is the GitHub Copilot certification hard?

It is a foundational exam, and the difficulty is judgement rather than recall. Most questions are scenarios where several Copilot features could sound capable and only one fits the stated need, with the wrong answers being plausible misuses of real features. Scenario practice with worked explanations matters far more than memorising what each feature does.

How long should I study for GH-300?

Most candidates who already use GitHub Copilot day to day are ready in four to five weeks of steady study. Less hands-on exposure means more time on the two heaviest domains, Using Copilot features and Responsible use, and on the surface and control decision trees the whole exam rests on.

Do I need to be an expert developer for this exam?

No. You need to read developer scenarios and reason about which Copilot surface, plan, or control fits, plus apply prompt-crafting and responsible-use rules. It is about using GitHub Copilot effectively and responsibly, not about advanced algorithms, so comfort with everyday IDE, terminal, and chat use is what carries you.

Which domains should I focus on?

Using Copilot features at 28 percent and Using GitHub Copilot responsibly at 18 percent are nearly half the exam, so they deserve the most time. Developer productivity and the privacy and content-exclusion controls are each meaningful, and prompting and the data and architecture model round out the rest with reliable marks if drilled as decision rules.

What is the difference between Copilot Chat, Edit Mode, and Agent Mode on this exam?

Copilot Chat answers questions and explains existing code in natural language without making edits; Edit Mode makes a scoped, developer-requested multi-file change you review inline; Agent Mode acts autonomously, running tools and applying changes it judges necessary across the project. The need in the scenario - explain, scoped rewrite, or delegated multi-step work - picks the surface.

How should I think about content exclusions and duplication detection?

Keep them separate. Content exclusions stop named files being used as context for completions and chat at the repository level; duplication detection compares outbound suggestions against public code and suppresses matches. Neither makes output deterministic and neither is a licence guarantee, so the accountable developer still reviews accepted output for licence and secret concerns.

Does the exam expect me to trust Copilot's output?

The opposite. A recurring theme is that Copilot predicts plausible text, so fluent confidence is never proof of correctness, the same prediction process produces confident output whether it is right or wrong, and recent external facts can be stated wrongly. The accountable developer must validate accepted code, tests, secrets, and licence positions before relying on them.

How many practice questions should I do before booking?

Enough that every domain clears comfortably on questions you have not seen, and a full timed mock feels comfortable on pacing. Quality of review beats raw volume: on every question, read the explanation and name the need that picked the answer and the feature each distractor misdescribed, including on the ones you got right.

Is the GitHub Copilot certification (GH-300) worth it?

It is a useful credential for developers and technical leads who use GitHub Copilot regularly and want a recognised way to demonstrate they can apply it effectively and responsibly, not just that they have the subscription enabled. Because the exam tests decision-making over feature recall, the preparation helps practitioners become more deliberate about which surface to use and why, which is a practical skill rather than exam trivia. The broader value depends on your context: organisations rolling out AI developer tooling at scale tend to value structured proof of responsible use over raw enthusiasm.

Examworthy is not affiliated with or endorsed by GitHub. This guide is original study material based on the public exam blueprint. We never reproduce live exam items. GH-300 and related marks belong to their respective owners.