Smoke Testing vs Sanity Testing: When to Run Each
Smoke testing and sanity testing are two of the most frequently confused terms in software quality. Both are quick, both happen early, and both serve as gate checks before deeper testing begins. But they answer different questions, run at different times, and catching the distinction matters if your team wants to use its limited testing time efficiently. Smoke testing asks "is this build stable enough to test?" Sanity testing asks "does this specific change work as expected?"
The confusion is understandable because both tests are shallow by design. Neither one is thorough. They are intentionally fast and focused. But applying them incorrectly, or treating them as interchangeable, creates gaps in your testing process that show up as bugs in production.
What smoke testing does and when to run it
Smoke testing is a broad, shallow check across an application's core functionality. The name comes from hardware testing: plug it in and see if smoke comes out. In software, you deploy a new build and verify that the most critical paths work at a basic level. Can a user log in? Does the main dashboard load? Can you create the primary object your application manages? Does the API return responses instead of 500 errors?
The purpose of a smoke test is to determine whether a build is worth testing further. If login is broken, there is no point running your full regression suite. If the API gateway is returning connection errors, your integration tests will all fail for the same reason. Smoke testing catches these fundamental problems early so your team does not waste time investigating failures that stem from a single root cause.
A good smoke test suite has specific characteristics:
- Covers breadth, not depth. A smoke suite touches every major module of the application but only tests the most basic operation in each. You are checking that the plumbing works, not that every faucet delivers the right temperature.
- Runs in under 10 minutes. If your smoke tests take longer than that, they include too much. The entire point is rapid feedback. Most effective smoke suites run in 2 to 5 minutes.
- Executes on every build. Smoke tests are the first automated gate in your deployment pipeline. They run immediately after a build completes, before any other testing stage. For more on sequencing, see where QA fits in your CI/CD pipeline.
- Fails loudly. A smoke test failure should block the pipeline. There is no "acceptable" smoke test failure because each test represents a core function that must work for the application to be usable at all.
For a startup running a SaaS product, a typical smoke suite might include 10 to 20 tests covering: authentication, the main CRUD operations for your primary resource, navigation to key pages, API health endpoints, and at least one payment-related check if you process transactions.
What sanity testing does and when to run it
Sanity testing is narrow and focused. Instead of checking whether the whole application works at a basic level, a sanity test verifies that a specific change or fix works as intended. If a developer fixed a bug in the invoice calculation, a sanity test verifies that the calculation now produces the correct result. It does not check login, the dashboard, or any other area.
The name also comes from a simple idea: a sanity check confirms that the change is rational and complete before investing more time in thorough testing. If the bug fix does not actually fix the bug, or if the new feature does not perform its basic function, there is no reason to run a full regression cycle against it.
Sanity testing is usually run after a specific change has been deployed to a staging or QA environment. It is targeted rather than comprehensive:
- Focuses on the changed area. Only the functionality directly related to the change gets tested. If the change was to the search algorithm, sanity testing covers search. Everything else is left for regression testing.
- Often performed manually. Because sanity tests are ad hoc and tied to specific changes, they are frequently done by a developer or QA engineer manually verifying the fix before marking it ready for deeper testing.
- Does not require a full test plan. Sanity testing is informal and judgment-based. The tester uses their understanding of the change to verify the most important aspects quickly.
- Runs after smoke testing passes. Sanity testing assumes the build is stable. If smoke tests have not passed, sanity testing results are unreliable because failures might stem from the build instability rather than the specific change.
The key differences in a practical context
The simplest way to remember the difference: smoke testing is about the build, sanity testing is about the change. Here is how they compare across the dimensions that matter for your workflow.
Smoke testing is broad and shallow. It covers the entire application at a surface level. Sanity testing is narrow and slightly deeper. It covers only the changed functionality but examines it more carefully.
Smoke testing is almost always automated because it runs on every build and needs to be fast and consistent. Sanity testing is often manual because it requires human judgment about what to check for a specific change. That said, for recurring changes in well-understood areas, automating sanity checks is worthwhile.
Smoke testing answers "should we continue testing this build?" A failure means the build is rejected and goes back to development. Sanity testing answers "does this change accomplish what it was supposed to?" A failure means the specific ticket or fix goes back to the developer, but the rest of the build may still be testable.
In your pipeline, the sequence is: build, smoke test, sanity test specific changes, then run regression and deeper functional testing. Each step gates the next. This layered approach ensures you never waste expensive testing effort on an unstable foundation.
When teams confuse these and what goes wrong
The most common mistake is running only smoke tests and assuming sanity coverage is included. Smoke tests verify that login works, but they do not verify that the password reset flow you just refactored sends the correct email. The smoke test passes, the team proceeds to regression, and nobody specifically checks whether the changed feature actually works until a customer reports the problem.
The reverse mistake is less common but equally costly: running sanity tests on specific changes without first confirming the build is stable. A developer deploys a fix, sanity-tests it in staging, and marks it as verified. But staging was already broken by a different change that went in an hour earlier. The sanity test "passed" in an environment that does not represent what will ship.
Another pattern that causes problems is using the term "smoke test" to describe what is actually a full regression suite. When "smoke tests" take 45 minutes to run, they are no longer smoke tests. They have accumulated scope without anyone noticing, and the team has lost the rapid feedback mechanism that smoke testing is supposed to provide. If your smoke suite is growing beyond 10 minutes, it is time to split it into a true smoke tier and a separate regression tier. The definitive regression testing guide covers how to structure tiered test suites effectively.
Building both into your workflow
Implementing both smoke and sanity testing does not require a large investment. Most teams can get both practices running within a single sprint.
For smoke testing, start by identifying the 10 to 15 most critical paths in your application. Write automated tests for each using whatever framework you already have in place. These tests should be fast and independent of each other. Add them as the first test stage in your CI/CD pipeline with a hard fail gate. If any smoke test fails, the pipeline stops and alerts the team.
For sanity testing, establish a simple process: before any change moves from "in review" to "ready for QA," someone other than the author verifies that the specific change works as intended in the deployed environment. This can be a developer, a QA engineer, or a dedicated tester who brings fresh eyes. The key is that it happens consistently, not just when someone remembers to do it.
Track the results. When a bug makes it to regression testing or production that a smoke test or sanity check should have caught, note it. Over time, this data tells you whether your gate checks are effective or need adjustment. Teams that track the right QA metrics consistently improve their testing efficiency sprint over sprint.
If your team is looking for a structured approach to implementing these testing gates without pulling engineers off feature work, a managed QA service can own the sanity and regression layers while your developers focus on building. See how it works to understand the integration model.
Ready to level up your QA?
Book a free 30-minute call and see how Pinpoint plugs into your pipeline with zero overhead.