Files
Sharepoint-Toolbox/.planning/research/FEATURES.md
Dev 853f47c4a6 docs: complete v2.3 project research (STACK, FEATURES, ARCHITECTURE, PITFALLS)
Research covers all five v2.3 features: automated app registration, app removal,
auto-take ownership, group expansion in HTML reports, and report consolidation toggle.
No new NuGet packages required. Build order and phase implications documented.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 10:58:58 +02:00

31 KiB

Feature Landscape

Domain: MSP IT admin desktop tool — Tenant Management & Report Enhancements Milestone: v2.3 Researched: 2026-04-09 Overall confidence: HIGH (verified via official Graph API docs, PnP docs, and direct codebase inspection)


Scope Boundary

This file covers only the five net-new features in v2.3:

  1. Automated app registration on target tenant (with guided fallback)
  2. App removal from target tenant
  3. Auto-take ownership of SharePoint sites on access denied (global toggle)
  4. Expand groups in HTML reports (clickable to show members)
  5. Report consolidation toggle (merge duplicate user entries across locations)

Everything else is already shipped. Dependencies on existing code are called out explicitly.


Feature 1: Automated App Registration on Target Tenant

What it is

During profile creation/editing, the app can register itself as an Azure AD application on the target tenant. This eliminates the current manual step where admins must open Entra portal, create an app registration, copy the Client ID, and paste it into the profile form.

How it works (technical)

Graph API app registration is a two-phase operation when performed programmatically:

Phase 1 — Create the Application object: POST /applications with displayName and optionally requiredResourceAccess (permission declarations). Returns appId (client ID) and id (object ID). Requires delegated permission Application.ReadWrite.All (least privilege for delegated scenarios), or the calling user must hold a role of Application Developer, Cloud Application Administrator, or higher.

Phase 2 — Create the Service Principal: POST /servicePrincipals with appId from Phase 1. This is a required explicit step when registering via Graph API — the portal creates the SP automatically, the API does not. Requires the same Application.ReadWrite.All delegated permission.

Phase 3 — Grant admin consent for required permissions: POST /servicePrincipals/{resourceId}/appRoleAssignedTo for each application permission (SharePoint, Graph scopes needed). The calling user must hold Cloud Application Administrator or Global Administrator to grant tenant-wide consent. Requires delegated permissions Application.Read.All + AppRoleAssignment.ReadWrite.All.

Phase 4 — Store credentials (optional): POST /applications/{id}/addPassword to create a client secret. The secretText is only returned once at creation — must be stored immediately in the profile. Alternatively, the app can use a MSAL public client flow (interactive login), which does not require a client secret.

Guided fallback: If the calling user lacks Application.ReadWrite.All or admin consent cannot be granted programmatically (e.g., tenant has restricted app consent policies), the automated path fails. The fallback shows step-by-step instructions + a deep-link to the Entra portal app registration wizard, with the appId field pre-fill-ready when the admin returns.

Table Stakes

Feature Why Expected Complexity Notes
Create app registration on target tenant via Graph API MSPs manage 10-50 tenants; manual Entra portal steps per-tenant is the biggest onboarding friction High 4 API calls; requires Application.ReadWrite.All + admin consent grant scope in the calling token
Return and store the Client ID automatically The resulting appId must be wired into the TenantProfile as the registered clientId Low Phase 1 response body contains appId; persist to TenantProfile.ClientId
Guided fallback (manual instructions + portal deep-link) if automated path fails Tenant admin consent policies may block programmatic app creation Medium Detect 403/insufficient_scope errors; render a modal with numbered steps and a link to https://entra.microsoft.com/#blade/Microsoft_AAD_RegisteredApps/CreateApplicationBlade
Progress/status feedback during multi-step registration 4 API calls; each can fail independently Low Use existing OperationProgress pattern; surface per-step status in the UI

Differentiators

Feature Value Proposition Complexity Notes
Pre-configure required Graph/SharePoint permissions in the app manifest Avoids admin having to manually tick permissions in Entra portal after creation Medium Include requiredResourceAccess in POST /applications body, targeting Graph SP (appId 00000003-0000-0000-c000-000000000000) and SharePoint SP (appId 00000003-0000-0ff1-ce00-000000000000)
Verify existing registration before creating a new one Prevents duplicate registrations on re-run or retry Low GET /applications?$filter=displayName eq '{name}' before POST; surface existing one if found

Anti-Features

Anti-Feature Why Avoid What to Do Instead
Store client secrets in the profile JSON Secrets at rest in a local JSON file are a liability; the app uses delegated (interactive) auth, not app-only Use MSAL interactive delegated flow; no client secret needed at runtime
Certificate-based credential management Cert lifecycle (expiry, rotation) is out of scope for an MSP admin tool Interactive user auth handles token refresh automatically
Silent background retry on consent failures The calling user may not have the right role; silent retry without user action would spin indefinitely Detect error class, surface actionable UI immediately

Feature Dependencies

Existing:
  GraphClientFactory     → provides authenticated GraphServiceClient for target tenant
  TenantProfile.ClientId → stores the resulting appId after registration
  ProfileManagementDialog → hosts the registration trigger button
  OperationProgress      → used for per-step status display

New:
  IAppRegistrationService / AppRegistrationService
    → CreateApplicationAsync(tenantId, displayName) : Task<Application>
    → CreateServicePrincipalAsync(appId) : Task<ServicePrincipal>
    → GrantAdminConsentAsync(servicePrincipalId) : Task (app role assignments)
    → RemoveApplicationAsync(appId) : Task (Feature 2)
  AppRegistrationFallbackDialog (new WPF dialog)
    → renders numbered steps when automated path fails
    → deep-link button to Entra portal

Key existing code note: GraphClientFactory already acquires delegated tokens with the tenant's registered clientId. For the registration flow specifically, the app needs a token scoped to the management tenant (where Entra lives), not just SharePoint/Graph read scopes. The MSAL PCA must request Application.ReadWrite.All and AppRoleAssignment.ReadWrite.All for the registration step — these are broader than the app's normal operation scopes and will trigger a new consent prompt if not previously consented.

Complexity Assessment

Sub-task Complexity Reason
POST /applications + POST /servicePrincipals Medium Two sequential calls; error handling at each step
Grant admin consent (appRoleAssignment per permission) High Must look up resource SP IDs, match appRole GUIDs by permission name; 4-6 role assignments needed
TenantProfile.ClientId persistence after registration Low Existing JSON serialization; add one field
UI: Register button in ProfileManagementDialog Low Button + status label; hooks into existing async command pattern
Guided fallback modal Medium Error detection logic + WPF dialog with instructional content
Localization EN/FR Low ~12-16 new keys
Unit tests for AppRegistrationService High Requires mocking Graph SDK Application/ServicePrincipal/AppRoleAssignment calls; 15-20 test cases

Feature 2: App Removal from Target Tenant

What it is

Inverse of Feature 1. When a tenant profile is deleted or when the admin explicitly removes the registration, the app deletes the Azure AD application object from the target tenant.

How it works (technical)

DELETE /applications/{id} — soft-deletes the application object (moved to deleted items for 30 days). Requires Application.ReadWrite.All delegated. The {id} here is the object ID (not the appId/client ID) — must be resolved first via GET /applications?$filter=appId eq '{clientId}'.

Optionally: DELETE /directory/deletedItems/{id} for permanent deletion (requires Application.ReadWrite.All; irreversible within the 30-day window — avoid).

Table Stakes

Feature Why Expected Complexity Notes
Remove app registration when profile is deleted Avoid Entra app sprawl in client tenants; clean exit Medium Resolve object ID by appId, then DELETE /applications/{id}
Confirmation prompt before removal Deletion is irreversible within the session; accidental removal would break other automations using the same app Low Modal confirm dialog with clientId displayed
Graceful handling when app no longer exists Re-run, manual deletion in portal, or already removed Low Handle 404 as success (idempotent delete)

Anti-Features

Anti-Feature Why Avoid What to Do Instead
Permanent hard-delete from deletedItems 30-day soft-delete is a safety net; no MSP needs immediate permanent removal Soft-delete only (default DELETE /applications behavior)
Auto-remove on profile deletion without prompt Silent data destruction is never acceptable Always require explicit user confirmation

Feature Dependencies

Existing:
  AppRegistrationService (from Feature 1)
    + RemoveApplicationAsync(clientId) : Task
      → GET /applications?$filter=appId eq '{clientId}' → resolve object ID
      → DELETE /applications/{objectId}

ProfileManagementDialog
  → Remove Registration button (separate from Delete Profile)
  → or confirmation step during profile deletion flow

Complexity Assessment

Sub-task Complexity Reason
RemoveApplicationAsync (resolve then delete) Low-Medium Two calls; 404 idempotency handling
Confirmation dialog Low Reuse existing ConfirmationDialog pattern
Wire into profile deletion flow Low Existing profile delete command; add optional app removal step

Feature 3: Auto-Take Ownership on Access Denied (Global Toggle)

What it is

When the scanner hits a site with "Access Denied", and the global toggle is on, the app automatically adds the scanning account as a Site Collection Administrator for that site, retries the scan, and (optionally) removes itself afterward. The admin controls this with a global on/off.

How it works (technical)

Option A — PnP Framework (preferred): context.Web.Context.Site.Owner = user combined with Tenant.SetSiteProperties(siteUrl, owners: loginName) or SPOTenantContext.SetSiteAdmin(siteUrl, loginName, isAdmin: true) — all available via PnP.Framework which is already a project dependency.

The calling account must hold the SharePoint Administrator role at the tenant level (not just site admin) to add itself as site collection admin to a site it is currently denied from.

Option B — Graph API: POST /sites/{siteId}/permissions with roles: ["owner"] — grants the service principal owner access to a specific site. This works for application permissions with Sites.FullControl.All but requires additional Graph permission scopes not currently in use.

Recommended: PnP Framework path (Option A) because:

  • PnP.Framework is already a dependency (no new package)
  • The app already uses delegated PnP context for all SharePoint operations
  • Tenant.SetSiteAdmin is a single method call, well-understood in the MSP ecosystem
  • Graph site permissions path requires Sites.FullControl.All which is a very broad app permission

Self-healing sequence:

  1. Site scan returns 401/403
  2. If toggle is ON: call SetSiteAdmin(siteUrl, currentUserLogin, isAdmin: true)
  3. Retry the scan operation
  4. If auto-remove-after is ON: call SetSiteAdmin(siteUrl, currentUserLogin, isAdmin: false)
  5. Log the takeover action (site, timestamp, user) for audit trail

Table Stakes

Feature Why Expected Complexity Notes
Global toggle in Settings to enable/disable auto-ownership Some MSPs want this; others consider it too aggressive for compliance reasons Low Boolean field in AppSettings; surfaced in Settings tab
Take ownership on 401/403 and retry the scan The core capability; without the retry it is pointless Medium Error interception in the scan pipeline; conditional branch
Audit log of takeover actions Compliance requirement — admin must know which sites were temporarily owned Low Extend existing Serilog logging; optionally surface in the scan results list
Respect the toggle per-scan-run (not retroactive) Some scans are read-only audits; the toggle state at run-start should be captured Low Capture AppSettings.AutoTakeOwnership at scan start; pass through as scan context

Differentiators

Feature Value Proposition Complexity Notes
Auto-remove ownership after scan completes Least-privilege principle; the scanning account should not retain admin rights beyond the scan Medium Track which sites were auto-granted; remove in a finally block or post-scan cleanup step
Per-scan results column showing "Ownership Taken" flag Transparency — admin sees which sites required escalation Low Add a flag to the ScanResultItem model; render as icon/badge in the results DataGrid

Anti-Features

Anti-Feature Why Avoid What to Do Instead
Auto-take enabled by default Too aggressive for compliance-conscious MSPs; could violate client change-control policies Default OFF; require explicit opt-in
Permanently retain ownership Violates least-privilege; creates audit exposure for the MSP Always remove after scan unless admin explicitly retains
Silent ownership changes with no audit trail Undiscoverable by the client tenant's own admins Log every takeover with timestamp and account UPN

Feature Dependencies

Existing:
  AppSettings              + AutoTakeOwnership (bool, default false)
                           + AutoRemoveOwnershipAfterScan (bool, default true)
  IPnPContextFactory       → provides PnP context for Tenant-level operations
  PermissionsScanService   → where 401/403 errors currently surface per site
  ScanResultItem           → add OwnershipTakenFlag (bool)

New:
  ISiteOwnershipService / SiteOwnershipService
    → TakeOwnershipAsync(siteUrl, loginName) : Task
    → RemoveOwnershipAsync(siteUrl, loginName) : Task
    → Uses PnP.Framework Tenant.SetSiteAdmin

ScanContext record (or existing scan parameters)
  → Carry AutoTakeOwnership bool captured at scan-start

Key existing code note: The existing BulkOperationRunner pattern handles per-item continue-on- error. The ownership-takeover path should not be grafted into BulkOperationRunner directly; instead it wraps the per-site scan call with a retry decorator that intercepts 401/403 and invokes SiteOwnershipService before retrying.

Complexity Assessment

Sub-task Complexity Reason
AppSettings fields + Settings UI toggle Low Trivial bool fields; existing settings pattern
ISiteOwnershipService + PnP SetSiteAdmin calls Low-Medium Well-known PnP API; success/failure handling
Error interception + retry in scan pipeline Medium Must not break existing error reporting; retry must not loop on non-permissions errors
Auto-remove in finally block after scan Medium Must track which sites were granted and clean up even if scan errors
Audit log / results column Low Extend existing model + DataGrid
Localization EN/FR Low ~8-10 new keys
Unit tests High Retry logic + cleanup path requires careful mock setup; ~15 test cases

Feature 4: Expand Groups in HTML Reports

What it is

In HTML permission reports, security groups currently appear as a flat entry (e.g., "IT Team — Edit"). With this feature, each group row has an expand/collapse toggle that shows its members inline, without leaving the report page. The expanded members list is embedded at report-generation time (not lazy-loaded via an API call when the report is opened).

How it works (technical)

At report generation time, for each permission entry that is a group:

  1. Resolve group membership: GET /groups/{id}/members?$select=displayName,userPrincipalName
  2. Embed member data inline in the HTML as a hidden <tbody> or <div> with a stable CSS class
  3. Emit a small inline <script> block (vanilla JS, no external dependencies) that toggles display:none on the child rows when the group header is clicked

The <details>/<summary> HTML5 approach is also viable and requires zero JavaScript, but gives less control over styling and the expand icon placement. The onclick/toggle pattern with a <span> chevron is more consistent with the existing report CSS.

Group member resolution requires the GroupMember.Read.All Graph permission (delegated) or Group.Read.All. The app likely already consumes Group.Read.All for the existing group-in- permissions display — confirm scope list during implementation.

Depth limit: Nested groups (groups-within-groups) should be resolved one level deep only. Full recursive expansion of nested groups can return hundreds of entries and is overkill for a permissions audit. Mark nested group entries with a "nested group — expand separately" note.

Table Stakes

Feature Why Expected Complexity Notes
Group rows in HTML report are expandable to show members A flat "IT Team — Edit" entry is not auditable; admins need to see who is actually in the group Medium Member data embedded at generation time; vanilla JS toggle
Collapsed by default Reports may have dozens of groups; expanded by default would be overwhelming Low CSS display:none on child rows by default; toggle on click
Member count shown on collapsed group row Gives the admin a preview of group size without expanding Low memberCount available from group metadata or members.length at generation time
Groups without members (empty) still render correctly Empty groups exist; collapsed empty list should not crash or show a spinner Low Conditional render: no chevron and "(0 members)" label when empty

Differentiators

Feature Value Proposition Complexity Notes
"Expand all / Collapse all" button in report header Useful for small reports or print-to-PDF workflows Low Two buttons calling a JS querySelectorAll('.group-members').forEach(...)
Distinguish direct members vs nested group members visually Clear hierarchy: direct members vs members-via-nested-group Medium Color code or indent nested group entries; requires recursive resolution with depth tracking

Anti-Features

Anti-Feature Why Avoid What to Do Instead
Live API call when user clicks expand (lazy load in browser) HTML reports are static files — often emailed or archived offline; API calls from a saved HTML file will fail Embed all member data at generation time, unconditionally
Full recursive group expansion (unlimited depth) Deep nesting can multiply entries 100x; makes reports unusable One level deep; label nested group entries as such
Add group expansion to CSV exports CSV is flat by nature CSV stays flat; group expansion is HTML-only

Feature Dependencies

Existing:
  IGraphGroupService (or existing permission resolution code)
    → MemberResolutionAsync(groupId) : Task<IEnumerable<GroupMemberEntry>>
    → Uses existing GraphClientFactory

  HtmlExportService (and all other HTML exporters that include group entries)
    → Pass group members into the template at generation time
    → New: HtmlGroupExpansionHelper
        → Renders group header row with expand chevron + member count
        → Renders hidden member rows
        → Emits the inline toggle JS snippet once per report (idempotent)

  PermissionEntry model (or equivalent)
    → Add: ResolvedMembers (IList<GroupMemberEntry>?, nullable — only populated for groups)

Key existing code note: v2.2 already has a shared HtmlReportHeaderBuilder. The group expansion helper follows the same pattern — a shared renderer called from each HTML export service that emits the group expand/collapse markup and the one-time JS snippet.

Complexity Assessment

Sub-task Complexity Reason
Group member resolution at export time Medium Graph call per group; rate-limit awareness; empty group handling
HTML template for expandable group rows Medium Markup + CSS; inline vanilla JS toggle
Embed member data in report model Low Extend permission entry model; nullable field
Wire up in all HTML exporters that render groups Medium Multiple exporters (permissions, user access); each needs the helper
Localization EN/FR Low ~6-8 new keys
Unit tests Medium Mock member resolution; verify HTML output contains toggle structure

Feature 5: Report Consolidation Toggle (Merge Duplicate Entries)

What it is

In the permissions report, a user who appears in multiple groups — or has direct AND group-based access — currently generates multiple rows (one per access path). With consolidation ON, these rows are merged into a single row showing the user's highest-permission level and a count of access paths.

Example before: "Alice — Edit (via IT Team)", "Alice — Read (direct)" Example after: "Alice — Edit (2 access paths)" [with a detail-expand or tooltip]

How it works (technically)

Pure in-memory post-processing on the list of resolved permission entries:

  1. Group entries by UPN (or object ID for robustness)
  2. For each group: keep the highest-privilege entry, aggregate source paths into a list
  3. Annotate the merged entry with access path count and source summary
  4. The toggle lives in the export settings or the results toolbar — not a permanent report setting

This is entirely client-side (in-memory in C#) — no additional API calls needed.

Privilege ordering must be well-defined: FullControl > Edit/Contribute > Read > Limited Access > View Only

Table Stakes

Feature Why Expected Complexity Notes
Consolidation toggle in the report/export UI Auditors want one row per user for a clean headcount view; default OFF preserves existing behavior Low Toggle in ViewModel; filters the display/export collection
Merge duplicate user rows, keep highest permission Core consolidation logic Medium LINQ GroupBy on UPN + MaxBy on permission level; requires a defined privilege enum ordering
Show access path count on consolidated row "Alice — Edit (3 access paths)" is auditable; silent deduplication is not Low Derived count from the group; add to the display model
Consolidated export to both HTML and CSV The toggle must apply equally to all export formats Low-Medium Apply consolidation in the ViewModel before passing to export services

Differentiators

Feature Value Proposition Complexity Notes
Expand consolidated row to see individual access paths (HTML only) Same expand pattern as Feature 4 (groups); user sees "why Edit" on click Medium Reuse the group expansion HTML pattern; embed source paths as hidden child rows
Summary line: "X users, Y consolidated entries" in report header Gives auditors the before/after count immediately Low Simple count comparison; rendered in the report header

Anti-Features

Anti-Feature Why Avoid What to Do Instead
Consolidation ON by default Breaks existing workflow; MSPs relying on multi-path audit output would lose data silently Default OFF; opt-in per export run
Permanent merge (no way to see individual paths) Auditors must be able to see all access paths for security review Always preserve the unexpanded detail; consolidation is a view layer only
Merge across sites A user's Edit on Site A and Read on Site B are not the same permission; cross-site merge would lose site context Consolidate within a site only; separate sections per site remain intact

Feature Dependencies

Existing:
  PermissionEntry / UserAccessEntry models
    → No schema changes needed; consolidation is a view-model transform

  PermissionsScanViewModel / UserAccessAuditViewModel
    → Add: IsConsolidated (bool toggle, default false)
    → Add: ConsolidatedResults (computed from raw results via LINQ on toggle change)

  HtmlExportService / CsvExportService
    → Accept either raw or consolidated entry list based on toggle state

New:
  PermissionConsolidationService (or static helper)
    → Consolidate(IEnumerable<PermissionEntry>, siteScope) : IEnumerable<ConsolidatedEntry>
    → Defines PermissionLevel enum with ordering for MaxBy

Key existing code note: The app already has a detail-level toggle (simplified vs full permissions view — shipped in v1.1). The consolidation toggle follows the same UX pattern: a toolbar toggle that switches between two display modes. Reuse that toggle component and the pattern of maintaining a filtered/transformed display collection alongside the raw results.

Complexity Assessment

Sub-task Complexity Reason
PermissionLevel enum with ordering Low Define once; used by consolidation service and existing simplified view
PermissionConsolidationService (LINQ GroupBy + MaxBy) Low-Medium Straightforward transformation; edge cases around tie-breaking and LimitedAccess entries
ViewModel toggle + computed consolidated collection Low Mirrors existing simplified/detail toggle pattern
Wire consolidated list into export services Low Both exporters already accept IEnumerable; no signature change needed
HTML: expand access paths for consolidated entries Medium Reuse group expansion markup from Feature 4
Localization EN/FR Low ~8-10 new keys
Unit tests Medium Consolidation logic has many edge cases (direct+group, multiple groups, empty)

Cross-Feature Dependencies

Graph API scopes (cumulative for this milestone):
  Application.ReadWrite.All     → Features 1+2 (app registration/removal)
  AppRoleAssignment.ReadWrite.All → Feature 1 (consent grant)
  GroupMember.Read.All          → Feature 4 (group expansion)
  SharePoint Sites.FullControl.All → Optional alt path for Feature 3 (avoid if possible)

Model changes:
  AppSettings                   + AutoTakeOwnership (bool)
                                + AutoRemoveOwnershipAfterScan (bool)
                                + (no new branding fields — v2.2 shipped those)
  TenantProfile                 + ClientId may be auto-populated (Feature 1)
  PermissionEntry               + ResolvedMembers (Feature 4)
  ScanResultItem                + OwnershipTakenFlag (Feature 3)

New services (all injectable, interface-first):
  IAppRegistrationService       → Features 1+2
  ISiteOwnershipService         → Feature 3
  IPermissionConsolidationService → Feature 5
  HtmlGroupExpansionHelper      → Feature 4 (not a full service, a renderer helper)

Shared infrastructure (no changes needed):
  GraphClientFactory            → unchanged
  BulkOperationRunner           → unchanged; Feature 3 wraps around it
  HtmlReportHeaderBuilder       → extended by Feature 4 helper
  Serilog logging               → unchanged; Features 1+3 add audit log entries

No new NuGet packages are needed for Features 3-5. Features 1-2 are already covered by the Microsoft Graph SDK which is a current dependency. The self-contained EXE size is not expected to increase.


Build Order Recommendation

Sequence by lowest-risk-to-highest-risk, each independently releasable:

  1. Report Consolidation Toggle (Feature 5) — Pure in-memory LINQ; zero new API calls; zero risk to existing pipeline. Builds confidence before touching external APIs.

  2. Group Expansion in HTML Reports (Feature 4) — Graph call at export time; reuses existing GraphClientFactory; lower blast radius than account/registration operations.

  3. Auto-Take Ownership Toggle (Feature 3) — Modifies tenant state (site admin changes); must be tested on a non-production tenant. PnP path is well-understood.

  4. App Registration (Feature 1) — Modifies Entra configuration on the target tenant; highest blast radius if something goes wrong; save for last when the rest of the milestone is stable.

  5. App Removal (Feature 2) — Depends on Feature 1 infra (AppRegistrationService); build immediately after Feature 1 is stable and tested.

Defer to a later milestone:

  • Certificate-based credentials for registered apps (out of scope by design)
  • Cross-site consolidation (different problem domain)
  • Recursive group expansion beyond 1 level (complexity/value ratio too low)

Sources