diff --git a/.planning/research/ARCHITECTURE.md b/.planning/research/ARCHITECTURE.md new file mode 100644 index 0000000..775a7a5 --- /dev/null +++ b/.planning/research/ARCHITECTURE.md @@ -0,0 +1,581 @@ +# Architecture Research + +**Domain:** C#/WPF SharePoint Online Administration Desktop Tool +**Researched:** 2026-04-02 +**Confidence:** HIGH + +## Standard Architecture + +### System Overview + +``` +┌─────────────────────────────────────────────────────────────────────┐ +│ PRESENTATION LAYER │ +│ ┌──────────────┐ ┌─────────────────────────────────────────────┐ │ +│ │ MainWindow │ │ Feature Views (XAML) │ │ +│ │ Shell.xaml │ │ Permissions │ Storage │ Search │ Templates │ │ +│ │ │ │ Duplicates │ Bulk │ Reports │ Settings │ │ +│ └──────┬───────┘ └──────────────────────┬────────────────────┘ │ +│ │ DataContext binding │ DataContext binding │ +├─────────┴─────────────────────────────────┴────────────────────────┤ +│ VIEWMODEL LAYER │ +│ ┌─────────────┐ ┌──────────────────────────────────────────────┐ │ +│ │ MainWindow │ │ Feature ViewModels │ │ +│ │ ViewModel │ │ PermissionsVM │ StorageVM │ SearchVM │ │ +│ │ (nav/shell)│ │ TemplatesVM │ BulkOpsVM │ DuplicatesVM │ │ +│ └──────┬──────┘ └───────────────────────┬──────────────────────┘ │ +│ │ ICommand, ObservableProperty │ AsyncRelayCommand │ +├─────────┴─────────────────────────────────┴────────────────────────┤ +│ SERVICE LAYER │ +│ ┌────────────────┐ ┌─────────────────┐ ┌──────────────────────┐ │ +│ │ AuthService │ │ SharePoint │ │ Cross-Cutting │ │ +│ │ SessionManager │ │ Feature Services │ │ Services │ │ +│ │ TenantSession │ │ PermissionsService│ │ ReportExportService │ │ +│ │ │ │ StorageService │ │ LocalizationService │ │ +│ │ │ │ SearchService │ │ DialogService │ │ +│ │ │ │ TemplateService │ │ SettingsService │ │ +│ └────────┬───────┘ └────────┬────────┘ └──────────────────────┘ │ +│ │ ClientContext │ IProgress, CancellationToken │ +├───────────┴────────────────────┴────────────────────────────────────┤ +│ INFRASTRUCTURE / INTEGRATION LAYER │ +│ ┌──────────────────┐ ┌───────────────────┐ ┌──────────────────┐ │ +│ │ PnP Framework │ │ Microsoft Graph │ │ Local Storage │ │ +│ │ AuthManager │ │ GraphServiceClient │ │ JSON Files │ │ +│ │ ClientContext │ │ (Graph operations) │ │ Profiles │ │ +│ │ (CSOM ops) │ │ │ │ Templates │ │ +│ └──────────────────┘ └───────────────────┘ └──────────────────┘ │ +└─────────────────────────────────────────────────────────────────────┘ +``` + +### Component Responsibilities + +| Component | Responsibility | Typical Implementation | +|-----------|----------------|------------------------| +| MainWindow Shell | Tab navigation, tenant selector, app chrome, log panel | XAML with TabControl or navigation frame | +| Feature Views | User input forms, result grids, progress indicators | UserControl XAML, zero code-behind | +| Feature ViewModels | Commands, observable state, orchestrates services | ObservableObject subclass, AsyncRelayCommand | +| AuthService / SessionManager | Multi-tenant session lifecycle, token cache, active tenant state | Singleton, MSAL token cache per tenant | +| TenantSession | Per-tenant PnP ClientContext + auth token | Immutable record, created by AuthService | +| SharePoint Feature Services | Domain logic that calls PnP Framework or Graph | Stateless class, injectable, cancellable | +| ReportExportService | HTML/CSV generation from result models | Stateless, template-based string builder | +| LocalizationService | Key-based EN/FR translation, dynamic language switch | Singleton, loads lang/*.json, INotifyPropertyChanged | +| SettingsService | Read/write JSON settings, profiles, templates | Singleton, file I/O wrapped in async | +| DialogService | Open files, show message boxes, pick folders | Interface + WPF implementation, testable | + +--- + +## Recommended Project Structure + +``` +SharepointToolbox/ +├── App.xaml # Application entry, DI container bootstrap +├── App.xaml.cs # Host builder, service registration +│ +├── Core/ # Domain models — no WPF dependencies +│ ├── Models/ +│ │ ├── PermissionEntry.cs +│ │ ├── StorageMetrics.cs +│ │ ├── SiteTemplate.cs +│ │ ├── TenantProfile.cs +│ │ └── SearchResult.cs +│ ├── Interfaces/ +│ │ ├── IAuthService.cs +│ │ ├── IPermissionsService.cs +│ │ ├── IStorageService.cs +│ │ ├── ISearchService.cs +│ │ ├── ITemplateService.cs +│ │ ├── IBulkOpsService.cs +│ │ ├── IDuplicateService.cs +│ │ ├── IReportExportService.cs +│ │ ├── ISettingsService.cs +│ │ ├── ILocalizationService.cs +│ │ └── IDialogService.cs +│ └── Exceptions/ +│ ├── SharePointConnectionException.cs +│ └── AuthenticationException.cs +│ +├── Services/ # Business logic + infrastructure +│ ├── Auth/ +│ │ ├── AuthService.cs # PnP AuthenticationManager wrapper +│ │ ├── SessionManager.cs # Multi-tenant session store +│ │ └── TenantSession.cs # Per-tenant PnP ClientContext holder +│ ├── SharePoint/ +│ │ ├── PermissionsService.cs # Recursive permission scanning +│ │ ├── StorageService.cs # Storage metric traversal +│ │ ├── SearchService.cs # KQL-based search via PnP/Graph +│ │ ├── TemplateService.cs # Capture & apply site templates +│ │ ├── DuplicateService.cs # File/folder duplicate detection +│ │ └── BulkOpsService.cs # Transfer, site creation, member add +│ ├── Reporting/ +│ │ ├── HtmlReportService.cs # Self-contained HTML + JS reports +│ │ └── CsvExportService.cs # CSV export +│ ├── LocalizationService.cs # EN/FR key-value translations +│ ├── SettingsService.cs # JSON profiles, templates, settings +│ └── DialogService.cs # WPF dialog abstractions +│ +├── ViewModels/ # WPF-aware but UI-framework-agnostic +│ ├── MainWindowViewModel.cs # Shell nav, tenant switcher, log +│ ├── Permissions/ +│ │ └── PermissionsViewModel.cs +│ ├── Storage/ +│ │ └── StorageViewModel.cs +│ ├── Search/ +│ │ └── SearchViewModel.cs +│ ├── Templates/ +│ │ └── TemplatesViewModel.cs +│ ├── Duplicates/ +│ │ └── DuplicatesViewModel.cs +│ ├── BulkOps/ +│ │ └── BulkOpsViewModel.cs +│ └── Settings/ +│ └── SettingsViewModel.cs +│ +├── Views/ # XAML — no business logic +│ ├── MainWindow.xaml +│ ├── Permissions/ +│ │ └── PermissionsView.xaml +│ ├── Storage/ +│ │ └── StorageView.xaml +│ ├── Search/ +│ │ └── SearchView.xaml +│ ├── Templates/ +│ │ └── TemplatesView.xaml +│ ├── Duplicates/ +│ │ └── DuplicatesView.xaml +│ ├── BulkOps/ +│ │ └── BulkOpsView.xaml +│ └── Settings/ +│ └── SettingsView.xaml +│ +├── Controls/ # Reusable WPF controls +│ ├── TenantSelectorControl.xaml +│ ├── LogPanelControl.xaml +│ ├── ProgressOverlayControl.xaml +│ └── StorageChartControl.xaml # LiveCharts2 wrapper +│ +├── Converters/ # IValueConverter implementations +│ ├── BytesToStringConverter.cs +│ ├── BoolToVisibilityConverter.cs +│ └── PermissionColorConverter.cs +│ +├── Resources/ # Styles, brushes, theme +│ ├── Styles.xaml +│ └── Colors.xaml +│ +├── Lang/ # Language files +│ ├── en.json +│ └── fr.json +│ +└── Infrastructure/ + └── Behaviors/ # XAML attached behaviors (no code-behind workaround) + └── ScrollToBottomBehavior.cs +``` + +### Structure Rationale + +- **Core/**: Pure C# — no WPF references. Interfaces here make services testable. Models are plain data classes. +- **Services/**: All domain logic and I/O. Injected via constructor DI. No static state. +- **ViewModels/**: Mirror the feature structure. Depend on service interfaces, never on concrete implementations. +- **Views/**: XAML-only. No logic. `DataContext` set by DI or ViewModelLocator pattern at startup. +- **Controls/**: Reusable UI widgets that encapsulate chart, log, and progress concerns. + +--- + +## Architectural Patterns + +### Pattern 1: ObservableObject + AsyncRelayCommand (CommunityToolkit.Mvvm) + +**What:** Use `ObservableObject` as base class for all ViewModels. Use `[ObservableProperty]` source-gen attribute for bindable properties. Use `AsyncRelayCommand` (with `CancellationToken`) for all SharePoint operations. + +**When to use:** All ViewModels. This is the standard pattern for .NET 8 + WPF. + +**Trade-offs:** Source generators require C# 10+. Generated partial class syntax is unfamiliar at first but eliminates 80% of boilerplate. + +**Example:** +```csharp +public partial class PermissionsViewModel : ObservableObject +{ + private readonly IPermissionsService _permissionsService; + + [ObservableProperty] + private bool _isRunning; + + [ObservableProperty] + private string _statusMessage = string.Empty; + + [ObservableProperty] + private ObservableCollection _results = new(); + + public IAsyncRelayCommand RunReportCommand { get; } + + public PermissionsViewModel(IPermissionsService permissionsService) + { + _permissionsService = permissionsService; + RunReportCommand = new AsyncRelayCommand(RunReportAsync, allowConcurrentExecutions: false); + } + + private async Task RunReportAsync(CancellationToken cancellationToken) + { + IsRunning = true; + StatusMessage = "Scanning permissions..."; + try + { + var results = await _permissionsService.ScanAsync( + SiteUrl, cancellationToken, + new Progress(msg => StatusMessage = msg)); + Results = new ObservableCollection(results); + } + finally { IsRunning = false; } + } +} +``` + +### Pattern 2: Multi-Tenant Session Manager + +**What:** A singleton `SessionManager` holds a dictionary of `TenantSession` objects keyed by tenant URL. When the user selects a tenant profile, the session is reused if still valid (MSAL token cache handles token refresh). No re-authentication unless the token is expired and silent refresh fails. + +**When to use:** Every SharePoint service operation resolves `IAuthService.GetSessionAsync(tenantUrl)` before calling PnP Framework. + +**Trade-offs:** MSAL token cache must be persisted across app restarts for seamless reconnect. For interactive login, MSAL `PublicClientApplicationBuilder` with `WithParentActivityOrWindow` is required on Windows to avoid a blank browser window. + +**Example:** +```csharp +public class SessionManager +{ + private readonly ConcurrentDictionary _sessions = new(); + + public async Task GetOrCreateSessionAsync( + TenantProfile profile, CancellationToken ct) + { + if (_sessions.TryGetValue(profile.TenantUrl, out var session) + && !session.IsExpired) + return session; + + var authManager = new PnP.Framework.AuthenticationManager( + profile.ClientId, + openBrowserCallback: url => Process.Start(new ProcessStartInfo(url) { UseShellExecute = true })); + + var ctx = await authManager.GetContextAsync(profile.TenantUrl); + var newSession = new TenantSession(profile, ctx, authManager); + _sessions[profile.TenantUrl] = newSession; + return newSession; + } +} +``` + +### Pattern 3: IProgress\ + CancellationToken for All Long Operations + +**What:** Every service method that calls SharePoint accepts `IProgress` and `CancellationToken`. The ViewModel creates `Progress` (which marshals callbacks to the UI thread automatically) and `CancellationTokenSource`. + +**When to use:** All SharePoint service methods. This replaces the PowerShell runspace + timer polling pattern from the existing app. + +**Trade-offs:** `Progress` captures SynchronizationContext on creation — must be created on the UI thread (i.e., inside the ViewModel, not inside the service). + +**Example:** +```csharp +// In ViewModel (UI thread context): +var cts = new CancellationTokenSource(); +CancelCommand = new RelayCommand(() => cts.Cancel()); +var progress = new Progress(p => StatusMessage = p.Message); + +// In Service (any thread): +public async Task> ScanAsync( + string siteUrl, + CancellationToken ct, + IProgress progress) +{ + progress.Report(new OperationProgress("Connecting...")); + using var ctx = await _sessionManager.GetOrCreateSessionAsync(..., ct); + // ... recursive scanning ... + ct.ThrowIfCancellationRequested(); + progress.Report(new OperationProgress($"Found {results.Count} entries")); + return results; +} +``` + +### Pattern 4: Messenger for Cross-ViewModel Events + +**What:** Use `CommunityToolkit.Mvvm.Messaging.WeakReferenceMessenger` for decoupled communication between ViewModels (e.g., "tenant switched" notifies all feature VMs to reset state, "log entry added" updates the log panel ViewModel). + +**When to use:** When two ViewModels need to communicate without direct reference (shell ↔ feature VMs, service callbacks ↔ log panel). + +**Trade-offs:** Weak references mean recipients must be alive (held by DI container). Don't use for per-request data passing — use method return values for that. + +### Pattern 5: Dependency Injection via Microsoft.Extensions.Hosting + +**What:** Bootstrap the app with `Host.CreateDefaultBuilder()` in `App.xaml.cs`. Register all services, ViewModels, and the main window in the DI container. Use constructor injection everywhere — no service locator anti-pattern. + +**Example:** +```csharp +// App.xaml.cs +protected override void OnStartup(StartupEventArgs e) +{ + _host = Host.CreateDefaultBuilder() + .ConfigureServices(services => + { + // Core services (singletons) + services.AddSingleton(); + services.AddSingleton(); + services.AddSingleton(); + services.AddSingleton(); + services.AddSingleton(); + + // Feature services (transient — no shared state) + services.AddTransient(); + services.AddTransient(); + services.AddTransient(); + + // ViewModels + services.AddTransient(); + services.AddTransient(); + services.AddTransient(); + + // Views + services.AddSingleton(); + }) + .Build(); + + _host.Start(); + var mainWindow = _host.Services.GetRequiredService(); + mainWindow.DataContext = _host.Services.GetRequiredService(); + mainWindow.Show(); +} +``` + +--- + +## Data Flow + +### SharePoint Operation Request Flow + +``` +User clicks "Run" button + ↓ +View command binding triggers AsyncRelayCommand.ExecuteAsync() + ↓ +ViewModel validates inputs → creates CancellationTokenSource + Progress + ↓ +ViewModel calls IFeatureService.ScanAsync(params, ct, progress) + ↓ +Service calls SessionManager.GetOrCreateSessionAsync(profile, ct) + ↓ +SessionManager checks cache → reuses token or triggers interactive login + ↓ +Service executes PnP Framework / Graph SDK calls (async, awaited) + ↓ +Service reports incremental progress → Progress.Report() → UI thread + ↓ +Service returns result collection to ViewModel + ↓ +ViewModel updates ObservableCollection → WPF binding refreshes DataGrid + ↓ +ViewModel sets IsRunning = false → progress overlay hides +``` + +### Authentication & Session Flow + +``` +User selects tenant profile from dropdown + ↓ +MainWindowViewModel calls SessionManager.SetActiveProfile(profile) + ↓ +SessionManager publishes TenantChangedMessage via WeakReferenceMessenger + ↓ +All feature ViewModels receive message → reset their state/results + ↓ +On first operation: SessionManager.GetOrCreateSessionAsync() + ↓ + [Cache hit: token valid] → return existing ClientContext immediately + [Cache miss / expired] → PnP AuthManager.GetContextAsync() + ↓ + MSAL silent token refresh attempt + ↓ + [Silent fails] → open browser for interactive login + ↓ + User authenticates → token cached by MSAL + ↓ + ClientContext returned to caller +``` + +### Report Export Flow + +``` +Service returns List to ViewModel + ↓ +User clicks "Export CSV" or "Export HTML" + ↓ +ViewModel calls IReportExportService.ExportAsync(results, format, outputPath) + ↓ +ReportExportService generates file (string building, no blocking I/O on UI thread) + ↓ +ViewModel calls IDialogService.OpenFile(outputPath) to auto-open result +``` + +### State Management + +``` +AppState (DI-managed singletons): + SessionManager → active profile, tenant sessions dict + SettingsService → user prefs, data folder, profiles list + LocalizationService → current language, translation dict + +Per-Operation State (ViewModel-local): + ObservableCollection → bound to DataGrid + CancellationTokenSource → cancel button binding + IsRunning (bool) → progress overlay binding + StatusMessage (string) → progress label binding +``` + +--- + +## Component Boundaries + +### What Communicates With What + +| Boundary | Communication Method | Direction | Notes | +|----------|---------------------|-----------|-------| +| View ↔ ViewModel | WPF data binding (two-way for inputs, one-way for results) | Both | No code-behind | +| ViewModel ↔ Service | Constructor-injected interface, async method call | VM → Service | Services return Task\ | +| ViewModel ↔ ViewModel | WeakReferenceMessenger messages | Broadcast | Tenant switch, log events | +| Service ↔ SessionManager | `GetOrCreateSessionAsync()` | Service → SessionMgr | Every SharePoint call | +| SessionManager ↔ PnP Framework | `AuthenticationManager.GetContextAsync()` | SessionMgr → PnP | On cache miss only | +| Service ↔ Graph SDK | `GraphServiceClient` method calls | Service → Graph | For Graph-only operations | +| SettingsService ↔ FileSystem | `System.Text.Json` + `File.ReadAllText/WriteAllText` | Both | Async I/O | +| LocalizationService ↔ Views | XAML binding to translated string properties | Service → View | Via singleton binding | + +### What Must NOT Cross Boundaries + +- Views must not call services directly — all via ViewModel commands +- Services must not reference any WPF types (`System.Windows.*`) — use `IProgress` for UI feedback +- ViewModels must not instantiate `ClientContext` or `AuthenticationManager` directly — only via `IAuthService` +- SessionManager is the only class that holds `ClientContext` objects — services receive them per-operation + +--- + +## Build Order (Dependency Graph) + +The following reflects the order components can be built because later items depend on earlier ones: + +``` +Phase 1: Foundation + └── Core/Models/* (no dependencies) + └── Core/Interfaces/* (no dependencies) + └── Core/Exceptions/* (no dependencies) + +Phase 2: Infrastructure Services + └── SettingsService (depends on Core models) + └── LocalizationService (depends on lang files) + └── DialogService (depends on WPF — implement last in phase) + └── AuthService / SessionManager (depends on PnP Framework NuGet) + +Phase 3: Feature Services (depend on Auth + Core) + └── PermissionsService + └── StorageService + └── SearchService + └── TemplateService + └── DuplicateService + └── BulkOpsService + +Phase 4: Reporting (depends on Feature Services output models) + └── HtmlReportService + └── CsvExportService + +Phase 5: ViewModels (depend on service interfaces) + └── MainWindowViewModel (shell, nav, tenant selector) + └── Feature ViewModels (Permissions, Storage, Search, Templates, Duplicates, BulkOps) + └── SettingsViewModel + +Phase 6: Views + App Bootstrap (depend on ViewModels + DI) + └── XAML Views (bind to ViewModels) + └── Controls (TenantSelector, LogPanel, Charts) + └── App.xaml.cs DI container wiring +``` + +--- + +## Scaling Considerations + +This is a local desktop tool with a single user. "Scaling" means handling larger SharePoint tenants, not more users. + +| Concern | Approach | +|---------|----------| +| Large site collections (1000+ sites) | Async streaming with early cancellation; paginated PnP calls; virtual DataGrid | +| Deep permission hierarchies | Configurable scan depth; user can limit scope to top-level only | +| Large file search results | Server-side KQL filtering first, client-side regex only as secondary pass | +| Multiple simultaneous operations | Each ViewModel has its own CancellationTokenSource; operations are isolated | +| Session token expiry during long scan | MSAL silent refresh + retry on 401; surface error to user if re-auth needed | + +--- + +## Anti-Patterns + +### Anti-Pattern 1: `Dispatcher.Invoke` in Services + +**What people do:** Call `Application.Current.Dispatcher.Invoke()` inside service classes to update UI state. +**Why it's wrong:** Couples service layer to WPF, makes services untestable, causes deadlocks if called from wrong thread. +**Do this instead:** Service accepts `IProgress` parameter. `Progress` marshals to UI thread automatically via the captured SynchronizationContext. + +### Anti-Pattern 2: Giant "God ViewModel" + +**What people do:** Create one MainViewModel with all feature logic, mirroring the monolithic PowerShell script. +**Why it's wrong:** Replicates the exact problem being solved. Hard to navigate, hard to test, merge conflicts on every change. +**Do this instead:** One ViewModel per feature tab. MainWindowViewModel owns only shell navigation, active tenant, and log state. + +### Anti-Pattern 3: Storing ClientContext as a Long-Lived Static + +**What people do:** Cache `ClientContext` in a static field for reuse. +**Why it's wrong:** `ClientContext` is not thread-safe and has an auth token that expires. Static makes it impossible to manage per-tenant. +**Do this instead:** `SessionManager` manages ClientContext lifetime. Services request a context per operation. PnP Framework handles token refresh. + +### Anti-Pattern 4: Blocking Async on Sync Context + +**What people do:** Call `.Result` or `.Wait()` on Tasks inside WPF event handlers to avoid `async void`. +**Why it's wrong:** Deadlocks the WPF SynchronizationContext. The UI freezes permanently. +**Do this instead:** Use `async void` only for top-level event handlers (acceptable in WPF), or bind all user actions to `AsyncRelayCommand`. + +### Anti-Pattern 5: Silent Catch Blocks (porting the existing bug) + +**What people do:** Wrap PnP calls in `catch {}` or `catch { /* ignore */ }` to prevent crashes. +**Why it's wrong:** The existing PowerShell app has 38 such blocks — they produce silent failures, missing data, and phantom "success" states. +**Do this instead:** Catch specific exceptions (`SharePointException`, `MicrosoftIdentityException`). Log with full stack trace via `ILogger`. Surface user-visible error message via ViewModel's `ErrorMessage` property. + +--- + +## Integration Points + +### External Services + +| Service | Integration Pattern | Library | Notes | +|---------|---------------------|---------|-------| +| SharePoint Online (CSOM) | PnP Framework `ClientContext` | `PnP.Framework` NuGet | Use for permissions, storage, templates, bulk ops | +| SharePoint Search | PnP Framework `SearchRequest` | `PnP.Framework` NuGet | KQL queries; paginated | +| Microsoft Graph | `GraphServiceClient` | `Microsoft.Graph` NuGet | Use for user/group lookups, Teams data | +| Azure AD / MSAL | `PublicClientApplication` via PnP `AuthenticationManager` | Built into `PnP.Framework` | Interactive browser login; token cache callback | +| WPF Charts | `LiveCharts2` or `OxyPlot.Wpf` | NuGet | Storage metrics visualization; LiveCharts2 preferred for richer WPF binding | + +### Internal Boundaries + +| Boundary | Communication | Notes | +|----------|---------------|-------| +| SessionManager ↔ Feature Services | `TenantSession` passed per operation | Services do not store sessions | +| LocalizationService ↔ XAML | Singleton bound via `StaticResource`; properties fire `INotifyPropertyChanged` on language switch | All UI text goes through this | +| ReportExportService ↔ ViewModels | Called after operation completes; returns file path | Self-contained HTML with embedded JS/CSS | +| SettingsService ↔ all singletons | Read at startup; written on change | JSON format must match existing `Sharepoint_Settings.json` schema for migration | + +--- + +## Sources + +- [Introduction to MVVM Toolkit - Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/communitytoolkit/mvvm/) — HIGH confidence +- [AsyncRelayCommand - CommunityToolkit](https://learn.microsoft.com/en-us/dotnet/communitytoolkit/mvvm/asyncrelaycommand) — HIGH confidence +- [PnP Framework AuthenticationManager API](https://pnp.github.io/pnpframework/api/PnP.Framework.AuthenticationManager.html) — HIGH confidence +- [PnP Framework Getting Started](https://pnp.github.io/pnpframework/using-the-framework/readme.html) — HIGH confidence +- [Acquire and cache tokens with MSAL - Microsoft Learn](https://learn.microsoft.com/en-us/entra/identity-platform/msal-acquire-cache-tokens) — HIGH confidence +- [WPF Development Best Practices 2024 - MESCIUS](https://medium.com/mesciusinc/wpf-development-best-practices-for-2024-9e5062c71350) — MEDIUM confidence +- [Modern WPF Development: MVVM and Prism - Einfochips](https://www.einfochips.com/blog/modern-wpf-development-leveraging-mvvm-and-prism-for-enterprise-app/) — MEDIUM confidence +- [Async Programming Patterns for MVVM - Microsoft Learn](https://learn.microsoft.com/en-us/archive/msdn-magazine/2014/april/async-programming-patterns-for-asynchronous-mvvm-applications-commands) — HIGH confidence + +--- + +*Architecture research for: C#/WPF SharePoint Online administration desktop tool* +*Researched: 2026-04-02* diff --git a/.planning/research/FEATURES.md b/.planning/research/FEATURES.md new file mode 100644 index 0000000..0a2017b --- /dev/null +++ b/.planning/research/FEATURES.md @@ -0,0 +1,192 @@ +# Feature Research + +**Domain:** SharePoint Online administration and auditing desktop tool (MSP / IT admin) +**Researched:** 2026-04-02 +**Confidence:** MEDIUM (competitive landscape from web sources; no Context7 for SaaS tools; Microsoft docs HIGH confidence) + +## Feature Landscape + +### Table Stakes (Users Expect These) + +Features that IT admins and MSPs assume exist in any SharePoint admin tool. Missing these makes the product feel broken or incomplete. + +| Feature | Why Expected | Complexity | Notes | +|---------|--------------|------------|-------| +| Permissions report (site-level) | Every audit tool has this; admins must prove who has access where | MEDIUM | Must show owners, members, guests, external users, and broken inheritance | +| Export to CSV | Standard workflow — admins paste into tickets, compliance reports, Excel | LOW | Already in current app; keep for all reports | +| Multi-site permissions scan | Admins manage dozens of sites; per-site-only scan is unusable at scale | HIGH | Requires batching Graph API calls; throttling management needed | +| Storage metrics per site | Native M365 admin center only shows tenant-level; per-site is expected | MEDIUM | Already in current app; retain and improve | +| Interactive login / Azure AD OAuth | No client secret storage expected; browser-based auth is the norm | MEDIUM | Already implemented; new version adds session caching | +| Site template management | Re-using structure across client sites is a core MSP workflow | MEDIUM | Already in current app; port to C# | +| File search across sites | Finding content across a tenant is a day-1 admin task | MEDIUM | Already in current app; Graph driveItem search | +| Bulk operations (user add/remove, site creation) | Manual one-by-one is unacceptable at MSP scale | HIGH | Already in current app; async required to avoid UI freeze | +| Error reporting (not silent failures) | Admins need to know when scans fail partially | LOW | Current app has 38 silent catch blocks — critical fix | +| Localization (EN + FR) | Already exists; removing it would break existing users | LOW | Key-based translation system already in place | +| Export to interactive HTML | Shareable reports without requiring recipients to have the tool | MEDIUM | Already in current app; retain embedded JS for sorting/filtering | + +### Differentiators (Competitive Advantage) + +Features that are not universally provided, or are done poorly by competitors, where this tool can create genuine advantage. + +| Feature | Value Proposition | Complexity | Notes | +|---------|-------------------|------------|-------| +| Multi-tenant session caching | MSPs switch between 10-30 client tenants daily; re-auth per client wastes 2-3 min each | HIGH | Token cache per tenant profile; MSAL token cache serialization; core MSP differentiator | +| User access export across selected sites | "Show me everything User X can access across these 15 sites" — native M365 can't do this for arbitrary site subsets | HIGH | Requires enumerating group memberships, direct assignments, and inherited access across n sites; high Graph API volume | +| Simplified permissions view (plain language) | Compliance reports today require admins to translate "Contribute" to "can edit files" — untrained staff can't read them | MEDIUM | Jargon-free labels, summary counts, color coding; configurable detail level | +| Storage graph by file type (pie + bar toggle) | Native admin center shows totals only; file-type breakdown identifies what's consuming quota (videos, backups, etc.) | MEDIUM | Requires Graph driveItem enumeration with file extension grouping; recharts-style WPF chart control | +| Duplicate file detection | Reduces storage waste; no native Microsoft tool provides this simply | HIGH | Hash-based (SHA256/MD5) or name+size matching; large tenant = Graph throttling challenge | +| Folder structure provisioning | Create standardized folder trees on new sites from a template — critical for MSPs onboarding clients | MEDIUM | Already in current app; differentiating because competitors (ShareGate) don't focus on this | +| Offline profile / tenant registry | Store tenant URLs, display names, notes locally — instant context switching without re-entering URLs | LOW | JSON-backed, local only — simple but missing from all SaaS tools by design | +| Operation progress and cancellation | SaaS tools run jobs server-side; desktop tool must show real-time progress and allow cancel mid-scan | MEDIUM | CancellationToken throughout async operations; progress reporting via IProgress | + +### Anti-Features (Commonly Requested, Often Problematic) + +Features that seem valuable but create disproportionate complexity, maintenance burden, or scope creep for this tool's purpose. + +| Feature | Why Requested | Why Problematic | Alternative | +|---------|---------------|-----------------|-------------| +| Permission change alerts / real-time monitoring | Admins want to know when permissions change | Requires persistent background service, webhook registration in Azure, certificate lifecycle management — turns a desktop tool into a service | Run scheduled audit scans manually or via Windows Task Scheduler; export diffs between runs | +| Automated remediation (auto-revoke permissions) | "Fix it for me" saves time | One wrong rule destroys access for a client's entire org; liability risk; requires undo capability and audit trail that equals a full compliance system | Surface recommendations, let admin click to apply one at a time | +| SQLite or database storage | Faster queries on large datasets | Adds install dependency, schema migration complexity, and breaks the "single EXE" distribution model | JSON with chunked loading; lazy evaluation; paginated display | +| Cloud sync / shared tenant registry | Team of admins sharing tenant configs | Requires auth system, conflict resolution, server infrastructure — out of scope for local tool | Export/import JSON profiles; share config files manually | +| AI-powered governance recommendations | Microsoft is adding this to native admin center (SharePoint Admin Agent, Copilot-licensed) | Requires Copilot license, Graph calls with high latency, and competes directly with Microsoft's own roadmap | Focus on raw data accuracy and export quality; let Microsoft handle AI summaries | +| Cross-platform (Mac/Linux) support | Some admins use Macs | WPF is Windows-only; rewrite to MAUI/Avalonia is a full project — not justified for current user base | Confirmed out of scope in PROJECT.md | +| Version history management / rollback | Admins sometimes need to see version bloat | Version management is a deep separate problem; Graph API pagination for versions is complex and slow at scale | Surface version storage totals in storage metrics; flag libraries with high version counts | +| SharePoint content migration | Admins ask to move content between tenants or sites | Migration is a fully separate product category (ShareGate, AvePoint); competing here is a multi-year investment | Refer to ShareGate or native SharePoint migration for content moves | + +## Feature Dependencies + +``` +Multi-tenant session caching + └──requires──> Tenant profile registry (JSON-backed) + └──required by──> All features (auth gate) + +User access export across selected sites + └──requires──> Multi-site permissions scan + └──requires──> Multi-tenant session caching + +Simplified permissions view + └──enhances──> Permissions report (site-level) + └──enhances──> User access export across selected sites + +Storage graph by file type + └──requires──> Storage metrics per site + └──requires──> Graph driveItem enumeration (file extension data) + +Duplicate file detection + └──requires──> File search across sites (file enumeration infrastructure) + └──conflicts──> Automated remediation (deletion without undo = data loss risk) + +Bulk operations + └──requires──> Operation progress and cancellation + └──requires──> Error reporting (not silent failures) + +Export (CSV / HTML) + └──enhances──> All report features + └──required by──> Compliance audit workflows + +Folder structure provisioning + └──requires──> Site template management +``` + +### Dependency Notes + +- **Multi-tenant session caching requires Tenant profile registry:** Without a registry of tenant URLs and display names, the session cache has nothing to key against. The tenant profile JSON must exist before any feature can authenticate. +- **User access export requires multi-site permissions scan:** The "all accesses for user X" feature is essentially a filtered multi-site permissions scan. The scanning infrastructure must exist first. +- **Simplified permissions view enhances reports:** This is a presentation layer on top of raw permissions data — it cannot exist without the underlying data model. +- **Storage graph by file type requires Graph driveItem enumeration:** The native Graph storage reports do not include file type breakdown. This requires enumerating files with their extensions, which is a heavier Graph operation than summary-only calls. +- **Duplicate detection requires file enumeration infrastructure:** The file search feature already enumerates files; duplicate detection reuses that path but adds hash computation or name+size matching on top. +- **Bulk operations require cancellation support:** Long-running bulk operations that cannot be cancelled will freeze or force-kill the app. CancellationToken must be threaded through before bulk ops are exposed to users. +- **Duplicate detection conflicts with automated remediation:** Surfacing duplicates is safe; auto-deleting them without undo is not. Keep these concerns separate. + +## MVP Definition + +### Launch With (v1) + +Minimum viable product — sufficient to replace the existing PowerShell tool completely. + +- [ ] Tenant profile registry with multi-tenant session caching — without this, no feature works +- [ ] Permissions report (site-level) with CSV + HTML export — core audit use case +- [ ] Storage metrics per site — currently used daily +- [ ] File search across sites — currently used daily +- [ ] Bulk operations (member add, site creation, transfer) with progress + cancel — currently used; async required +- [ ] Site template management — core MSP provisioning workflow +- [ ] Folder structure provisioning — paired with templates +- [ ] Duplicate file detection — currently used for storage cleanup +- [ ] Error reporting (no silent failures) — current app's biggest reliability issue +- [ ] Localization (EN/FR) — existing users depend on this + +### Add After Validation (v1.x) + +Features to add once core parity is confirmed working. + +- [ ] User access export across selected sites — new feature; high value for MSP audits; add once multi-site scan is stable +- [ ] Simplified permissions view (plain language) — presentation enhancement; add after raw data model is solid +- [ ] Storage graph by file type (pie + bar toggle) — visualization enhancement on top of existing storage metrics + +### Future Consideration (v2+) + +Features to defer until product-market fit is established. + +- [ ] Scheduled scan runs via Windows Task Scheduler integration — requires stable CLI/headless mode first +- [ ] Permission comparison between two points in time (diff report) — useful for compliance but requires snapshot storage +- [ ] Export to XLSX (full Excel format, not just CSV) — requested but not critical; CSV opens in Excel adequately + +## Feature Prioritization Matrix + +| Feature | User Value | Implementation Cost | Priority | +|---------|------------|---------------------|----------| +| Tenant profile registry + session caching | HIGH | MEDIUM | P1 | +| Permissions report (site-level) | HIGH | MEDIUM | P1 | +| Storage metrics per site | HIGH | MEDIUM | P1 | +| File search across sites | HIGH | MEDIUM | P1 | +| Bulk operations with progress/cancel | HIGH | HIGH | P1 | +| Error reporting (no silent failures) | HIGH | LOW | P1 | +| Site template management | HIGH | MEDIUM | P1 | +| Folder structure provisioning | MEDIUM | MEDIUM | P1 | +| Duplicate file detection | MEDIUM | HIGH | P1 | +| Localization (EN/FR) | MEDIUM | LOW | P1 | +| User access export across selected sites | HIGH | HIGH | P2 | +| Simplified permissions view | HIGH | MEDIUM | P2 | +| Storage graph by file type | MEDIUM | MEDIUM | P2 | +| Permission diff / snapshot comparison | MEDIUM | HIGH | P3 | +| XLSX export | LOW | LOW | P3 | +| Scheduled scans (headless/CLI) | LOW | HIGH | P3 | + +**Priority key:** +- P1: Must have for v1 launch (parity with existing PowerShell tool) +- P2: Should have — add after v1 validated; new features from PROJECT.md active requirements +- P3: Nice to have, future consideration + +## Competitor Feature Analysis + +| Feature | ShareGate | ManageEngine SharePoint Manager Plus | AdminDroid | Our Approach | +|---------|-----------|---------------------------------------|------------|--------------| +| Permissions matrix report | Yes — visual matrix, CSV export | Yes — granular permission level reports | Yes — site users/groups report | Yes — with plain-language layer on top | +| Multi-tenant management | Yes — SaaS, per-tenant login | Yes — web-based | Yes — cloud SaaS | Yes — local session cache, instant switch, offline profiles | +| Storage reporting | Basic | Basic tenant-level | Basic | Enhanced — file-type breakdown, pie/bar toggle | +| Duplicate detection | No | No | No | Yes — differentiator | +| Folder structure provisioning | No | No | No | Yes — differentiator | +| Site templates | Migration focus | No | No | Yes — admin provisioning focus | +| Bulk operations | Yes — migration-focused | Limited | No | Yes — admin-operations focus (not migration) | +| User access export (cross-site) | Partial — site-by-site | Partial | Partial | Yes — arbitrary site subset, single export | +| Plain language permissions | No | No | No | Yes — differentiator for untrained users | +| Local desktop app (no SaaS) | No — cloud | No — cloud | No — cloud | Yes — core constraint and privacy advantage | +| Offline / no internet needed | No | No | No | Yes (after auth token cached) | +| Price | ~$6K/year | Subscription | Subscription | Tool cost (one-time dev, distributed free or licensed) | + +## Sources + +- [ShareGate SharePoint audit tool feature page](https://sharegate.com/sharepoint-audit-tool) — MEDIUM confidence (marketing page) +- [ManageEngine SharePoint Manager Plus permissions auditing](https://www.manageengine.com/sharepoint-management-reporting/sharepoint-permission-auditing-tool.html) — MEDIUM confidence +- [Microsoft Data access governance reports — site permissions for users](https://learn.microsoft.com/en-us/sharepoint/data-access-governance-site-permissions-users-report) — HIGH confidence +- [Microsoft SharePoint Advanced Management overview](https://learn.microsoft.com/en-us/sharepoint/advanced-management) — HIGH confidence +- [sprobot.io: 9 must-have features for SharePoint storage reporting](https://www.sprobot.io/blog/how-to-choose-the-right-sharepoint-storage-reporting-tool-9-must-have-features) — MEDIUM confidence +- [AdminDroid SharePoint Online auditing](https://admindroid.com/microsoft-365-sharepoint-online-auditing) — MEDIUM confidence +- [CIAOPS: Best ways to monitor and audit permissions across SharePoint M365](https://blog.ciaops.com/2025/04/27/best-ways-to-monitor-and-audit-permissions-across-a-sharepoint-environment-in-microsoft-365/) — MEDIUM confidence +- [ShareGate: How to generate a SharePoint user permissions report](https://sharegate.com/blog/build-the-perfect-sharepoint-permissions-report) — MEDIUM confidence +- [Microsoft SharePoint storage reports admin center](https://learn.microsoft.com/en-us/microsoft-365/admin/activity-reports/sharepoint-storage-reports?view=o365-worldwide) — HIGH confidence + +--- +*Feature research for: SharePoint Online administration/auditing desktop tool (C#/WPF, MSP/IT admin)* +*Researched: 2026-04-02* diff --git a/.planning/research/PITFALLS.md b/.planning/research/PITFALLS.md new file mode 100644 index 0000000..b4cfe60 --- /dev/null +++ b/.planning/research/PITFALLS.md @@ -0,0 +1,383 @@ +# Pitfalls Research + +**Domain:** C#/WPF SharePoint Online administration desktop tool (PowerShell-to-C# rewrite) +**Researched:** 2026-04-02 +**Confidence:** HIGH (critical pitfalls verified via official docs, PnP GitHub issues, and known existing codebase problems) + +--- + +## Critical Pitfalls + +### Pitfall 1: Calling PnP/CSOM Methods Synchronously on the UI Thread + +**What goes wrong:** +`AuthenticationManager.GetContext()`, `ExecuteQuery()`, and similar PnP Framework / CSOM calls are blocking network operations. If called directly on the WPF UI thread — even inside a button click handler — the entire window freezes until the call completes. This is precisely what causes the UI freezes in the current PowerShell app, and the problem migrates verbatim into C# if async patterns are not used from day one. + +A subtler variant: using `.Result` or `.Wait()` on a `Task` from the UI thread. The UI thread holds a `SynchronizationContext`; the async continuation needs that same context to resume; deadlock ensues. The application hangs with no exception and no feedback. + +**Why it happens:** +Developers migrating from PowerShell think in sequential terms and instinctively port one-liner calls directly to event handler bodies. The WPF framework does not prevent synchronous blocking — it just stops processing messages, which looks like a freeze. + +**How to avoid:** +- Every SharePoint/PnP call must be wrapped in `await Task.Run(...)` or use the async overloads directly (`ExecuteQueryRetryAsync`, `GetContextAsync`). +- Never use `.Result`, `.Wait()`, or `Task.GetAwaiter().GetResult()` on the UI thread. +- Establish a project-wide convention: all ViewModels execute SharePoint operations through `async Task` methods with `CancellationToken` parameters. Codify this in architecture docs from Phase 1. +- Use `ConfigureAwait(false)` in all service/repository layer code (below ViewModel level) so continuations do not need to return to the UI thread unnecessarily. + +**Warning signs:** +- Any `void` method containing a PnP call. +- Any `Task.Result` or `.Wait()` in ViewModel or code-behind. +- Button click handlers that are not `async`. +- Application hangs for seconds at a time when switching tenants or starting operations. + +**Phase to address:** Foundation/infrastructure phase (first phase). This pattern must be established before any feature work begins. Retrofitting async throughout a codebase is one of the most expensive rewrites possible. + +--- + +### Pitfall 2: Replicating Silent Error Suppression from the PowerShell Original + +**What goes wrong:** +The existing codebase has 38 empty `catch` blocks and 27 instances of `-ErrorAction SilentlyContinue`. During a rewrite, developers under time pressure port the "working" behavior, which means they replicate the silent failures. The C# version appears to work in demos but hides the same class of bugs: group member additions that silently did nothing, storage scans that silently skipped folders, JSON loads that silently returned empty defaults from corrupted files. + +**Why it happens:** +Port-from-working-code instinct. The original returned a result (even if wrong), so the C# version is written to also return a result without questioning whether an error was swallowed. Also, `try { ... } catch (Exception) { }` in C# is syntactically shorter and less ceremonial than PowerShell's equivalent, making it easy to write reflexively. + +**How to avoid:** +- Treat every `catch` block as code that requires a positive decision: log and recover, log and rethrow, or log and surface to the user. A `catch` that does none of these three things is a bug. +- Adopt a structured logging pattern (e.g., `ILogger` with `Microsoft.Extensions.Logging`) from Phase 1 so logging is never optional. +- Create a custom `SharePointOperationException` hierarchy that preserves original exceptions and adds context (which site, which operation, which user) before rethrowing. This prevents exception swallowing during the port. +- In PR reviews, flag any empty or log-only catch blocks that do not surface the error to the user as a defect. + +**Warning signs:** +- Any `catch (Exception ex) { }` with no body. +- Any `catch` block that only calls `_logger.LogWarning` but returns a success result to the caller. +- Operations that complete in < 1 second when they should take 5–10 seconds (silent skip). +- Users reporting "the button did nothing" with no error shown. + +**Phase to address:** Foundation/infrastructure phase. Define the error handling strategy and base exception types before porting any features. + +--- + +### Pitfall 3: SharePoint List View Threshold (5 000 Items) Causing Unhandled Exceptions + +**What goes wrong:** +Any CSOM or PnP Framework call that queries a SharePoint list without explicit pagination throws a `Microsoft.SharePoint.Client.ServerException` with message "The attempted operation is prohibited because it exceeds the list view threshold" when the list contains more than 5 000 items. In the current PowerShell code this is partially masked by `-ErrorAction SilentlyContinue`. In C# it becomes an unhandled exception that crashes the operation unless explicitly caught and handled. + +Real tenant libraries with 5 000+ files are common. Permissions reports, storage scans, and file search are all affected. + +**Why it happens:** +Developers test against small tenant sites during development. The threshold is not hit, tests pass, the feature ships. First production use against a real client library fails. + +**How to avoid:** +- All `GetItems`, `GetListItems`, and folder-enumeration calls must use `CamlQuery` with `RowLimit` set to a page size (500–2 000), iterating with `ListItemCollectionPosition` until exhausted. +- For Graph SDK paths, use the `PageIterator` pattern; never call `.GetAsync()` on a collection without a `$top` parameter. +- The storage recursion function (`Collect-FolderStorage` equivalent) must default to depth 3–4, not 999, and show estimated time before starting. +- Write an integration test against a seeded list of 6 000 items before shipping each feature that enumerates list items. + +**Warning signs:** +- Any `GetItems` call without a `CamlQuery` with explicit `RowLimit`. +- Any Graph SDK call to list items without `.Top(n)`. +- `ServerException` appearing in logs from client sites but not in dev testing. + +**Phase to address:** Each feature phase that touches list enumeration (permissions, storage, file search). The pagination helper should be a shared utility written in the foundation phase and reused everywhere. + +--- + +### Pitfall 4: Multi-Tenant Token Cache Race Conditions and Stale Tokens + +**What goes wrong:** +The design requires cached authentication sessions so users can switch between client tenants without re-authenticating. MSAL.NET token caches are not thread-safe by default. If two background operations run concurrently against different tenants, cache read/write races produce corrupted cache state, silent auth failures, or one tenant's token being used for another tenant's request. + +A secondary problem: when an Azure AD app registration's permissions change (e.g., a new Graph scope is granted), MSAL returns the cached token for the old scope. The operation fails with a 403 but looks like a permissions error, not a stale cache error, sending the developer on a false debugging path. + +**Why it happens:** +Multi-tenant caching is not covered in most MSAL.NET tutorials, which show single-tenant flows. The token cache API (`TokenCacheCallback`, `BeforeAccessNotification`, `AfterAccessNotification`) is low-level and easy to implement incorrectly. + +**How to avoid:** +- Use `Microsoft.Identity.Client.Extensions.Msal` (`MsalCacheHelper`) for file-based, cross-process-safe token persistence. This is the Microsoft-recommended approach for desktop public client apps. +- The `AuthenticationManager` instance in PnP Framework accepts a `tokenCacheCallback`; wire it to `MsalCacheHelper` so cache is persisted safely per-tenant. +- Scope the `IPublicClientApplication` instance per-ClientId (app registration), not per-tenant URL. Different tenants share the same client app but have different account entries in the cache. +- Implement an explicit "clear cache for tenant" action in the UI so users can force re-authentication when permissions change. +- Never share a single `AuthenticationManager` instance across concurrent operations on different tenants without locking. + +**Warning signs:** +- Intermittent 401 or 403 errors that resolve after restarting the app. +- User reports "wrong tenant data shown" (cross-tenant token bleed). +- `MsalUiRequiredException` thrown only on the second or third operation of a session. + +**Phase to address:** Authentication/multi-tenant infrastructure phase (early, before any feature uses the auth layer). + +--- + +### Pitfall 5: WPF ObservableCollection Updates from Background Threads + +**What goes wrong:** +Populating a `DataGrid` or `ListView` bound to an `ObservableCollection` from a background `Task` or `Task.Run` throws a `NotSupportedException`: "This type of CollectionView does not support changes to its SourceCollection from a thread different from the Dispatcher thread." The exception crashes the background operation. If it is swallowed (see Pitfall 2), the UI simply does not update. + +This maps directly to the current app's runspace-to-UI communication via synchronized hashtables polled by a timer. The C# version must use the Dispatcher or the MVVM toolkit equivalently. + +**Why it happens:** +In a `Task.Run` lambda, the continuation runs on a thread pool thread, not the UI thread. Developers add items to the collection inside that lambda. It works in small-scale testing (timing may work) but fails under load. + +**How to avoid:** +- Never add items to an `ObservableCollection` from a non-UI thread. +- Preferred pattern: collect results into a plain `List` on the background thread, then `await Application.Current.Dispatcher.InvokeAsync(() => { Items = new ObservableCollection(list); })` in one atomic swap. +- For streaming progress (show items as they arrive), use `BindingOperations.EnableCollectionSynchronization` with a lock object at initialization, then add items with the lock held. +- Use `IProgress` with `Progress` (captures the UI `SynchronizationContext` at construction) to report incremental results safely. + +**Warning signs:** +- `InvalidOperationException` or `NotSupportedException` in logs referencing `CollectionView`. +- UI lists that do not update despite background operation completing. +- Items appearing out of order or partially in lists. + +**Phase to address:** Foundation/infrastructure phase. Define the progress-reporting and collection-update patterns before porting any feature that returns lists of results. + +--- + +### Pitfall 6: WPF Trimming Breaks Self-Contained EXE + +**What goes wrong:** +Publishing a WPF app as a self-contained single EXE with `PublishTrimmed=true` silently removes types that WPF and XAML use via reflection at runtime. The app compiles and publishes successfully but crashes at startup or throws `TypeInitializationException` when opening a window whose XAML references a type that was trimmed. PnP Framework and MSAL also use reflection heavily; trimming removes their internal types. + +**Why it happens:** +The .NET trimmer performs static analysis and removes code it cannot prove is referenced. XAML data binding, converters, `DataTemplateSelector`, `IValueConverter`, and `DynamicResource` are resolved at runtime via reflection — the trimmer cannot see these references. + +**How to avoid:** +- Do not use `PublishTrimmed=true` for WPF + PnP Framework + MSAL projects. The EXE will be larger (~150 MB self-contained is expected and acceptable per PROJECT.md). +- Use `PublishSingleFile=true` with `SelfContained=true` and `IncludeAllContentForSelfExtract=true`, but without trimming. This bundles the runtime into the EXE correctly. +- Verify the single-file output in CI by running the EXE on a clean machine (no .NET installed) before each release. +- Set `true` for startup performance improvement instead of trimming. + +**Warning signs:** +- Publish profile has `true`. +- "Works on dev machine, crashes on client machine" with `TypeInitializationException` or `MissingMethodException`. +- EXE is suspiciously small (< 50 MB for a self-contained WPF app). + +**Phase to address:** Distribution/packaging phase. Establish the publish profile with correct flags before any release packaging work. + +--- + +### Pitfall 7: Async Void in Command Handlers Swallows Exceptions + +**What goes wrong:** +In WPF, button `Click` event handlers are `void`-returning delegates. Developers writing `async void` handlers (e.g., `private async void OnRunButtonClick(...)`) create methods where exceptions thrown after an `await` are raised on the `SynchronizationContext` rather than returned as a faulted `Task`. These exceptions cannot be caught by a caller and will crash the process (or be silently eaten by `Application.DispatcherUnhandledException` without the stack context needed to debug them). + +**Why it happens:** +MVVM `ICommand` requires a `void Execute(object parameter)` signature. New C# developers write `async void Execute(...)` without understanding the consequence. The `CommunityToolkit.Mvvm` provides `AsyncRelayCommand` to solve this correctly, but it is not the obvious choice. + +**How to avoid:** +- Never write `async void` anywhere in the codebase except the required WPF event handler entry points in code-behind, and only when those entry points immediately delegate to an `async Task` ViewModel method. +- Use `AsyncRelayCommand` from `CommunityToolkit.Mvvm` for all commands that invoke async operations. It wraps the `Task`, exposes `ExecutionTask`, `IsRunning`, and `IsCancellationRequested`, and handles exceptions via `AsyncRelayCommandOptions.FlowExceptionsToTaskScheduler`. +- Wire a global `Application.DispatcherUnhandledException` handler and `TaskScheduler.UnobservedTaskException` handler that log full stack traces and show a user-facing error dialog. This is the last line of defense. + +**Warning signs:** +- Any `async void` method outside of a `MainWindow.xaml.cs` entry point. +- Commands implemented as `async void Execute(...)` in ViewModels. +- Exceptions that appear in logs with no originating ViewModel context. + +**Phase to address:** Foundation/infrastructure phase (MVVM base classes and command patterns established before any feature code). + +--- + +### Pitfall 8: SharePoint API Throttling Not Handled (429/503) + +**What goes wrong:** +SharePoint Online and Microsoft Graph enforce per-app, per-tenant throttling. Bulk operations (permissions scan across 50+ sites, storage scan on 10 000+ folders, bulk member additions) generate enough API calls to trigger HTTP 429 or 503 responses. Without explicit retry-after handling, the operation fails partway through with an unhandled `HttpRequestException` and leaves the user with partial results and no indication of how to resume. + +**Why it happens:** +PnP.PowerShell handled this invisibly for the PowerShell app. PnP Framework in C# does have built-in retry via `ExecuteQueryRetryAsync`, but developers unfamiliar with C#-side PnP may use the raw CSOM `ExecuteQuery()` or direct `HttpClient` calls that lack this protection. + +**How to avoid:** +- Always use `ExecuteQueryRetryAsync` (never `ExecuteQuery`) for all CSOM batch calls. +- When using Graph SDK, use the `GraphServiceClient` with the default retry handler enabled — it handles 429 with `Retry-After` header respect automatically. +- For multi-site bulk operations, add a short delay (100–300 ms) between site connections to avoid burst throttling. Implement a configurable concurrency limit (default: sequential or max 3 parallel). +- Surface throttling events in the progress log: "Rate limited, retrying in 15s…" so the user knows the operation is paused, not hung. + +**Warning signs:** +- Raw `ExecuteQuery()` calls anywhere in the codebase. +- `HttpRequestException` with 429 status in logs. +- Operations that fail consistently at the same approximate item count across multiple runs. + +**Phase to address:** Foundation/infrastructure phase for the retry handler; each feature phase must use the established pattern. + +--- + +### Pitfall 9: Resource Disposal Gaps in Long-Running Operations + +**What goes wrong:** +`ClientContext` objects returned by `AuthenticationManager.GetContext()` are `IDisposable`. If a background `Task` is cancelled or throws an exception mid-operation, a `ClientContext` created in the try block is not disposed if the `finally` block is missing. Over a long session (MSP workflow: dozens of tenant switches, multiple scans), leaked `ClientContext` objects accumulate unmanaged resources and eventually cause connection refusals or memory degradation. This is the C# equivalent of the runspace disposal gaps in the current codebase. + +**Why it happens:** +`using` statements are the idiomatic C# solution, but they do not compose well with async cancellation. Developers use `try/catch` without `finally`, or structure the code so the `using` scope is exited before the `Task` completes. + +**How to avoid:** +- Always obtain `ClientContext` inside a `using` statement or `await using` if using C# 8+ disposable pattern: `await using var ctx = await authManager.GetContextAsync(url, token)`. +- Wrap the entire operation body in `try/finally` with disposal in the `finally` block when `await using` is not applicable. +- When a `CancellationToken` is triggered, let the `OperationCanceledException` propagate naturally; the `using` / `finally` will still execute. +- Add a unit test for the "cancelled mid-operation" path that verifies `ClientContext.Dispose()` is called. + +**Warning signs:** +- `GetContext` calls without `using`. +- `catch (Exception) { return; }` that bypasses a `ClientContext` created earlier in the method. +- Memory growth over a multi-hour MSP session visible in Task Manager. + +**Phase to address:** Foundation/infrastructure phase (define the context acquisition pattern) and validated in each feature phase. + +--- + +### Pitfall 10: JSON Settings Corruption on Concurrent Writes + +**What goes wrong:** +The app writes profiles, settings, and templates to JSON files on disk. If the user triggers two rapid operations (e.g., saves a profile while a background scan completes and updates settings), both code paths may attempt to write the same file simultaneously. The second write overwrites a partially-written first write, producing a truncated or syntactically invalid JSON file. On next startup, the file fails to parse and silently returns empty defaults — erasing all user profiles. + +This is a known bug in the current app (CONCERNS.md: "Profile JSON file: no transaction semantics"). + +**Why it happens:** +File I/O is not inherently thread-safe. `System.Text.Json`'s `JsonSerializer.SerializeAsync` writes to a stream but does not protect the file from concurrent access by another code path. + +**How to avoid:** +- Serialize all writes to each JSON file through a single `SemaphoreSlim(1)` per file. Acquire before reading or writing, release in `finally`. +- Use write-then-replace: write to `filename.tmp`, validate the JSON by deserializing it, then `File.Move(tmp, original, overwrite: true)`. An interrupted write leaves the original intact. +- On startup, if the primary file is invalid, check for a `.tmp` or `.bak` version before falling back to defaults — and log which fallback was used. + +**Warning signs:** +- Profile file occasionally empty after normal use. +- `JsonException` on startup that the user cannot reproduce on demand. +- App loaded with correct profiles yesterday, empty profiles today. + +**Phase to address:** Foundation/infrastructure phase (data access layer). Must be solved before any feature persists data. + +--- + +## Technical Debt Patterns + +| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable | +|----------|-------------------|----------------|-----------------| +| Copy PowerShell logic verbatim into a `Task.Run` | Fast initial port, works locally | Inherits all silent failures, no cancellation, no progress reporting | Never — always re-examine the logic | +| `async void` command handlers | Compiles and runs | Exceptions crash app silently; no cancellation propagation | Only for WPF event entry points that immediately call `async Task` | +| Direct `ExecuteQuery()` without retry | Simpler call site | Crashes on throttling for real client tenants | Never — use `ExecuteQueryRetryAsync` | +| Single shared `AuthenticationManager` instance | Simple instantiation | Token cache race conditions under concurrent operations | Only if all operations are strictly sequential (initial MVP, clearly documented) | +| Load entire list into memory before display | Simple binding | `OutOfMemoryException` on libraries with 50k+ items | Only for lists known to be small and bounded (e.g., profiles list) | +| No `CancellationToken` propagation | Simpler method signatures | Operations cannot be cancelled; UI stuck waiting | Never for operations > 2 seconds | +| Hard-code English fallback strings in code | Quick to write | Breaks FR locale; strings diverge from key system | Never — always use resource keys | + +--- + +## Integration Gotchas + +| Integration | Common Mistake | Correct Approach | +|-------------|----------------|------------------| +| PnP Framework `GetContext` | Calling on UI thread synchronously | Always `await Task.Run(() => authManager.GetContext(...))` or use `GetContextAsync` | +| MSAL token cache (multi-tenant) | One `IPublicClientApplication` per call | One `IPublicClientApplication` per ClientId, long-lived, with `MsalCacheHelper` wired | +| SharePoint list enumeration | No `RowLimit` in `CamlQuery` | Always paginate with `RowLimit` ≤ 2 000 and `ListItemCollectionPosition` | +| Graph SDK paging | Calling `.GetAsync()` on collections without `$top` | Use `PageIterator` or explicit `.Top(n)` on every collection request | +| PnP `ExecuteQueryRetryAsync` | Forgetting to `await`; using synchronous `ExecuteQuery` | Always `await ctx.ExecuteQueryRetryAsync()` | +| WPF `ObservableCollection` | Modifying from `Task.Run` lambda | Collect into `List`, then assign via `Dispatcher.InvokeAsync` | +| PnP Management Shell client ID | Using the shared PnP app ID in a multi-tenant production tool | Register a dedicated Azure AD app per deployment; don't rely on PnP's shared registration | +| SharePoint Search API (KQL) | No result limit, assuming all results returned | Always set `RowLimit`; results capped at 500 per page, max 50 000 total | + +--- + +## Performance Traps + +| Trap | Symptoms | Prevention | When It Breaks | +|------|----------|------------|----------------| +| Loading all `ObservableCollection` items before displaying any | UI freezes until entire operation completes | Use `IProgress` to stream items as they arrive; enable UI virtualization | Any list > ~500 items | +| WPF virtualization disabled by `ScrollViewer.CanContentScroll=False` or grouping | DataGrid scroll is sluggish with 200+ rows | Never disable `CanContentScroll`; set `VirtualizingPanel.IsVirtualizingWhenGrouping=True` | > 200 rows in a DataGrid | +| Adding items to `ObservableCollection` one-by-one from background | Thousands of UI binding notifications; UI jank | Batch-load: assign `new ObservableCollection(list)` once | > 50 items added in a loop | +| Permissions scan without depth limit | Scan takes hours on deep folder structures | Default depth 3–4; show estimated time; require explicit user override for deeper | Sites with > 5 folder levels | +| HTML report built entirely in memory | `OutOfMemoryException` or report generation takes minutes | Stream HTML to file; write rows as they are produced, not after full scan | > 10 000 rows in report | +| Sequential site processing for multi-site reports | Report for 20 sites takes 20× single-site time | Process up to 3 sites concurrently with `SemaphoreSlim`; show per-site progress | > 5 sites selected | +| Duplicate `Connect-PnPOnline` calls per operation | Redundant browser popups or token refreshes | Cache authenticated `ClientContext` per (tenant, clientId) for session lifetime | Any operation that reconnects unnecessarily | + +--- + +## Security Mistakes + +| Mistake | Risk | Prevention | +|---------|------|------------| +| Storing Client ID in plaintext JSON profile | Low on its own (Client ID is not a secret), but combined with tenant URL it eases targeted phishing | Document that Client ID is not a secret; optionally encrypt the profile file with DPAPI `ProtectedData.Protect` for defence-in-depth | +| Writing temp files with tenant credentials to `%TEMP%` | File readable by other processes on the same user account; not cleaned up on crash | Use `SecureString` in-memory for transient auth data; delete temp files in `finally` blocks; prefer named pipes or in-memory channels | +| No validation of tenant URL format before connecting | Typo sends auth token to wrong endpoint; user confused by misleading auth error | Validate against regex `^https://[a-zA-Z0-9-]+\.sharepoint\.com` before any connection attempt | +| Logging full exception messages that include HTTP request URLs | Tenant URLs and item paths exposed in log files readable on shared machines | Strip or redact SharePoint URLs in log output at `Debug` level; keep them out of `Information`-level user-visible logs | +| Bundling PnP Management Shell client ID (shared multi-tenant app) | App uses a shared identity not owned by the deploying organisation; harder to audit and revoke | Require each deployment to use a dedicated app registration; document the registration steps clearly | + +--- + +## UX Pitfalls + +| Pitfall | User Impact | Better Approach | +|---------|-------------|-----------------| +| No cancellation for operations > 5 seconds | User closes app via Task Manager; loses in-progress results; must restart | Every operation exposed in UI must accept a `CancellationToken`; show a "Cancel" button that is always enabled during operation | +| Progress bar with no ETA or item count | User cannot judge whether to wait or cancel | Show "Scanned X of Y sites" or "X items found"; update every 0.5 s minimum | +| Error messages showing raw exception text | Non-technical admin users see stack traces and `ServerException: CSOM call failed` | Translate known error types to plain-language messages; offer a "Copy technical details" link for support escalation | +| Silent success on bulk operations with partial failures | User thinks all 50 members were added; 12 failed silently | Show a per-item result summary: "38 added successfully, 12 failed — see details" | +| Language switches require app restart | FR-speaking users see flickering English then French on startup | Load correct language before any UI is shown; apply language from settings before `InitializeComponent` | +| Permissions report jargon ("Full Control", "Contribute", "Limited Access") shown raw | Non-technical stakeholders do not understand the report | Map SharePoint permission levels to plain-language equivalents in the report output; keep raw names in a "technical details" expandable section | + +--- + +## "Looks Done But Isn't" Checklist + +- [ ] **Multi-tenant session switching:** Verify that switching from Tenant A to Tenant B does not return Tenant A's data. Test with two real tenants, not two sites in the same tenant. +- [ ] **Operation cancellation:** Verify that pressing Cancel stops the operation within 2 seconds and leaves no zombie threads or unreleased `ClientContext` objects. +- [ ] **5 000+ item libraries:** Verify permissions report and storage scan complete without `ServerException` on a real library with > 5 000 items (not a test tenant with 50 items). +- [ ] **Self-contained EXE on clean machine:** Install the EXE on a machine with no .NET runtime installed; verify startup and a complete workflow before every release. +- [ ] **JSON file corruption recovery:** Corrupt a profile JSON file manually; verify the app starts, logs the corruption, does not silently return empty profiles, and preserves the backup. +- [ ] **Concurrent writes:** Simultaneously trigger "Save profile" and "Export settings" from two rapid button clicks; verify neither file is truncated. +- [ ] **Large HTML reports:** Generate a permissions report for a site with > 5 000 items; verify the HTML file opens in a browser in < 10 seconds and the DataGrid is scrollable. +- [ ] **FR locale completeness:** Switch to French; verify no UI string shows an untranslated key or hardcoded English text. +- [ ] **Throttling recovery:** Simulate a 429 response; verify the operation pauses, logs "Retrying in Xs", and completes successfully after the retry interval. + +--- + +## Recovery Strategies + +| Pitfall | Recovery Cost | Recovery Steps | +|---------|---------------|----------------| +| Async/sync deadlocks introduced in foundation | HIGH — requires refactoring all affected call chains | Identify all `.Result`/`.Wait()` calls with a codebase grep; convert bottom-up (services first, then ViewModels) | +| Silent failures ported from PowerShell | MEDIUM — requires audit of every catch block | Search all `catch` blocks; classify each as log-and-recover, log-and-rethrow, or log-and-surface; fix one feature at a time | +| Token cache corruption | LOW — clear the cache file and re-authenticate | Expose a "Clear cached sessions" action in the UI; document in troubleshooting guide | +| JSON profile file corruption | LOW if backup exists, HIGH if no backup | Implement write-then-replace before first release; add backup-on-corrupt logic to deserializer | +| WPF trimming breaks EXE | MEDIUM — need to republish with trimming disabled | Update publish profile, re-run publish, retest EXE on clean machine | +| Missing pagination on large lists | MEDIUM — need to refactor per-feature enumeration | Create shared pagination helper; replace calls feature by feature; test each against 6 000-item library | + +--- + +## Pitfall-to-Phase Mapping + +| Pitfall | Prevention Phase | Verification | +|---------|------------------|--------------| +| Sync/async deadlocks on UI thread | Phase 1: Foundation — establish async-first patterns | Code review checklist: no `.Result`/`.Wait()` in any ViewModel or event handler | +| Silent error suppression replication | Phase 1: Foundation — define error handling strategy and base types | Automated lint rule (Roslyn analyser or SonarQube) flagging empty catch blocks | +| SharePoint 5 000-item threshold | Phase 1: Foundation — write shared paginator; reused in all features | Integration test against 6 000-item library for every feature that enumerates lists | +| Multi-tenant token cache race | Phase 1: Foundation — auth layer with `MsalCacheHelper` | Test: two concurrent operations on different tenants return correct data | +| ObservableCollection cross-thread updates | Phase 1: Foundation — define progress-reporting pattern | Automated test: populate collection from background thread; verify no exception | +| WPF trimming breaks EXE | Final distribution phase | CI step: run published EXE on a clean Windows VM, assert startup and one workflow completes | +| Async void command handlers | Phase 1: Foundation — establish MVVM base with `AsyncRelayCommand` | Code review: no `async void` in ViewModel files | +| API throttling unhandled | Phase 1: Foundation — retry handler; applied by every feature | Load test: run storage scan against a tenant with rate-limiting; verify retry log entry | +| Resource disposal gaps | Phase 1: Foundation — context acquisition pattern | Unit test: cancel a long operation mid-run; verify `ClientContext.Dispose` called | +| JSON concurrent write corruption | Phase 1: Foundation — write-then-replace + `SemaphoreSlim` | Stress test: 100 concurrent save calls; verify file always parseable after all complete | + +--- + +## Sources + +- PnP Framework GitHub issue #961: `AuthenticationManager.GetContext` freeze in C# desktop app — https://github.com/pnp/pnpframework/issues/961 +- PnP Framework GitHub issue #447: `AuthenticationManager.GetContext` hanging in ASP.NET — https://github.com/pnp/pnpframework/issues/447 +- Microsoft Learn: Token cache serialization (MSAL.NET) — https://learn.microsoft.com/en-us/entra/msal/dotnet/how-to/token-cache-serialization +- Microsoft Learn: SharePoint Online list view threshold — https://learn.microsoft.com/en-us/troubleshoot/sharepoint/lists-and-libraries/items-exceeds-list-view-threshold +- Microsoft Learn: Single-file publishing overview — https://learn.microsoft.com/en-us/dotnet/core/deploying/single-file/overview +- dotnet/wpf GitHub issue #4216: `PublishTrimmed` causes `Unhandled Exception` in self-contained WPF app — https://github.com/dotnet/wpf/issues/4216 +- dotnet/wpf GitHub issue #6096: Trimming for WPF — https://github.com/dotnet/wpf/issues/6096 +- Microsoft .NET Blog: Await, and UI, and deadlocks — https://devblogs.microsoft.com/dotnet/await-and-ui-and-deadlocks-oh-my/ +- Microsoft Learn: AsyncRelayCommand (CommunityToolkit.Mvvm) — https://learn.microsoft.com/en-us/dotnet/communitytoolkit/mvvm/asyncrelaycommand +- Microsoft Learn: Graph SDK paging — https://learn.microsoft.com/en-us/graph/sdks/paging +- Microsoft Learn: Graph throttling guidance — https://learn.microsoft.com/en-us/graph/throttling +- Rick Strahl's Web Log: Async and Async Void Event Handling in WPF — https://weblog.west-wind.com/posts/2022/Apr/22/Async-and-Async-Void-Event-Handling-in-WPF +- Existing codebase CONCERNS.md audit (2026-04-02) — `.planning/codebase/CONCERNS.md` + +--- + +*Pitfalls research for: C#/WPF SharePoint Online administration desktop tool (PowerShell-to-C# rewrite)* +*Researched: 2026-04-02* diff --git a/.planning/research/STACK.md b/.planning/research/STACK.md new file mode 100644 index 0000000..8ca4321 --- /dev/null +++ b/.planning/research/STACK.md @@ -0,0 +1,204 @@ +# Stack Research + +**Domain:** C#/WPF desktop administration tool for SharePoint Online (multi-tenant MSP) +**Researched:** 2026-04-02 +**Confidence:** HIGH (core framework choices), MEDIUM (charting library) + +--- + +## Recommended Stack + +### Core Technologies + +| Technology | Version | Purpose | Why Recommended | +|------------|---------|---------|-----------------| +| .NET 10 LTS | 10.x | Target runtime | Released November 2025, LTS until November 2028 — the current LTS. Avoid .NET 8 (ends November 2026) and .NET 9 STS (ended May 2026). WPF support is first-class and actively improved in .NET 10. | +| WPF (.NET 10) | built-in | UI framework | Windows-only per project constraint. Modern MVVM data binding, richer styling than WinForms. The existing codebase uses WinForms; WPF is the correct upgrade path for richer UI. | +| C# 13 | built-in with .NET 10 | Language | Current language version shipping with .NET 10 SDK. | + +### SharePoint / Microsoft 365 API + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| PnP.Framework | 1.18.0 | SharePoint CSOM extensions, provisioning engine, site templates, permissions | Directly replaces PnP.PowerShell patterns the existing app uses. Contains PnP Provisioning Engine needed for site templates feature. Targets .NET Standard 2.0 so runs on .NET 10 via compatibility. This is the correct choice for a CSOM-heavy migration — use PnP.Core SDK only when starting greenfield with Graph-first design. | +| Microsoft.Graph | 5.103.0 | Microsoft Graph API access (Teams, Groups, users across tenants) | Required for Teams site management, user enumeration across tenants. Complements PnP.Framework which is CSOM-first. Use Graph SDK for Graph-native operations; use PnP.Framework for SharePoint-specific provisioning. | + +**Note on PnP.Core SDK vs PnP.Framework:** PnP Core SDK is the modern Graph-first replacement for PnP Framework, but PnP Framework is the right choice here because: (1) this is a migration from PnP.PowerShell which is CSOM-based, (2) the PnP Provisioning Engine for site templates lives in PnP.Framework, not PnP Core SDK, (3) the existing feature set maps directly to PnP.Framework's extension methods. + +### Authentication + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| Microsoft.Identity.Client (MSAL.NET) | 4.83.1 | Azure AD interactive browser login, token acquisition | The underlying auth library used by both PnP.Framework and Microsoft.Graph SDK. Use directly for multi-tenant session management. | +| Microsoft.Identity.Client.Extensions.Msal | 4.83.3 | Token cache persistence to disk | Required for multi-tenant session caching — serializes the MSAL token cache to encrypted local storage so users don't re-authenticate on each app launch or tenant switch. PnP.Framework 1.18.0 already depends on this (>= 4.70.2). | +| Microsoft.Identity.Client.Desktop | 4.82.1 | Windows-native broker support (WAM) | Enables Windows Authentication Manager integration for WPF apps. Provides system-level SSO. Add `.WithWindowsBroker()` to the PublicClientApplicationBuilder. | + +**Multi-tenant session caching pattern:** Create one `PublicClientApplication` per tenant, serialize each tenant's token cache separately using `MsalCacheHelper` from Extensions.Msal. Store serialized caches in `%AppData%\SharepointToolbox\tokens\{tenantId}.bin`. PnP.Framework's `AuthenticationManager.CreateWithInteractiveLogin()` accepts a custom MSAL app instance — wire the cached app here. + +### MVVM Infrastructure + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| CommunityToolkit.Mvvm | 8.4.2 | MVVM base classes, source-generated commands and properties, messaging | Microsoft-maintained, ships with .NET Community Toolkit. Source generators eliminate 90% of MVVM boilerplate. `[ObservableProperty]`, `[RelayCommand]`, `[INotifyPropertyChanged]` attributes generate all property change plumbing at compile time. The standard choice for WPF/MVVM in 2025-2026. | +| Microsoft.Extensions.Hosting | 10.x | Generic Host for DI, configuration, lifetime management | Provides `IServiceCollection` DI container, `IConfiguration`, and structured app startup/shutdown lifecycle in WPF. Avoids manual service locator patterns. Wire WPF `Application.Startup` into the host lifetime. | +| Microsoft.Extensions.DependencyInjection | 10.x | DI container | Included with Hosting. Register ViewModels, services, and repositories as scoped/singleton/transient services. | + +### Logging + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| Serilog | 4.3.1 | Structured logging | Industry standard for .NET desktop apps. Structured log events (not just strings) make post-mortem debugging of the existing app's 38 silent catch blocks tractable. File sink for persistent logs, debug sink for development. | +| Serilog.Extensions.Logging | 10.0.0 | Bridge Serilog into ILogger | Allows injecting `ILogger` everywhere while Serilog handles the actual output. One configuration point. | +| Serilog.Sinks.File | latest | Write logs to rolling files | `%AppData%\SharepointToolbox\logs\log-.txt` with daily rolling. Essential for diagnosing auth and SharePoint API failures in the field. | + +### Data Serialization + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| System.Text.Json | built-in .NET 10 | JSON read/write for profiles, settings, templates | Built into .NET, no NuGet dependency, faster and less memory-hungry than Newtonsoft.Json. Sufficient for the simple config/profile/template structures this app needs. The existing PowerShell app uses JSON — `System.Text.Json` with source generators enables AOT-safe deserialization, important for self-contained EXE size. | + +**Why not Newtonsoft.Json:** Slower, adds ~500KB to the EXE, no AOT support. Only justified when you need LINQ-to-JSON or highly polymorphic deserialization — neither of which applies here. + +### Data Visualization (Charts) + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| ScottPlot.WPF | 5.1.57 | Pie and bar charts for storage metrics | Stable, actively maintained (weekly releases), MIT licensed, no paid tier. Supports pie, bar, and all chart types needed. Renders via SkiaSharp — fast even for large datasets. LiveCharts2 is still RC for WPF (2.0.0-rc6.1 as of April 2026) and introduces unnecessary risk. OxyPlot is mature but lacks interactive features and has poor performance on large datasets. ScottPlot 5.x is the stable choice. | + +### Report Generation + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| CsvHelper | latest stable | CSV export | Industry standard for .NET CSV serialization. Handles encoding, quoting, header generation. Replaces manual string concatenation. | +| No HTML library needed | — | HTML reports | Generate HTML reports via `StringBuilder` or T4/Scriban text templates with embedded JS (Chart.js or DataTables). Self-contained HTML files require no server. Keep it simple — a `ReportBuilder` service class is sufficient. | + +### Localization + +| Library | Version | Purpose | Why Recommended | +|---------|---------|---------|-----------------| +| .NET Resource files (.resx) | built-in | EN/FR localization | ResX is the standard WPF localization approach for a two-language desktop app. Compile-time safety, strong tooling in Visual Studio, no runtime switching complexity. The existing app uses a key-based translation system — ResX maps directly. Use `Properties/Resources.en.resx` and `Properties/Resources.fr.resx`. Runtime language switching (if needed later) is achievable via `Thread.CurrentThread.CurrentUICulture`. | + +### Distribution + +| Tool | Version | Purpose | Why Recommended | +|------|---------|---------|-----------------| +| `dotnet publish` with PublishSingleFile + SelfContained | .NET 10 SDK | Single self-contained EXE | Built-in SDK feature. Set `true`, `true`, `win-x64`. No third-party tool needed. Expected output size: ~150-200MB (runtime + SkiaSharp from ScottPlot). | + +--- + +## Project File Configuration + +```xml + + net10.0-windows + true + enable + enable + + true + true + win-x64 + + false + +``` + +**Note on trimming:** Do NOT enable `PublishTrimmed` with PnP.Framework or MSAL.NET. Both libraries use reflection internally and are not trim-safe. The EXE will be larger (~150-200MB) but reliable. Trimming would require extensive `[DynamicDependency]` annotations and is not worth the effort. + +--- + +## Installation (NuGet Package References) + +```xml + + + + + + + + + + + + + + + + + + + + + + + +``` + +--- + +## Alternatives Considered + +| Category | Recommended | Alternative | Why Not | +|----------|-------------|-------------|---------| +| .NET version | .NET 10 LTS | .NET 8 LTS | .NET 8 support ends November 2026 — too soon for a new project to start on | +| .NET version | .NET 10 LTS | .NET 9 STS | .NET 9 ended May 2026 — already past EOL at time of writing | +| SharePoint API | PnP.Framework | PnP Core SDK | PnP Core SDK is Graph-first and not yet feature-complete for CSOM-heavy provisioning operations. Wrong choice for a migration from PnP.PowerShell patterns. | +| MVVM toolkit | CommunityToolkit.Mvvm | Prism | Prism adds module/region/navigation complexity appropriate for large enterprise apps. This is a focused admin tool — CommunityToolkit.Mvvm is leaner and Microsoft-maintained. | +| Charts | ScottPlot.WPF | LiveCharts2 | LiveCharts2 WPF package is still RC (2.0.0-rc6.1). Unstable API surface is inappropriate for production. | +| Charts | ScottPlot.WPF | OxyPlot | OxyPlot has poor performance on large datasets and limited interactivity. Low activity/maintenance compared to ScottPlot 5. | +| JSON | System.Text.Json | Newtonsoft.Json | Newtonsoft.Json adds ~500KB to EXE, is slower, and has no AOT support. Not needed for simple config structures. | +| Localization | ResX (.resx files) | WPF ResourceDictionary XAML | ResourceDictionary localization is more complex, harder to maintain with tooling, and overkill for a two-language app. ResX provides compile-time safety. | +| HTML reports | T4/StringBuilder | Razor / Blazor Hybrid | A dedicated template engine adds a dependency for what is a one-time file generation task. StringBuilder or Scriban (lightweight) is sufficient. | +| Logging | Serilog | Microsoft.Extensions.Logging (built-in) | Built-in logging lacks file sinks and structured event support without additional providers. Serilog is de facto standard for desktop .NET apps. | + +--- + +## What NOT to Use + +| Avoid | Why | Use Instead | +|-------|-----|-------------| +| LiveCharts2 WPF | Still in RC (2.0.0-rc6.1 as of April 2026) — unstable API, potential breaking changes before 2.0 GA | ScottPlot.WPF 5.1.57 (stable, weekly releases) | +| PnP Core SDK (as primary SharePoint lib) | Graph-first design doesn't match the CSOM-heavy provisioning/permissions operations being migrated. The PnP Provisioning Engine is only in PnP.Framework | PnP.Framework 1.18.0 | +| Prism Framework | Overengineered for this use case. Adds module system, region navigation complexity that doesn't match a single-window admin tool | CommunityToolkit.Mvvm 8.4.2 | +| PublishTrimmed=true | PnP.Framework and MSAL.NET use reflection and are not trim-safe. Trimming causes runtime crashes | Keep trimming disabled; accept larger EXE | +| .NET 8 as target | EOL November 2026 — a new project started now should not immediately be on a near-EOL runtime | .NET 10 LTS (supported until November 2028) | +| SQLite / LiteDB | Out of scope per project constraints. JSON is sufficient for profiles, settings, templates. | System.Text.Json with file-based storage | +| DeviceLogin / client secrets for auth | Per project memory note: MSP workflow requires interactive login, never DeviceLogin for PnP registration | MSAL interactive browser login via `WithInteractiveBrowser()` | +| WinForms | The existing app is WinForms. The rewrite targets WPF explicitly for MVVM data binding and richer styling | WPF | + +--- + +## Version Compatibility Notes + +| Concern | Detail | +|---------|--------| +| PnP.Framework on .NET 10 | PnP.Framework targets .NET Standard 2.0, .NET 8.0, .NET 9.0. It runs on .NET 10 via .NET Standard 2.0 compatibility. No explicit .NET 10 TFM yet (as of April 2026), but the .NET Standard 2.0 path is stable. | +| MSAL version pinning | PnP.Framework 1.18.0 requires `Microsoft.Identity.Client.Extensions.Msal >= 4.70.2`. Installing 4.83.3 satisfies this constraint. Pin to 4.83.x to avoid drift. | +| Microsoft.Graph SDK major version | Use 5.x only. The 4.x to 5.x upgrade introduced Kiota-generated code with significant breaking changes. Do not mix 4.x and 5.x packages. | +| CommunityToolkit.Mvvm source generators | 8.4.2 introduces partial properties support requiring C# 13 / .NET 9+ SDK. On .NET 10 this is fully supported. | +| ScottPlot.WPF + SkiaSharp | ScottPlot 5.x bundles SkiaSharp. Ensure no version conflict if SkiaSharp is pulled in by another dependency. ScottPlot.WPF 5.1.57 bundles SkiaSharp 2.88.x. | + +--- + +## Sources + +- NuGet: https://www.nuget.org/packages/PnP.Framework/ — version 1.18.0 confirmed, .NET targets confirmed +- NuGet: https://www.nuget.org/packages/Microsoft.Graph/ — version 5.103.0 confirmed +- NuGet: https://www.nuget.org/packages/microsoft.identity.client — version 4.83.1 confirmed +- NuGet: https://www.nuget.org/packages/Microsoft.Identity.Client.Extensions.Msal/ — version 4.83.3 confirmed +- NuGet: https://www.nuget.org/packages/CommunityToolkit.Mvvm/ — version 8.4.2 confirmed +- NuGet: https://www.nuget.org/packages/ScottPlot.WPF — version 5.1.57 (stable), 5.1.58 (latest as of March 2026) +- NuGet: https://www.nuget.org/packages/serilog/ — version 4.3.1 confirmed +- Microsoft Learn: https://learn.microsoft.com/en-us/dotnet/core/deploying/single-file/overview — PublishSingleFile guidance, .NET 8+ SelfContained behavior change +- .NET Blog: https://devblogs.microsoft.com/dotnet/announcing-dotnet-10/ — .NET 10 LTS November 2025 GA +- .NET Support Policy: https://dotnet.microsoft.com/en-us/platform/support/policy/dotnet-core — LTS lifecycle dates +- PnP Framework GitHub: https://github.com/pnp/pnpframework — .NET targets, auth patterns +- PnP Framework vs Core comparison: https://github.com/pnp/pnpframework/issues/620 — authoritative guidance on which library to use +- MSAL token cache: https://learn.microsoft.com/en-us/entra/msal/dotnet/how-to/token-cache-serialization — cache serialization patterns +- CommunityToolkit 8.4 announcement: https://devblogs.microsoft.com/dotnet/announcing-the-dotnet-community-toolkit-840/ — partial properties, .NET 10 support + +--- + +*Stack research for: SharePoint Online administration desktop tool (C#/WPF)* +*Researched: 2026-04-02* diff --git a/.planning/research/SUMMARY.md b/.planning/research/SUMMARY.md new file mode 100644 index 0000000..c8112e3 --- /dev/null +++ b/.planning/research/SUMMARY.md @@ -0,0 +1,212 @@ +# Project Research Summary + +**Project:** SharePoint Toolbox — C#/WPF SharePoint Online Administration Desktop Tool +**Domain:** SharePoint Online administration, auditing, and provisioning (MSP / IT admin) +**Researched:** 2026-04-02 +**Confidence:** HIGH + +## Executive Summary + +This project is a full rewrite of a PowerShell-based SharePoint Online administration toolbox into a standalone C#/WPF desktop application targeting MSP administrators who manage 10–30 client tenants simultaneously. The research confirms that the correct technical path is .NET 10 LTS with WPF, PnP.Framework (not PnP.Core SDK) as the SharePoint library, and CommunityToolkit.Mvvm for the MVVM layer. The key architectural constraint is that multi-tenant session caching — holding MSAL token caches per tenant with `MsalCacheHelper` — must be the very first infrastructure component built, because every single feature gates on it. The recommended architecture is a strict four-layer MVVM pattern (View → ViewModel → Service → Infrastructure) with no WPF types below the ViewModel layer, constructor-injected interfaces throughout, and `AsyncRelayCommand` for every SharePoint operation. + +The feature scope is well-defined: parity with the existing PowerShell tool is the v1 MVP (permissions reports, storage metrics, file search, bulk operations, site templates, duplicate detection, error reporting, EN/FR localization). Three new features are justified for a v1.x release once core parity is validated — user access export across sites, simplified plain-language permissions view, and storage charts by file type. These represent genuine competitive differentiation against SaaS tools like ShareGate and ManageEngine, which are cloud-based, subscription-priced, and do not offer local offline operation or MSP-grade multi-tenant context switching. + +The most dangerous risk is not technical complexity but porting discipline: the existing codebase has 38 silent catch blocks and no async discipline. The single highest-priority constraint for the entire project is that async patterns (`AsyncRelayCommand`, `IProgress`, `CancellationToken`, `ExecuteQueryRetryAsync`) must be established in the foundation phase and enforced through code review before any feature work begins. Retrofitting these patterns after-the-fact is among the most expensive refactors possible in a WPF codebase. Similarly, the write-then-replace JSON persistence pattern and SharePoint pagination helpers must be built once in the foundation and reused everywhere — building these per-feature guarantees divergence and bugs. + +## Key Findings + +### Recommended Stack + +The stack is fully resolved with high confidence. All package versions are confirmed on NuGet as of 2026-04-02. The runtime is .NET 10 LTS (EOL November 2028); .NET 8 was explicitly rejected because it reaches EOL in November 2026 — too soon for a new project. PnP.Framework 1.18.0 is the correct SharePoint library choice because this is a CSOM-heavy migration from PnP.PowerShell patterns and the PnP Provisioning Engine (required for site templates) lives only in PnP.Framework, not in PnP.Core SDK. Do not use `PublishTrimmed=true` — PnP.Framework and MSAL use reflection and are not trim-safe; the self-contained EXE will be approximately 150–200 MB, which is acceptable per project constraints. + +**Core technologies:** +- **.NET 10 LTS + WPF**: Windows-only per constraint; richer MVVM binding than WinForms (the existing framework) +- **PnP.Framework 1.18.0**: CSOM operations, PnP Provisioning Engine, site templates — the direct C# equivalent of PnP.PowerShell +- **Microsoft.Graph 5.103.0**: Teams, groups, user enumeration across tenants — Graph-native operations only +- **MSAL.NET 4.83.1 + Extensions.Msal 4.83.3 + Desktop 4.82.1**: Multi-tenant token cache per tenant, Windows broker (WAM) support +- **CommunityToolkit.Mvvm 8.4.2**: Source-generated `[ObservableProperty]`, `[RelayCommand]`, `AsyncRelayCommand` — eliminates MVVM boilerplate +- **Microsoft.Extensions.Hosting 10.x**: DI container (`IServiceCollection`), app lifetime, `IConfiguration` +- **Serilog 4.3.1 + file sink**: Structured logging to rolling files in `%AppData%\SharepointToolbox\logs\` — essential for diagnosing the silent failures in the existing app +- **ScottPlot.WPF 5.1.57**: Pie and bar charts for storage metrics — stable MIT-licensed library (LiveCharts2 WPF is still RC as of April 2026) +- **System.Text.Json (built-in)**: JSON profiles, settings, templates — no Newtonsoft.Json dependency +- **CsvHelper**: CSV export — replaces manual string concatenation +- **.resx localization**: EN/FR compile-time-safe resource files + +### Expected Features + +The feature scope is well-researched. Competitive analysis against ShareGate, ManageEngine SharePoint Manager Plus, and AdminDroid confirms that local offline operation, instant multi-tenant switching, plain-language permissions, and folder structure provisioning are genuine differentiators that no competitor SaaS tool offers. + +**Must have (table stakes — v1 parity):** +- Tenant profile registry + multi-tenant session caching — everything gates on this +- Permissions report (site-level) with CSV + HTML export +- Storage metrics per site +- File search across sites +- Bulk operations (member add, site creation, transfer) with progress and cancellation +- Site template management + folder structure provisioning +- Duplicate file detection +- Error reporting (replace 38 silent catch blocks with visible failures) +- Localization (EN/FR) — existing users depend on this + +**Should have (competitive differentiators — v1.x):** +- User access export across selected sites — "everything User X can access across 15 sites" — no native M365 equivalent +- Simplified permissions view (plain language) — "can edit files" instead of "Contribute" +- Storage graph by file type (pie + bar toggle) — file-type breakdown competitors don't provide + +**Defer (v2+):** +- Scheduled scan runs via Windows Task Scheduler (requires stable CLI/headless mode first) +- Permission comparison/diff between two time points (requires snapshot storage) +- XLSX export (CSV opens in Excel adequately for v1) + +**Anti-features to reject outright:** real-time permission change alerts (requires persistent Azure service), automated remediation (liability risk), cloud sync, AI governance recommendations (Microsoft's own roadmap). + +### Architecture Approach + +The recommended architecture is a strict four-layer MVVM pattern hosted in `Microsoft.Extensions.Hosting`. The application is organized as: Views (XAML only, zero code-behind) → ViewModels (CommunityToolkit.Mvvm, one per feature tab) → Services (domain logic, stateless, constructor-injected via interfaces) → Infrastructure (PnP.Framework, Microsoft.Graph, local JSON files). Cross-ViewModel communication uses `WeakReferenceMessenger` (e.g., tenant-switched event resets all feature VM state). A singleton `SessionManager` is the only class that holds `ClientContext` objects — services request a context per operation and never store it. The `Core/` folder contains pure C# models and interfaces with no WPF references, making all services independently testable. + +**Major components:** +1. **AuthService / SessionManager** — multi-tenant MSAL token cache, `TenantSession` per tenant, active profile state; singleton; every feature gates on this +2. **Feature Services (6)** — PermissionsService, StorageService, SearchService, TemplateService, DuplicateService, BulkOpsService — stateless, cancellable, progress-reporting; registered as transient +3. **ReportExportService + CsvExportService** — self-contained HTML reports (embedded JS/CSS) and CSV generation; called after operation completes +4. **SettingsService** — JSON profiles, templates, settings with write-then-replace pattern and `SemaphoreSlim` concurrency guard; singleton +5. **MainWindowViewModel** — shell navigation, tenant selector, log panel; delegates all feature logic to feature ViewModels via DI +6. **Feature ViewModels (7)** — one per tab (Permissions, Storage, Search, Templates, Duplicates, BulkOps, Settings); own `CancellationTokenSource` and `ObservableCollection` per operation + +### Critical Pitfalls + +10 pitfalls were identified. All 10 are addressed in Phase 1 (Foundation) — none can be deferred to feature phases. + +1. **Sync calls on the UI thread** — Never use `.Result`/`.Wait()` on the UI thread; every PnP call must use `await` with the async overload or `Task.Run`; use `AsyncRelayCommand` for all commands. Establish this pattern before any feature work begins or retrofitting costs will be severe. + +2. **Porting silent error suppression** — The existing app has 38 empty catch blocks. Every `catch` in the C# rewrite must do one of three things: log-and-recover, log-and-rethrow, or log-and-surface to the user. Treat empty catch as a build defect from day one. + +3. **SharePoint 5,000-item list view threshold** — All CSOM list enumeration must use `CamlQuery` with `RowLimit` ≤ 2,000 and `ListItemCollectionPosition` pagination. Build a shared pagination helper in Phase 1 and mandate its use in every feature that enumerates list items. + +4. **Multi-tenant token cache race conditions** — Use `MsalCacheHelper` (Microsoft.Identity.Client.Extensions.Msal) for file-based per-tenant token cache serialization. Scope `IPublicClientApplication` per ClientId, not per tenant URL. Provide a "Clear cached sessions" UI action. + +5. **JSON settings file corruption on concurrent writes** — Use write-then-replace (`filename.tmp` → validate → `File.Move`) plus `SemaphoreSlim(1)` per file. Implement before any feature persists data. Known bug in the existing app per CONCERNS.md. + +6. **WPF `ObservableCollection` updates from background threads** — Collect results into `List` on background thread, then assign `new ObservableCollection(list)` atomically via `Dispatcher.InvokeAsync`. Use `IProgress` for streaming. Never modify `ObservableCollection` from `Task.Run`. + +7. **`async void` command handlers** — Use `AsyncRelayCommand` exclusively for async operations. `async void` swallows exceptions post-`await`. Wire `Application.DispatcherUnhandledException` and `TaskScheduler.UnobservedTaskException` as last-resort handlers. + +8. **API throttling (429/503)** — Always use `ExecuteQueryRetryAsync` (never `ExecuteQuery`). For Graph SDK, the default retry handler respects `Retry-After` automatically. Surface retry events to the user as progress messages. + +9. **`ClientContext` resource disposal gaps** — Always obtain `ClientContext` inside `using` or `await using`. Verify `Dispose()` is called on cancellation via unit tests. + +10. **WPF trimming breaks self-contained EXE** — Never set `PublishTrimmed=true`. Accept the ~150–200 MB EXE size. Use `PublishReadyToRun=true` for startup performance instead. + +## Implications for Roadmap + +Based on the combined research, the dependency graph from ARCHITECTURE.md and FEATURES.md, and the pitfall-to-phase mapping from PITFALLS.md, the following phase structure is strongly recommended: + +### Phase 1: Foundation and Infrastructure +**Rationale:** All 10 critical pitfalls must be resolved before feature work begins. The dependency graph in FEATURES.md shows that every feature requires the tenant profile registry and session caching layer. Establishing async patterns, error handling, DI container, logging, and JSON persistence now prevents the most expensive retrofits. +**Delivers:** Runnable WPF shell with tenant selector, multi-tenant session caching (MSAL + MsalCacheHelper), DI container wiring, Serilog logging, SettingsService with write-then-replace persistence, ResX localization scaffolding, shared pagination helper, shared `AsyncRelayCommand` pattern, global exception handlers. +**Addresses:** Tenant profile registry (prerequisite for all features), EN/FR localization scaffolding, error reporting infrastructure. +**Avoids:** All 10 pitfalls — async deadlocks, silent errors, token cache races, JSON corruption, ObservableCollection threading, async void, throttling, disposal gaps, trimming. +**Research flag:** Standard patterns — `Microsoft.Extensions.Hosting` + `CommunityToolkit.Mvvm` + `MsalCacheHelper` are well-documented. No additional research needed. + +### Phase 2: Permissions and Audit Core +**Rationale:** Permissions reporting is the highest-value daily-use feature and the canonical audit use case. Building it second validates that the auth layer and pagination helper work under real conditions before other features depend on them. It also forces the error reporting UX to be finalized early. +**Delivers:** Site-level permissions report with recursive scan (configurable depth), CSV export, self-contained HTML export, plain progress feedback ("Scanning X of Y sites"), error surface for failed scans (no silent failures). +**Addresses:** Permissions report (table stakes P1), CSV + HTML export (table stakes P1), error reporting (table stakes P1). +**Avoids:** 5,000-item threshold (pagination helper reuse), silent errors (error handling from Phase 1), sync/async deadlock (AsyncRelayCommand from Phase 1). +**Research flag:** Standard patterns — PnP Framework permission scanning is well-documented. PnP permissions API is HIGH confidence. + +### Phase 3: Storage Metrics and File Operations +**Rationale:** Storage metrics and file search are the other two daily-use features in the existing tool. They reuse the auth session and export infrastructure from Phases 1–2. Duplicate detection depends on the file enumeration infrastructure built for file search, so these belong together. +**Delivers:** Storage metrics per site (total + breakdown), file search across sites (KQL-based), duplicate file detection (hash or name+size matching), storage data export (CSV + HTML). +**Addresses:** Storage metrics (P1), file search (P1), duplicate detection (P1). +**Avoids:** Large collection streaming (IProgress pattern from Phase 1), Graph SDK pagination (`PageIterator`), API throttling (retry handler from Phase 1). +**Research flag:** Duplicate detection against large tenants under Graph throttling may need tactical research during planning — hash-based detection at scale has specific pagination constraints. + +### Phase 4: Bulk Operations and Provisioning +**Rationale:** Bulk operations (member add, site creation, transfer) and site/folder template management are the remaining P1 features. They are the highest-complexity features (HIGH implementation cost in FEATURES.md) and benefit from stable async/cancel/progress infrastructure from Phase 1. Folder provisioning depends on site template management — build together. +**Delivers:** Bulk member add/remove, bulk site creation, ownership transfer, site template capture and apply, folder structure provisioning from template. +**Addresses:** Bulk operations with progress/cancel (P1), site template management (P1), folder structure provisioning (P1). +**Avoids:** Operation cancellation (CancellationToken threading from Phase 1), partial-failure reporting (error surface from Phase 2), API throttling (retry handler from Phase 1). +**Research flag:** PnP Provisioning Engine for site templates may need specific research during planning — template schema and apply behavior are documented but edge cases (Teams-connected sites, modern vs. classic) need validation. + +### Phase 5: New Differentiating Features (v1.x) +**Rationale:** These three features are new capabilities (not existing-tool parity) that depend on stable v1 infrastructure. User access export across sites requires multi-site permissions scan from Phase 2. Storage charts require storage metrics from Phase 3. Plain-language permissions view is a presentation layer on top of the permissions data model from Phase 2. Grouping them as v1.x avoids blocking the v1 release on new development. +**Delivers:** User access export across arbitrary site subsets (cross-site access report for a single user), simplified plain-language permissions view (jargon-free labels, color coding), storage graph by file type (pie/bar toggle via ScottPlot.WPF). +**Addresses:** User access export (P2), simplified permissions view (P2), storage graph by file type (P2). +**Uses:** ScottPlot.WPF 5.1.57, existing PermissionsService and StorageService from Phases 2–3. +**Research flag:** User access export across sites involves enumerating group memberships, direct assignments, and inherited access across N sites — the Graph API volume and correct enumeration approach may need targeted research. + +### Phase 6: Distribution and Hardening +**Rationale:** Packaging, end-to-end validation on clean machines, FR locale completeness check, and the "looks done but isn't" checklist from PITFALLS.md. Must be done before any release, not as an afterthought. +**Delivers:** Single self-contained EXE (`PublishSingleFile=true`, `SelfContained=true`, `PublishTrimmed=false`, `win-x64`), validated on a machine with no .NET runtime, FR locale fully tested, throttling recovery verified, JSON corruption recovery verified, cancellation verified, 5,000+ item library tested. +**Avoids:** WPF trimming crash (Pitfall 6), "works on dev machine" surprises. +**Research flag:** Standard patterns — `dotnet publish` single-file configuration is well-documented. + +### Phase Ordering Rationale + +- **Foundation first** is mandatory: all 10 pitfalls map to Phase 1. The auth layer and async patterns are prerequisites for every subsequent phase. Starting features before the foundation is solid replicates the original app's architectural problems. +- **Permissions before storage/search** because permissions validates the pagination helper, auth layer, and export pipeline under real conditions with the most complex data model. +- **Bulk ops and provisioning after core read operations** because they have higher risk (they write to client tenants) and should be tested against a validated auth layer and error surface. +- **New v1.x features after v1 parity** to avoid blocking the release on non-parity features. The three P2 features are all presentation or cross-cutting enhancements on top of stable Phase 2–3 data models. +- **Distribution last** because EXE packaging must be validated against the complete feature set. + +### Research Flags + +Phases likely needing `/gsd:research-phase` during planning: +- **Phase 3 (Duplicate detection):** Hash-based detection under Graph throttling constraints at large scale — specific pagination strategy and concurrency limits for file enumeration need validation. +- **Phase 4 (Site templates):** PnP Provisioning Engine behavior for Teams-connected sites, modern site template schema edge cases, and apply-template behavior on non-empty sites need verification. +- **Phase 5 (User access export):** Graph API approach for enumerating all permissions for a single user across N sites (group memberships + direct assignments + inherited) — the correct API sequence and volume implications need targeted research. + +Phases with standard patterns (skip research-phase): +- **Phase 1 (Foundation):** `Microsoft.Extensions.Hosting` + `CommunityToolkit.Mvvm` + `MsalCacheHelper` patterns are extensively documented in official Microsoft sources. +- **Phase 2 (Permissions):** PnP Framework permission scanning APIs are HIGH confidence from official PnP documentation. +- **Phase 6 (Distribution):** `dotnet publish` single-file configuration is straightforward and well-documented. + +## Confidence Assessment + +| Area | Confidence | Notes | +|------|------------|-------| +| Stack | HIGH | All package versions verified on NuGet; .NET lifecycle dates confirmed on Microsoft support policy page; PnP.Framework vs PnP.Core SDK choice verified against authoritative GitHub issue | +| Features | MEDIUM | Microsoft docs (permissions reports, storage reports, Graph API) are HIGH; competitor feature analysis from marketing pages is MEDIUM; no direct API testing performed | +| Architecture | HIGH | MVVM patterns from Microsoft Learn (official); PnP Framework auth patterns from official PnP docs; `MsalCacheHelper` from official MSAL.NET docs | +| Pitfalls | HIGH | Critical pitfalls verified via official docs, PnP GitHub issues, and direct audit of the existing codebase (CONCERNS.md); async deadlock and WPF trimming pitfalls confirmed via dotnet/wpf GitHub issues | + +**Overall confidence:** HIGH + +### Gaps to Address + +- **PnP Provisioning Engine for Teams-connected sites:** The behavior of `PnP.Framework`'s provisioning engine when applied to Teams-connected modern team sites (vs. classic or communication sites) is not fully documented. Validate during Phase 4 planning with a dedicated research spike. +- **User cross-site access enumeration via Graph API:** The correct Graph API sequence for "all permissions for user X across N sites" (covering group memberships, direct site assignments, and SharePoint group memberships) has multiple possible approaches with different throttling profiles. Validate the most efficient approach during Phase 5 planning. +- **Graph API volume for duplicate detection:** Enumerating file hashes across a large tenant (100k+ files) via `driveItem` Graph calls has unclear throttling limits at that scale. The practical concurrency limit and whether SHA256 computation must happen client-side needs validation. +- **ScottPlot.WPF XAML integration:** ScottPlot 5.x WPF XAML control integration patterns are less documented than the WinForms equivalent. Validate the `WpfPlot` control binding approach during Phase 5 planning. + +## Sources + +### Primary (HIGH confidence) +- Microsoft Learn: MSAL token cache serialization — https://learn.microsoft.com/en-us/entra/msal/dotnet/how-to/token-cache-serialization +- Microsoft Learn: Single-file publishing overview — https://learn.microsoft.com/en-us/dotnet/core/deploying/single-file/overview +- Microsoft Learn: AsyncRelayCommand (CommunityToolkit.Mvvm) — https://learn.microsoft.com/en-us/dotnet/communitytoolkit/mvvm/asyncrelaycommand +- Microsoft Learn: SharePoint Online list view threshold — https://learn.microsoft.com/en-us/troubleshoot/sharepoint/lists-and-libraries/items-exceeds-list-view-threshold +- Microsoft Learn: Graph SDK paging — https://learn.microsoft.com/en-us/graph/sdks/paging +- Microsoft Learn: Graph throttling guidance — https://learn.microsoft.com/en-us/graph/throttling +- PnP Framework GitHub: https://github.com/pnp/pnpframework — .NET targets, auth patterns +- PnP Framework vs Core authoritative comparison: https://github.com/pnp/pnpframework/issues/620 +- PnP Framework auth issues: https://github.com/pnp/pnpframework/issues/961, /447 +- dotnet/wpf trimming issues: https://github.com/dotnet/wpf/issues/4216, /6096 +- .NET 10 announcement: https://devblogs.microsoft.com/dotnet/announcing-dotnet-10/ +- .NET support policy: https://dotnet.microsoft.com/en-us/platform/support/policy/dotnet-core +- CommunityToolkit 8.4 announcement: https://devblogs.microsoft.com/dotnet/announcing-the-dotnet-community-toolkit-840/ +- Existing codebase CONCERNS.md audit (2026-04-02) + +### Secondary (MEDIUM confidence) +- ShareGate SharePoint audit tool feature page — https://sharegate.com/sharepoint-audit-tool +- ManageEngine SharePoint Manager Plus — https://www.manageengine.com/sharepoint-management-reporting/sharepoint-permission-auditing-tool.html +- AdminDroid SharePoint Online auditing — https://admindroid.com/microsoft-365-sharepoint-online-auditing +- sprobot.io: 9 must-have features for SharePoint storage reporting — https://www.sprobot.io/blog/how-to-choose-the-right-sharepoint-storage-reporting-tool-9-must-have-features +- WPF Development Best Practices 2024 — https://medium.com/mesciusinc/wpf-development-best-practices-for-2024-9e5062c71350 +- Rick Strahl: Async and Async Void Event Handling in WPF — https://weblog.west-wind.com/posts/2022/Apr/22/Async-and-Async-Void-Event-Handling-in-WPF + +### Tertiary (LOW confidence) +- NuGet: ScottPlot.WPF XAML control documentation — sparse; WpfPlot binding patterns need hands-on validation + +--- +*Research completed: 2026-04-02* +*Ready for roadmap: yes*