docs(15): research phase domain for consolidation data model

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Dev
2026-04-09 11:32:23 +02:00
parent e3ff27a673
commit 9031fd3473

View File

@@ -0,0 +1,447 @@
# Phase 15: Consolidation Data Model - Research
**Researched:** 2026-04-09
**Domain:** C# data modeling, LINQ grouping, pure data transformation
**Confidence:** HIGH
## Summary
Phase 15 creates a `ConsolidatedPermissionEntry` model and a `PermissionConsolidator` service that merges flat `UserAccessEntry` rows into consolidated rows grouped by a composite key (`UserLogin` + `PermissionLevel` + `AccessType` + `GrantedThrough`). This is a pure data transformation with zero API calls, making it fully testable in isolation.
The codebase already has an established pattern for composite-key grouping in `DuplicatesService.MakeKey()` -- a static method that builds a pipe-delimited string key from selected fields, used with LINQ `GroupBy`. The consolidator should follow this exact pattern. The project uses C# records extensively for immutable data models, xUnit + Moq for testing, and organizes code into `Core/Models`, `Core/Helpers`, and `Services` directories.
**Primary recommendation:** Create `LocationInfo` record in `Core/Models`, `ConsolidatedPermissionEntry` record in `Core/Models`, and a static `PermissionConsolidator` class in `Core/Helpers` following the `MakeKey()` + LINQ `GroupBy` pattern from `DuplicatesService`.
<user_constraints>
## User Constraints (from CONTEXT.md)
### Locked Decisions
| Decision | Value |
|---|---|
| Consolidation scope | User access audit report only -- site-centric permission report is unchanged |
| Source model | `UserAccessEntry` (already normalized, one user per row) |
| Consolidation is opt-in | Defaults to OFF; toggle wired in Phase 16 |
| No API calls | Pure data transformation -- no Graph or CSOM calls |
| Existing exports unchanged | When consolidation is not applied, output is identical to pre-v2.3 |
### Discussed Areas (Locked)
- Consolidation key: merge when `UserLogin` + `PermissionLevel` + `AccessType` + `GrantedThrough` all match
- Merged locations: `List<LocationInfo>` with `LocationCount` convenience property
- `LocationInfo` is a lightweight record: `{ string SiteUrl, string SiteTitle, string ObjectTitle, string ObjectUrl, string ObjectType }`
- `ConsolidatedPermissionEntry` holds all key fields plus `List<LocationInfo>`
- Presentation decisions deferred to Phase 16
### Deferred Ideas (OUT OF SCOPE)
- Consolidation toggle UI (Phase 16)
- Consolidated view rendering in HTML exports (Phase 16)
- Group expansion within consolidated rows (Phase 17)
- Consolidation in CSV exports (out of scope per REQUIREMENTS.md)
- "Same permission set across sites" consolidation for site-centric report (not planned)
</user_constraints>
<phase_requirements>
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|-----------------|
| RPT-04 | Consolidated reports merge rows for the same user with identical access levels across multiple locations into a single row | `PermissionConsolidator` service with `MakeKey()` composite key grouping over `UserAccessEntry` list, producing `ConsolidatedPermissionEntry` with `List<LocationInfo>` |
</phase_requirements>
## Standard Stack
### Core
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| .NET 10.0 | net10.0-windows | Target framework | Project already targets this |
| C# records | N/A | Immutable model types | Project pattern -- `UserAccessEntry`, `PermissionEntry`, `SiteInfo` all use positional records |
| LINQ | N/A | GroupBy for consolidation | Project pattern -- `DuplicatesService` uses `GroupBy(item => MakeKey(...))` |
### Testing
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|-------------|
| xUnit | 2.9.3 | Test framework | Already in test project |
| Moq | 4.20.72 | Mocking | Already in test project (not needed for pure logic tests) |
| Microsoft.NET.Test.Sdk | 17.14.1 | Test runner | Already in test project |
**Installation:** No new packages needed. All required libraries are already present.
## Architecture Patterns
### Recommended Project Structure
```
SharepointToolbox/
Core/
Models/
LocationInfo.cs # NEW - lightweight record for merged location data
ConsolidatedPermissionEntry.cs # NEW - consolidated row model
Helpers/
PermissionConsolidator.cs # NEW - static consolidation logic
Services/
(no changes in Phase 15)
SharepointToolbox.Tests/
Helpers/
PermissionConsolidatorTests.cs # NEW - pure logic tests
```
### Pattern 1: Positional Record for Immutable Models
**What:** All data models in this project use C# positional records (constructor-defined properties).
**When to use:** For all new model types.
**Example (from existing code):**
```csharp
// Source: SharepointToolbox/Core/Models/UserAccessEntry.cs
public record UserAccessEntry(
string UserDisplayName,
string UserLogin,
string SiteUrl,
string SiteTitle,
string ObjectType,
string ObjectTitle,
string ObjectUrl,
string PermissionLevel,
AccessType AccessType,
string GrantedThrough,
bool IsHighPrivilege,
bool IsExternalUser
);
```
### Pattern 2: MakeKey Composite Key Grouping
**What:** Static method builds a pipe-delimited string key from fields, used with LINQ `GroupBy` to find duplicates/consolidations.
**When to use:** When grouping rows by a multi-field composite key.
**Example (from existing code):**
```csharp
// Source: SharepointToolbox/Services/DuplicatesService.cs (lines 217-226)
internal static string MakeKey(DuplicateItem item, DuplicateScanOptions opts)
{
var parts = new List<string> { item.Name.ToLowerInvariant() };
if (opts.MatchSize && item.SizeBytes.HasValue) parts.Add(item.SizeBytes.Value.ToString());
// ... more optional parts
return string.Join("|", parts);
}
// Usage (lines 36-47):
var groups = allItems
.GroupBy(item => MakeKey(item, options))
.Where(g => g.Count() >= 2)
.Select(g => new DuplicateGroup { ... })
.ToList();
```
### Pattern 3: Group Result Model
**What:** A class/record holding the group key plus a list of grouped items.
**When to use:** For storing grouped results.
**Example (from existing code):**
```csharp
// Source: SharepointToolbox/Core/Models/DuplicateGroup.cs
public class DuplicateGroup
{
public string GroupKey { get; set; } = string.Empty;
public string Name { get; set; } = string.Empty;
public List<DuplicateItem> Items { get; set; } = new();
}
```
### Anti-Patterns to Avoid
- **Mutable consolidation state:** Do not use mutable dictionaries that accumulate state across calls. Use LINQ `GroupBy` + `Select` for a single-pass functional transformation.
- **Modifying UserAccessEntry:** The source model is a record and must remain unchanged. Consolidation produces a NEW type, not a modified version.
- **Case-sensitive key matching:** `UserLogin` should use case-insensitive comparison in the key (following `DuplicatesService` pattern of `.ToLowerInvariant()`). `PermissionLevel` and `GrantedThrough` should also be case-insensitive. `AccessType` is an enum so comparison is already exact.
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| Composite key hashing | Custom hash function | Pipe-delimited string via `MakeKey()` | Matches existing codebase pattern, readable, debuggable |
| Case-insensitive grouping | Custom IEqualityComparer | `ToLowerInvariant()` in key parts | Simpler, matches `DuplicatesService` pattern |
| Immutable models | Classes with manual equality | C# `record` types | Project convention, value equality built-in |
## Common Pitfalls
### Pitfall 1: Case Sensitivity in Consolidation Key
**What goes wrong:** "Full Control" vs "full control" or "alice@contoso.com" vs "Alice@contoso.com" produce different keys, breaking consolidation.
**Why it happens:** String fields from SharePoint are not consistently cased.
**How to avoid:** Use `.ToLowerInvariant()` on `UserLogin`, `PermissionLevel`, and `GrantedThrough` when building the key (not on the stored values).
**Warning signs:** Tests pass with carefully-cased data but fail with real-world data.
### Pitfall 2: UserDisplayName Varies Across Sites
**What goes wrong:** Same user (`alice@contoso.com`) may have `UserDisplayName` = "Alice Smith" on one site and "Alice M. Smith" on another.
**Why it happens:** Display names are site-local metadata, not identity-tied.
**How to avoid:** `UserDisplayName` is NOT part of the consolidation key (keyed on `UserLogin`). For the consolidated entry, pick the first occurrence or most common display name.
**Warning signs:** Same user appearing as two separate consolidated rows.
### Pitfall 3: IsHighPrivilege and IsExternalUser Across Merged Rows
**What goes wrong:** If a user has "Full Control" (IsHighPrivilege=true) at 3 sites, the consolidated row must preserve `IsHighPrivilege=true`. These boolean flags should be consistent across merged rows (since the key includes PermissionLevel), but worth asserting.
**Why it happens:** All rows in a group share the same PermissionLevel, so IsHighPrivilege will be identical. IsExternalUser is login-based, also consistent.
**How to avoid:** Take the value from the first entry in the group (they should all match). Add a debug assertion or test to verify consistency.
### Pitfall 4: Empty Input Handling
**What goes wrong:** Consolidator crashes or returns unexpected results on empty list.
**Why it happens:** LINQ operations on empty sequences can throw or return surprising defaults.
**How to avoid:** Return empty list immediately for empty input. Test this case explicitly.
### Pitfall 5: LocationInfo Field Completeness
**What goes wrong:** A location is missing `ObjectTitle` or `ObjectUrl` because the original entry had empty strings.
**Why it happens:** Some SharePoint objects (like site collections) may not populate all fields.
**How to avoid:** Pass through whatever the source entry has -- empty strings are valid. Do not filter or skip locations with missing fields.
## Code Examples
### New Model: LocationInfo
```csharp
// File: SharepointToolbox/Core/Models/LocationInfo.cs
namespace SharepointToolbox.Core.Models;
/// <summary>
/// Lightweight record for a single location within a consolidated permission entry.
/// Contains the site and object details from the original UserAccessEntry.
/// </summary>
public record LocationInfo(
string SiteUrl,
string SiteTitle,
string ObjectTitle,
string ObjectUrl,
string ObjectType
);
```
### New Model: ConsolidatedPermissionEntry
```csharp
// File: SharepointToolbox/Core/Models/ConsolidatedPermissionEntry.cs
namespace SharepointToolbox.Core.Models;
/// <summary>
/// A merged permission entry representing one user's identical access
/// across multiple locations. Key fields: UserLogin + PermissionLevel +
/// AccessType + GrantedThrough. Location-varying fields collected in Locations list.
/// </summary>
public record ConsolidatedPermissionEntry(
string UserDisplayName,
string UserLogin,
string PermissionLevel,
AccessType AccessType,
string GrantedThrough,
bool IsHighPrivilege,
bool IsExternalUser,
IReadOnlyList<LocationInfo> Locations
)
{
/// <summary>Convenience property for display and sorting.</summary>
public int LocationCount => Locations.Count;
}
```
### New Helper: PermissionConsolidator (MakeKey pattern)
```csharp
// File: SharepointToolbox/Core/Helpers/PermissionConsolidator.cs
namespace SharepointToolbox.Core.Helpers;
using SharepointToolbox.Core.Models;
/// <summary>
/// Merges flat UserAccessEntry rows into consolidated entries where
/// rows with identical (UserLogin, PermissionLevel, AccessType, GrantedThrough)
/// are collapsed into a single entry with multiple locations.
/// </summary>
public static class PermissionConsolidator
{
/// <summary>
/// Builds a pipe-delimited composite key for consolidation grouping.
/// All string parts are lowercased for case-insensitive matching.
/// </summary>
internal static string MakeKey(UserAccessEntry entry)
{
return string.Join("|",
entry.UserLogin.ToLowerInvariant(),
entry.PermissionLevel.ToLowerInvariant(),
entry.AccessType.ToString(),
entry.GrantedThrough.ToLowerInvariant());
}
/// <summary>
/// Consolidates a flat list of UserAccessEntry into grouped entries.
/// Each group shares the same composite key; location-specific fields
/// are collected into a LocationInfo list.
/// </summary>
public static IReadOnlyList<ConsolidatedPermissionEntry> Consolidate(
IReadOnlyList<UserAccessEntry> entries)
{
if (entries.Count == 0)
return Array.Empty<ConsolidatedPermissionEntry>();
return entries
.GroupBy(e => MakeKey(e))
.Select(g =>
{
var first = g.First();
var locations = g.Select(e => new LocationInfo(
e.SiteUrl, e.SiteTitle, e.ObjectTitle, e.ObjectUrl, e.ObjectType
)).ToList();
return new ConsolidatedPermissionEntry(
UserDisplayName: first.UserDisplayName,
UserLogin: first.UserLogin,
PermissionLevel: first.PermissionLevel,
AccessType: first.AccessType,
GrantedThrough: first.GrantedThrough,
IsHighPrivilege: first.IsHighPrivilege,
IsExternalUser: first.IsExternalUser,
Locations: locations);
})
.OrderBy(c => c.UserLogin)
.ThenBy(c => c.PermissionLevel)
.ToList();
}
}
```
### Test Pattern (follows existing conventions)
```csharp
// File: SharepointToolbox.Tests/Helpers/PermissionConsolidatorTests.cs
// Pattern follows DuplicatesServiceTests.cs and UserAccessAuditServiceTests.cs
namespace SharepointToolbox.Tests.Helpers;
using SharepointToolbox.Core.Helpers;
using SharepointToolbox.Core.Models;
public class PermissionConsolidatorTests
{
private static UserAccessEntry MakeEntry(
string userLogin = "alice@contoso.com",
string displayName = "Alice",
string siteUrl = "https://contoso.sharepoint.com",
string siteTitle = "Contoso",
string objectType = "List",
string objectTitle = "Docs",
string objectUrl = "https://contoso.sharepoint.com/Docs",
string permissionLevel = "Read",
AccessType accessType = AccessType.Direct,
string grantedThrough = "Direct Permissions",
bool isHighPrivilege = false,
bool isExternalUser = false) =>
new(displayName, userLogin, siteUrl, siteTitle, objectType,
objectTitle, objectUrl, permissionLevel, accessType,
grantedThrough, isHighPrivilege, isExternalUser);
[Fact]
public void Empty_input_returns_empty()
{
var result = PermissionConsolidator.Consolidate(Array.Empty<UserAccessEntry>());
Assert.Empty(result);
}
[Fact]
public void Single_entry_returns_single_consolidated_with_one_location()
{
var entries = new[] { MakeEntry() };
var result = PermissionConsolidator.Consolidate(entries);
Assert.Single(result);
Assert.Equal(1, result[0].LocationCount);
}
// ... see Validation Architecture for full test map
}
```
## UserAccessEntry Field Reference
Complete field list from `SharepointToolbox/Core/Models/UserAccessEntry.cs`:
| Field | Type | In Key? | In LocationInfo? | In Consolidated? | Notes |
|-------|------|---------|-------------------|-------------------|-------|
| `UserDisplayName` | `string` | No | No | Yes (first occurrence) | May vary across sites for same user |
| `UserLogin` | `string` | **YES** | No | Yes | Identity key, case-insensitive |
| `SiteUrl` | `string` | No | **YES** | No (in Locations) | Varies per merged row |
| `SiteTitle` | `string` | No | **YES** | No (in Locations) | Varies per merged row |
| `ObjectType` | `string` | No | **YES** | No (in Locations) | "Site Collection", "Site", "List", "Folder" |
| `ObjectTitle` | `string` | No | **YES** | No (in Locations) | Varies per merged row |
| `ObjectUrl` | `string` | No | **YES** | No (in Locations) | Varies per merged row |
| `PermissionLevel` | `string` | **YES** | No | Yes | e.g. "Full Control", "Contribute" |
| `AccessType` | `AccessType` enum | **YES** | No | Yes | Direct, Group, Inherited |
| `GrantedThrough` | `string` | **YES** | No | Yes | e.g. "Direct Permissions" |
| `IsHighPrivilege` | `bool` | No | No | Yes | Derived from PermissionLevel, consistent within group |
| `IsExternalUser` | `bool` | No | No | Yes | Derived from UserLogin, consistent within group |
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| Flat `UserAccessEntry` list only | Flat + optional consolidated view | Phase 15 (new) | Adds consolidated model alongside existing flat model |
**Key design choice:** The consolidator returns a NEW type (`ConsolidatedPermissionEntry`), not a modified `UserAccessEntry`. This means:
- Existing code consuming `IReadOnlyList<UserAccessEntry>` is completely unaffected
- Phase 16 will need to handle both types (or convert) in the HTML export
- The opt-in pattern means the consolidator is only called when the user enables it
## Open Questions
1. **Ordering of consolidated results**
- What we know: Original entries come back in site-scan order. Consolidated entries need an explicit sort.
- What's unclear: Should sorting be by UserLogin, then PermissionLevel? Or follow some other order?
- Recommendation: Default to `UserLogin` then `PermissionLevel` (alphabetical). Phase 16 can add sort controls.
2. **Location ordering within a consolidated entry**
- What we know: Locations come from different sites scanned in order.
- What's unclear: Should locations be sorted by SiteTitle? SiteUrl?
- Recommendation: Preserve insertion order (scan order) for now. Phase 16 can sort in presentation.
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | xUnit 2.9.3 |
| Config file | `SharepointToolbox.Tests/SharepointToolbox.Tests.csproj` (implicit xUnit config via SDK) |
| Quick run command | `dotnet test SharepointToolbox.Tests --filter "FullyQualifiedName~PermissionConsolidator" --no-build -v q` |
| Full suite command | `dotnet test SharepointToolbox.Tests -v q` |
### Phase Requirements to Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| RPT-04-a | Empty input returns empty list | unit | `dotnet test --filter "PermissionConsolidatorTests.Empty_input_returns_empty" -v q` | Wave 0 |
| RPT-04-b | Single entry produces 1 consolidated row with 1 location | unit | `dotnet test --filter "PermissionConsolidatorTests.Single_entry" -v q` | Wave 0 |
| RPT-04-c | Same key across 3 sites merges into 1 row with 3 locations | unit | `dotnet test --filter "PermissionConsolidatorTests.*merges*" -v q` | Wave 0 |
| RPT-04-d | Different keys remain separate rows | unit | `dotnet test --filter "PermissionConsolidatorTests.*separate*" -v q` | Wave 0 |
| RPT-04-e | Case-insensitive key matching | unit | `dotnet test --filter "PermissionConsolidatorTests.*case*" -v q` | Wave 0 |
| RPT-04-f | MakeKey produces expected pipe-delimited format | unit | `dotnet test --filter "PermissionConsolidatorTests.MakeKey*" -v q` | Wave 0 |
| RPT-04-g | 10-row input with 3 duplicate pairs produces 7 consolidated rows | unit | `dotnet test --filter "PermissionConsolidatorTests.*ten_row*" -v q` | Wave 0 |
| RPT-04-h | LocationCount matches Locations.Count | unit | `dotnet test --filter "PermissionConsolidatorTests.*LocationCount*" -v q` | Wave 0 |
| RPT-04-i | IsHighPrivilege and IsExternalUser preserved correctly | unit | `dotnet test --filter "PermissionConsolidatorTests.*flags*" -v q` | Wave 0 |
| RPT-04-j | Existing exports compile unchanged (build verification) | smoke | `dotnet build SharepointToolbox -v q` | Existing |
### Sampling Rate
- **Per task commit:** `dotnet test SharepointToolbox.Tests --filter "FullyQualifiedName~PermissionConsolidator" --no-build -v q`
- **Per wave merge:** `dotnet test SharepointToolbox.Tests -v q`
- **Phase gate:** Full suite green before `/gsd:verify-work`
### Wave 0 Gaps
- [ ] `SharepointToolbox.Tests/Helpers/PermissionConsolidatorTests.cs` -- covers RPT-04 (all sub-tests)
- No framework install needed -- xUnit already configured
- No new packages needed
## Sources
### Primary (HIGH confidence)
- `SharepointToolbox/Core/Models/UserAccessEntry.cs` -- complete field list of source model (12 fields)
- `SharepointToolbox/Services/DuplicatesService.cs` -- `MakeKey()` pattern (lines 217-226) and `GroupBy` usage (lines 36-47)
- `SharepointToolbox/Core/Models/DuplicateGroup.cs` -- group result model pattern
- `SharepointToolbox/Services/UserAccessAuditService.cs` -- producer of `UserAccessEntry` list, shows transformation flow
- `SharepointToolbox/Services/Export/UserAccessHtmlExportService.cs` -- downstream consumer, accepts `IReadOnlyList<UserAccessEntry>`
- `SharepointToolbox.Tests/Services/DuplicatesServiceTests.cs` -- test pattern for `MakeKey()` logic
- `SharepointToolbox.Tests/Services/UserAccessAuditServiceTests.cs` -- test conventions (helper factories, naming, xUnit patterns)
- `.planning/phases/15-consolidation-data-model/15-CONTEXT.md` -- locked decisions
### Secondary (MEDIUM confidence)
- `.planning/REQUIREMENTS.md` -- RPT-04 requirement definition
- `.planning/ROADMAP.md` -- Phase 15 success criteria (10-row test, 7-row output)
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH -- all libraries already in project, no new dependencies
- Architecture: HIGH -- directly following established `MakeKey()` + `GroupBy` pattern from `DuplicatesService`
- Pitfalls: HIGH -- identified from analyzing actual field semantics in `UserAccessEntry` and `UserAccessAuditService`
- Test patterns: HIGH -- derived from reading actual test files in the project
**Research date:** 2026-04-09
**Valid until:** 2026-05-09 (stable -- pure data transformation, no external dependencies)