Files
Sharepoint-Toolbox/.planning/phases/15-consolidation-data-model/15-RESEARCH.md
2026-04-09 11:32:23 +02:00

22 KiB

Phase 15: Consolidation Data Model - Research

Researched: 2026-04-09 Domain: C# data modeling, LINQ grouping, pure data transformation Confidence: HIGH

Summary

Phase 15 creates a ConsolidatedPermissionEntry model and a PermissionConsolidator service that merges flat UserAccessEntry rows into consolidated rows grouped by a composite key (UserLogin + PermissionLevel + AccessType + GrantedThrough). This is a pure data transformation with zero API calls, making it fully testable in isolation.

The codebase already has an established pattern for composite-key grouping in DuplicatesService.MakeKey() -- a static method that builds a pipe-delimited string key from selected fields, used with LINQ GroupBy. The consolidator should follow this exact pattern. The project uses C# records extensively for immutable data models, xUnit + Moq for testing, and organizes code into Core/Models, Core/Helpers, and Services directories.

Primary recommendation: Create LocationInfo record in Core/Models, ConsolidatedPermissionEntry record in Core/Models, and a static PermissionConsolidator class in Core/Helpers following the MakeKey() + LINQ GroupBy pattern from DuplicatesService.

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

Decision Value
Consolidation scope User access audit report only -- site-centric permission report is unchanged
Source model UserAccessEntry (already normalized, one user per row)
Consolidation is opt-in Defaults to OFF; toggle wired in Phase 16
No API calls Pure data transformation -- no Graph or CSOM calls
Existing exports unchanged When consolidation is not applied, output is identical to pre-v2.3

Discussed Areas (Locked)

  • Consolidation key: merge when UserLogin + PermissionLevel + AccessType + GrantedThrough all match
  • Merged locations: List<LocationInfo> with LocationCount convenience property
  • LocationInfo is a lightweight record: { string SiteUrl, string SiteTitle, string ObjectTitle, string ObjectUrl, string ObjectType }
  • ConsolidatedPermissionEntry holds all key fields plus List<LocationInfo>
  • Presentation decisions deferred to Phase 16

Deferred Ideas (OUT OF SCOPE)

  • Consolidation toggle UI (Phase 16)
  • Consolidated view rendering in HTML exports (Phase 16)
  • Group expansion within consolidated rows (Phase 17)
  • Consolidation in CSV exports (out of scope per REQUIREMENTS.md)
  • "Same permission set across sites" consolidation for site-centric report (not planned) </user_constraints>

<phase_requirements>

Phase Requirements

ID Description Research Support
RPT-04 Consolidated reports merge rows for the same user with identical access levels across multiple locations into a single row PermissionConsolidator service with MakeKey() composite key grouping over UserAccessEntry list, producing ConsolidatedPermissionEntry with List<LocationInfo>
</phase_requirements>

Standard Stack

Core

Library Version Purpose Why Standard
.NET 10.0 net10.0-windows Target framework Project already targets this
C# records N/A Immutable model types Project pattern -- UserAccessEntry, PermissionEntry, SiteInfo all use positional records
LINQ N/A GroupBy for consolidation Project pattern -- DuplicatesService uses GroupBy(item => MakeKey(...))

Testing

Library Version Purpose Why Standard
xUnit 2.9.3 Test framework Already in test project
Moq 4.20.72 Mocking Already in test project (not needed for pure logic tests)
Microsoft.NET.Test.Sdk 17.14.1 Test runner Already in test project

Installation: No new packages needed. All required libraries are already present.

Architecture Patterns

SharepointToolbox/
  Core/
    Models/
      LocationInfo.cs              # NEW - lightweight record for merged location data
      ConsolidatedPermissionEntry.cs  # NEW - consolidated row model
    Helpers/
      PermissionConsolidator.cs    # NEW - static consolidation logic
  Services/
    (no changes in Phase 15)

SharepointToolbox.Tests/
  Helpers/
    PermissionConsolidatorTests.cs  # NEW - pure logic tests

Pattern 1: Positional Record for Immutable Models

What: All data models in this project use C# positional records (constructor-defined properties). When to use: For all new model types. Example (from existing code):

// Source: SharepointToolbox/Core/Models/UserAccessEntry.cs
public record UserAccessEntry(
    string UserDisplayName,
    string UserLogin,
    string SiteUrl,
    string SiteTitle,
    string ObjectType,
    string ObjectTitle,
    string ObjectUrl,
    string PermissionLevel,
    AccessType AccessType,
    string GrantedThrough,
    bool IsHighPrivilege,
    bool IsExternalUser
);

Pattern 2: MakeKey Composite Key Grouping

What: Static method builds a pipe-delimited string key from fields, used with LINQ GroupBy to find duplicates/consolidations. When to use: When grouping rows by a multi-field composite key. Example (from existing code):

// Source: SharepointToolbox/Services/DuplicatesService.cs (lines 217-226)
internal static string MakeKey(DuplicateItem item, DuplicateScanOptions opts)
{
    var parts = new List<string> { item.Name.ToLowerInvariant() };
    if (opts.MatchSize && item.SizeBytes.HasValue) parts.Add(item.SizeBytes.Value.ToString());
    // ... more optional parts
    return string.Join("|", parts);
}

// Usage (lines 36-47):
var groups = allItems
    .GroupBy(item => MakeKey(item, options))
    .Where(g => g.Count() >= 2)
    .Select(g => new DuplicateGroup { ... })
    .ToList();

Pattern 3: Group Result Model

What: A class/record holding the group key plus a list of grouped items. When to use: For storing grouped results. Example (from existing code):

// Source: SharepointToolbox/Core/Models/DuplicateGroup.cs
public class DuplicateGroup
{
    public string GroupKey { get; set; } = string.Empty;
    public string Name { get; set; } = string.Empty;
    public List<DuplicateItem> Items { get; set; } = new();
}

Anti-Patterns to Avoid

  • Mutable consolidation state: Do not use mutable dictionaries that accumulate state across calls. Use LINQ GroupBy + Select for a single-pass functional transformation.
  • Modifying UserAccessEntry: The source model is a record and must remain unchanged. Consolidation produces a NEW type, not a modified version.
  • Case-sensitive key matching: UserLogin should use case-insensitive comparison in the key (following DuplicatesService pattern of .ToLowerInvariant()). PermissionLevel and GrantedThrough should also be case-insensitive. AccessType is an enum so comparison is already exact.

Don't Hand-Roll

Problem Don't Build Use Instead Why
Composite key hashing Custom hash function Pipe-delimited string via MakeKey() Matches existing codebase pattern, readable, debuggable
Case-insensitive grouping Custom IEqualityComparer ToLowerInvariant() in key parts Simpler, matches DuplicatesService pattern
Immutable models Classes with manual equality C# record types Project convention, value equality built-in

Common Pitfalls

Pitfall 1: Case Sensitivity in Consolidation Key

What goes wrong: "Full Control" vs "full control" or "alice@contoso.com" vs "Alice@contoso.com" produce different keys, breaking consolidation. Why it happens: String fields from SharePoint are not consistently cased. How to avoid: Use .ToLowerInvariant() on UserLogin, PermissionLevel, and GrantedThrough when building the key (not on the stored values). Warning signs: Tests pass with carefully-cased data but fail with real-world data.

Pitfall 2: UserDisplayName Varies Across Sites

What goes wrong: Same user (alice@contoso.com) may have UserDisplayName = "Alice Smith" on one site and "Alice M. Smith" on another. Why it happens: Display names are site-local metadata, not identity-tied. How to avoid: UserDisplayName is NOT part of the consolidation key (keyed on UserLogin). For the consolidated entry, pick the first occurrence or most common display name. Warning signs: Same user appearing as two separate consolidated rows.

Pitfall 3: IsHighPrivilege and IsExternalUser Across Merged Rows

What goes wrong: If a user has "Full Control" (IsHighPrivilege=true) at 3 sites, the consolidated row must preserve IsHighPrivilege=true. These boolean flags should be consistent across merged rows (since the key includes PermissionLevel), but worth asserting. Why it happens: All rows in a group share the same PermissionLevel, so IsHighPrivilege will be identical. IsExternalUser is login-based, also consistent. How to avoid: Take the value from the first entry in the group (they should all match). Add a debug assertion or test to verify consistency.

Pitfall 4: Empty Input Handling

What goes wrong: Consolidator crashes or returns unexpected results on empty list. Why it happens: LINQ operations on empty sequences can throw or return surprising defaults. How to avoid: Return empty list immediately for empty input. Test this case explicitly.

Pitfall 5: LocationInfo Field Completeness

What goes wrong: A location is missing ObjectTitle or ObjectUrl because the original entry had empty strings. Why it happens: Some SharePoint objects (like site collections) may not populate all fields. How to avoid: Pass through whatever the source entry has -- empty strings are valid. Do not filter or skip locations with missing fields.

Code Examples

New Model: LocationInfo

// File: SharepointToolbox/Core/Models/LocationInfo.cs
namespace SharepointToolbox.Core.Models;

/// <summary>
/// Lightweight record for a single location within a consolidated permission entry.
/// Contains the site and object details from the original UserAccessEntry.
/// </summary>
public record LocationInfo(
    string SiteUrl,
    string SiteTitle,
    string ObjectTitle,
    string ObjectUrl,
    string ObjectType
);

New Model: ConsolidatedPermissionEntry

// File: SharepointToolbox/Core/Models/ConsolidatedPermissionEntry.cs
namespace SharepointToolbox.Core.Models;

/// <summary>
/// A merged permission entry representing one user's identical access
/// across multiple locations. Key fields: UserLogin + PermissionLevel +
/// AccessType + GrantedThrough. Location-varying fields collected in Locations list.
/// </summary>
public record ConsolidatedPermissionEntry(
    string UserDisplayName,
    string UserLogin,
    string PermissionLevel,
    AccessType AccessType,
    string GrantedThrough,
    bool IsHighPrivilege,
    bool IsExternalUser,
    IReadOnlyList<LocationInfo> Locations
)
{
    /// <summary>Convenience property for display and sorting.</summary>
    public int LocationCount => Locations.Count;
}

New Helper: PermissionConsolidator (MakeKey pattern)

// File: SharepointToolbox/Core/Helpers/PermissionConsolidator.cs
namespace SharepointToolbox.Core.Helpers;

using SharepointToolbox.Core.Models;

/// <summary>
/// Merges flat UserAccessEntry rows into consolidated entries where
/// rows with identical (UserLogin, PermissionLevel, AccessType, GrantedThrough)
/// are collapsed into a single entry with multiple locations.
/// </summary>
public static class PermissionConsolidator
{
    /// <summary>
    /// Builds a pipe-delimited composite key for consolidation grouping.
    /// All string parts are lowercased for case-insensitive matching.
    /// </summary>
    internal static string MakeKey(UserAccessEntry entry)
    {
        return string.Join("|",
            entry.UserLogin.ToLowerInvariant(),
            entry.PermissionLevel.ToLowerInvariant(),
            entry.AccessType.ToString(),
            entry.GrantedThrough.ToLowerInvariant());
    }

    /// <summary>
    /// Consolidates a flat list of UserAccessEntry into grouped entries.
    /// Each group shares the same composite key; location-specific fields
    /// are collected into a LocationInfo list.
    /// </summary>
    public static IReadOnlyList<ConsolidatedPermissionEntry> Consolidate(
        IReadOnlyList<UserAccessEntry> entries)
    {
        if (entries.Count == 0)
            return Array.Empty<ConsolidatedPermissionEntry>();

        return entries
            .GroupBy(e => MakeKey(e))
            .Select(g =>
            {
                var first = g.First();
                var locations = g.Select(e => new LocationInfo(
                    e.SiteUrl, e.SiteTitle, e.ObjectTitle, e.ObjectUrl, e.ObjectType
                )).ToList();

                return new ConsolidatedPermissionEntry(
                    UserDisplayName: first.UserDisplayName,
                    UserLogin: first.UserLogin,
                    PermissionLevel: first.PermissionLevel,
                    AccessType: first.AccessType,
                    GrantedThrough: first.GrantedThrough,
                    IsHighPrivilege: first.IsHighPrivilege,
                    IsExternalUser: first.IsExternalUser,
                    Locations: locations);
            })
            .OrderBy(c => c.UserLogin)
            .ThenBy(c => c.PermissionLevel)
            .ToList();
    }
}

Test Pattern (follows existing conventions)

// File: SharepointToolbox.Tests/Helpers/PermissionConsolidatorTests.cs
// Pattern follows DuplicatesServiceTests.cs and UserAccessAuditServiceTests.cs

namespace SharepointToolbox.Tests.Helpers;

using SharepointToolbox.Core.Helpers;
using SharepointToolbox.Core.Models;

public class PermissionConsolidatorTests
{
    private static UserAccessEntry MakeEntry(
        string userLogin = "alice@contoso.com",
        string displayName = "Alice",
        string siteUrl = "https://contoso.sharepoint.com",
        string siteTitle = "Contoso",
        string objectType = "List",
        string objectTitle = "Docs",
        string objectUrl = "https://contoso.sharepoint.com/Docs",
        string permissionLevel = "Read",
        AccessType accessType = AccessType.Direct,
        string grantedThrough = "Direct Permissions",
        bool isHighPrivilege = false,
        bool isExternalUser = false) =>
        new(displayName, userLogin, siteUrl, siteTitle, objectType,
            objectTitle, objectUrl, permissionLevel, accessType,
            grantedThrough, isHighPrivilege, isExternalUser);

    [Fact]
    public void Empty_input_returns_empty()
    {
        var result = PermissionConsolidator.Consolidate(Array.Empty<UserAccessEntry>());
        Assert.Empty(result);
    }

    [Fact]
    public void Single_entry_returns_single_consolidated_with_one_location()
    {
        var entries = new[] { MakeEntry() };
        var result = PermissionConsolidator.Consolidate(entries);
        Assert.Single(result);
        Assert.Equal(1, result[0].LocationCount);
    }

    // ... see Validation Architecture for full test map
}

UserAccessEntry Field Reference

Complete field list from SharepointToolbox/Core/Models/UserAccessEntry.cs:

Field Type In Key? In LocationInfo? In Consolidated? Notes
UserDisplayName string No No Yes (first occurrence) May vary across sites for same user
UserLogin string YES No Yes Identity key, case-insensitive
SiteUrl string No YES No (in Locations) Varies per merged row
SiteTitle string No YES No (in Locations) Varies per merged row
ObjectType string No YES No (in Locations) "Site Collection", "Site", "List", "Folder"
ObjectTitle string No YES No (in Locations) Varies per merged row
ObjectUrl string No YES No (in Locations) Varies per merged row
PermissionLevel string YES No Yes e.g. "Full Control", "Contribute"
AccessType AccessType enum YES No Yes Direct, Group, Inherited
GrantedThrough string YES No Yes e.g. "Direct Permissions"
IsHighPrivilege bool No No Yes Derived from PermissionLevel, consistent within group
IsExternalUser bool No No Yes Derived from UserLogin, consistent within group

State of the Art

Old Approach Current Approach When Changed Impact
Flat UserAccessEntry list only Flat + optional consolidated view Phase 15 (new) Adds consolidated model alongside existing flat model

Key design choice: The consolidator returns a NEW type (ConsolidatedPermissionEntry), not a modified UserAccessEntry. This means:

  • Existing code consuming IReadOnlyList<UserAccessEntry> is completely unaffected
  • Phase 16 will need to handle both types (or convert) in the HTML export
  • The opt-in pattern means the consolidator is only called when the user enables it

Open Questions

  1. Ordering of consolidated results

    • What we know: Original entries come back in site-scan order. Consolidated entries need an explicit sort.
    • What's unclear: Should sorting be by UserLogin, then PermissionLevel? Or follow some other order?
    • Recommendation: Default to UserLogin then PermissionLevel (alphabetical). Phase 16 can add sort controls.
  2. Location ordering within a consolidated entry

    • What we know: Locations come from different sites scanned in order.
    • What's unclear: Should locations be sorted by SiteTitle? SiteUrl?
    • Recommendation: Preserve insertion order (scan order) for now. Phase 16 can sort in presentation.

Validation Architecture

Test Framework

Property Value
Framework xUnit 2.9.3
Config file SharepointToolbox.Tests/SharepointToolbox.Tests.csproj (implicit xUnit config via SDK)
Quick run command dotnet test SharepointToolbox.Tests --filter "FullyQualifiedName~PermissionConsolidator" --no-build -v q
Full suite command dotnet test SharepointToolbox.Tests -v q

Phase Requirements to Test Map

Req ID Behavior Test Type Automated Command File Exists?
RPT-04-a Empty input returns empty list unit dotnet test --filter "PermissionConsolidatorTests.Empty_input_returns_empty" -v q Wave 0
RPT-04-b Single entry produces 1 consolidated row with 1 location unit dotnet test --filter "PermissionConsolidatorTests.Single_entry" -v q Wave 0
RPT-04-c Same key across 3 sites merges into 1 row with 3 locations unit dotnet test --filter "PermissionConsolidatorTests.*merges*" -v q Wave 0
RPT-04-d Different keys remain separate rows unit dotnet test --filter "PermissionConsolidatorTests.*separate*" -v q Wave 0
RPT-04-e Case-insensitive key matching unit dotnet test --filter "PermissionConsolidatorTests.*case*" -v q Wave 0
RPT-04-f MakeKey produces expected pipe-delimited format unit dotnet test --filter "PermissionConsolidatorTests.MakeKey*" -v q Wave 0
RPT-04-g 10-row input with 3 duplicate pairs produces 7 consolidated rows unit dotnet test --filter "PermissionConsolidatorTests.*ten_row*" -v q Wave 0
RPT-04-h LocationCount matches Locations.Count unit dotnet test --filter "PermissionConsolidatorTests.*LocationCount*" -v q Wave 0
RPT-04-i IsHighPrivilege and IsExternalUser preserved correctly unit dotnet test --filter "PermissionConsolidatorTests.*flags*" -v q Wave 0
RPT-04-j Existing exports compile unchanged (build verification) smoke dotnet build SharepointToolbox -v q Existing

Sampling Rate

  • Per task commit: dotnet test SharepointToolbox.Tests --filter "FullyQualifiedName~PermissionConsolidator" --no-build -v q
  • Per wave merge: dotnet test SharepointToolbox.Tests -v q
  • Phase gate: Full suite green before /gsd:verify-work

Wave 0 Gaps

  • SharepointToolbox.Tests/Helpers/PermissionConsolidatorTests.cs -- covers RPT-04 (all sub-tests)
  • No framework install needed -- xUnit already configured
  • No new packages needed

Sources

Primary (HIGH confidence)

  • SharepointToolbox/Core/Models/UserAccessEntry.cs -- complete field list of source model (12 fields)
  • SharepointToolbox/Services/DuplicatesService.cs -- MakeKey() pattern (lines 217-226) and GroupBy usage (lines 36-47)
  • SharepointToolbox/Core/Models/DuplicateGroup.cs -- group result model pattern
  • SharepointToolbox/Services/UserAccessAuditService.cs -- producer of UserAccessEntry list, shows transformation flow
  • SharepointToolbox/Services/Export/UserAccessHtmlExportService.cs -- downstream consumer, accepts IReadOnlyList<UserAccessEntry>
  • SharepointToolbox.Tests/Services/DuplicatesServiceTests.cs -- test pattern for MakeKey() logic
  • SharepointToolbox.Tests/Services/UserAccessAuditServiceTests.cs -- test conventions (helper factories, naming, xUnit patterns)
  • .planning/phases/15-consolidation-data-model/15-CONTEXT.md -- locked decisions

Secondary (MEDIUM confidence)

  • .planning/REQUIREMENTS.md -- RPT-04 requirement definition
  • .planning/ROADMAP.md -- Phase 15 success criteria (10-row test, 7-row output)

Metadata

Confidence breakdown:

  • Standard stack: HIGH -- all libraries already in project, no new dependencies
  • Architecture: HIGH -- directly following established MakeKey() + GroupBy pattern from DuplicatesService
  • Pitfalls: HIGH -- identified from analyzing actual field semantics in UserAccessEntry and UserAccessAuditService
  • Test patterns: HIGH -- derived from reading actual test files in the project

Research date: 2026-04-09 Valid until: 2026-05-09 (stable -- pure data transformation, no external dependencies)