129 lines
5.7 KiB
Markdown
129 lines
5.7 KiB
Markdown
---
|
|
phase: 03-storage
|
|
plan: "04"
|
|
subsystem: search
|
|
tags: [csom, sharepoint-search, kql, duplicates, pagination]
|
|
|
|
# Dependency graph
|
|
requires:
|
|
- phase: 03-01
|
|
provides: ISearchService, IDuplicatesService, SearchOptions, DuplicateScanOptions, SearchResult, DuplicateItem, DuplicateGroup, OperationProgress models and interfaces
|
|
|
|
provides:
|
|
- SearchService: KQL-based file search with 500-row pagination and 50,000-item hard cap
|
|
- DuplicatesService: file duplicates via Search API + folder duplicates via CAML FSObjType=1
|
|
- MakeKey composite key logic for grouping duplicates by name+size+dates+counts
|
|
|
|
affects: [03-05, 03-07, 03-08]
|
|
|
|
# Tech tracking
|
|
tech-stack:
|
|
added: []
|
|
patterns:
|
|
- "KeywordQuery + SearchExecutor pattern: executor.ExecuteQuery(kq) registers query, then ExecuteQueryRetryHelper.ExecuteQueryRetryAsync executes it"
|
|
- "StringCollection.Add loop: SelectProperties is StringCollection, not List<string> — must add properties one-by-one"
|
|
- "StartRow pagination: += BatchSize per iteration, hard stop at MaxStartRow (50,000)"
|
|
- "goto done pattern for early exit from nested pagination loop when MaxResults reached"
|
|
|
|
key-files:
|
|
created:
|
|
- SharepointToolbox/Services/SearchService.cs
|
|
- SharepointToolbox/Services/DuplicatesService.cs
|
|
modified: []
|
|
|
|
key-decisions:
|
|
- "SearchService uses SelectProperties.Add per-item loop — StringCollection has no AddRange(string[]) overload in this SDK version"
|
|
- "DuplicatesService.MakeKey internal static method matches inline test helper in DuplicatesServiceTests exactly — deliberate design to ensure test parity"
|
|
- "DuplicatesService file mode re-implements pagination inline (not delegating to SearchService) — avoids coupling between services with different result models"
|
|
|
|
patterns-established:
|
|
- "KQL SelectProperties: Add each property in a foreach loop, never AddRange with array"
|
|
- "Search pagination: do/while with startRow <= MaxStartRow guard, break on empty table"
|
|
- "Folder CAML: FSObjType=1 (not FileSystemObjectType) — wrong name returns zero results"
|
|
|
|
requirements-completed: [SRCH-01, SRCH-02, DUPL-01, DUPL-02]
|
|
|
|
# Metrics
|
|
duration: 2min
|
|
completed: 2026-04-02
|
|
---
|
|
|
|
# Phase 03 Plan 04: SearchService and DuplicatesService Summary
|
|
|
|
**KQL file search with 500-row StartRow pagination (50k cap) and composite-key duplicate detection for files (Search API) and folders (CAML FSObjType=1)**
|
|
|
|
## Performance
|
|
|
|
- **Duration:** 2 min
|
|
- **Started:** 2026-04-02T14:09:25Z
|
|
- **Completed:** 2026-04-02T14:12:09Z
|
|
- **Tasks:** 2
|
|
- **Files modified:** 2 created
|
|
|
|
## Accomplishments
|
|
|
|
- SearchService implements full KQL builder (extension, date range, creator, editor, library filters) with paginated retrieval up to 50,000 items
|
|
- DuplicatesService supports both file mode (Search API) and folder mode (CAML FSObjType=1) with client-side composite key grouping
|
|
- MakeKey logic matches the inline test scaffold from Plan 03-01 DuplicatesServiceTests — 5 pure-logic tests pass
|
|
|
|
## Task Commits
|
|
|
|
Each task was committed atomically:
|
|
|
|
1. **Task 1: Implement SearchService** - `9e3d501` (feat)
|
|
2. **Task 2: Implement DuplicatesService** - `df5f79d` (feat)
|
|
|
|
## Files Created/Modified
|
|
|
|
- `SharepointToolbox/Services/SearchService.cs` - KQL search with pagination, vti_history filter, regex client-side filter, KQL length validation
|
|
- `SharepointToolbox/Services/DuplicatesService.cs` - File/folder duplicate detection, MakeKey composite grouping, CAML folder enumeration
|
|
|
|
## Decisions Made
|
|
|
|
- `SelectProperties` is a `StringCollection` — `AddRange(string[])` does not compile. Fixed inline per-item `foreach` add loop (Rule 1 auto-fix applied during Task 1 first build).
|
|
- DuplicatesService re-implements file pagination inline rather than delegating to SearchService because result types differ (`DuplicateItem` vs `SearchResult`) and the two services have different lifecycles.
|
|
- `MakeKey` is `internal static` to match the test project's inline copy — enables verifying parity without a live CSOM context.
|
|
|
|
## Deviations from Plan
|
|
|
|
### Auto-fixed Issues
|
|
|
|
**1. [Rule 1 - Bug] StringCollection.AddRange(string[]) does not exist**
|
|
- **Found during:** Task 1 (SearchService build)
|
|
- **Issue:** `kq.SelectProperties.AddRange(new[] { ... })` — `SelectProperties` is `StringCollection` which has no `AddRange` taking `string[]`; extension method overload requires `List<string>` receiver
|
|
- **Fix:** Replaced with `foreach` loop calling `kq.SelectProperties.Add(prop)` for each property name
|
|
- **Files modified:** `SharepointToolbox/Services/SearchService.cs`, `SharepointToolbox/Services/DuplicatesService.cs`
|
|
- **Verification:** `dotnet build` 0 errors after fix; same fix proactively applied in DuplicatesService before its first build
|
|
- **Committed in:** `9e3d501` (Task 1 commit)
|
|
|
|
---
|
|
|
|
**Total deviations:** 1 auto-fixed (Rule 1 - bug)
|
|
**Impact on plan:** Minor API surface mismatch in the plan's code listing; fix is purely syntactic, no behavioral difference.
|
|
|
|
## Issues Encountered
|
|
|
|
- `dotnet test ... -x` flag not recognized by the `dotnet test` CLI on this machine (MSBuild switch error). Removed the flag; tests ran correctly without it.
|
|
|
|
## User Setup Required
|
|
|
|
None - no external service configuration required.
|
|
|
|
## Next Phase Readiness
|
|
|
|
- SearchService and DuplicatesService are complete and compile cleanly
|
|
- Wave 2 is now ready for 03-05 (Search/Duplicate exports) and 03-06 (Localization) to proceed in parallel with 03-03 (Storage exports)
|
|
- 5 MakeKey tests pass; CSOM integration tests will remain skipped until a live tenant is available
|
|
|
|
---
|
|
*Phase: 03-storage*
|
|
*Completed: 2026-04-02*
|
|
|
|
## Self-Check: PASSED
|
|
|
|
- SharepointToolbox/Services/SearchService.cs: FOUND
|
|
- SharepointToolbox/Services/DuplicatesService.cs: FOUND
|
|
- .planning/phases/03-storage/03-04-SUMMARY.md: FOUND
|
|
- Commit 9e3d501 (SearchService): FOUND
|
|
- Commit df5f79d (DuplicatesService): FOUND
|