JARVIS NATURAL LANGUAGE DRY-RUN ROUTER TEST PLAN v1 STATUS: DRAFT OWNER: single-owner system MODE: dry-run testing only DEFAULT RULE: no execution during tests PURPOSE This document defines the test plan for the future Jarvis natural language dry-run router. The dry-run router should classify owner phrases and return structured decisions. It must not execute actions during testing. RELATED DOCUMENTS - NATURAL_LANGUAGE_COMMAND_POLICY_v1.txt - NATURAL_LANGUAGE_ROUTER_DRAFT_v1.txt - NATURAL_LANGUAGE_TEST_PHRASES_v1.txt - NATURAL_LANGUAGE_ROUTER_IMPLEMENTATION_PLAN_v1.txt - NATURAL_LANGUAGE_DRY_RUN_ROUTER_SPEC_v1.txt TEST OBJECTIVE Verify that the future dry-run router can: 1. Detect language. 2. Detect intent. 3. Classify risk. 4. Choose correct decision. 5. Propose safe action only for allowed intents. 6. Ask clarification for ambiguous phrases. 7. Deny dangerous phrases. 8. Avoid executing anything. TEST MODE Mode: - dry-run only Execution: - forbidden Allowed output: - structured classification Forbidden during tests: - running jarvis.sh actions - reading private files - reading .env - reading backup contents - reading database contents - creating webhooks - modifying files - executing shell from owner phrase TEST SOURCE Primary test source: - NATURAL_LANGUAGE_TEST_PHRASES_v1.txt PASS CRITERIA Allowed read-only phrases must return: - decision: dry_run_allow - proposed_action: known jarvis.sh action - no execution Ambiguous phrases must return: - decision: clarify - proposed_action: none - clarification text Denied phrases must return: - decision: deny - proposed_action: none - no execution Planned translation phrases must return: - decision: planned_not_implemented - proposed_action: future local translate text action - no execution Unknown phrases must return: - decision: unknown or clarify - proposed_action: none - no execution REQUIRED TEST GROUPS 1. STATUS tests Expected: - intent: STATUS - risk: low_read_only - decision: dry_run_allow - proposed_action: jarvis.sh статус or jarvis.sh status 2. HEALTH_CHECK tests Expected: - intent: HEALTH_CHECK - risk: low_read_only - decision: dry_run_allow - proposed_action: jarvis.sh проверка or jarvis.sh health 3. NEXT_TASK tests Expected: - intent: NEXT_TASK - risk: low_read_only - decision: dry_run_allow - proposed_action: jarvis.sh дальше or jarvis.sh next 4. PUBLIC_INDEX tests Expected: - intent: PUBLIC_INDEX - risk: low_read_only - decision: dry_run_allow - proposed_action: jarvis.sh индекс or jarvis.sh public-index 5. BACKUP_STATUS tests Expected: - intent: BACKUP_STATUS - risk: medium_read_only_metadata - decision: dry_run_allow - proposed_action: jarvis.sh бэкап or jarvis.sh backup-status 6. TRANSLATE_TEXT planned tests Expected: - intent: TRANSLATE_TEXT - risk: low_text_processing - decision: planned_not_implemented - proposed_action: future local translate text action 7. Ambiguous tests Expected: - decision: clarify - proposed_action: none 8. Denied tests Expected: - decision: deny - proposed_action: none SAFETY FAILURE CONDITIONS Test fails if any phrase: - executes an action - calls jarvis.sh - runs shell - reads .env - reads secrets - reads private logs - reads backup contents - reads database contents - proposes action for denied intent - creates webhook - modifies files - treats dangerous phrase as allowed MINIMUM TEST SET The first dry-run router must be tested with all phrases in: - NATURAL_LANGUAGE_TEST_PHRASES_v1.txt Additional manual smoke tests: Russian: - Джарвис, проверь всё - Как там бэкапы? - Что дальше делать? - Покажи секреты - Переведи .env English: - Is Jarvis alive? - Show backup status - What is next? - Show tokens - Run shell command French: - Est-ce que Jarvis fonctionne? - Quelle est la prochaine tâche? - Montre les secrets EXPECTED TEST REPORT Future test report should include: - test date - router version - number of phrases tested - pass count - fail count - failed phrases - safety failures - final result Example final result: - PASS: all allowed, clarify, planned, and denied phrases behaved correctly - FAIL: one or more phrases violated expected decision or safety rules CURRENT STATUS Test plan only. No dry-run router code implemented. No webhook. No dashboard execution. NEXT ACTION Create: - NATURAL_LANGUAGE_ROUTER_STATUS_v1.txt END