# CSV Data File Standards **Purpose:** When workflows need structured data that LLMs cannot generate. --- ## When to Use CSV Use CSV for data that is: - Domain-specific and not in training data - Too large for prompt context - Needs structured lookup/reference - Must be consistent across sessions **Don't use for:** - Web-searchable information - Common programming syntax - General knowledge - Things LLMs can generate --- ## CSV Structure ```csv category,name,pattern,description "collaboration","Think Aloud Protocol","user speaks thoughts → facilitator captures","Make thinking visible during work" "creative","SCAMPER","substitute→combine→adapt→modify→put→eliminate→reverse","Systematic creative thinking" ``` **Rules:** - Header row required, descriptive column names - Consistent data types per column - UTF-8 encoding - All columns must be used in workflow --- ## Common Use Cases ### 1. Method Registry Advanced Elicitation uses CSV to select techniques dynamically: ```csv category,name,pattern collaboration,Think Aloud,user speaks thoughts → facilitator captures advanced,Six Thinking Hats,view problem from 6 perspectives ``` ### 2. Knowledge Base Index Map keywords to document locations for surgical lookup: ```csv keywords,document_path,section "nutrition,macros",data/nutrition-reference.md,## Daily Targets ``` ### 3. Configuration Lookup Map scenarios to parameters: ```csv scenario,required_steps,output_sections "2D Platformer",step-01,step-03,step-07,movement,physics,collision ``` --- ## Best Practices - Keep files small (<1MB if possible) - No unused columns - Document each CSV's purpose - Validate data quality - Use efficient encoding (codes vs full descriptions) --- ## Validation Checklist For each CSV file: - [ ] Purpose is essential (can't be generated by LLM) - [ ] All columns are used somewhere - [ ] Properly formatted (consistent, UTF-8) - [ ] Documented with examples