# USDA Vision Database Schema This document provides a comprehensive overview of the database schema for the USDA Vision pecan processing experiment management system. ## Database Overview - **Database Type**: PostgreSQL (via Supabase) - **Version**: PostgreSQL 17 - **Authentication**: Supabase Auth with Row Level Security (RLS) - **Extensions**: uuid-ossp for UUID generation ## Core Tables ### 1. Authentication & User Management #### `auth.users` (Supabase managed) - Standard Supabase authentication table - Contains user authentication data #### `public.roles` ```sql CREATE TABLE public.roles ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), name TEXT UNIQUE NOT NULL CHECK (name IN ('admin', 'conductor', 'analyst', 'data recorder')), description TEXT NOT NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() ); ``` **Roles:** - `admin`: Full system access with user management capabilities - `conductor`: Operational access for conducting experiments and managing data - `analyst`: Read-only access for data analysis and reporting - `data recorder`: Data entry and recording capabilities #### `public.user_profiles` ```sql CREATE TABLE public.user_profiles ( id UUID PRIMARY KEY REFERENCES auth.users(id) ON DELETE CASCADE, email TEXT NOT NULL, role_id UUID REFERENCES public.roles(id), -- Nullable (legacy) status TEXT DEFAULT 'active' CHECK (status IN ('active', 'disabled')), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW() ); ``` #### `public.user_roles` (Junction Table) ```sql CREATE TABLE public.user_roles ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), user_id UUID NOT NULL REFERENCES public.user_profiles(id) ON DELETE CASCADE, role_id UUID NOT NULL REFERENCES public.roles(id) ON DELETE CASCADE, assigned_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), assigned_by UUID REFERENCES public.user_profiles(id), UNIQUE(user_id, role_id) ); ``` ### 2. Experiment Management #### `public.experiment_phases` ```sql CREATE TABLE public.experiment_phases ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), name TEXT NOT NULL UNIQUE, description TEXT, -- Optional description for the phase created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_by UUID NOT NULL REFERENCES public.user_profiles(id) ); ``` **Purpose**: Groups experiments into logical phases for better organization and navigation. #### `public.experiments` ```sql CREATE TABLE public.experiments ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), experiment_number INTEGER UNIQUE NOT NULL, reps_required INTEGER NOT NULL CHECK (reps_required > 0), soaking_duration_hr FLOAT NOT NULL CHECK (soaking_duration_hr >= 0), air_drying_time_min INTEGER NOT NULL CHECK (air_drying_time_min >= 0), plate_contact_frequency_hz FLOAT NOT NULL CHECK (plate_contact_frequency_hz > 0), throughput_rate_pecans_sec FLOAT NOT NULL CHECK (throughput_rate_pecans_sec > 0), crush_amount_in FLOAT NOT NULL CHECK (crush_amount_in >= 0), entry_exit_height_diff_in FLOAT NOT NULL, results_status TEXT NOT NULL DEFAULT 'valid' CHECK (results_status IN ('valid', 'invalid')), completion_status BOOLEAN NOT NULL DEFAULT false, phase_id UUID REFERENCES public.experiment_phases(id) ON DELETE SET NULL, created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_by UUID NOT NULL REFERENCES public.user_profiles(id) ); ``` **Purpose**: Experiment blueprints/templates that define the parameters for pecan processing experiments. #### `public.experiment_repetitions` ```sql CREATE TABLE public.experiment_repetitions ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), experiment_id UUID NOT NULL REFERENCES public.experiments(id) ON DELETE CASCADE, repetition_number INTEGER NOT NULL CHECK (repetition_number > 0), scheduled_date TIMESTAMP WITH TIME ZONE, schedule_status TEXT NOT NULL DEFAULT 'pending schedule' CHECK (schedule_status IN ('pending schedule', 'scheduled', 'canceled', 'aborted')), completion_status BOOLEAN NOT NULL DEFAULT false, is_locked BOOLEAN NOT NULL DEFAULT false, locked_at TIMESTAMP WITH TIME ZONE, locked_by UUID REFERENCES public.user_profiles(id), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_by UUID NOT NULL REFERENCES public.user_profiles(id), CONSTRAINT unique_repetition_per_experiment UNIQUE (experiment_id, repetition_number) ); ``` **Purpose**: Individual repetitions of experiment blueprints that can be scheduled and executed. ### 3. Data Entry System #### `public.experiment_phase_drafts` ```sql CREATE TABLE public.experiment_phase_drafts ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), experiment_id UUID NOT NULL REFERENCES public.experiments(id) ON DELETE CASCADE, repetition_id UUID NOT NULL REFERENCES public.experiment_repetitions(id) ON DELETE CASCADE, user_id UUID NOT NULL REFERENCES public.user_profiles(id), phase_name TEXT NOT NULL CHECK (phase_name IN ('pre-soaking', 'air-drying', 'cracking', 'shelling')), status TEXT NOT NULL DEFAULT 'draft' CHECK (status IN ('draft', 'submitted', 'withdrawn')), draft_name TEXT, -- Optional name for the draft created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), submitted_at TIMESTAMP WITH TIME ZONE, withdrawn_at TIMESTAMP WITH TIME ZONE ); ``` **Purpose**: Phase-specific draft records for experiment data entry with status tracking. #### `public.experiment_phase_data` ```sql CREATE TABLE public.experiment_phase_data ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), phase_draft_id UUID NOT NULL REFERENCES public.experiment_phase_drafts(id) ON DELETE CASCADE, phase_name TEXT NOT NULL CHECK (phase_name IN ('pre-soaking', 'air-drying', 'cracking', 'shelling')), -- Pre-soaking phase data batch_initial_weight_lbs FLOAT CHECK (batch_initial_weight_lbs >= 0), initial_shell_moisture_pct FLOAT CHECK (initial_shell_moisture_pct >= 0 AND initial_shell_moisture_pct <= 100), initial_kernel_moisture_pct FLOAT CHECK (initial_kernel_moisture_pct >= 0 AND initial_kernel_moisture_pct <= 100), soaking_start_time TIMESTAMP WITH TIME ZONE, -- Air-drying phase data airdrying_start_time TIMESTAMP WITH TIME ZONE, post_soak_weight_lbs FLOAT CHECK (post_soak_weight_lbs >= 0), post_soak_kernel_moisture_pct FLOAT CHECK (post_soak_kernel_moisture_pct >= 0 AND post_soak_kernel_moisture_pct <= 100), post_soak_shell_moisture_pct FLOAT CHECK (post_soak_shell_moisture_pct >= 0 AND post_soak_shell_moisture_pct <= 100), avg_pecan_diameter_in FLOAT CHECK (avg_pecan_diameter_in >= 0), -- Cracking phase data cracking_start_time TIMESTAMP WITH TIME ZONE, -- Shelling phase data shelling_start_time TIMESTAMP WITH TIME ZONE, bin_1_weight_lbs FLOAT CHECK (bin_1_weight_lbs >= 0), bin_2_weight_lbs FLOAT CHECK (bin_2_weight_lbs >= 0), bin_3_weight_lbs FLOAT CHECK (bin_3_weight_lbs >= 0), discharge_bin_weight_lbs FLOAT CHECK (discharge_bin_weight_lbs >= 0), bin_1_full_yield_oz FLOAT CHECK (bin_1_full_yield_oz >= 0), bin_2_full_yield_oz FLOAT CHECK (bin_2_full_yield_oz >= 0), bin_3_full_yield_oz FLOAT CHECK (bin_3_full_yield_oz >= 0), bin_1_half_yield_oz FLOAT CHECK (bin_1_half_yield_oz >= 0), bin_2_half_yield_oz FLOAT CHECK (bin_2_half_yield_oz >= 0), bin_3_half_yield_oz FLOAT CHECK (bin_3_half_yield_oz >= 0), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), CONSTRAINT unique_phase_per_draft UNIQUE (phase_draft_id, phase_name) ); ``` **Purpose**: Phase-specific measurement data for experiments. #### `public.pecan_diameter_measurements` ```sql CREATE TABLE public.pecan_diameter_measurements ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), phase_data_id UUID NOT NULL REFERENCES public.experiment_phase_data(id) ON DELETE CASCADE, measurement_number INTEGER NOT NULL CHECK (measurement_number >= 1 AND measurement_number <= 10), diameter_in FLOAT NOT NULL CHECK (diameter_in >= 0), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), CONSTRAINT unique_measurement_per_phase UNIQUE (phase_data_id, measurement_number) ); ``` **Purpose**: Individual pecan diameter measurements (up to 10 per phase). ### 3. Conductor Availability Management #### `public.conductor_availability` ```sql CREATE TABLE public.conductor_availability ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), user_id UUID NOT NULL REFERENCES public.user_profiles(id) ON DELETE CASCADE, available_from TIMESTAMP WITH TIME ZONE NOT NULL, available_to TIMESTAMP WITH TIME ZONE NOT NULL, notes TEXT, -- Optional notes about the availability status TEXT NOT NULL DEFAULT 'active' CHECK (status IN ('active', 'cancelled')), created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_by UUID NOT NULL REFERENCES public.user_profiles(id), -- Ensure available_to is after available_from CONSTRAINT valid_time_range CHECK (available_to > available_from), -- Ensure availability is in the future (can be modified if needed for past records) CONSTRAINT future_availability CHECK (available_from >= NOW() - INTERVAL '1 day') ); ``` **Purpose**: Stores conductor availability windows for experiment scheduling with overlap prevention. #### `public.experiment_phase_assignments` ```sql CREATE TABLE public.experiment_phase_assignments ( id UUID PRIMARY KEY DEFAULT uuid_generate_v4(), experiment_id UUID NOT NULL REFERENCES public.experiments(id) ON DELETE CASCADE, repetition_id UUID NOT NULL REFERENCES public.experiment_repetitions(id) ON DELETE CASCADE, conductor_id UUID NOT NULL REFERENCES public.user_profiles(id) ON DELETE CASCADE, phase_name TEXT NOT NULL CHECK (phase_name IN ('pre-soaking', 'air-drying', 'cracking', 'shelling')), scheduled_start_time TIMESTAMP WITH TIME ZONE NOT NULL, scheduled_end_time TIMESTAMP WITH TIME ZONE NOT NULL, status TEXT NOT NULL DEFAULT 'scheduled' CHECK (status IN ('scheduled', 'in-progress', 'completed', 'cancelled')), notes TEXT, -- Optional notes about the assignment created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(), created_by UUID NOT NULL REFERENCES public.user_profiles(id), -- Ensure scheduled_end_time is after scheduled_start_time CONSTRAINT valid_scheduled_time_range CHECK (scheduled_end_time > scheduled_start_time), -- Ensure unique assignment per conductor per phase per repetition CONSTRAINT unique_conductor_phase_assignment UNIQUE (repetition_id, conductor_id, phase_name) ); ``` **Purpose**: Assigns conductors to specific experiment repetition phases with scheduled times for future scheduling functionality. ## Key Functions ### Authentication & Authorization - `public.get_user_role()`: Returns the first role for a user (backward compatibility) - `public.get_user_roles()`: Returns all roles for a user as an array - `public.is_admin()`: Checks if user has admin role - `public.has_role(role_name TEXT)`: Checks if user has specific role - `public.has_any_role(role_names TEXT[])`: Checks if user has any of the specified roles - `public.can_manage_experiments()`: Checks if user can manage experiments (admin or conductor) ### User Management - `public.generate_temp_password()`: Generates secure temporary passwords - `public.create_user_with_roles(user_email, role_names, temp_password)`: Creates users with multiple roles ### Data Validation - `validate_repetition_number()`: Ensures repetition numbers don't exceed experiment requirements - `public.check_repetition_lock_before_withdrawal()`: Prevents withdrawal of locked repetitions - `public.check_availability_overlap()`: Prevents overlapping availabilities for the same conductor - `public.adjust_overlapping_availability()`: Automatically adjusts overlapping availabilities (alternative approach) ### Availability Management - `public.get_available_conductors(start_time, end_time)`: Returns conductors available for a specific time range - `public.is_conductor_available(conductor_id, start_time, end_time)`: Checks if a conductor is available for a specific time range ### Timestamp Management - `public.handle_updated_at()`: Updates the updated_at timestamp - `public.handle_phase_draft_status_change()`: Manages submitted_at and withdrawn_at timestamps ## Row Level Security (RLS) Policies ### Access Control Summary - **Admin**: Full access to all tables and operations - **Conductor**: Can manage experiments, experiment phases, and repetitions, view all data, manage their own availability - **Analyst**: Read-only access to all data - **Data Recorder**: Can create and manage their own phase drafts and data ### Key RLS Features - All authenticated users can view experiments, experiment phases, and repetitions - Admin and conductor roles can manage experiment phases - Users can only modify their own phase drafts (unless admin) - Users can only manage their own availability (unless admin) - Conductors can view their own experiment phase assignments - Locked repetitions prevent draft modifications - Submitted drafts cannot be withdrawn if repetition is locked - Role-based access control for user management ## Indexes The database includes comprehensive indexing for performance: - Primary key indexes on all tables - Foreign key indexes for joins - Status and date indexes for filtering - Composite indexes for common query patterns ## Constraints ### Data Integrity - Check constraints on numeric values (weights, percentages, etc.) - Enum constraints on status fields - Unique constraints to prevent duplicates - Foreign key constraints for referential integrity ### Business Rules - Repetition numbers cannot exceed experiment requirements - Only one submitted draft per user per phase per repetition - Moisture percentages must be between 0-100 - Weights and measurements must be non-negative - Conductor availabilities cannot overlap for the same user - Availability windows must have valid time ranges (end > start) - Only one conductor assignment per phase per repetition - Scheduled times must have valid ranges (end > start) ## Migration History The schema has evolved through several migrations: 1. **RBAC Schema**: Initial role-based access control 2. **Multiple Roles Support**: Enhanced user role management 3. **Experiments Table**: Core experiment management 4. **Repetitions System**: Separated experiments from repetitions 5. **Data Entry System**: Phase-specific draft and data management 6. **Constraint Fixes**: Refined business rules and constraints 7. **Experiment Phases**: Added experiment grouping for better organization 8. **Conductor Availability**: Added availability management and experiment phase assignments for scheduling This schema supports a comprehensive pecan processing experiment management system with robust security, data integrity, and flexible role-based access control.