# OCR Implementation for Health Metrics Processing

This document explains the OCR (Optical Character Recognition) implementation for processing health metrics images in the FitnessApp.

## Overview

The FitnessApp now includes a robust OCR implementation that can extract health metrics values from uploaded images. This allows users to take a photo of their smart scale display and have the app automatically extract and store the metrics.

## Components

The OCR implementation consists of the following components:

1. **OCR Configuration (`ocr_config.php`)**: Contains configuration settings for the OCR service, including API keys, regex patterns for extracting metrics, and debug settings.

2. **OCR Service (`ocr_service.php`)**: A PHP class that provides OCR functionality using various providers (OCR.space, Google Cloud Vision, Tesseract).

3. **Image Processing (`process_health_image.php`)**: The main script that handles image uploads, processes them using the OCR service, and stores the extracted metrics in the database.

4. **Testing Tools**: Scripts for testing the OCR implementation, including `test_ocr_service.php`.

## How It Works

1. **Image Upload**: The user uploads an image of their health metrics display through the app.

2. **OCR Processing**: The `process_health_image.php` script receives the image and passes it to the `OCRService` class.

3. **Text Extraction**: The `OCRService` class extracts text from the image using one of the configured OCR providers (OCR.space by default).

4. **Metric Extraction**: The service then applies regex patterns to the extracted text to identify health metrics values.

5. **Database Storage**: The extracted metrics are stored in the `manual_health_metrics` table in the database.

6. **Fallback Mechanism**: If the OCR service fails to extract enough metrics (less than 3), it falls back to default values to ensure the app continues to function.

## OCR Providers

The implementation supports multiple OCR providers:

1. **OCR.space API** (Default): A free OCR API that provides good accuracy for text extraction.

2. **Google Cloud Vision API**: A more advanced OCR service that provides higher accuracy but requires a Google Cloud account.

3. **Tesseract OCR**: An open-source OCR engine that can be installed on the server.

## Configuration

The OCR implementation can be configured in the `ocr_config.php` file:

```php
// OCR Service to use: 'google_vision', 'tesseract', 'ocr_space'
$OCR_SERVICE = 'ocr_space';

// OCR.space API Key
$OCR_SPACE_API_KEY = 'your-api-key';

// Debug settings
$OCR_DEBUG = true;  // Set to false in production
```

## Regex Patterns

The implementation uses regex patterns to extract health metrics from the OCR text. These patterns can be customized in the `ocr_config.php` file:

```php
$HEALTH_METRICS_PATTERNS = [
    'weight' => [
        'pattern' => '/(?:weight|wt)[:\s]+(\d+\.?\d*)\s*(?:lbs?|pounds?)/i',
        'group' => 1
    ],
    // ... other patterns ...
];
```

## Testing

You can test the OCR implementation using the `test_ocr_service.php` script:

1. Generate a test image with known health metrics values
2. Process the image using the OCR service
3. Compare the extracted values with the expected values

## Debugging

The OCR implementation includes detailed logging to help with debugging. The logs are stored in the `ocr_debug.log` file.

## Supported Metrics

The following health metrics are supported:

- Weight (lbs)
- Body Fat Percentage (%)
- Lean Body Mass (lbs)
- Muscle Mass (lbs)
- BMI
- Water Percentage (%)
- Muscle Mass Percentage (%)
- Bone Mass Percentage (%)
- BMR (kcal)
- Visceral Fat
- Body Fat Mass (lbs)
- Bone Mass (lbs)

## Future Improvements

1. **Improved Regex Patterns**: Refine the regex patterns to better handle different formats and units.

2. **Image Preprocessing**: Add image preprocessing steps to improve OCR accuracy, such as contrast enhancement and noise reduction.

3. **Multiple OCR Providers**: Implement a fallback mechanism that tries different OCR providers if one fails.

4. **Custom OCR Model**: Train a custom OCR model specifically for health metrics displays.

## Getting Started

To use the OCR implementation:

1. Sign up for an OCR.space API key at https://ocr.space/
2. Update the `$OCR_SPACE_API_KEY` in `ocr_config.php` with your API key
3. Test the implementation using `test_ocr_service.php`
4. Upload a health metrics image through the app 