# Troy Email Fetcher API with Real-Time Polling

A Flask-based API service that fetches emails from Microsoft Outlook/Exchange using Microsoft Graph API, extracts booking details from Twine scheduling emails, and provides real-time email notifications through a polling service.

## Features

- ✅ **Email Fetching**: Fetch emails from Microsoft Outlook using Graph API
- ✅ **Automatic Extraction**: Extract booking details (name, phone, email, locations, etc.) from Twine emails
- ✅ **Real-Time Polling**: Background service that continuously checks for new emails
- ✅ **Smart Tracking**: Prevents duplicate processing of emails
- ✅ **Callback Notifications**: Automatic notifications when new emails arrive
- ✅ **RESTful API**: Easy-to-use endpoints for email management
- ✅ **State Persistence**: Survives server restarts with state tracking

## Prerequisites

- Python 3.7+
- Microsoft Azure AD App Registration with:
  - Client ID
  - Client Secret (the VALUE, not the ID)
  - Tenant ID
  - Mail.Read permission granted

## Installation

1. **Clone or download the project files**

2. **Install dependencies:**
   ```bash
   pip install flask python-dotenv requests msal
   ```

3. **Configure your credentials:**
   
   Edit `api.py` or set environment variables:
   ```bash
   export EMAIL_ADDRESS="your-email@domain.com"
   export CLIENT_ID="your-client-id"
   export CLIENT_SECRET="your-client-secret-value"
   export TENANT_ID="your-tenant-id"
   ```

   Or create a `.env` file:
   ```env
   EMAIL_ADDRESS=your-email@domain.com
   CLIENT_ID=your-client-id
   CLIENT_SECRET=your-client-secret-value
   TENANT_ID=your-tenant-id
   ```

## Configuration

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `EMAIL_ADDRESS` | `info@troyrides.com` | Your email address to check |
| `CLIENT_ID` | - | Azure AD Application Client ID |
| `CLIENT_SECRET` | - | Azure AD Application Client Secret (VALUE) |
| `TENANT_ID` | - | Azure AD Tenant ID |
| `DEFAULT_SENDER` | `scheduling@usetwine.com` | Default sender email to filter |
| `DEFAULT_FOLDER` | `inbox` | Email folder to check |
| `DEFAULT_FROM_DATE` | `2025-12-17` | Default date filter |
| `POLLING_INTERVAL` | `60` | Polling interval in seconds |
| `POLLING_AUTO_START` | `true` | Auto-start polling on server start |
| `POLLING_ENABLED` | `true` | Enable/disable polling feature |
| `TRACKING_FILE` | `processed_emails.json` | File to store processed email IDs |
| `CALLBACK_URL` | `None` | External callback URL (optional) |
| `USE_INTERNAL_CALLBACK` | `true` | Use internal callback endpoint |
| `HOST` | `0.0.0.0` | Server host |
| `PORT` | `5000` | Server port |

## Running the API

### Start the server:

```bash
python3 api.py
```

The server will start on `http://0.0.0.0:5000` (or your configured host/port).

### With environment variables:

```bash
PORT=5000 HOST=0.0.0.0 python3 api.py
```

## API Endpoints

### Health Check

```bash
GET /health
```

**Response:**
```json
{
  "status": "healthy",
  "service": "Troy Email Fetcher API"
}
```

### Fetch Emails

```bash
GET /api/emails?from_date=2025-12-17&sender=scheduling@usetwine.com&limit=100
```

**Query Parameters:**
- `sender` (optional): Email sender to filter (default: scheduling@usetwine.com)
- `folder` (optional): Email folder (default: inbox)
- `unread_only` (optional): Only unread emails (default: false)
- `from_date` (optional): Filter emails from this date (YYYY-MM-DD)
- `limit` (optional): Maximum emails to return (default: 100)

**Response:**
```json
{
  "success": true,
  "count": 5,
  "emails": [
    {
      "id": "AAMkADA5YzVkNjM2...",
      "subject": "Troy Rides: Twine Customer 110123",
      "from": "Twine Updates <scheduling@usetwine.com>",
      "date": "2025-12-27T04:52:57Z",
      "body": "...",
      "extracted_details": {
        "name": "Morena Gonzales",
        "phone": "(562) 606-8046",
        "email": "withlove.morena@gmail.com",
        "address": null,
        "date_time": "2025-12-31 9:30 PM",
        "pick_up_location": "DTLA to WeHo",
        "drop_off_location": "WeHo to DTLA",
        "number_of_passengers": "2"
      }
    }
  ]
}
```

### Get Email Summary

```bash
GET /api/emails/summary?from_date=2025-12-17
```

Same parameters as `/api/emails`, but returns only essential fields (no full body).

### Get Email by ID

```bash
GET /api/emails/{email_id}
```

### Polling Service Endpoints

#### Start Polling

```bash
POST /api/polling/start?interval=60
```

**Query Parameters:**
- `interval` (optional): Polling interval in seconds (default: configured interval)

**Response:**
```json
{
  "success": true,
  "message": "Polling started",
  "status": {
    "is_running": true,
    "interval": 60,
    "last_check_time": "2026-01-05T10:00:00",
    "emails_processed_count": 0,
    "sender_email": "scheduling@usetwine.com",
    "folder": "inbox"
  }
}
```

#### Stop Polling

```bash
POST /api/polling/stop
```

#### Get Polling Status

```bash
GET /api/polling/status
```

**Response:**
```json
{
  "success": true,
  "status": {
    "is_running": true,
    "interval": 60,
    "last_check_time": "2026-01-05T10:05:00",
    "emails_processed_count": 5,
    "sender_email": "scheduling@usetwine.com",
    "folder": "inbox",
    "last_error": null,
    "processed_emails_tracked": 5
  }
}
```

#### Update Polling Configuration

```bash
POST /api/polling/config
Content-Type: application/json

{
  "interval": 120
}
```

#### Email Notification Callback

```bash
POST /api/polling/callback
Content-Type: application/json

{
  "email": {
    "id": "email_id",
    "subject": "...",
    "from": "...",
    "date": "...",
    "extracted_details": {...}
  },
  "timestamp": "2026-01-05T10:05:01",
  "source": "email_poller"
}
```

This endpoint is automatically called by the polling service when new emails are detected. You can also call it manually for testing.

## How Polling Works

### Overview

The polling service runs in a background thread and continuously checks for new emails at configurable intervals (default: 60 seconds).

### Flow Diagram

```
┌─────────────────────────────────────────────────┐
│  Background Polling Thread (runs every 60s)   │
│                                                 │
│  1. Check for emails since last check          │
│  2. Filter by sender (scheduling@usetwine.com) │
│  3. Compare with processed_emails.json         │
│  4. Process new emails                         │
│  5. Extract booking details                    │
│  6. Call callback endpoint                     │
│  7. Save state                                 │
│  8. Wait for next interval                     │
└─────────────────────────────────────────────────┘
```

### Step-by-Step Process

1. **Polling Loop Starts**: Background thread begins checking every X seconds
2. **Fetch Emails**: Queries Microsoft Graph API for emails from last check time
3. **Filter by Sender**: Only processes emails from `scheduling@usetwine.com`
4. **Check if New**: Compares email IDs with `processed_emails.json`
5. **Process New Emails**: 
   - Extracts booking details (name, phone, email, locations, etc.)
   - Marks as processed
   - Logs all details
6. **Callback Notification**: Automatically calls `/api/polling/callback` with email data
7. **Save State**: Updates `poller_state.json` with last check time and count

### State Files

The system creates two state files:

1. **`processed_emails.json`**: Tracks which emails have been processed
   ```json
   {
     "processed_emails": {
       "email_id_1": {
         "timestamp": "2026-01-05T10:05:01",
         "email_data": {...}
       }
     }
   }
   ```

2. **`poller_state.json`**: Stores polling state
   ```json
   {
     "last_check_timestamp": "2026-01-05T10:06:00",
     "emails_processed_count": 5,
     "last_updated": "2026-01-05T10:06:00"
   }
   ```

## Usage Examples

### 1. Start Polling Service

```bash
curl -X POST 'http://your-server:5000/api/polling/start?interval=60'
```

### 2. Check Polling Status

```bash
curl 'http://your-server:5000/api/polling/status'
```

### 3. Fetch Emails Manually

```bash
curl 'http://your-server:5000/api/emails?from_date=2025-12-17'
```

### 4. Get Email Summary

```bash
curl 'http://your-server:5000/api/emails/summary?from_date=2025-12-17'
```

### 5. Stop Polling

```bash
curl -X POST 'http://your-server:5000/api/polling/stop'
```

### 6. Test Callback Endpoint

```bash
curl -X POST 'http://your-server:5000/api/polling/callback' \
  -H 'Content-Type: application/json' \
  -d '{
    "email": {
      "id": "test-123",
      "subject": "Test Email",
      "from": "test@example.com",
      "date": "2026-01-05T10:00:00Z",
      "extracted_details": {
        "name": "Test User",
        "phone": "123-456-7890"
      }
    },
    "timestamp": "2026-01-05T10:00:00",
    "source": "manual_test"
  }'
```

## Callback Notifications

### How It Works

When a new email is detected, the polling service automatically calls the callback endpoint with the email data.

### Default Behavior

By default, the system uses an **internal callback endpoint** (`/api/polling/callback`) that:
- Receives email notifications
- Logs all email details to `logs/api.log`
- Returns a confirmation response

### Custom Callback URL

To use an external webhook:

```bash
export CALLBACK_URL="https://your-external-api.com/webhook/emails"
```

### Callback Payload

The callback receives this JSON structure:

```json
{
  "email": {
    "id": "AAMkADA5YzVkNjM2...",
    "subject": "Troy Rides: Twine Customer 110123",
    "from": "Twine Updates <scheduling@usetwine.com>",
    "date": "2025-12-27T04:52:57Z",
    "extracted_details": {
      "name": "Morena Gonzales",
      "phone": "(562) 606-8046",
      "email": "withlove.morena@gmail.com",
      "address": null,
      "date_time": "2025-12-31 9:30 PM",
      "pick_up_location": "DTLA to WeHo",
      "drop_off_location": "WeHo to DTLA",
      "number_of_passengers": "2"
    }
  },
  "timestamp": "2026-01-05T10:05:01",
  "source": "email_poller"
}
```

### Extending the Callback

You can add custom logic in `api.py` in the `email_notification_callback()` function (around line 676):

```python
# Add your custom processing:
# - Save to database
# - Send to another service
# - Trigger webhooks
# - Send notifications (SMS, email, etc.)
```

## Logging

All activities are logged to `logs/api.log`:

- Email fetching operations
- Polling service activities
- Email processing details
- Callback notifications
- Errors and warnings

### Monitor Logs

```bash
# Watch logs in real-time
tail -f logs/api.log

# Filter for email notifications
tail -f logs/api.log | grep "EMAIL NOTIFICATION"

# Filter for polling activities
tail -f logs/api.log | grep "polling"
```

## Troubleshooting

### Polling Not Starting

1. **Check if polling is enabled:**
   ```bash
   curl 'http://your-server:5000/api/polling/status'
   ```

2. **Check logs:**
   ```bash
   tail -f logs/api.log
   ```

3. **Manually start polling:**
   ```bash
   curl -X POST 'http://your-server:5000/api/polling/start'
   ```

### No Emails Found

1. **Verify authentication:**
   ```bash
   curl 'http://your-server:5000/api/debug'
   ```

2. **Check date filter:**
   - Make sure `from_date` is not too recent
   - Try without date filter: `GET /api/emails`

3. **Verify sender email:**
   - Check if emails are from `scheduling@usetwine.com`
   - Try different sender filter

### 401 Unauthorized Errors

The system automatically handles token expiration and re-authenticates. If you see persistent 401 errors:

1. **Check credentials:**
   - Verify CLIENT_SECRET is the VALUE, not the ID
   - Ensure credentials are correct

2. **Check permissions:**
   - Verify Mail.Read permission is granted
   - Check admin consent is provided

### Callback Not Working

1. **Check if callback URL is set:**
   - Default: Uses internal callback automatically
   - Check logs for callback attempts

2. **Test callback manually:**
   ```bash
   curl -X POST 'http://your-server:5000/api/polling/callback' \
     -H 'Content-Type: application/json' \
     -d '{"email": {"id": "test"}, "timestamp": "2026-01-05T10:00:00", "source": "test"}'
   ```

3. **Check callback logs:**
   ```bash
   tail -f logs/api.log | grep "EMAIL NOTIFICATION"
   ```

## File Structure

```
.
├── api.py                 # Main Flask API application
├── fetch.py               # Email fetching and extraction logic
├── email_poller.py        # Polling service implementation
├── processed_emails.json  # Tracks processed emails (auto-generated)
├── poller_state.json      # Polling state (auto-generated)
├── logs/
│   └── api.log           # Application logs
└── README.md             # This file
```

## Security Notes

- **Never commit credentials**: Use environment variables or `.env` file (add to `.gitignore`)
- **Client Secret**: Use the SECRET VALUE, not the Secret ID
- **HTTPS**: Use HTTPS in production
- **Authentication**: Consider adding authentication to API endpoints in production

## Production Deployment

### Using Gunicorn

```bash
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 api:app
```

### Using systemd

Create `/etc/systemd/system/troy-email-api.service`:

```ini
[Unit]
Description=Troy Email Fetcher API
After=network.target

[Service]
Type=simple
User=your-user
WorkingDirectory=/path/to/project
Environment="PATH=/path/to/venv/bin"
ExecStart=/path/to/venv/bin/python api.py
Restart=always

[Install]
WantedBy=multi-user.target
```

Then:
```bash
sudo systemctl enable troy-email-api
sudo systemctl start troy-email-api
```

### Using Supervisor

Create `/etc/supervisor/conf.d/troy-email-api.conf`:

```ini
[program:troy-email-api]
command=/path/to/venv/bin/python /path/to/api.py
directory=/path/to/project
user=your-user
autostart=true
autorestart=true
stderr_logfile=/var/log/troy-email-api.err.log
stdout_logfile=/var/log/troy-email-api.out.log
```

## Support

For issues or questions:
1. Check the logs: `logs/api.log`
2. Review the troubleshooting section
3. Verify your Microsoft Graph API credentials and permissions

## License

[Your License Here]
