# Development Applications RAG Chatbot

A powerful Retrieval-Augmented Generation (RAG) chatbot that provides intelligent responses about development applications, planning data, and council information using your existing database as the knowledge source.

## Features

- **RAG-powered responses**: Uses LangChain and OpenAI to generate contextual responses
- **FAISS vector database**: Fast similarity search for relevant information
- **Database integration**: Leverages your existing development applications data
- **RESTful API**: Easy integration with any frontend application
- **Beautiful web interface**: Modern, responsive chat interface
- **Source attribution**: Shows which applications/documents were used for responses
- **Confidence scoring**: Indicates how confident the system is in its response

## Architecture

```
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   User Query    │───▶│  RAG Service     │───▶│  FAISS Vector   │
│                 │    │                  │    │   Database      │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                              │
                              ▼
                       ┌──────────────────┐
                       │  OpenAI LLM      │
                       │  (GPT-3.5-turbo) │
                       └──────────────────┘
                              │
                              ▼
                       ┌──────────────────┐
                       │  Contextual      │
                       │  Response        │
                       └──────────────────┘
```

## Setup Instructions

### 1. Install Dependencies

```bash
pip install -r requirements.txt
```

### 2. Environment Configuration

Make sure your Django settings include:
- `OPENAI_API_KEY`: Your OpenAI API key
- Database configuration (already configured)
- CORS settings (already configured)

### 3. Build Vector Store

Before using the chatbot, you need to build the FAISS vector store from your database:

```bash
python manage.py build_vector_store
```

To force rebuild (if you have new data):
```bash
python manage.py build_vector_store --force
```

### 4. Run the Server

```bash
python manage.py runserver
```

## API Endpoints

### Chat Endpoint
- **URL**: `/api/chat/`
- **Method**: `POST`
- **Body**: `{"message": "Your question here"}`
- **Response**: 
```json
{
    "success": true,
    "response": "Generated response",
    "sources": [
        {
            "application_id": "D-101-2023",
            "council_name": "City of Darebin",
            "document_type": "Application Form",
            "source_type": "extracted_data"
        }
    ],
    "confidence": "high"
}
```

### Statistics Endpoint
- **URL**: `/api/chatbot/stats/`
- **Method**: `GET`
- **Response**: Database statistics and vector store status

### Rebuild Vector Store
- **URL**: `/api/chatbot/rebuild/`
- **Method**: `POST`
- **Response**: Confirmation of rebuild completion

### Search Applications
- **URL**: `/api/applications/search/?q=query&council=council_name`
- **Method**: `GET`
- **Parameters**: 
  - `q`: Search query
  - `council`: Filter by council
  - `decision`: Filter by decision
  - `development_type`: Filter by development type

### Get Application Details
- **URL**: `/api/applications/<application_id>/`
- **Method**: `GET`
- **Response**: Detailed application information with PDF data

### Get Councils
- **URL**: `/api/councils/`
- **Method**: `GET`
- **Response**: List of all councils in the database

### Get Development Types
- **URL**: `/api/development-types/`
- **Method**: `GET`
- **Response**: List of all development types

## Web Interface

Access the chatbot web interface at: `http://localhost:8000/chatbot/`

Features:
- Real-time chat interface
- Message history
- Source attribution display
- Confidence indicators
- Statistics display
- Responsive design

## Example Queries

The chatbot can answer questions like:

- "How many development applications are there in the database?"
- "What types of developments are most common?"
- "Tell me about applications in City of Darebin"
- "What are the recent planning decisions?"
- "Show me applications with high costs"
- "What development applications involve residential buildings?"
- "Tell me about the applicant for application D-101-2023"
- "What are the lot sizes for recent applications?"

## How It Works

1. **Document Creation**: The system creates documents from your database data including:
   - Application metadata (ID, council, decision, dates, costs, etc.)
   - Extracted PDF data (land descriptions, applicant info, development details)

2. **Vectorization**: Documents are split into chunks and converted to embeddings using SentenceTransformers

3. **Storage**: Embeddings are stored in a FAISS vector database for fast similarity search

4. **Retrieval**: When a user asks a question, the system:
   - Converts the query to embeddings
   - Searches for similar documents in the vector store
   - Retrieves the most relevant context

5. **Generation**: The retrieved context is sent to OpenAI's GPT-3.5-turbo along with the user's question to generate a contextual response

6. **Response**: The system returns the generated response along with source information and confidence level

## Configuration

### RAG Service Configuration

You can modify the RAG service behavior in `shaoApp/rag_service.py`:

- **Chunk size**: Modify `chunk_size` in `RecursiveCharacterTextSplitter`
- **Embedding model**: Change `model_name` in `HuggingFaceEmbeddings`
- **LLM model**: Change `model` in `ChatOpenAI`
- **Temperature**: Adjust `temperature` for response creativity
- **Search results**: Modify `k` parameter in `search_similar_documents`

### Vector Store Location

The FAISS vector store is saved in: `media/faiss_index/`

## Troubleshooting

### Common Issues

1. **Vector store not found**: Run `python manage.py build_vector_store`
2. **OpenAI API errors**: Check your API key in settings
3. **Memory issues**: Reduce chunk size or use smaller embedding model
4. **Slow responses**: Consider using GPU for embeddings or reducing search results

### Performance Optimization

- Use GPU for embeddings if available
- Adjust chunk size based on your data
- Implement caching for frequently asked questions
- Consider using a smaller embedding model for faster processing

## Security Considerations

- The API is currently open (AllowAny permissions) - implement authentication for production
- Consider rate limiting for API endpoints
- Validate and sanitize user inputs
- Monitor API usage and costs

## Future Enhancements

- User authentication and session management
- Conversation history persistence
- Advanced filtering and search capabilities
- Export functionality for search results
- Integration with external planning databases
- Multi-language support
- Voice interface
- Mobile app

## Support

For issues or questions:
1. Check the Django logs for error messages
2. Verify your database has data
3. Ensure the vector store is built
4. Check OpenAI API key and quota

## License

This chatbot is part of your development applications system and follows the same licensing terms. 