Machine Learning example with .pkl files and flask

Below is a quick example of setting up and machine learning model using python. You train a model, create a scoring script, set up a docker container and host the API endpoint in a container. JSON can be passed to this API and a response is given.

Instructions

Step 1: Create and Save the Model

Create a script train_model.py to train and save the model:

import pickle
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a simple Logistic Regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

# Save the model to a file
with open('app/model.pkl', 'wb') as f:
    pickle.dump(model, f)

Run this script to generate the model.pkl file:
```
python train_model.py
```

Step 2: Create the Scoring Script

Create a scoring script scoring_script.py in the app/ directory:

import pandas as pd

def score_function(model, input_data):
    """
    This function takes a model and input data (as a DataFrame) and returns the predictions.
    """
    predictions = model.predict(input_data)
    return predictions

Step 3: Create the Flask API

Create main.py in the app/ directory:

from flask import Flask, request, jsonify
import pickle
import pandas as pd
from scoring_script import score_function  # Assuming your scoring logic is in scoring_script.py

app = Flask(__name__)

# Define your API token
API_TOKEN = 'your_secure_api_token'

# Load the model
with open('app/model.pkl', 'rb') as f:
    model = pickle.load(f)

def check_token(request):
    token = request.headers.get('Authorization')
    if token == API_TOKEN:
        return True
    else:
        return False

@app.route('/predict', methods=['POST'])
def predict():
    if not check_token(request):
        return jsonify({'error': 'Unauthorized access'}), 401

    try:
        data = request.get_json(force=True)
        # Assuming the input data is in a format suitable for your model
        df = pd.DataFrame([data])
        prediction = score_function(model, df)
        return jsonify({'prediction': prediction.tolist()})
    except Exception as e:
        return jsonify({'error': str(e)})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Step 4: Create `requirements.txt`

List the necessary dependencies:
```
flask
pandas
scikit-learn
```

Step 5: Create the Dockerfile

Create a Dockerfile in the project root:

# Use the official Python image from the Docker Hub
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define environment variable
ENV FLASK_APP=app/main.py

# Run main.py when the container launches
CMD ["flask", "run", "--host=0.0.0.0"]

Step 6: Create the Input JSON

Create input.json in the project root:

{
    "feature1": 5.1,
    "feature2": 3.5,
    "feature3": 1.4,
    "feature4": 0.2
}

Step 7: Build and Run the Docker Container

Navigate to your project directory:
```
cd project
```
Build the Docker image:
```
docker build -t my-scoring-app .
```
Run the Docker container:
```
docker run -p 5000:5000 my-scoring-app
```

Step 8: Make an Authenticated Request

Use curl to send a request with the JSON file:

curl -X POST -H "Content-Type: application/json" -H "Authorization: your_secure_api_token" -d @input.json http://localhost:5000/predict

Replace your_secure_api_token with the actual token you defined in main.py. This should send the JSON file content as the request body and receive predictions from your model.