Machine Learning example with .pkl files and flask

Below is a quick example of setting up and machine learning model using python. You train a model, create a scoring script, set up a docker container and host the API endpoint in a container. JSON can be passed to this API and a response is given.

Instructions

Step 1: Create and Save the Model

  1. Create a script train_model.py to train and save the model:

    import pickle
    from sklearn.datasets import load_iris
    from sklearn.linear_model import LogisticRegression
    from sklearn.model_selection import train_test_split
    
    # Load the Iris dataset
    iris = load_iris()
    X, y = iris.data, iris.target
    
    # Split the data into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
    
    # Train a simple Logistic Regression model
    model = LogisticRegression(max_iter=200)
    model.fit(X_train, y_train)
    
    # Save the model to a file
    with open('app/model.pkl', 'wb') as f:
        pickle.dump(model, f)
    
  2. Run this script to generate the model.pkl file:

    python train_model.py
    

Step 2: Create the Scoring Script

  1. Create a scoring script scoring_script.py in the app/ directory:

    import pandas as pd
    
    def score_function(model, input_data):
        """
        This function takes a model and input data (as a DataFrame) and returns the predictions.
        """
        predictions = model.predict(input_data)
        return predictions
    

Step 3: Create the Flask API

  1. Create main.py in the app/ directory:

    from flask import Flask, request, jsonify
    import pickle
    import pandas as pd
    from scoring_script import score_function  # Assuming your scoring logic is in scoring_script.py
    
    app = Flask(__name__)
    
    # Define your API token
    API_TOKEN = 'your_secure_api_token'
    
    # Load the model
    with open('app/model.pkl', 'rb') as f:
        model = pickle.load(f)
    
    def check_token(request):
        token = request.headers.get('Authorization')
        if token == API_TOKEN:
            return True
        else:
            return False
    
    @app.route('/predict', methods=['POST'])
    def predict():
        if not check_token(request):
            return jsonify({'error': 'Unauthorized access'}), 401
    
        try:
            data = request.get_json(force=True)
            # Assuming the input data is in a format suitable for your model
            df = pd.DataFrame([data])
            prediction = score_function(model, df)
            return jsonify({'prediction': prediction.tolist()})
        except Exception as e:
            return jsonify({'error': str(e)})
    
    if __name__ == '__main__':
        app.run(host='0.0.0.0', port=5000)
    

Step 4: Create requirements.txt

  1. List the necessary dependencies:

    flask
    pandas
    scikit-learn
    

Step 5: Create the Dockerfile

  1. Create a Dockerfile in the project root:

    # Use the official Python image from the Docker Hub
    FROM python:3.9-slim
    
    # Set the working directory in the container
    WORKDIR /app
    
    # Copy the current directory contents into the container at /app
    COPY . /app
    
    # Install any needed packages specified in requirements.txt
    RUN pip install --no-cache-dir -r requirements.txt
    
    # Make port 5000 available to the world outside this container
    EXPOSE 5000
    
    # Define environment variable
    ENV FLASK_APP=app/main.py
    
    # Run main.py when the container launches
    CMD ["flask", "run", "--host=0.0.0.0"]
    

Step 6: Create the Input JSON

  1. Create input.json in the project root:

    {
        "feature1": 5.1,
        "feature2": 3.5,
        "feature3": 1.4,
        "feature4": 0.2
    }
    

Step 7: Build and Run the Docker Container

  1. Navigate to your project directory:

    cd project
    
  2. Build the Docker image:

    docker build -t my-scoring-app .
    
  3. Run the Docker container:

    docker run -p 5000:5000 my-scoring-app
    

Step 8: Make an Authenticated Request

  1. Use curl to send a request with the JSON file:

    curl -X POST -H "Content-Type: application/json" -H "Authorization: your_secure_api_token" -d @input.json http://localhost:5000/predict
    

Replace your_secure_api_token with the actual token you defined in main.py. This should send the JSON file content as the request body and receive predictions from your model.