MauricioGarciaS cea5eda985

feat(recommendations): Added services recommendation (ml_service) and trainer (ml_trainer) (#1275 )

* Created two services: recommendation training and recommendation serving

* Deleted Docker temporary

* Added features based in signals information

* Added method to get sessions features using PG

* Added same utils and core elements into ml_trainer

* Added checks before training models, added handler for model serving

* Updated serving API and recommendation functions to use frontend signals features

* reorganized modules to have base image and for both serving and training

* Added Dockerfiles and base Dockerfile

* Solved issue while ordering sessions by relevance

* Added method to save user feedback of recommendations

* Added security authorization

* Updated Dockerfile

* fixed issues with secret insertion to API

* Updated feedback structure

* Added git for dags

* Solved issue of insertion on recommendation feedback

* Changed update method from def to async def and it is called during startup

* Solved issues of airflow running mlflow in dag

* Changes sanity checks and added middleware params

* base path renaming

* Changed update method to a interval method which loads one model each 10s if there are models to download

* Added sql files for recommendation service and trainer

* Cleaned files and added documentation for methods and classes

* Added README file

* Renamed endpoints, changed None into empty array and updated readme

* refactor(recommendation): optimized query

* style(recommendation): changed import to top file, renamed endpoints parameters, function optimization

* refactor(recommendation): .gitignore

* refactor(recommendation): .gitignore

* refactor(recommendation): Optimized Dockerfiles

* refactor(recommendation): changed imports

* refactor(recommendation): optimized requests

* refactor(recommendation): optimized requests

* Fixed boot for fastapi, updated some queries

* Fixed issues while downloading models and while returning json response from API

* limited number of recommendations and set a minimum score to present recommendations

* fix(recommendation): fixed some queries and updated prediction method

* Added env value to control number of predictions to make

* docs(recommendation): Added third party libraries used in recommendation service

* frozen requirements

* Update base_crons.py

added `misfire_grace_time` to recommendation crons

---------

Co-authored-by: Taha Yassine Kraiem <tahayk2@gmail.com>

2023-06-07 15:58:33 +02:00

2.2 KiB

Raw Blame History

Recommendations

index

Build image
1. Recommendations service image
2. Trainer service image
Trainer
1. Env params
Recommendations
1. Env params

Build image

In order to build both recommendation image and trainer image, first a base image should be created by running the following command:

docker build -t recommendations_base .

which will add the files from core and utils which are common between ml_service and ml_trainer and will install common dependencies.

Recommendations service image

Inside ml_service run docker build to create the recommendation service image

cd ml_service/
docker build -t recommendations .
cd ../

Trainer service image

Inside ml_trainer run docker build to create the recommendation service image

cd ml_trainer/
docker build -t trainer .
cd ../

Trainer service

The trainer is an orchestration service which is in charge of training models and saving models into S3. This is made using Directed Acyclic Graphs (DAGs) in Airflow for orchestration and MLflow as a monitoring service for training model that creates a registry over S3.

Trainer env params

 pg_host=
 pg_port=
 pg_user=
 pg_password=
 pg_dbname=
 pg_host_ml=
 pg_port_ml=
 pg_user_ml=
 pg_password_ml=
 pg_dbname_ml='mlruns'
 PG_POOL='true'
 MODELS_S3_BUCKET= #'s3://path/to/bucket'
 pg_user_airflow=
 pg_password_airflow=
 pg_dbname_airflow='airflow'
 pg_host_airflow=
 pg_port_airflow=
 AIRFLOW_HOME=/app/airflow
 airflow_secret_key=
 airflow_admin_password=
 crons_train='0 0 * * *'

Recommendation service

The recommendation service is a FastAPI server that uses MLflow to read models from S3 and serve them, it also takes feedback from user and saves it into postgres for retraining purposes.

Recommendation env params

 pg_host=
 pg_port=
 pg_user=
 pg_password=
 pg_dbname=
 pg_host_ml=
 pg_port_ml=
 pg_user_ml=
 pg_password_ml=
 pg_dbname_ml='mlruns'
 PG_POOL='true'
 API_AUTH_KEY=

2.2 KiB Raw Blame History